Crustal origin of trench-parallel shear-wave fast polarizations in the Central Andes
NASA Astrophysics Data System (ADS)
Wölbern, I.; Löbl, U.; Rümpker, G.
2014-04-01
In this study, SKS and local S phases are analyzed to investigate variations of shear-wave splitting parameters along two dense seismic profiles across the central Andean Altiplano and Puna plateaus. In contrast to previous observations, the vast majority of the measurements reveal fast polarizations sub-parallel to the subduction direction of the Nazca plate with delay times between 0.3 and 1.2 s. Local phases show larger variations of fast polarizations and exhibit delay times ranging between 0.1 and 1.1 s. Two 70 km and 100 km wide sections along the Altiplano profile exhibit larger delay times and are characterized by fast polarizations oriented sub-parallel to major fault zones. Based on finite-difference wavefield calculations for anisotropic subduction zone models we demonstrate that the observations are best explained by fossil slab anisotropy with fast symmetry axes oriented sub-parallel to the slab movement in combination with a significant component of crustal anisotropy of nearly trench-parallel fast-axis orientation. From the modeling we exclude a sub-lithospheric origin of the observed strong anomalies due to the short-scale variations of the fast polarizations. Instead, our results indicate that anisotropy in the Central Andes generally reflects the direction of plate motion while the observed trench-parallel fast polarizations likely originate in the continental crust above the subducting slab.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Y.
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block tridiagonal matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconstant coefficients. A method was recently proposed to parallelize and vectorize BCR. In this paper, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational compelxity lower than that of parallel BCR.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Youcef
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR.
The Anisotropic Structure of South China Sea: Using OBS Data to Constrain Mantle Flow
NASA Astrophysics Data System (ADS)
Li, L.; Xue, M.; Yang, T.; Liu, C.; Hua, Q.; Xia, S.; Huang, H.; Le, B. M.; Huo, D.; Pan, M.
2015-12-01
The dynamic mechanism of the formation of South China Sea (SCS) has been debated for decades. The anisotropic structure can provide useful insight into the complex evolution of SCS by indicating its mantle flow direction and strength. In this study, we employ shear wave splitting methods on two half-year seismic data collected from 10 and 6 passive source Ocean Bottom Seismometers (OBS) respectively. These OBSs were deployed along both sides of the extinct ridge in the central basin of SCS by Tongji University in 2012 and 2013 respectively, which were then successfully recovered in 2013 and 2015 respectively. Through processing and inspecting the global and regional earthquakes (with local events being processing) of the 2012 dataset, measurements are made for 2 global events and 24 regional events at 5 OBSs using the tangential energy minimization, the smallest eigenvalue minimization, as well as the correlation methods. We also implement cluster analysis on the splitting results obtained for different time windows as well as filtered at different frequency bands. For teleseismic core phases like SKS and PKS, we find the fast polarization direction beneath the central basin is approximately NE-SW, nearly parallel to the extinct ridge in the central basin of SCS. Whereas for regional events, the splitting analysis on S, PS and ScS phases shows much more complicated fast directions as the ray path varies for different phases. The fast directions observed can be divided into three groups: (1) for the events from the Eurasia plate, a gradual rotation of the fast polarization direction from NNE-SSW to NEE-SWW along the path from the inner Eurasia plate to the central SCS is observed, implying the mantle flow is controlled by the India-Eurasia collision; (2) for the events located at the junction of Pacific plate and Philippine plate, the dominant fast direction is NW-SE, almost perpendicular to Ryukyu Trench as well as sub-parallel to the absolute direction of Philippine plate; (3) for the events occurred in the SE direction near the Philippine Fault zone, the observed NE-SW fast direction is sub-parallel to the subduction direction of the Philippine plate.
NASA Astrophysics Data System (ADS)
Takagi, R.; Okada, T.; Yoshida, K.; Townend, J.; Boese, C. M.; Baratin, L. M.; Chamberlain, C. J.; Savage, M. K.
2016-12-01
We estimate shear wave velocity anisotropy in shallow crust near the Alpine fault using seismic interferometry of borehole vertical arrays. We utilized four borehole observations: two sensors are deployed in two boreholes of the Deep Fault Drilling Project in the hanging wall side, and the other two sites are located in the footwall side. Surface sensors deployed just above each borehole are used to make vertical arrays. Crosscorrelating rotated horizontal seismograms observed by the borehole and surface sensors, we extracted polarized shear waves propagating from the bottom to the surface of each borehole. The extracted shear waves show polarization angle dependence of travel time, indicating shear wave anisotropy between the two sensors. In the hanging wall side, the estimated fast shear wave directions are parallel to the Alpine fault. Strong anisotropy of 20% is observed at the site within 100 m from the Alpine fault. The hanging wall consists of mylonite and schist characterized by fault parallel foliation. In addition, an acoustic borehole imaging reveals fractures parallel to the Alpine fault. The fault parallel anisotropy suggest structural anisotropy is predominant in the hanging wall, demonstrating consistency of geological and seismological observations. In the footwall side, on the other hand, the angle between the fast direction and the strike of the Alpine fault is 33-40 degrees. Since the footwall is composed of granitoid that may not have planar structure, stress induced anisotropy is possibly predominant. The direction of maximum horizontal stress (SHmax) estimated by focal mechanisms of regional earthquakes is 55 degrees of the Alpine fault. Possible interpretation of the difference between the fast direction and SHmax direction is depth rotation of stress field near the Alpine fault. Similar depth rotation of stress field is also observed in the SAFOD borehole at the San Andreas fault.
NASA Astrophysics Data System (ADS)
Olive, Jean-Arthur; Pearce, Frederick; Rondenay, Stéphane; Behn, Mark D.
2014-04-01
Many subduction zones exhibit significant retrograde motion of their arc and trench. The observation of fast shear-wave velocities parallel to the trench in such settings has been inferred to represent trench-parallel mantle flow beneath a retreating slab. Here, we investigate this process by measuring seismic anisotropy in the shallow Aegean mantle. We carry out shear-wave splitting analysis on a dense array of seismometers across the Western Hellenic Subduction Zone, and find a pronounced zonation of anisotropy at the scale of the subduction zone. Fast SKS splitting directions subparallel to the trench-retreat direction dominate the region nearest to the trench. Fast splitting directions abruptly transition to trench-parallel above the corner of the mantle wedge, and rotate back to trench-normal over the back-arc. We argue that the trench-normal anisotropy near the trench is explained by entrainment of an asthenospheric layer beneath the shallow-dipping portion of the slab. Toward the volcanic arc this signature is overprinted by trench-parallel anisotropy in the mantle wedge, likely caused by a layer of strained serpentine immediately above the slab. Arcward steepening of the slab and horizontal divergence of mantle flow due to rollback may generate an additional component of sub-slab trench-parallel anisotropy in this region. Poloidal flow above the retreating slab is likely the dominant source of back-arc trench-normal anisotropy. We hypothesize that trench-normal anisotropy associated with significant entrainment of the asthenospheric mantle near the trench may be widespread but only observable at shallow-dipping subduction zones where stations nearest the trench do not overlie the mantle wedge.
NASA Astrophysics Data System (ADS)
Palmesi, P.; Abert, C.; Bruckner, F.; Suess, D.
2018-05-01
Fast stray field calculation is commonly considered of great importance for micromagnetic simulations, since it is the most time consuming part of the simulation. The Fast Multipole Method (FMM) has displayed linear O(N) parallelization behavior on many cores. This article investigates the error of a recent FMM approach approximating sources using linear—instead of constant—finite elements in the singular integral for calculating the stray field and the corresponding potential. After measuring performance in an earlier manuscript, this manuscript investigates the convergence of the relative L2 error for several FMM simulation parameters. Various scenarios either calculating the stray field directly or via potential are discussed.
Anisotropic Behaviour of Magnetic Power Spectra in Solar Wind Turbulence.
NASA Astrophysics Data System (ADS)
Banerjee, S.; Saur, J.; Gerick, F.; von Papen, M.
2017-12-01
Introduction:High altitude fast solar wind turbulence (SWT) shows different spectral properties as a function of the angle between the flow direction and the scale dependent mean magnetic field (Horbury et al., PRL, 2008). The average magnetic power contained in the near perpendicular direction (80º-90º) was found to be approximately 5 times larger than the average power in the parallel direction (0º- 10º). In addition, the parallel power spectra was found to give a steeper (-2) power law than the perpendicular power spectral density (PSD) which followed a near Kolmogorov slope (-5/3). Similar anisotropic behaviour has also been observed (Chen et al., MNRAS, 2011) for slow solar wind (SSW), but using a different method exploiting multi-spacecraft data of Cluster. Purpose:In the current study, using Ulysses data, we investigate (i) the anisotropic behaviour of near ecliptic slow solar wind using the same methodology (described below) as that of Horbury et al. (2008) and (ii) the dependence of the anisotropic behaviour of SWT as a function of the heliospheric latitude.Method:We apply the wavelet method to calculate the turbulent power spectra of the magnetic field fluctuations parallel and perpendicular to the local mean magnetic field (LMF). According to Horbury et al., LMF for a given scale (or size) is obtained using an envelope of the envelope of that size. Results:(i) SSW intervals always show near -5/3 perpendicular spectra. Unlike the fast solar wind (FSW) intervals, for SSW, we often find intervals where power parallel to the mean field is not observed. For a few intervals with sufficient power in parallel direction, slow wind turbulence also exhibit -2 parallel spectra similar to FSW.(ii) The behaviours of parallel and perpendicular power spectra are found to be independent of the heliospheric latitude. Conclusion:In the current study we do not find significant influence of the heliospheric latitude on the spectral slopes of parallel and perpendicular magnetic spectra. This indicates that the spectral anisotropy in parallel and perpendicular direction is governed by intrinsic properties of SWT.
NASA Astrophysics Data System (ADS)
Cao, L.; Kao, H.; Wang, K.; Wang, Z.
2016-12-01
Haida Gwaii is located along the transpressive Queen Charlotte margin between the Pacific (PA) and North America (NA) plates. The highly oblique relative plate motion is partitioned, with the strike-slip component accommodated by the Queen Charlotte Fault (QCF) and the convergent component by a thrust fault offshore. To understand how the presence of a obliquely subducting slab influences shear deformation of the plate boundary, we investigate mantle anisotropy by analyzing shear-wave splitting of teleseismic SKS phases recorded at 17 seismic stations in and around Haida Gwaii. We used the MFAST program to determine the polarization direction of the fast wave (φ) and the delay time (δt) between the fast and slow phases. The fast directions derived from stations on Haida Gwaii and two stations to the north on the Alaska Panhandle are predominantly margin-parallel (NNW). However, away from the plate boundary, the fast direction transitions to WSW-trending, very oblique or perpendicular to the plate boundary. Because the average delay time of 0.6-2.45 s is much larger than values based on an associated local S phase splitting analysis in the same study area, it is reasonable to infer that most of the anisotropy from our SKS analysis originates from the upper mantle and is associated with lattice-preferred orientation of anisotropic minerals. The margin-parallel fast direction within about 100 km of the QCF (average φ = -40º and δt = 1.2 s) is likely induced by the PA-NA shear motion. The roughly margin-normal fast directions farther away, although more scatterd, are consistent with that previously observed in the NA continent and are attributed to the absolute motion of the NA plate. However, the transition between the two regimes based on our SKS analysis appears to be gradual, suggesting that the plate boundary shear influences a much broader region at mantle depths than would be inferred from the surface trace of the QCF. We think this is due to the presence of a subducted portion of the Pacific plate. Because the slab travels mostly in the strike direction, it is expected to induce margin-parallel shear deformation of the mantle material. This result has importance implications to the geodynamics of transpressive plate margins.
NASA Astrophysics Data System (ADS)
Lu, San; Artemyev, A. V.; Angelopoulos, V.
2017-11-01
Magnetotail current sheet thinning is a distinctive feature of substorm growth phase, during which magnetic energy is stored in the magnetospheric lobes. Investigation of charged particle dynamics in such thinning current sheets is believed to be important for understanding the substorm energy storage and the current sheet destabilization responsible for substorm expansion phase onset. We use Time History of Events and Macroscale Interactions during Substorms (THEMIS) B and C observations in 2008 and 2009 at 18 - 25 RE to show that during magnetotail current sheet thinning, the electron temperature decreases (cooling), and the parallel temperature decreases faster than the perpendicular temperature, leading to a decrease of the initially strong electron temperature anisotropy (isotropization). This isotropization cannot be explained by pure adiabatic cooling or by pitch angle scattering. We use test particle simulations to explore the mechanism responsible for the cooling and isotropization. We find that during the thinning, a fast decrease of a parallel electric field (directed toward the Earth) can speed up the electron parallel cooling, causing it to exceed the rate of perpendicular cooling, and thus lead to isotropization, consistent with observation. If the parallel electric field is too small or does not change fast enough, the electron parallel cooling is slower than the perpendicular cooling, so the parallel electron anisotropy grows, contrary to observation. The same isotropization can also be accomplished by an increasing parallel electric field directed toward the equatorial plane. Our study reveals the existence of a large-scale parallel electric field, which plays an important role in magnetotail particle dynamics during the current sheet thinning process.
A High-Order Direct Solver for Helmholtz Equations with Neumann Boundary Conditions
NASA Technical Reports Server (NTRS)
Sun, Xian-He; Zhuang, Yu
1997-01-01
In this study, a compact finite-difference discretization is first developed for Helmholtz equations on rectangular domains. Special treatments are then introduced for Neumann and Neumann-Dirichlet boundary conditions to achieve accuracy and separability. Finally, a Fast Fourier Transform (FFT) based technique is used to yield a fast direct solver. Analytical and experimental results show this newly proposed solver is comparable to the conventional second-order elliptic solver when accuracy is not a primary concern, and is significantly faster than that of the conventional solver if a highly accurate solution is required. In addition, this newly proposed fourth order Helmholtz solver is parallel in nature. It is readily available for parallel and distributed computers. The compact scheme introduced in this study is likely extendible for sixth-order accurate algorithms and for more general elliptic equations.
Large-scale trench-perpendicular mantle flow beneath northern Chile
NASA Astrophysics Data System (ADS)
Reiss, M. C.; Rumpker, G.; Woelbern, I.
2017-12-01
We investigate the anisotropic properties of the forearc region of the central Andean margin by analyzing shear-wave splitting from teleseismic and local earthquakes from the Nazca slab. The data stems from the Integrated Plate boundary Observatory Chile (IPOC) located in northern Chile, covering an approximately 120 km wide coastal strip between 17°-25° S with an average station spacing of 60 km. With partly over ten years of data, this data set is uniquely suited to address the long-standing debate about the mantle flow field at the South American margin and in particular whether the flow field beneath the slab is parallel or perpendicular to the trench. Our measurements yield two distinct anisotropic layers. The teleseismic measurements show a change of fast polarizations directions from North to South along the trench ranging from parallel to subparallel to the absolute plate motion and, given the geometry of absolute plate motion and strike of the trench, mostly perpendicular to the trench. Shear-wave splitting from local earthquakes shows fast polarizations roughly aligned trench-parallel but exhibit short-scale variations which are indicative of a relatively shallow source. Comparisons between fast polarization directions and the strike of the local fault systems yield a good agreement. We use forward modelling to test the influence of the upper layer on the teleseismic measurements. We show that the observed variations of teleseismic measurements along the trench are caused by the anisotropy in the upper layer. Accordingly, the mantle layer is best characterized by an anisotropic fast axes parallel to the absolute plate motion which is roughly trench-perpendicular. This anisotropy is likely caused by a combination of crystallographic preferred orientation of the mantle mineral olivine as fossilized anisotropy in the slab and entrained flow beneath the slab. We interpret the upper anisotropic layer to be confined to the crust of the overriding continental plate. This is explained by the shape-preferred orientation of micro-cracks in relation to local fault zones which are oriented parallel the overall strike of the Andean range. Our results do not provide any evidence for a significant contribution of trench-parallel mantle flow beneath the subducting slab to the measurements.
Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael
2012-06-01
We present l₁-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative self-consistent parallel imaging (SPIRiT). Like many iterative magnetic resonance imaging reconstructions, l₁-SPIRiT's image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing l₁-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of l₁-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT spoiled gradient echo (SPGR) sequence with up to 8× acceleration via Poisson-disc undersampling in the two phase-encoded directions.
Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael
2012-01-01
We present ℓ1-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the Wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative Self-Consistent Parallel Imaging (SPIRiT). Like many iterative MRI reconstructions, ℓ1-SPIRiT’s image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing ℓ1-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of ℓ1-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT Spoiled Gradient Echo (SPGR) sequence with up to 8× acceleration via poisson-disc undersampling in the two phase-encoded directions. PMID:22345529
Yue, Chao; Li, Wen; Reeves, Geoffrey D.; ...
2016-07-01
Interactions between interplanetary (IP) shocks and the Earth's magnetosphere manifest many important space physics phenomena including low-energy ion flux enhancements and particle acceleration. In order to investigate the mechanisms driving shock-induced enhancement of low-energy ion flux, we have examined two IP shock events that occurred when the Van Allen Probes were located near the equator while ionospheric and ground observations were available around the spacecraft footprints. We have found that, associated with the shock arrival, electromagnetic fields intensified, and low-energy ion fluxes, including H +, He +, and O +, were enhanced dramatically in both the parallel and perpendicular directions.more » During the 2 October 2013 shock event, both parallel and perpendicular flux enhancements lasted more than 20 min with larger fluxes observed in the perpendicular direction. In contrast, for the 15 March 2013 shock event, the low-energy perpendicular ion fluxes increased only in the first 5 min during an impulse of electric field, while the parallel flux enhancement lasted more than 30 min. In addition, ionospheric outflows were observed after shock arrivals. From a simple particle motion calculation, we found that the rapid response of low-energy ions is due to drifts of plasmaspheric population by the enhanced electric field. Furthermore, the fast acceleration in the perpendicular direction cannot solely be explained by E × B drift but betatron acceleration also plays a role. Adiabatic acceleration may also explain the fast response of the enhanced parallel ion fluxes, while ion outflows may contribute to the enhanced parallel fluxes that last longer than the perpendicular fluxes.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yue, Chao; Li, Wen; Reeves, Geoffrey D.
Interactions between interplanetary (IP) shocks and the Earth's magnetosphere manifest many important space physics phenomena including low-energy ion flux enhancements and particle acceleration. In order to investigate the mechanisms driving shock-induced enhancement of low-energy ion flux, we have examined two IP shock events that occurred when the Van Allen Probes were located near the equator while ionospheric and ground observations were available around the spacecraft footprints. We have found that, associated with the shock arrival, electromagnetic fields intensified, and low-energy ion fluxes, including H +, He +, and O +, were enhanced dramatically in both the parallel and perpendicular directions.more » During the 2 October 2013 shock event, both parallel and perpendicular flux enhancements lasted more than 20 min with larger fluxes observed in the perpendicular direction. In contrast, for the 15 March 2013 shock event, the low-energy perpendicular ion fluxes increased only in the first 5 min during an impulse of electric field, while the parallel flux enhancement lasted more than 30 min. In addition, ionospheric outflows were observed after shock arrivals. From a simple particle motion calculation, we found that the rapid response of low-energy ions is due to drifts of plasmaspheric population by the enhanced electric field. Furthermore, the fast acceleration in the perpendicular direction cannot solely be explained by E × B drift but betatron acceleration also plays a role. Adiabatic acceleration may also explain the fast response of the enhanced parallel ion fluxes, while ion outflows may contribute to the enhanced parallel fluxes that last longer than the perpendicular fluxes.« less
Large-scale trench-normal mantle flow beneath central South America
NASA Astrophysics Data System (ADS)
Reiss, M. C.; Rümpker, G.; Wölbern, I.
2018-01-01
We investigate the anisotropic properties of the fore-arc region of the central Andean margin between 17-25°S by analyzing shear-wave splitting from teleseismic and local earthquakes from the Nazca slab. With partly over ten years of recording time, the data set is uniquely suited to address the long-standing debate about the mantle flow field at the South American margin and in particular whether the flow field beneath the slab is parallel or perpendicular to the trench. Our measurements suggest two anisotropic layers located within the crust and mantle beneath the stations, respectively. The teleseismic measurements show a moderate change of fast polarizations from North to South along the trench ranging from parallel to subparallel to the absolute plate motion and, are oriented mostly perpendicular to the trench. Shear-wave splitting measurements from local earthquakes show fast polarizations roughly aligned trench-parallel but exhibit short-scale variations which are indicative of a relatively shallow origin. Comparisons between fast polarization directions from local earthquakes and the strike of the local fault systems yield a good agreement. To infer the parameters of the lower anisotropic layer we employ an inversion of the teleseismic waveforms based on two-layer models, where the anisotropy of the upper (crustal) layer is constrained by the results from the local splitting. The waveform inversion yields a mantle layer that is best characterized by a fast axis parallel to the absolute plate motion which is more-or-less perpendicular to the trench. This orientation is likely caused by a combination of the fossil crystallographic preferred orientation of olivine within the slab and entrained mantle flow beneath the slab. The anisotropy within the crust of the overriding continental plate is explained by the shape-preferred orientation of micro-cracks in relation to local fault zones which are oriented parallel to the overall strike of the Andean range. Our results do not provide any evidence for a significant contribution of trench-parallel mantle flow beneath the subducting slab.
Hybrid mechanosensing system to generate the polarity needed for migration in fish keratocytes
Okimura, Chika; Iwadate, Yoshiaki
2016-01-01
ABSTRACT Crawling cells can generate polarity for migration in response to forces applied from the substratum. Such reaction varies according to cell type: there are both fast- and slow-crawling cells. In response to periodic stretching of the elastic substratum, the intracellular stress fibers in slow-crawling cells, such as fibroblasts, rearrange themselves perpendicular to the direction of stretching, with the result that the shape of the cells extends in that direction; whereas fast-crawling cells, such as neutrophil-like differentiated HL-60 cells and Dictyostelium cells, which have no stress fibers, migrate perpendicular to the stretching direction. Fish epidermal keratocytes are another type of fast-crawling cell. However, they have stress fibers in the cell body, which gives them a typical slow-crawling cell structure. In response to periodic stretching of the elastic substratum, intact keratocytes rearrange their stress fibers perpendicular to the direction of stretching in the same way as fibroblasts and migrate parallel to the stretching direction, while blebbistatin-treated stress fiber-less keratocytes migrate perpendicular to the stretching direction, in the same way as seen in HL-60 cells and Dictyostelium cells. Our results indicate that keratocytes have a hybrid mechanosensing system that comprises elements of both fast- and slow-crawling cells, to generate the polarity needed for migration. PMID:27124267
Hybrid massively parallel fast sweeping method for static Hamilton-Jacobi equations
NASA Astrophysics Data System (ADS)
Detrixhe, Miles; Gibou, Frédéric
2016-10-01
The fast sweeping method is a popular algorithm for solving a variety of static Hamilton-Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling, and show state-of-the-art speedup values for the fast sweeping method.
Parallel Fast Multipole Method For Molecular Dynamics
2007-06-01
Parallel Fast Multipole Method For Molecular Dynamics THESIS Reid G. Ormseth, Captain, USAF AFIT/GAP/ENP/07-J02 DEPARTMENT OF THE AIR FORCE AIR...the United States Government. AFIT/GAP/ENP/07-J02 Parallel Fast Multipole Method For Molecular Dynamics THESIS Presented to the Faculty Department of...has also been provided by ‘The Art of Molecular Dynamics Simulation ’ by Dennis Rapaport. This work is the clearest treatment of the Fast Multipole
Seismic Anisotropy Beneath the Eastern Flank of the Rio Grande Rift
NASA Astrophysics Data System (ADS)
Benton, N. W.; Pulliam, J.
2015-12-01
Shear wave splitting was measured across the eastern flank of the Rio Grande Rift (RGR) to investigate mechanisms of upper mantle anisotropy. Earthquakes recorded at epicentral distances of 90°-130° from EarthScope Transportable Array (TA) and SIEDCAR (SC) broadband seismic stations were examined comprehensively, via the Matlab program "Splitlab", to determine whether SKS and SKKS phases indicated anisotropic properties. Splitlab allows waveforms to be rotated, filtered, and windowed interactively and splitting measurements are made on a user-specified waveform segment via three independent methods simultaneously. To improve signal-to-noise and improve reliability, we stacked the error surfaces that resulted from grid searches in the measurements for each station location. Fast polarization directions near the Rio Grande Rift tend to be sub-parallel to the RGR but then change to angles that are consistent with North America's average plate motion, to the east. The surface erosional depression of the Pecos Valley coincides with fast polarization directions that are aligned in a more northerly direction than their neighbors, whereas the topographic high to the east coincides with an easterly change of the fast axis.The area above a mantle high velocity anomaly discovered separately via seismic tomography which may indicate thickened lithosphere, corresponds to unusually large delay times and fast polarization directions that are more closely aligned to a north-south orientation. The area of southeastern New Mexico that falls between the mantle fast anomaly and the Great Plains craton displays dramatically smaller delay times, as well as changes in fast axis directions toward the northeast. Changes in fast axis directions may indicate flow around the mantle anomaly; small delay times could indicate vertical or attenuated flow.
Hybrid massively parallel fast sweeping method for static Hamilton–Jacobi equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Detrixhe, Miles, E-mail: mdetrixhe@engineering.ucsb.edu; University of California Santa Barbara, Santa Barbara, CA, 93106; Gibou, Frédéric, E-mail: fgibou@engineering.ucsb.edu
The fast sweeping method is a popular algorithm for solving a variety of static Hamilton–Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling,more » and show state-of-the-art speedup values for the fast sweeping method.« less
Deformation, crystal preferred orientations, and seismic anisotropy in the Earth's D″ layer
NASA Astrophysics Data System (ADS)
Tommasi, Andréa; Goryaeva, Alexandra; Carrez, Philippe; Cordier, Patrick; Mainprice, David
2018-06-01
We use a forward multiscale model that couples atomistic modeling of intracrystalline plasticity mechanisms (dislocation glide ± twinning) in MgSiO3 post-perovskite (PPv) and periclase (MgO) at lower mantle pressures and temperatures to polycrystal plasticity simulations to predict crystal preferred orientations (CPO) development and seismic anisotropy in D″. We model the CPO evolution in aggregates of 70% PPv and 30% MgO submitted to simple shear, axial shortening, and along corner-flow streamlines, which simulate changes in flow orientation similar to those expected at the transition between a downwelling and flow parallel to the core-mantle boundary (CMB) within D″ or between CMB-parallel flow and upwelling at the borders of the large low shear wave velocity provinces (LLSVP) in the lowermost mantle. Axial shortening results in alignment of PPv [010] axes with the shortening direction. Simple shear produces PPv CPO with a monoclinic symmetry that rapidly rotates towards parallelism between the dominant [100](010) slip system and the macroscopic shear. These predictions differ from MgSiO3 post-perovskite textures formed in diamond-anvil cell experiments, but agree with those obtained in simple shear and compression experiments using CaIrO3 post-perovskite. Development of CPO in PPv and MgO results in seismic anisotropy in D″. For shear parallel to the CMB, at low strain, the inclination of ScS, Sdiff, and SKKS fast polarizations and delay times vary depending on the propagation direction. At moderate and high shear strains, all S-waves are polarized nearly horizontally. Downwelling flow produces Sdiff, ScS, and SKKS fast polarization directions and birefringence that vary gradually as a function of the back-azimuth from nearly parallel to inclined by up to 70° to CMB and from null to ∼5%. Change in the flow to shear parallel to the CMB results in dispersion of the CPO, weakening of the anisotropy, and strong azimuthal variation of the S-wave splitting up to 250 km from the corner. Transition from horizontal shear to upwelling also produces weakening of the CPO and complex seismic anisotropy patterns, with dominantly inclined fast ScS and SKKS polarizations, over most of the upwelling path. Models that take into account twinning in PPv explain most observations of seismic anisotropy in D″, but heterogeneity of the flow at scales <1000 km is needed to comply with the seismological evidence for low apparent birefringence in D″.
Contrasting upper-mantle shear wave anisotropy across the transpressive Queen Charlotte margin
NASA Astrophysics Data System (ADS)
Cao, Lingmin; Kao, Honn; Wang, Kelin
2017-10-01
In order to investigate upper mantle and crustal anisotropy along the transpressive Queen Charlotte margin between the Pacific (PA) and North America (NA) plates, we conducted shear wave splitting analyses using 17 seismic stations in and around the island of Haida Gwaii, Canada. Despite the limited station coverage at present, our reconnaissance study does reveal a systematic pattern of mantle anisotropy in this region. Fast directions derived from teleseismic SKS-phase splitting are mostly margin-parallel (NNW-SSE) near the plate boundary but transition to predominantly E-W-trending farther away. We propose that the former is associated with the absolute motion of PA, and the latter reflects a transition from this direction to that of the absolute motion of NA. The broad width of the zone of transition from the PA to NA direction is probably caused by the very obliquely subducting PA slab that travels primarily in the margin-parallel direction. Anisotropy of Haida Gwaii based on local earthquakes features a fast direction that cannot be explained with regional stresses and is probably associated with local structural fabric within the overriding crust. Our preliminary shear wave splitting measurements and working hypotheses based on them will serve to guide more refined future studies to unravel details of the geometry and kinematics of the subducted PA slab, as well as the viscous coupling between the slab and upper mantle in other transpressive margins.
Solar Wind Proton Temperature Anisotropy: Linear Theory and WIND/SWE Observations
NASA Technical Reports Server (NTRS)
Hellinger, P.; Travnicek, P.; Kasper, J. C.; Lazarus, A. J.
2006-01-01
We present a comparison between WIND/SWE observations (Kasper et al., 2006) of beta parallel to p and T perpendicular to p/T parallel to p (where beta parallel to p is the proton parallel beta and T perpendicular to p and T parallel to p are the perpendicular and parallel proton are the perpendicular and parallel proton temperatures, respectively; here parallel and perpendicular indicate directions with respect to the ambient magnetic field) and predictions of the Vlasov linear theory. In the slow solar wind, the observed proton temperature anisotropy seems to be constrained by oblique instabilities, by the mirror one and the oblique fire hose, contrary to the results of the linear theory which predicts a dominance of the proton cyclotron instability and the parallel fire hose. The fast solar wind core protons exhibit an anticorrelation between beta parallel to c and T perpendicular to c/T parallel to c (where beta parallel to c is the core proton parallel beta and T perpendicular to c and T parallel to c are the perpendicular and parallel core proton temperatures, respectively) similar to that observed in the HELIOS data (Marsch et al., 2004).
Parallel and pipeline computation of fast unitary transforms
NASA Technical Reports Server (NTRS)
Fino, B. J.; Algazi, V. R.
1975-01-01
The letter discusses the parallel and pipeline organization of fast-unitary-transform algorithms such as the fast Fourier transform, and points out the efficiency of a combined parallel-pipeline processor of a transform such as the Haar transform, in which (2 to the n-th power) -1 hardware 'butterflies' generate a transform of order 2 to the n-th power every computation cycle.
Fast parallel tandem mass spectral library searching using GPU hardware acceleration.
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K; Martin, Daniel B
2011-06-03
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.
Complicated seismic anisotropy beneath south-central Mongolia and its geodynamic implications
NASA Astrophysics Data System (ADS)
Qiang, Zhengyang; Wu, Qingju; Li, Yonghua; Gao, Mengtan; Demberel, Sodnomsambuu; Ulzibat, Munkhuu; Sukhbaatar, Usnikh; Flesch, Lucy M.
2017-05-01
Two years of high-quality broadband seismic data from 69 temporary stations deployed in south-central Mongolia provide an opportunity to study the anisotropy-forming mechanisms in this area. The majority of shear wave splitting observations determined from the analysis of teleseismic SKS phase are characterized by NW-SE trending fast directions with large splitting delay times (greater than 2.0 s at six stations), which is inferred to be generated by active asthenospheric flow. The variation of the fast direction may be associated with deflection of asthenosphere around the deep Siberian cratonic keel at the base of the lithosphere. Several of the NE-SW trending fast directions with relatively small delay times observed in the Gobi Desert are parallel to the strike of the main faults and sutures, which may represent lithospheric deformation. In addition, it is inferred that small-scale hot mantle upwelling is responsible for generating a cluster of null measurements observed on the south of the Hentiy Mountain.
One-step trinary signed-digit arithmetic using an efficient encoding scheme
NASA Astrophysics Data System (ADS)
Salim, W. Y.; Fyath, R. S.; Ali, S. A.; Alam, Mohammad S.
2000-11-01
The trinary signed-digit (TSD) number system is of interest for ultra fast optoelectronic computing systems since it permits parallel carry-free addition and borrow-free subtraction of two arbitrary length numbers in constant time. In this paper, a simple coding scheme is proposed to encode the decimal number directly into the TSD form. The coding scheme enables one to perform parallel one-step TSD arithmetic operation. The proposed coding scheme uses only a 5-combination coding table instead of the 625-combination table reported recently for recoded TSD arithmetic technique.
LDRD final report on massively-parallel linear programming : the parPCx system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar
2005-02-01
This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less
Fast disk array for image storage
NASA Astrophysics Data System (ADS)
Feng, Dan; Zhu, Zhichun; Jin, Hai; Zhang, Jiangling
1997-01-01
A fast disk array is designed for the large continuous image storage. It includes a high speed data architecture and the technology of data striping and organization on the disk array. The high speed data path which is constructed by two dual port RAM and some control circuit is configured to transfer data between a host system and a plurality of disk drives. The bandwidth can be more than 100 MB/s if the data path based on PCI (peripheral component interconnect). The organization of data stored on the disk array is similar to RAID 4. Data are striped on a plurality of disk, and each striping unit is equal to a track. I/O instructions are performed in parallel on the disk drives. An independent disk is used to store the parity information in the fast disk array architecture. By placing the parity generation circuit directly on the SCSI (or SCSI 2) bus, the parity information can be generated on the fly. It will affect little on the data writing in parallel on the other disks. The fast disk array architecture designed in the paper can meet the demands of the image storage.
NASA Technical Reports Server (NTRS)
Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.
1991-01-01
Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.
A note on parallel and pipeline computation of fast unitary transforms
NASA Technical Reports Server (NTRS)
Fino, B. J.; Algazi, V. R.
1974-01-01
The parallel and pipeline organization of fast unitary transform algorithms such as the Fast Fourier Transform are discussed. The efficiency is pointed out of a combined parallel-pipeline processor of a transform such as the Haar transform in which 2 to the n minus 1 power hardware butterflies generate a transform of order 2 to the n power every computation cycle.
Crystal alignments in the Fast ice of arctic Alaska
NASA Astrophysics Data System (ADS)
Weeks, W. F.; Gow, A. J.
1980-02-01
Field observations at 60 sites located in the fast or near-fast ice along a 1200-km stretch of the north coast of Alaska between the Bering Strait and Barter Island have shown that the great majority of the ice samples (95%) exhibit striking c axis alignments within the horizontal plane. In all cases the degree of preferred orientation increased with depth in the ice. Representative standard deviations around a mean direction in the horizontal plane are commonly less than ±10° for samples collected near the bottom of the ice. At a given site the mean c axis direction ?0 may vary as much as 20° with vertical location in the ice sheet. The c axis allignments in the nearshore region generally parallel the coast, with strong alignments occurring in the lagoon systems between the barrier islands and the coast and seaward of the barrier islands. In passes between islands and in entrances such as the opening to Kotzebue Sound the alignment is parallel to the channel. Only limited observations are available farther seaward over the inner (10- to 50-m isobaths) and outer (50-m isobath to shelf break) shelf regions. These indicate NE-SW and E-W alignments, respectively, in the Beaufort Sea north of Prudhoe Bay. The general patterns of the alignments support the correlation between the preferred c axis direction and the current direction at the ice/water interface suggested by Weeks and Gow (1978). A comparison between c axis alignments and instantaneous current measurements made at 42 locations shows that the most frequent current direction coincides with ?0. At the one site where we were able to determine the current direction (52°T) over a longer period (7 hours), the agreement with ?0. (48°T) was excellent. Similarly, if only ?0. values determined in the nearshore region are considered, the most frequent deviation is 10° or less between ?0. and the trend of the adjacent shoreline, which is presumably parallel to the prevailing longshore currents. The c axis alignments are believed to be the result of geometric selection, with the most favored orientation being that in which the current flows normal to the (0001) plates of ice that comprise the dendritic sea ice/seawater interface. The instantaneous current observations suggest SW nearshore currents along the Chukchi coast between SW of Point Lay and SW of the Rogers-Post Monument. In the vicinity of Barrow all currents measured along the Chukchi coast were toward the NE. Current directions along the Beaufort coast in the nearshore region were generally parallel to the coast, with 45% of the observations indicating currents toward the E and 55% currents toward the W.
Crustal anisotropy in the forearc of the Northern Cascadia Subduction Zone, British Columbia
NASA Astrophysics Data System (ADS)
Balfour, N. J.; Cassidy, J. F.; Dosso, S. E.
2012-01-01
This paper aims to identify sources and variations of crustal anisotropy from shear-wave splitting measurements in the forearc of the Northern Cascadia Subduction Zone of southwest British Columbia. Over 20 permanent stations and 15 temporary stations were available for shear-wave splitting analysis on ˜4500 event-station pairs for local crustal earthquakes. Results from 1100 useable shear-wave splitting measurements show spatial variations in fast directions, with margin-parallel fast directions at most stations and margin-perpendicular fast directions at stations in the northeast of the region. Crustal anisotropy is often attributed to stress and has been interpreted as the fast direction being related to the orientation of the maximum horizontal compressive stress. However, studies have also shown anisotropy can be complicated by crustal structure. Southwest British Columbia is a complex region of crustal deformation and some of the stations are located near large ancient faults. To use seismic anisotropy as a stress indicator requires identifying which stations are influenced by stress and which by structure. We determine the source of anisotropy at each station by comparing fast directions from shear-wave splitting results to the maximum horizontal compressive stress orientation determined from earthquake focal mechanism inversion. Most stations show agreement between the fast direction and the maximum horizontal compressive stress. This suggests that anisotropy is related to stress-aligned fluid-filled microcracks based on extensive dilatancy anisotropy. These stations are further analysed for temporal variations to lay groundwork for monitoring temporal changes in the stress over extended time periods. Determining the sources of variability in anisotropy can lead to a better understanding of the crustal structure and stress, and in the future may be used as a monitoring and mapping tool.
Are Fast Radio Bursts the Birthmark of Magnetars?
NASA Astrophysics Data System (ADS)
Lieu, Richard
2017-01-01
A model of fast radio bursts, which enlists young, short period extragalactic magnetars satisfying B/P > 2 × 1016 G s-1 (1 G = 1 statvolt cm-1) as the source, is proposed. When the parallel component {{\\boldsymbol{E}}}\\parallel of the surface electric field (under the scenario of a vacuum magnetosphere) of such pulsars approaches 5% of the critical field {E}c={m}e2{c}3/(e{\\hslash }), in strength, the field can readily decay via the Schwinger mechanism into electron-positron pairs, the back reaction of which causes {{\\boldsymbol{E}}}\\parallel to oscillate on a characteristic timescale smaller than the development of a spark gap. Thus, under this scenario, the open field line region of the pulsar magnetosphere is controlled by Schwinger pairs, and their large creation and acceleration rates enable the escaping pairs to coherently emit radio waves directly from the polar cap. The majority of the energy is emitted at frequencies ≲ 1 {GHz} where the coherent radiation has the highest yield, at a rate large enough to cause the magnetar to lose spin significantly over a timescale ≈ a few × {10}-3 s, the duration of a fast radio burst. Owing to the circumstellar environment of a young magnetar, however, the ≲1 GHz radiation is likely to be absorbed or reflected by the overlying matter. It is shown that the brightness of the remaining (observable) frequencies of ≈ 1 {GHz} and above are on a par with a typical fast radio burst. Unless some spin-up mechanism is available to recover the original high rotation rate that triggered the Schwinger mechanism, the fast radio burst will not be repeated again in the same magnetar.
Petascale turbulence simulation using a highly parallel fast multipole method on GPUs
NASA Astrophysics Data System (ADS)
Yokota, Rio; Barba, L. A.; Narumi, Tetsu; Yasuoka, Kenji
2013-03-01
This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on GPU hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the FMM-based vortex method achieving 74% parallel efficiency on 4096 processes (one GPU per MPI process, 3 GPUs per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of MPI processes (using only CPU cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.
Waite, Gregory P.; Schutt, D.L.; Smith, Robert B.
2005-01-01
Teleseismic shear wave splitting measured at 56 continuous and temporary seismographs deployed in a 500 km by 600 km area around the Yellowstone hot spot indicates that fast anisotropy in the mantle is parallel to the direction of plate motion under most of the array. The average split time from all stations of 0.9 s is typical of continental stations. There is little evidence for plume-induced radial strain, suggesting that any contribution of gravitationally spreading plume material is undetectably small with respect to the plate motion velocity. Two stations within Yellowstone have splitting measurements indicating the apparent fast anisotropy direction (ϕ) is nearly perpendicular to plate motion. These stations are ∼30 km from stations with ϕ parallel to plate motion. The 70° rotation over 30 km suggests a shallow source of anisotropy; however, split times for these stations are more than 2 s. We suggest melt-filled, stress-oriented cracks in the lithosphere are responsible for the anomalous ϕ orientations within Yellowstone. Stations southeast of Yellowstone have measurements of ϕ oriented NNW to WNW at high angles to the plate motion direction. The Archean lithosphere beneath these stations may have significant anisotropy capable of producing the observed splitting.
Upper Mantle Responses to India-Eurasia Collision in Indochina, Malaysia, and the South China Sea
NASA Astrophysics Data System (ADS)
Hongsresawat, S.; Russo, R. M.
2016-12-01
We present new shear wave splitting and splitting intensity measurements from SK(K)S phases recorded at seismic stations of the Malaysian National Seismic Network. These results, in conjunction with results from Tibet and Yunnan provide a basis for testing the degree to which Indochina and South China Sea upper mantle fabrics are responses to India-Eurasia collision. Upper mantle fabrics derived from shear wave splitting measurements in Yunnan and eastern Tibet parallel geodetic surface motions north of 26°N, requiring transmission of tractions from upper mantle depths to surface, or consistent deformation boundary conditions throughout the upper 200 km of crust and mantle. Shear wave splitting fast trends and surface velocities diverge in eastern Yunnan and south of 26°N, indicating development of an asthenospheric layer that decouples crust and upper mantle, or corner flow above the subducted Indo-Burma slab. E-W fast shear wave splitting trends southwest of 26°N/104°E indicate strong gradients in any asthenospheric infiltration. Possible upper mantle flow regimes beneath Indochina include development of olivine b-axis anisotropic symmetry due to high strain and hydrous conditions in the syntaxis/Indo-Burma mantle wedge (i.e., southward flow), development of strong upper mantle corner flow in the Indo-Burma wedge with olivine a-axis anisotropic symmetry (i.e., westward flow), and simple asthenospheric flow due to eastward motion of Sundaland shearing underlying asthenosphere. Further south, shear-wave splitting delay times at Malaysian stations vary from 0.5 seconds on the Malay Peninsula to over 2 seconds at stations on Borneo. Splitting fast trends at Borneo stations and Singapore trend NE-SW, but in northern Peninsular Malaysia, the splitting fast polarization direction is NW-SE, parallel to the trend of the Peninsula. Thus, there is a sharp transition from low delay time and NW-SE fast polarization to high delay times and fast polarization directions that parallel the strike of the now-inoperative spreading center in the South China Sea. This transition appears to occur in the central portion of Peninsular Malaysia and may mark the boundary between Tethyan upper mantle extruded from the India-Asia collision zone and supra-subduction upper mantle of the Indonesian arc.
Sources of Seismic Hazard in British Columbia: What Controls Earthquakes in the Crust?
NASA Astrophysics Data System (ADS)
Balfou, Natalie Joy
This thesis examines processes causing faulting in the North American crust in the northern Cascadia subduction zone. A combination of seismological methods, including source mechanism determination, stress inversion and earthquake relocations are used to determine where earthquakes occur and what forces influence faulting. We also determine if forces that control faulting can be monitored using seismic anisotropy. Investigating the processes that contribute to faulting in the crust is important because these earthquakes pose significant hazard to the large population centres in British Columbia and Washington State. To determine where crustal earthquakes occur we apply double-difference earthquake relocation techniques to events in the Fraser River Valley, British Columbia, and the San Juan Islands, Washington. This technique is used to identify "hidden" active structures using both catalogue and waveform cross-correlation data. Results have significantly reduced uncertainty over routine catalogue locations and show lineations in areas of clustered seismicity. In the Fraser River Valley these lineations or streaks appear to be hidden structures that do not disrupt near-surface sediments; however, in the San Juan Islands the identified lineation can be related to recently mapped surface expressions of faults. To determine forces that influence faulting we investigate the orientation and sources of stress using Bayesian inversion results from focal mechanism data. More than ˜600 focal mechanisms from crustal earthquakes are calculated to identify the dominant style of faulting and inverted to estimate the principal stress orientations and the stress ratio. Results indicate the maximum horizontal compressive stress (SHmax) orientation changes with distance from the subduction interface, from margin-normal along the coast to margin-parallel further inland. We relate the margin-normal stress direction to subduction-related strain rates due to the locked interface between the North America and Juan de Fuca plates just west of Vancouver Island. Further from the margin the plates are coupled less strongly and the margin-parallel SHmax relates to the northward push of the Oregon Block. Active faults around the region are generally thrust faults that strike east-west and might accommodate the margin- parallel compression. Finally, we consider whether crustal anisotropy can be used as a stress monitoring tool in this region. We identify sources and variations of crustal anisotropy using shear-wave splitting analysis on local crustal earthquakes. Results show spatial variations in fast directions, with margin-parallel fast directions at most stations and margin-perpendicular fast directions at stations in the northeast of the region. To use seismic anisotropy as a stress indicator requires identifying which stations are pri- marily influenced by stress. We determine the source of anisotropy at each station by comparing fast directions from shear-wave splitting results to the SHmax orientation. Most stations show agreement between these directions suggesting that anisotropy is stress-related. These stations are further analysed for temporal variations and show variation that could be associated with earthquakes (ML 3{5) and episodic tremor and slip events. The combination of earthquake relocations, source mechanisms, stress and anisotropy is unique and provides a better understanding of faulting and stress in the crust of northern Cascadia.
Raytracing and Direct-Drive Targets
NASA Astrophysics Data System (ADS)
Schmitt, Andrew J.; Bates, Jason; Fyfe, David; Eimerl, David
2013-10-01
Accurate simulation of the effects of laser imprinting and drive asymmetries in directly driven targets requires the ability to distinguish between raytrace noise and the intensity structure produced by the spatial and temporal incoherence of optical smoothing. We have developed and implemented a smoother raytrace algorithm for our mpi-parallel radiation hydrodynamics code, FAST3D. The underlying approach is to connect the rays into either sheets (in 2D) or volume-enclosing chunks (in 3D) so that the absorbed energy distribution continuously covers the propagation area illuminated by the laser. We will describe the status and show the different scalings encountered in 2D and 3D problems as the computational size, parallelization strategy, and number of rays is varied. Finally, we show results using the method in current NIKE experimental target simulations and in proposed symmetric and polar direct-drive target designs. Supported by US DoE/NNSA.
Fast parallel tandem mass spectral library searching using GPU hardware acceleration
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K.; Martin, Daniel B.
2011-01-01
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching) is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment. PMID:21545112
NASA Astrophysics Data System (ADS)
Puzyrev, Vladimir; Torres-Verdín, Carlos; Calo, Victor
2018-05-01
The interpretation of resistivity measurements acquired in high-angle and horizontal wells is a critical technical problem in formation evaluation. We develop an efficient parallel 3-D inversion method to estimate the spatial distribution of electrical resistivity in the neighbourhood of a well from deep directional electromagnetic induction measurements. The methodology places no restriction on the spatial distribution of the electrical resistivity around arbitrary well trajectories. The fast forward modelling of triaxial induction measurements performed with multiple transmitter-receiver configurations employs a parallel direct solver. The inversion uses a pre-conditioned gradient-based method whose accuracy is improved using the Wolfe conditions to estimate optimal step lengths at each iteration. The large transmitter-receiver offsets, used in the latest generation of commercial directional resistivity tools, improve the depth of investigation to over 30 m from the wellbore. Several challenging synthetic examples confirm the feasibility of the full 3-D inversion-based interpretations for these distances, hence enabling the integration of resistivity measurements with seismic amplitude data to improve the forecast of the petrophysical and fluid properties. Employing parallel direct solvers for the triaxial induction problems allows for large reductions in computational effort, thereby opening the possibility to invert multiposition 3-D data in practical CPU times.
Microstructure characterisation of Ti-6Al-4V from different additive manufacturing processes
NASA Astrophysics Data System (ADS)
Neikter, M.; Åkerfeldt, P.; Pederson, R.; Antti, M.-L.
2017-10-01
The focus of this work has been microstructure characterisation of Ti-6Al-4V manufactured by five different additive manufacturing (AM) processes. The microstructure features being characterised are the prior β size, grain boundary α and α lath thickness. It was found that material manufactured with powder bed fusion processes has smaller prior β grains than the material from directed energy deposition processes. The AM processes with fast cooling rate render in thinner α laths and also thinner, and in some cases discontinuous, grain boundary α. Furthermore, it has been observed that material manufactured with the directed energy deposition processes has parallel bands, except for one condition when the parameters were changed, while the powder bed fusion processes do not have any parallel bands.
Fast parallel approach for 2-D DHT-based real-valued discrete Gabor transform.
Tao, Liang; Kwan, Hon Keung
2009-12-01
Two-dimensional fast Gabor transform algorithms are useful for real-time applications due to the high computational complexity of the traditional 2-D complex-valued discrete Gabor transform (CDGT). This paper presents two block time-recursive algorithms for 2-D DHT-based real-valued discrete Gabor transform (RDGT) and its inverse transform and develops a fast parallel approach for the implementation of the two algorithms. The computational complexity of the proposed parallel approach is analyzed and compared with that of the existing 2-D CDGT algorithms. The results indicate that the proposed parallel approach is attractive for real time image processing.
Chaotic flows and fast magnetic dynamos
NASA Technical Reports Server (NTRS)
Finn, John M.; Ott, Edward
1988-01-01
The kinematic dynamo problem is considered in the R(m) approaching infinity limit. It is shown that the magnetic field tends to concentrate on a zero volume fractal set; moreover, it displays arbitrarily fine-scaled oscillations between parallel and antiparallel directions. Consideration is given to the relationship between the dynamo growth rate and quantitative measures of chaos, such as the Liapunov element and topological entropy.
Gaglianese, A; Costagli, M; Ueno, K; Ricciardi, E; Bernardi, G; Pietrini, P; Cheng, K
2015-01-22
The main visual pathway that conveys motion information to the middle temporal complex (hMT+) originates from the primary visual cortex (V1), which, in turn, receives spatial and temporal features of the perceived stimuli from the lateral geniculate nucleus (LGN). In addition, visual motion information reaches hMT+ directly from the thalamus, bypassing the V1, through a direct pathway. We aimed at elucidating whether this direct route between LGN and hMT+ represents a 'fast lane' reserved to high-speed motion, as proposed previously, or it is merely involved in processing motion information irrespective of speeds. We evaluated functional magnetic resonance imaging (fMRI) responses elicited by moving visual stimuli and applied connectivity analyses to investigate the effect of motion speed on the causal influence between LGN and hMT+, independent of V1, using the Conditional Granger Causality (CGC) in the presence of slow and fast visual stimuli. Our results showed that at least part of the visual motion information from LGN reaches hMT+, bypassing V1, in response to both slow and fast motion speeds of the perceived stimuli. We also investigated whether motion speeds have different effects on the connections between LGN and functional subdivisions within hMT+: direct connections between LGN and MT-proper carry mainly slow motion information, while connections between LGN and MST carry mainly fast motion information. The existence of a parallel pathway that connects the LGN directly to hMT+ in response to both slow and fast speeds may explain why MT and MST can still respond in the presence of V1 lesions. Copyright © 2014 IBRO. Published by Elsevier Ltd. All rights reserved.
A study on crustal shear wave splitting in the western part of the Banda arc-continent collision
DOE Office of Scientific and Technical Information (OSTI.GOV)
Syuhada, E-mail: hadda9@gmail.com; Research Centre for Physics - Indonesian Institute of Sciences; Hananto, Nugroho D.
2016-03-11
We analyzed shear wave splitting parameters from local shallow (< 30 km) earthquakes recorded at six seismic stations in the western part of the Banda arc-continent collision. We determined fast polarization and delay time for 195 event-stations pairs calculated from good signal-to-noise ratio waveforms. We observed that there is evidence for shear wave splitting at all stations with dominant fast polarization directions oriented about NE-SW, which are parallel to the collision direction of the Australian plate. However, minor fast polarization directions are oriented around NW-SE being perpendicular to the strike of Timor through. Furthermore, the changes in fast azimuths with themore » earthquake-station back azimuth suggest that the crustal anisotropy in the study area is not uniform. Splitting delay times are within the range of 0.05 s to 0.8 s, with a mean value of 0.29±0.18 s. Major seismic stations exhibit a weak tendency increasing of delay times with increasing hypocentral distance suggesting the main anisotropy contribution of the shallow crust. In addition, these variations in fast azimuths and delay times indicate that the crustal anisotropy in this region might not only be caused by extensive dilatancy anisotropy (EDA), but also by heterogeneity shallow structure such as the presence of foliations in the rock fabric and the fracture zones associated with active faults.« less
Mantle flow through a tear in the Nazca slab inferred from shear wave splitting
NASA Astrophysics Data System (ADS)
Lynner, Colton; Anderson, Megan L.; Portner, Daniel E.; Beck, Susan L.; Gilbert, Hersh
2017-07-01
A tear in the subducting Nazca slab is located between the end of the Pampean flat slab and normally subducting oceanic lithosphere. Tomographic studies suggest mantle material flows through this opening. The best way to probe this hypothesis is through observations of seismic anisotropy, such as shear wave splitting. We examine patterns of shear wave splitting using data from two seismic deployments in Argentina that lay updip of the slab tear. We observe a simple pattern of plate-motion-parallel fast splitting directions, indicative of plate-motion-parallel mantle flow, beneath the majority of the stations. Our observed splitting contrasts previous observations to the north and south of the flat slab region. Since plate-motion-parallel splitting occurs only coincidentally with the slab tear, we propose mantle material flows through the opening resulting in Nazca plate-motion-parallel flow in both the subslab mantle and mantle wedge.
NASA Technical Reports Server (NTRS)
Hess, B. J.; Angelaki, D. E.
1997-01-01
The spatial organization of fast phase velocity vectors of the vestibulo-ocular reflex (VOR) was studied in rhesus monkeys during yaw rotations about an earth-horizontal axis that changed continuously the orientation of the head relative to gravity ("barbecue spit" rotation). In addition to a velocity component parallel to the rotation axis, fast phases also exhibited a velocity component that invariably was oriented along the momentary direction of gravity. As the head rotated through supine and prone positions, torsional components of fast phase velocity axes became prominent. Similarly, as the head rotated through left and right ear-down positions, fast phase velocity axes exhibited prominent vertical components. The larger the speed of head rotation the greater the magnitude of this fast phase component, which was collinear with gravity. The main sequence properties of VOR fast phases were independent of head position. However, peak amplitude as well as peak velocity of fast phases were both modulated as a function of head orientation, exhibiting a minimum in prone position. The results suggest that the fast phases of vestibulo-ocular reflexes not only redirect gaze and reposition the eye in the direction of head motion but also reorient the eye with respect to earth-vertical when the head moves relative to gravity. As further elaborated in the companion paper, the underlying mechanism could be described as a dynamic, gravity-dependent modulation of the coordinates of ocular rotations relative to the head.
Analysis techniques for diagnosing runaway ion distributions in the reversed field pinch
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, J., E-mail: jkim536@wisc.edu; Anderson, J. K.; Capecchi, W.
2016-11-15
An advanced neutral particle analyzer (ANPA) on the Madison Symmetric Torus measures deuterium ions of energy ranges 8-45 keV with an energy resolution of 2-4 keV and time resolution of 10 μs. Three different experimental configurations measure distinct portions of the naturally occurring fast ion distributions: fast ions moving parallel, anti-parallel, or perpendicular to the plasma current. On a radial-facing port, fast ions moving perpendicular to the current have the necessary pitch to be measured by the ANPA. With the diagnostic positioned on a tangent line through the plasma core, a chord integration over fast ion density, background neutral density,more » and local appropriate pitch defines the measured sample. The plasma current can be reversed to measure anti-parallel fast ions in the same configuration. Comparisons of energy distributions for the three configurations show an anisotropic fast ion distribution favoring high pitch ions.« less
Parallel spatial direct numerical simulations on the Intel iPSC/860 hypercube
NASA Technical Reports Server (NTRS)
Joslin, Ronald D.; Zubair, Mohammad
1993-01-01
The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube is documented. The direct numerical simulation approach is used to compute spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows. The feasibility of using the PSDNS on the hypercube to perform transition studies is examined. The results indicate that the direct numerical simulation approach can effectively be parallelized on a distributed-memory parallel machine. By increasing the number of processors nearly ideal linear speedups are achieved with nonoptimized routines; slower than linear speedups are achieved with optimized (machine dependent library) routines. This slower than linear speedup results because the Fast Fourier Transform (FFT) routine dominates the computational cost and because the routine indicates less than ideal speedups. However with the machine-dependent routines the total computational cost decreases by a factor of 4 to 5 compared with standard FORTRAN routines. The computational cost increases linearly with spanwise wall-normal and streamwise grid refinements. The hypercube with 32 processors was estimated to require approximately twice the amount of Cray supercomputer single processor time to complete a comparable simulation; however it is estimated that a subgrid-scale model which reduces the required number of grid points and becomes a large-eddy simulation (PSLES) would reduce the computational cost and memory requirements by a factor of 10 over the PSDNS. This PSLES implementation would enable transition simulations on the hypercube at a reasonable computational cost.
Surrogates for numerical simulations; optimization of eddy-promoter heat exchangers
NASA Technical Reports Server (NTRS)
Patera, Anthony T.; Patera, Anthony
1993-01-01
Although the advent of fast and inexpensive parallel computers has rendered numerous previously intractable calculations feasible, many numerical simulations remain too resource-intensive to be directly inserted in engineering optimization efforts. An attractive alternative to direct insertion considers models for computational systems: the expensive simulation is evoked only to construct and validate a simplified, input-output model; this simplified input-output model then serves as a simulation surrogate in subsequent engineering optimization studies. A simple 'Bayesian-validated' statistical framework for the construction, validation, and purposive application of static computer simulation surrogates is presented. As an example, dissipation-transport optimization of laminar-flow eddy-promoter heat exchangers are considered: parallel spectral element Navier-Stokes calculations serve to construct and validate surrogates for the flowrate and Nusselt number; these surrogates then represent the originating Navier-Stokes equations in the ensuing design process.
A Parallel Fast Sweeping Method for the Eikonal Equation
NASA Astrophysics Data System (ADS)
Baker, B.
2017-12-01
Recently, there has been an exciting emergence of probabilistic methods for travel time tomography. Unlike gradient-based optimization strategies, probabilistic tomographic methods are resistant to becoming trapped in a local minimum and provide a much better quantification of parameter resolution than, say, appealing to ray density or performing checkerboard reconstruction tests. The benefits associated with random sampling methods however are only realized by successive computation of predicted travel times in, potentially, strongly heterogeneous media. To this end this abstract is concerned with expediting the solution of the Eikonal equation. While many Eikonal solvers use a fast marching method, the proposed solver will use the iterative fast sweeping method because the eight fixed sweep orderings in each iteration are natural targets for parallelization. To reduce the number of iterations and grid points required the high-accuracy finite difference stencil of Nobel et al., 2014 is implemented. A directed acyclic graph (DAG) is created with a priori knowledge of the sweep ordering and finite different stencil. By performing a topological sort of the DAG sets of independent nodes are identified as candidates for concurrent updating. Additionally, the proposed solver will also address scalability during earthquake relocation, a necessary step in local and regional earthquake tomography and a barrier to extending probabilistic methods from active source to passive source applications, by introducing an asynchronous parallel forward solve phase for all receivers in the network. Synthetic examples using the SEG over-thrust model will be presented.
NASA Astrophysics Data System (ADS)
Idárraga-García, J.; Kendall, J.-M.; Vargas, C. A.
2016-09-01
To investigate the subduction dynamics in northwestern South America, we measured SKS and slab-related local S splitting at 38 seismic stations. Comparison between the delay times of both phases shows that most of the SKS splitting is due to entrained mantle flow beneath the subducting Nazca and Caribbean slabs. On the other hand, the fast polarizations of local S-waves are consistently aligned with regional faults, which implies the existence of a lithosphere-confined anisotropy in the overriding plate, and that the mantle wedge is not contributing significantly to the splitting. Also, we identified a clear change in SKS fast directions at the trace of the Caldas Tear (˜5°N), which represents a variation in the subduction style. To the north of ˜5°N, fast directions are consistently parallel to the flat subduction of the Caribbean plate-Panama arc beneath South America, while to the south fast polarizations are subparallel to the Nazca-South America subduction direction. A new change in the SKS splitting pattern is detected at ˜2.8°N, which is related to another variation in the subduction geometry marked by the presence of a lithosphere-scale tearing structure, named here as Malpelo Tear; in this region, NE-SW-oriented SKS fast directions are consistent with the general dip direction of the underthrusting of the Carnegie Ridge beneath South America. Further inland, this NE-SW-trending mantle flow continues beneath the Eastern Cordillera of Colombia and Merida Andes of Venezuela. Finally, our results suggest that the subslab mantle flow in northwestern South America is strongly controlled by the presence of lithospheric tearing structures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zugarramurdi, A.; Debiossac, M.; Lunca-Popa, P.
2015-03-09
We present a grazing incidence fast atom diffraction (GIFAD) study of monolayer graphene on 6H-SiC(0001). This system shows a Moiré-like 13 × 13 superlattice above the reconstructed carbon buffer layer. The averaging property of GIFAD results in electronic and geometric corrugations that are well decoupled; the graphene honeycomb corrugation is only observed with the incident beam parallel to the zigzag direction while the geometric corrugation arising from the superlattice is revealed along the armchair direction. Full-quantum calculations of the diffraction patterns show the very high GIFAD sensitivity to the amplitude of the surface corrugation. The best agreement between the calculated and measuredmore » diffraction intensities yields a corrugation height of 0.27 ± 0.03 Å.« less
Petrović, Z Lj; Phelps, A V
2009-12-01
Absolute spectral emissivities for Doppler broadened H(alpha) profiles are measured and compared with predictions of energetic hydrogen ion, atom, and molecule behavior in low-current electrical discharges in H2 at very high electric field E to gas density N ratios E/N and low values of Nd , where d is the parallel-plate electrode separation. These observations reflect the energy and angular distributions for the excited atoms and quantitatively test features of multiple-scattering kinetic models in weakly ionized hydrogen in the presence of an electric field that are not tested by the spatial distributions of H(alpha) emission. Absolute spectral intensities agree well with predictions. Asymmetries in Doppler profiles observed parallel to the electric field at 4
Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform.
Tao, Liang; Kwan, Hon Keung
2012-07-01
Novel algorithms for the multirate and fast parallel implementation of the 2-D discrete Hartley transform (DHT)-based real-valued discrete Gabor transform (RDGT) and its inverse transform are presented in this paper. A 2-D multirate-based analysis convolver bank is designed for the 2-D RDGT, and a 2-D multirate-based synthesis convolver bank is designed for the 2-D inverse RDGT. The parallel channels in each of the two convolver banks have a unified structure and can apply the 2-D fast DHT algorithm to speed up their computations. The computational complexity of each parallel channel is low and is independent of the Gabor oversampling rate. All the 2-D RDGT coefficients of an image are computed in parallel during the analysis process and can be reconstructed in parallel during the synthesis process. The computational complexity and time of the proposed parallel algorithms are analyzed and compared with those of the existing fastest algorithms for 2-D discrete Gabor transforms. The results indicate that the proposed algorithms are the fastest, which make them attractive for real-time image processing.
Efficient multitasking of Choleski matrix factorization on CRAY supercomputers
NASA Technical Reports Server (NTRS)
Overman, Andrea L.; Poole, Eugene L.
1991-01-01
A Choleski method is described and used to solve linear systems of equations that arise in large scale structural analysis. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is used for comparison with the microtasked and autotasked implementations. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both computers. CPU and wall clock timings are given for the parallel implementations and are compared to single processor timings of the same algorithm.
NASA Astrophysics Data System (ADS)
Maneva, Yana; Poedts, Stefaan
2017-04-01
The electromagnetic fluctuations in the solar wind represent a zoo of plasma waves with different properties, whose wavelengths range from largest fluid scales to the smallest dissipation scales. By nature the power spectrum of the magnetic fluctuations is anisotropic with different spectral slopes in parallel and perpendicular directions with respect to the background magnetic field. Furthermore, the magnetic field power spectra steepen as one moves from the inertial to the dissipation range and we observe multiple spectral breaks with different slopes in parallel and perpendicular direction at the ion scales and beyond. The turbulent dissipation of magnetic field fluctuations at the sub-ion scales is believed to go into local ion heating and acceleration, so that the spectral breaks are typically associated with particle energization. The gained energy can be in the form of anisotropic heating, formation of non-thermal features in the particle velocity distributions functions, and redistribution of the differential acceleration between the different ion populations. To study the relation between the evolution of the anisotropic turbulent spectra and the particle heating at the ion and sub-ion scales we perform a series of 2.5D hybrid simulations in a collisionless drifting proton-alpha plasma. We neglect the fast electron dynamics and treat the electrons as an isothermal fluid electrons, whereas the protons and a minor population of alpha particles are evolved in a fully kinetic manner. We start with a given wave spectrum and study the evolution of the magnetic field spectral slopes as a function of the parallel and perpendicular wave¬numbers. Simultaneously, we track the particle response and the energy exchange between the parallel and perpendicular scales. We observe anisotropic behavior of the turbulent power spectra with steeper slopes along the dominant energy-containing direction. This means that for parallel and quasi-parallel waves we have steeper spectral slope in parallel direction, whereas for highly oblique waves the dissipation occurs predominantly in perpendicular direction and the spectral slopes are steeper across the background magnetic field. The value of the spectral slopes depends on the angle of propagation, the spectral range, as well as the plasma properties. In general the dissipation is stronger at small scales and the corresponding spectral slopes there are steeper. For parallel and quasi-parallel propagation the prevailing energy cascade remains along the magnetic field, whereas for initially isotropic oblique turbulence the cascade develops mainly in perpendicular direction.
Fast Time and Space Parallel Algorithms for Solution of Parabolic Partial Differential Equations
NASA Technical Reports Server (NTRS)
Fijany, Amir
1993-01-01
In this paper, fast time- and Space -Parallel agorithms for solution of linear parabolic PDEs are developed. It is shown that the seemingly strictly serial iterations of the time-stepping procedure for solution of the problem can be completed decoupled.
Particle-in-cell studies of fast-ion slowing-down rates in cool tenuous magnetized plasma
NASA Astrophysics Data System (ADS)
Evans, Eugene S.; Cohen, Samuel A.; Welch, Dale R.
2018-04-01
We report on 3D-3V particle-in-cell simulations of fast-ion energy-loss rates in a cold, weakly-magnetized, weakly-coupled plasma where the electron gyroradius, ρe, is comparable to or less than the Debye length, λDe, and the fast-ion velocity exceeds the electron thermal velocity, a regime in which the electron response may be impeded. These simulations use explicit algorithms, spatially resolve ρe and λDe, and temporally resolve the electron cyclotron and plasma frequencies. For mono-energetic dilute fast ions with isotropic velocity distributions, these scaling studies of the slowing-down time, τs, versus fast-ion charge are in agreement with unmagnetized slowing-down theory; with an applied magnetic field, no consistent anisotropy between τs in the cross-field and field-parallel directions could be resolved. Scaling the fast-ion charge is confirmed as a viable way to reduce the required computational time for each simulation. The implications of these slowing down processes are described for one magnetic-confinement fusion concept, the small, advanced-fuel, field-reversed configuration device.
NASA Astrophysics Data System (ADS)
Russell, J. B.; Gaherty, J. B.; Lin, P. P.; Lizarralde, D.; Collins, J. A.; Hirth, G.; Evans, R. L.
2017-12-01
Observations of seismic anisotropy in the ocean basins are important for constraining deformation and melting processes in the upper mantle. The NoMelt OBS array was deployed on relatively pristine, 70 Ma seafloor in the central Pacific with the aim of constraining upper mantle circulation and the evolution of the lithosphere-asthenosphere system. Surface-waves traversing the array provide a unique opportunity to estimate a comprehensive set of anisotropic parameters. Azimuthal variations in Rayleigh-wave velocity over a period band of 15-180 s suggest strong anisotropic fabric both in the lithosphere and deep in the asthenosphere. High-frequency ambient noise (4-10 s) provides constraints on average VSV and VSH as well as azimuthal variations in both VS and VP in the upper ˜10 km of the mantle. Our best fitting models require radial anisotropy in the uppermost mantle with VSH > VSV by 3 - 7% and as much as 2% radial anisotropy in the crust. Additionally, we find a strong azimuthal dependence for Rayleigh- and Love-wave velocities, with Rayleigh 2θ fast direction parallel to the fossil spreading direction (FSD) and Love 2θ and 4θ fast directions shifted 90º and 45º from the FSD, respectively. These are some of the first direct observations of the Love 2θ and 4θ azimuthal signal, which allows us to directly invert for anisotropic terms G, B, and E in the uppermost Pacific lithosphere, for the first time. Together, these observations of radial and azimuthal anisotropy provide a comprehensive picture of oceanic mantle fabric and are consistent with horizontal alignment of olivine with the a-axis parallel to fossil spreading and having an orthorhombic or hexagonal symmetry.
Scalable direct Vlasov solver with discontinuous Galerkin method on unstructured mesh.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, J.; Ostroumov, P. N.; Mustapha, B.
2010-12-01
This paper presents the development of parallel direct Vlasov solvers with discontinuous Galerkin (DG) method for beam and plasma simulations in four dimensions. Both physical and velocity spaces are in two dimesions (2P2V) with unstructured mesh. Contrary to the standard particle-in-cell (PIC) approach for kinetic space plasma simulations, i.e., solving Vlasov-Maxwell equations, direct method has been used in this paper. There are several benefits to solving a Vlasov equation directly, such as avoiding noise associated with a finite number of particles and the capability to capture fine structure in the plasma. The most challanging part of a direct Vlasov solvermore » comes from higher dimensions, as the computational cost increases as N{sup 2d}, where d is the dimension of the physical space. Recently, due to the fast development of supercomputers, the possibility has become more realistic. Many efforts have been made to solve Vlasov equations in low dimensions before; now more interest has focused on higher dimensions. Different numerical methods have been tried so far, such as the finite difference method, Fourier Spectral method, finite volume method, and spectral element method. This paper is based on our previous efforts to use the DG method. The DG method has been proven to be very successful in solving Maxwell equations, and this paper is our first effort in applying the DG method to Vlasov equations. DG has shown several advantages, such as local mass matrix, strong stability, and easy parallelization. These are particularly suitable for Vlasov equations. Domain decomposition in high dimensions has been used for parallelization; these include a highly scalable parallel two-dimensional Poisson solver. Benchmark results have been shown and simulation results will be reported.« less
NASA Astrophysics Data System (ADS)
Gershman, D. J.; Figueroa-Vinas, A.; Dorelli, J.; Goldstein, M. L.; Shuster, J. R.; Avanov, L. A.; Boardsen, S. A.; Stawarz, J. E.; Schwartz, S. J.; Schiff, C.; Lavraud, B.; Saito, Y.; Paterson, W. R.; Giles, B. L.; Pollock, C. J.; Strangeway, R. J.; Russell, C. T.; Torbert, R. B.; Moore, T. E.; Burch, J. L.
2017-12-01
Measurements from the Fast Plasma Investigation (FPI) on NASA's Magnetospheric Multiscale (MMS) mission have enabled unprecedented analyses of kinetic-scale plasma physics. FPI regularly provides estimates of current density and pressure gradients of sufficient accuracy to evaluate the relative contribution of terms in plasma equations of motion. In addition, high-resolution three-dimensional velocity distribution functions of both ions and electrons provide new insights into kinetic-scale processes. As an example, for a monochromatic kinetic Alfven wave (KAW) we find non-zero, but out-of-phase parallel current density and electric field fluctuations, providing direct confirmation of the conservative energy exchange between the wave field and particles. In addition, we use fluctuations in current density and magnetic field to calculate the perpendicular and parallel wavelengths of the KAW. Furthermore, examination of the electron velocity distribution inside the KAW reveals a population of electrons non-linearly trapped in the kinetic-scale magnetic mirror formed between successive wave peaks. These electrons not only contribute to the wave's parallel electric field but also account for over half of the density fluctuations within the wave, supplying an unexpected mechanism for maintaining quasi-neutrality in a KAW. Finally, we demonstrate that the employed wave vector determination technique is also applicable to broadband fluctuations found in Earth's turbulent magnetosheath.
NASA Astrophysics Data System (ADS)
Kiyani, Khurom; Chapman, Sandra; Osman, Kareem; Sahraoui, Fouad; Hnat, Bogdan
2014-05-01
The anisotropic nature of the scaling properties of solar wind magnetic turbulence fluctuations is investigated scale by scale using high cadence in situ magnetic field measurements from the Cluster, ACE and STEREO spacecraft missions in both fast and slow quiet solar wind conditions. The data span five decades in scales from the inertial range to the electron Larmor radius. We find a clear transition in scaling behaviour between the inertial and kinetic range of scales, which provides a direct, quantitative constraint on the physical processes that mediate the cascade of energy through these scales. In the inertial (magnetohydrodynamic) range the statistical nature of turbulent fluctuations are known to be anisotropic, both in the vector components of the magnetic field fluctuations (variance anisotropy) and in the spatial scales of these fluctuations (wavevector or k-anisotropy). We show for the first time that, when measuring parallel to the local magnetic field direction, the full statistical signature of the magnetic and Elsasser field fluctuations is that of a non-Gaussian globally scale-invariant process. This is distinct from the classic multi-exponent statistics observed when the local magnetic field is perpendicular to the flow direction. These observations suggest the weakness, or absence, of a parallel magnetofluid turbulence energy cascade. In contrast to the inertial range, there is a successive increase toward isotropy between parallel and transverse power at scales below the ion Larmor radius, with isotropy being achieved at the electron Larmor radius. Computing higher-order statistics, we show that the full statistical signature of both parallel, and perpendicular fluctuations at scales below the ion Larmor radius are that of an isotropic globally scale-invariant non-Gaussian process. Lastly, we perform a survey of multiple intervals of quiet solar wind sampled under different plasma conditions (fast, slow wind; plasma beta etc.) and find that the above results on the scaling transition between inertial and kinetic range scales are qualitatively robust, and that quantitatively, there is a spread in the values of the scaling exponents.
Fast adaptive composite grid methods on distributed parallel architectures
NASA Technical Reports Server (NTRS)
Lemke, Max; Quinlan, Daniel
1992-01-01
The fast adaptive composite (FAC) grid method is compared with the adaptive composite method (AFAC) under variety of conditions including vectorization and parallelization. Results are given for distributed memory multiprocessor architectures (SUPRENUM, Intel iPSC/2 and iPSC/860). It is shown that the good performance of AFAC and its superiority over FAC in a parallel environment is a property of the algorithm and not dependent on peculiarities of any machine.
Design, fabrication and characterization of a micro-fluxgate intended for parallel robot application
NASA Astrophysics Data System (ADS)
Kirchhoff, M. R.; Bogdanski, G.; Büttgenbach, S.
2009-05-01
This paper presents a micro-magnetometer based on the fluxgate principle. Fluxgates detect the magnitude and direction of DC and low-frequency AC magnetic fields. The detectable flux density typically ranges from several 10 nT to about 1 mT. The introduced fluxgate sensor is fabricated using MEMS-technologies, basically UV depth lithography and electroplating for manufacturing high aspect ratio structures. It consists of helical copper coils around a soft magnetic nickel-iron (NiFe) core. The core is designed in so-called racetrack geometry, whereby the directional sensitivity of the sensor is considerably higher compared to common ring-core fluxgates. The electrical operation is based on analyzing the 2nd harmonic of the AC output signal. Configuration, manufacturing and selected characteristics of the fluxgate magnetometer are discussed in this work. The fluxgate builds the basis of an innovative angular sensor system for a parallel robot with HEXA-structure. Integrated into the passive joints of the parallel robot, the fluxgates are combined with permanent magnets rotating on the joint shafts. The magnet transmits the angular information via its magnetic orientation. In this way, the angles between the kinematic elements are measured, which allows self-calibration of the robot and the fast analytical solution of direct kinematics for an advanced workspace monitoring.
Size and Shape of the Distant Magnetotail
NASA Technical Reports Server (NTRS)
Sibeck, D.G.; Lin, R.-Q.
2014-01-01
We employ a global magnetohydrodynamic model to study the effects of the interplanetary magnetic field (IMF) strength and direction upon the cross-section of the magnetotail at lunar distances. The anisotropic pressure of draped magnetosheath magnetic field lines and the inclusion of a reconnection-generated standing slow mode wave fan bounded by a rotational discontinuity within the definition of the magnetotail result in cross-sections elongated in the direction parallel to the component of the IMF in the plane perpendicular to the Sun-Earth line. Tilted cross-tail plasma sheets separate the northern and southern lobes within these cross-sections. Greater fast mode speeds perpendicular than parallel to the draped magnetos heath magnetic field lines result in greater distances to the bow shock in the direction perpendicular than parallel to the component of the IMF in the plane transverse to the Sun-Earth line. The magnetotail cross-section responds rapidly to reconnected magnetic field lines requires no more than the magnetosheath convection time to appear at any distance downstream, and further adjustments of the cross-section in response to the anisotropic pressures of the draped magnetic field lines require no more than 10-20 minutes. Consequently for typical ecliptic IMF orientations and strengths, the magnetotail cross-section is oblate while the bow shock is prolate.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Tong, Liqing; Liu, Kefu
2017-06-01
The purpose of impedance matching for a Marx generator and DBD lamp is to limit the output current of the Marx generator, provide a large discharge current at ignition, and obtain fast voltage rising/falling edges and large overshoot. In this paper, different impedance matching circuits (series inductor, parallel capacitor, and series inductor combined with parallel capacitor) are analyzed. It demonstrates that a series inductor could limit the Marx current. However, the discharge current is also limited. A parallel capacitor could provide a large discharge current, but the Marx current is also enlarged. A series inductor combined with a parallel capacitor takes full advantage of the inductor and capacitor, and avoids their shortcomings. Therefore, it is a good solution. Experimental results match the theoretical analysis well and show that both the series inductor and parallel capacitor improve the performance of the system. However, the series inductor combined with the parallel capacitor has the best performance. Compared with driving the DBD lamp with a Marx generator directly, an increase of 97.3% in radiant power and an increase of 59.3% in system efficiency are achieved using this matching circuit.
NASA Astrophysics Data System (ADS)
Palmesi, P.; Exl, L.; Bruckner, F.; Abert, C.; Suess, D.
2017-11-01
The long-range magnetic field is the most time-consuming part in micromagnetic simulations. Computational improvements can relieve problems related to this bottleneck. This work presents an efficient implementation of the Fast Multipole Method [FMM] for the magnetic scalar potential as used in micromagnetics. The novelty lies in extending FMM to linearly magnetized tetrahedral sources making it interesting also for other areas of computational physics. We treat the near field directly and in use (exact) numerical integration on the multipole expansion in the far field. This approach tackles important issues like the vectorial and continuous nature of the magnetic field. By using FMM the calculations scale linearly in time and memory.
Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fijany, A.; Milman, M.; Redding, D.
1994-12-31
In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm,more » designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.« less
Dynamic grid refinement for partial differential equations on parallel computers
NASA Technical Reports Server (NTRS)
Mccormick, S.; Quinlan, D.
1989-01-01
The fast adaptive composite grid method (FAC) is an algorithm that uses various levels of uniform grids to provide adaptive resolution and fast solution of PDEs. An asynchronous version of FAC, called AFAC, that completely eliminates the bottleneck to parallelism is presented. This paper describes the advantage that this algorithm has in adaptive refinement for moving singularities on multiprocessor computers. This work is applicable to the parallel solution of two- and three-dimensional shock tracking problems.
Massively Parallel Solution of Poisson Equation on Coarse Grain MIMD Architectures
NASA Technical Reports Server (NTRS)
Fijany, A.; Weinberger, D.; Roosta, R.; Gulati, S.
1998-01-01
In this paper a new algorithm, designated as Fast Invariant Imbedding algorithm, for solution of Poisson equation on vector and massively parallel MIMD architectures is presented. This algorithm achieves the same optimal computational efficiency as other Fast Poisson solvers while offering a much better structure for vector and parallel implementation. Our implementation on the Intel Delta and Paragon shows that a speedup of over two orders of magnitude can be achieved even for moderate size problems.
Multitasking domain decomposition fast Poisson solvers on the Cray Y-MP
NASA Technical Reports Server (NTRS)
Chan, Tony F.; Fatoohi, Rod A.
1990-01-01
The results of multitasking implementation of a domain decomposition fast Poisson solver on eight processors of the Cray Y-MP are presented. The object of this research is to study the performance of domain decomposition methods on a Cray supercomputer and to analyze the performance of different multitasking techniques using highly parallel algorithms. Two implementations of multitasking are considered: macrotasking (parallelism at the subroutine level) and microtasking (parallelism at the do-loop level). A conventional FFT-based fast Poisson solver is also multitasked. The results of different implementations are compared and analyzed. A speedup of over 7.4 on the Cray Y-MP running in a dedicated environment is achieved for all cases.
NASA Technical Reports Server (NTRS)
Farhat, Charbel
1998-01-01
In this grant, we have proposed a three-year research effort focused on developing High Performance Computation and Communication (HPCC) methodologies for structural analysis on parallel processors and clusters of workstations, with emphasis on reducing the structural design cycle time. Besides consolidating and further improving the FETI solver technology to address plate and shell structures, we have proposed to tackle the following design related issues: (a) parallel coupling and assembly of independently designed and analyzed three-dimensional substructures with non-matching interfaces, (b) fast and smart parallel re-analysis of a given structure after it has undergone design modifications, (c) parallel evaluation of sensitivity operators (derivatives) for design optimization, and (d) fast parallel analysis of mildly nonlinear structures. While our proposal was accepted, support was provided only for one year.
Fast, Massively Parallel Data Processors
NASA Technical Reports Server (NTRS)
Heaton, Robert A.; Blevins, Donald W.; Davis, ED
1994-01-01
Proposed fast, massively parallel data processor contains 8x16 array of processing elements with efficient interconnection scheme and options for flexible local control. Processing elements communicate with each other on "X" interconnection grid with external memory via high-capacity input/output bus. This approach to conditional operation nearly doubles speed of various arithmetic operations.
Lithospheric deformation in the Canadian Appalachians: evidence from shear wave splitting
NASA Astrophysics Data System (ADS)
Bastow, I. D.; Gilligan, A.; Watson, E.; Darbyshire, F. A.; Levin, V. L.; Menke, W. H.; Lane, V.; Boyce, A.; Liddell, M. V.; Petrescu, L.; Hawthorn, D.
2016-12-01
Plate-scale deformation is expected to impart seismic anisotropic fabrics on the lithosphere. Determination of the fast shear wave orientation (φ ) and the delay time between the fast and slow split shear waves (δt ) via SKS splitting can help place spatial and temporal constraints on lithospheric deformation. The Canadian Appalachians experienced multiple episodes of deformation during the Phanerozoic: accretionary collisions during the Palaeozoic prior to the collision between Laurentia and Gondwana, and rifting related to the Mesozoic opening of the North Atlantic. However, the extent to which extensional events have overprinted older orogenic trends is uncertain. We address this issue through measurements of seismic anisotropy beneath the Canadian Appalachians, computing shear wave splitting parameters (φ , δt ) for new and existing seismic stations in Nova Scotia and New Brunswick. Average δt values of 1.2 s, relatively short length scale (≥ 100 km) splitting parameter variations, and a lack of correlation with absolute plate motion direction and mantle flow models, demonstrate that fossil lithospheric anisotropic fabrics dominate our results. Most fast directions parallel Appalachian orogenic trends observed at the surface, while δt values point towards coherent deformation of the crust and mantle lithosphere. Mesozoic rifting had minimal impact on our study area, except locally within the Bay of Fundy and in southern Nova Scotia, where fast directions are subparallel to the opening direction of Mesozoic rifting; associated δt values of > 1 s require an anisotropic layer that spans both the crust and mantle, meaning the formation of the Bay of Fundy was not merely a thin-skinned tectonic event.
Lithospheric deformation in the Canadian Appalachians: evidence from shear wave splitting
NASA Astrophysics Data System (ADS)
Gilligan, Amy; Bastow, Ian D.; Watson, Emma; Darbyshire, Fiona A.; Levin, Vadim; Menke, William; Lane, Victoria; Hawthorn, David; Boyce, Alistair; Liddell, Mitchell V.; Petrescu, Laura
2016-08-01
Plate-scale deformation is expected to impart seismic anisotropic fabrics on the lithosphere. Determination of the fast shear wave orientation (ϕ) and the delay time between the fast and slow split shear waves (δt) via SKS splitting can help place spatial and temporal constraints on lithospheric deformation. The Canadian Appalachians experienced multiple episodes of deformation during the Phanerozoic: accretionary collisions during the Palaeozoic prior to the collision between Laurentia and Gondwana, and rifting related to the Mesozoic opening of the North Atlantic. However, the extent to which extensional events have overprinted older orogenic trends is uncertain. We address this issue through measurements of seismic anisotropy beneath the Canadian Appalachians, computing shear wave splitting parameters (ϕ, δt) for new and existing seismic stations in Nova Scotia and New Brunswick. Average δt values of 1.2 s, relatively short length scale (≥100 km) splitting parameter variations, and a lack of correlation with absolute plate motion direction and mantle flow models, demonstrate that fossil lithospheric anisotropic fabrics dominate our results. Most fast directions parallel Appalachian orogenic trends observed at the surface, while δt values point towards coherent deformation of the crust and mantle lithosphere. Mesozoic rifting had minimal impact on our study area, except locally within the Bay of Fundy and in southern Nova Scotia, where fast directions are subparallel to the opening direction of Mesozoic rifting; associated δt values of >1 s require an anisotropic layer that spans both the crust and mantle, meaning the formation of the Bay of Fundy was not merely a thin-skinned tectonic event.
A Domain Decomposition Parallelization of the Fast Marching Method
NASA Technical Reports Server (NTRS)
Herrmann, M.
2003-01-01
In this paper, the first domain decomposition parallelization of the Fast Marching Method for level sets has been presented. Parallel speedup has been demonstrated in both the optimal and non-optimal domain decomposition case. The parallel performance of the proposed method is strongly dependent on load balancing separately the number of nodes on each side of the interface. A load imbalance of nodes on either side of the domain leads to an increase in communication and rollback operations. Furthermore, the amount of inter-domain communication can be reduced by aligning the inter-domain boundaries with the interface normal vectors. In the case of optimal load balancing and aligned inter-domain boundaries, the proposed parallel FMM algorithm is highly efficient, reaching efficiency factors of up to 0.98. Future work will focus on the extension of the proposed parallel algorithm to higher order accuracy. Also, to further enhance parallel performance, the coupling of the domain decomposition parallelization to the G(sub 0)-based parallelization will be investigated.
Seismic anisotropy beneath South China Sea: using SKS splitting to constrain mantle flow
NASA Astrophysics Data System (ADS)
Xue, M.; Le, K.; Yang, T.
2011-12-01
The evolution of South China Sea is under debate and several hypotheses have been proposed: (1) The collision of India plate and Eurasia plate; (2) the backward movement of the Pacific subduction plate; (3) mantle upwelling; and (4) combinations of above hypotheses. All these causal mechanisms emphasize the contributions of deep structures to the evolution of South China Sea. In this study we use earthquake data recorded by seismic stations surrounding South China Sea to constrain mantle flow beneath. To fill the vacancy of seismic data in Viet Nam, we deployed 4 seismic stations (VT01-VT04) in a roughly north - south orientation in Viet Nam in Nov. 2009. We combine the VT dataset with the AD and MY datasets from IRIS and select 81 events for SKS splitting analysis. Measurements were made at 11 stations using Wolfe and Silver (1998)'s multi-event stacking procedure. Our observed splitting directions in Vietnam are generally consistent with those of Bai et. al. (2009) . In northern Vietnam, the splitting times are around 1 sec and the fast directions are NWW-SEE, parallel to the absolute plate motion as well as the motion of the Earth surface, implying the crust and the mantle are coupled in this region and is moving as a result of the collision of India and China. While in southern Vietnam and Malaya, the fast directions are NE-SW, almost perpendicular to the absolute plate motion as well as the surface motion of Eurasia plate. However, the observed NE-SW is parallel to the subduction direction of the Australian plate, which might be caused by the mantle flow along NE-SW induced by the subduction.
Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering.
He, Zhaoshui; Xie, Shengli; Zdunek, Rafal; Zhou, Guoxu; Cichocki, Andrzej
2011-12-01
Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: α-SNMF and β -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression.
S-HARP: A parallel dynamic spectral partitioner
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sohn, A.; Simon, H.
1998-01-01
Computational science problems with adaptive meshes involve dynamic load balancing when implemented on parallel machines. This dynamic load balancing requires fast partitioning of computational meshes at run time. The authors present in this report a fast parallel dynamic partitioner, called S-HARP. The underlying principles of S-HARP are the fast feature of inertial partitioning and the quality feature of spectral partitioning. S-HARP partitions a graph from scratch, requiring no partition information from previous iterations. Two types of parallelism have been exploited in S-HARP, fine grain loop level parallelism and coarse grain recursive parallelism. The parallel partitioner has been implemented in Messagemore » Passing Interface on Cray T3E and IBM SP2 for portability. Experimental results indicate that S-HARP can partition a mesh of over 100,000 vertices into 256 partitions in 0.2 seconds on a 64 processor Cray T3E. S-HARP is much more scalable than other dynamic partitioners, giving over 15 fold speedup on 64 processors while ParaMeTiS1.0 gives a few fold speedup. Experimental results demonstrate that S-HARP is three to 10 times faster than the dynamic partitioners ParaMeTiS and Jostle on six computational meshes of size over 100,000 vertices.« less
A general purpose subroutine for fast fourier transform on a distributed memory parallel machine
NASA Technical Reports Server (NTRS)
Dubey, A.; Zubair, M.; Grosch, C. E.
1992-01-01
One issue which is central in developing a general purpose Fast Fourier Transform (FFT) subroutine on a distributed memory parallel machine is the data distribution. It is possible that different users would like to use the FFT routine with different data distributions. Thus, there is a need to design FFT schemes on distributed memory parallel machines which can support a variety of data distributions. An FFT implementation on a distributed memory parallel machine which works for a number of data distributions commonly encountered in scientific applications is presented. The problem of rearranging the data after computing the FFT is also addressed. The performance of the implementation on a distributed memory parallel machine Intel iPSC/860 is evaluated.
Thermal diffusivity measurement of GaAs/AlGaAs thin-film structures
NASA Astrophysics Data System (ADS)
Chen, G.; Tien, C. L.; Wu, X.; Smith, J. S.
1994-05-01
This work develops a new measurement technique that determines the thermal diffusivity of thin films in both parallel and perpendicular directions, and presents experimental results on the thermal diffusivity of GaAs/AlGaAs-based thin-film structures. In the experiment, a modulated laser source heats up the sample and a fast-response temperature sensor patterned directly on the sample picks up the thermal response. From the phase delay between the heating source and the temperature sensor, the thermal diffusivity in either the parallel or perpendicular direction is obtained depending on the experimental configuration. The experiment is performed on a molecular-beam-epitaxy grown vertical-cavity surface-emitting laser (VCSEL) structure. The substrates of the samples are etched away to eliminate the effects of the interface between the film and the substrate. The results show that the thermal diffusivity of the VCSEL structure is 5-7 times smaller than that of its corresponding bulk media. The experiments also provide evidence on the anisotropy of thermal diffusivity caused solely by the effects of interfaces and boundaries of thin films.
Overview and extensions of a system for routing directed graphs on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1988-01-01
Many problems can be described in terms of directed graphs that contain a large number of vertices where simple computations occur using data from adjacent vertices. A method is given for parallelizing such problems on an SIMD machine model that uses only nearest neighbor connections for communication, and has no facility for local indirect addressing. Each vertex of the graph will be assigned to a processor in the machine. Rules for a labeling are introduced that support the use of a simple algorithm for movement of data along the edges of the graph. Additional algorithms are defined for addition and deletion of edges. Modifying or adding a new edge takes the same time as parallel traversal. This combination of architecture and algorithms defines a system that is relatively simple to build and can do fast graph processing. All edges can be traversed in parallel in time O(T), where T is empirically proportional to the average path length in the embedding times the average degree of the graph. Additionally, researchers present an extension to the above method which allows for enhanced performance by allowing some broadcasting capabilities.
Rotary fast tool servo system and methods
Montesanti, Richard C.; Trumper, David L.
2007-10-02
A high bandwidth rotary fast tool servo provides tool motion in a direction nominally parallel to the surface-normal of a workpiece at the point of contact between the cutting tool and workpiece. Three or more flexure blades having all ends fixed are used to form an axis of rotation for a swing arm that carries a cutting tool at a set radius from the axis of rotation. An actuator rotates a swing arm assembly such that a cutting tool is moved in and away from the lathe-mounted, rotating workpiece in a rapid and controlled manner in order to machine the workpiece. A pair of position sensors provides rotation and position information for a swing arm to a control system. A control system commands and coordinates motion of the fast tool servo with the motion of a spindle, rotating table, cross-feed slide, and in-feed slide of a precision lathe.
Rotary fast tool servo system and methods
Montesanti, Richard C [Cambridge, MA; Trumper, David L [Plaistow, NH; Kirtley, Jr., James L.
2009-08-18
A high bandwidth rotary fast tool servo provides tool motion in a direction nominally parallel to the surface-normal of a workpiece at the point of contact between the cutting tool and workpiece. Three or more flexure blades having all ends fixed are used to form an axis of rotation for a swing arm that carries a cutting tool at a set radius from the axis of rotation. An actuator rotates a swing arm assembly such that a cutting tool is moved in and away from the lathe-mounted, rotating workpiece in a rapid and controlled manner in order to machine the workpiece. One or more position sensors provides rotation and position information for a swing arm to a control system. A control system commands and coordinates motion of the fast tool servo with the motion of a spindle, rotating table, cross-feed slide, and in-feed slide of a precision lathe.
Particle-in-cell studies of fast-ion slowing-down rates in cool tenuous magnetized plasma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Evans, Eugene S.; Cohen, Samuel A.; Welch, Dale R.
We report on 3D-3V particle-in-cell simulations of fast-ion energy-loss rates in a cold, weakly-magnetized, weakly-coupled plasma where the electron gyroradius, ρe, is comparable to or less than the Debye length, λ De, and the fast-ion velocity exceeds the electron thermal velocity, a regime in which the electron response may be impeded. These simulations use explicit algorithms, spatially resolve ρ e and λ De, and temporally resolve the electron cyclotron and plasma frequencies. For mono-energetic dilute fast ions with isotropic velocity distributions, these scaling studies of the slowing-down time, τ s, versus fast-ion charge are in agreement with unmagnetized slowing-down theory;more » with an applied magnetic field, no consistent anisotropy between τs in the cross-field and field-parallel directions could be resolved. Scaling the fast-ion charge is confirmed as a viable way to reduce the required computational time for each simulation. In conclusion, the implications of these slowing down processes are described for one magnetic-confinement fusion concept, the small, advanced-fuel, field-reversed configuration device.« less
Particle-in-cell studies of fast-ion slowing-down rates in cool tenuous magnetized plasma
Evans, Eugene S.; Cohen, Samuel A.; Welch, Dale R.
2018-04-05
We report on 3D-3V particle-in-cell simulations of fast-ion energy-loss rates in a cold, weakly-magnetized, weakly-coupled plasma where the electron gyroradius, ρe, is comparable to or less than the Debye length, λ De, and the fast-ion velocity exceeds the electron thermal velocity, a regime in which the electron response may be impeded. These simulations use explicit algorithms, spatially resolve ρ e and λ De, and temporally resolve the electron cyclotron and plasma frequencies. For mono-energetic dilute fast ions with isotropic velocity distributions, these scaling studies of the slowing-down time, τ s, versus fast-ion charge are in agreement with unmagnetized slowing-down theory;more » with an applied magnetic field, no consistent anisotropy between τs in the cross-field and field-parallel directions could be resolved. Scaling the fast-ion charge is confirmed as a viable way to reduce the required computational time for each simulation. In conclusion, the implications of these slowing down processes are described for one magnetic-confinement fusion concept, the small, advanced-fuel, field-reversed configuration device.« less
NASA Technical Reports Server (NTRS)
Chew, W. C.; Song, J. M.; Lu, C. C.; Weedon, W. H.
1995-01-01
In the first phase of our work, we have concentrated on laying the foundation to develop fast algorithms, including the use of recursive structure like the recursive aggregate interaction matrix algorithm (RAIMA), the nested equivalence principle algorithm (NEPAL), the ray-propagation fast multipole algorithm (RPFMA), and the multi-level fast multipole algorithm (MLFMA). We have also investigated the use of curvilinear patches to build a basic method of moments code where these acceleration techniques can be used later. In the second phase, which is mainly reported on here, we have concentrated on implementing three-dimensional NEPAL on a massively parallel machine, the Connection Machine CM-5, and have been able to obtain some 3D scattering results. In order to understand the parallelization of codes on the Connection Machine, we have also studied the parallelization of 3D finite-difference time-domain (FDTD) code with PML material absorbing boundary condition (ABC). We found that simple algorithms like the FDTD with material ABC can be parallelized very well allowing us to solve within a minute a problem of over a million nodes. In addition, we have studied the use of the fast multipole method and the ray-propagation fast multipole algorithm to expedite matrix-vector multiplication in a conjugate-gradient solution to integral equations of scattering. We find that these methods are faster than LU decomposition for one incident angle, but are slower than LU decomposition when many incident angles are needed as in the monostatic RCS calculations.
NASA Astrophysics Data System (ADS)
Jiang, Xikai; Li, Jiyuan; Zhao, Xujun; Qin, Jian; Karpeev, Dmitry; Hernandez-Ortiz, Juan; de Pablo, Juan J.; Heinonen, Olle
2016-08-01
Large classes of materials systems in physics and engineering are governed by magnetic and electrostatic interactions. Continuum or mesoscale descriptions of such systems can be cast in terms of integral equations, whose direct computational evaluation requires O(N2) operations, where N is the number of unknowns. Such a scaling, which arises from the many-body nature of the relevant Green's function, has precluded wide-spread adoption of integral methods for solution of large-scale scientific and engineering problems. In this work, a parallel computational approach is presented that relies on using scalable open source libraries and utilizes a kernel-independent Fast Multipole Method (FMM) to evaluate the integrals in O(N) operations, with O(N) memory cost, thereby substantially improving the scalability and efficiency of computational integral methods. We demonstrate the accuracy, efficiency, and scalability of our approach in the context of two examples. In the first, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space. In the second, we solve an electrostatic problem involving polarizable dielectric bodies in an unbounded dielectric medium. The results from these test cases show that our proposed parallel approach, which is built on a kernel-independent FMM, can enable highly efficient and accurate simulations and allow for considerable flexibility in a broad range of applications.
Adropin induction of lipoprotein lipase expression in tilapia hepatocytes.
Lian, Anji; Wu, Keqiang; Liu, Tianqiang; Jiang, Nan; Jiang, Quan
2016-01-01
The peptide hormone adropin plays a role in energy homeostasis. However, biological actions of adropin in non-mammalian species are still lacking. Using tilapia as a model, we examined the role of adropin in lipoprotein lipase (LPL) regulation in hepatocytes. To this end, the structural identity of tilapia adropin was established by 5'/3'-rapid amplification of cDNA ends (RACE). The transcripts of tilapia adropin were ubiquitously expressed in various tissues with the highest levels in the liver and hypothalamus. The prolonged fasting could elevate tilapia hepatic adropin gene expression, whereas no effect of fasting was observed on hypothalamic adropin gene levels. In primary cultures of tilapia hepatocytes, synthetic adropin was effective in stimulating LPL release, cellular LPL content, and total LPL production. The increase in LPL production also occurred with parallel rises in LPL gene levels. In parallel experiments, adropin could elevate cAMP production and up-regulate protein kinase A (PKA) and PKC activities. Using a pharmacological approach, cAMP/PKA and PLC/inositol trisphosphate (IP3)/PKC cascades were shown to be involved in adropin-stimulated LPL gene expression. Parallel inhibition of p38MAPK and Erk1/2, however, were not effective in these regards. Our findings provide, for the first time, evidence that adropin could stimulate LPL gene expression via direct actions in tilapia hepatocytes through the activation of multiple signaling mechanisms. © 2016 Society for Endocrinology.
Jiang, Xikai; Li, Jiyuan; Zhao, Xujun; ...
2016-08-10
Large classes of materials systems in physics and engineering are governed by magnetic and electrostatic interactions. Continuum or mesoscale descriptions of such systems can be cast in terms of integral equations, whose direct computational evaluation requires O( N 2) operations, where N is the number of unknowns. Such a scaling, which arises from the many-body nature of the relevant Green's function, has precluded wide-spread adoption of integral methods for solution of large-scale scientific and engineering problems. In this work, a parallel computational approach is presented that relies on using scalable open source libraries and utilizes a kernel-independent Fast Multipole Methodmore » (FMM) to evaluate the integrals in O( N) operations, with O( N) memory cost, thereby substantially improving the scalability and efficiency of computational integral methods. We demonstrate the accuracy, efficiency, and scalability of our approach in the context of two examples. In the first, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space. In the second, we solve an electrostatic problem involving polarizable dielectric bodies in an unbounded dielectric medium. Lastly, the results from these test cases show that our proposed parallel approach, which is built on a kernel-independent FMM, can enable highly efficient and accurate simulations and allow for considerable flexibility in a broad range of applications.« less
Exploiting Symmetry on Parallel Architectures.
NASA Astrophysics Data System (ADS)
Stiller, Lewis Benjamin
1995-01-01
This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
Electromagnetic variable degrees of freedom actuator systems and methods
Montesanti, Richard C [Pleasanton, CA; Trumper, David L [Plaistow, NH; Kirtley, Jr., James L.
2009-02-17
The present invention provides a variable reluctance actuator system and method that can be adapted for simultaneous rotation and translation of a moving element by applying a normal-direction magnetic flux on the moving element. In a beneficial example arrangement, the moving element includes a swing arm that carries a cutting tool at a set radius from an axis of rotation so as to produce a rotary fast tool servo that provides a tool motion in a direction substantially parallel to the surface-normal of a workpiece at the point of contact between the cutting tool and workpiece. An actuator rotates a swing arm such that a cutting tool moves toward and away from a mounted rotating workpiece in a controlled manner in order to machine the workpiece. Position sensors provide rotation and displacement information for a swing arm to a control system. A control system commands and coordinates motion of the fast tool servo with the motion of a spindle, rotating table, cross-feed slide, and in feed slide of a precision lathe.
NASA Astrophysics Data System (ADS)
Roy, Sunil K.; Kumar, M. Ravi; Davuluri, Srinagesh
2017-08-01
This study presents 106 splitting and 40 null measurements of source side anisotropy in subduction zones, utilizing direct S waves registered at two stations sited on the Indian continent, which show null shear wave splitting measurements for SKS phases. Our results suggest that trench-parallel anisotropy is dominant beneath the Philippines, Mariana, Izu-Bonin, and edge of the Java slab, while plate motion-parallel anisotropy is observed beneath the Solomon, Aegean, Japan, and Java slabs. Results from Kuril and Aleutian regions reveal trench-oblique anisotropy. We chose to interpret these observations primarily in terms of mantle flow beneath a subduction zone. While the two-dimensional (2-D) slab entrained flow model offers a simple explanation for trench-normal fast polarization azimuths (FPA), the trench-parallel FPA can be reconciled by extension due to slab rollback. The model that invokes age of the subducting lithosphere can explain anisotropy in the subslab, derived from rays recorded at the updip stations. However, when downdip stations are used, contributions from the slab and supraslab need to be considered. In Japan, anisotropy in the subslab mantle shallower than 300 km might be associated with trench-parallel mantle flow resulting in the alignment of FPA in the same direction. Anisotropy in the deeper part, above the transition zone, is probably associated with 2-D flow resulting in trench-normal FPA. Anisotropy in the Mariana Trench might be associated with trench-parallel mantle flow in the supraslab region, with similar deformation in the upper mantle and the transition zone.
NASA Astrophysics Data System (ADS)
Savage, M. K.; Ferrazzini, V.; Peltier, A.; Rivemale, E.; Mayor, J.; Schmid, A.; Brenguier, F.; Massin, F.; Got, J.-L.; Battaglia, J.; DiMuro, A.; Staudacher, T.; Rivet, D.; Taisne, B.; Shelley, A.
2015-05-01
The Piton de la Fournaise volcano exhibits frequent eruptions preceded by seismic swarms and is a good target to test hypotheses about magmatically induced variations in seismic wave properties. We use a permanent station network and a portable broadband network to compare seismic anisotropy measured via shear wave splitting with geodetic displacements, ratios of compressional to shear velocity (Vp/Vs), earthquake focal mechanisms, and ambient noise correlation analysis of surface wave velocities and to examine velocity and stress changes from 2000 through 2012. Fast directions align radially to the central cone and parallel to surface cracks and fissures, suggesting stress-controlled cracks. High Vp/Vs ratios under the summit compared with low ratios under the flank suggest spatial variations in the proportion of fluid-filled versus gas-filled cracks. Secular variations of fast directions (ϕ) and delay times (dt) between split shear waves are interpreted to sense changing crack densities and pressure. Delay times tend to increase while surface wave velocity decreases before eruptions. Rotations of ϕ may be caused by changes in either stress direction or fluid pressure. These changes usually correlate with GPS baseline changes. Changes in shear wave splitting measurements made on multiplets yield several populations with characteristic delay times, measured incoming polarizations, and fast directions, which change their proportion as a function of time. An eruption sequence on 14 October 2010 yielded over 2000 shear wave splitting measurements in a 14 h period, allowing high time resolution measurements to characterize the sequence. Stress directions from a propagating dike model qualitatively fit the temporal change in splitting.
NASA Astrophysics Data System (ADS)
Shao, Tongbin; Ji, Shaocheng; Oya, Shoma; Michibayashi, Katsuyoshi; Wang, Qian
2016-05-01
Measurements of crystallographic preferred orientations (CPO) and calculations of P- and S-wave velocities (Vp and Vs) and anisotropy were conducted on three quartz-mica schists and one felsic mylonite, which are representative of typical metamorphic rocks deformed in the middle crust beneath the southeastern Tibetan plateau. Results show that the schists have Vp anisotropy (AVp) ranging from 16.4% to 25.5% and maximum Vs anisotropy [AVs(max)] between 21.6% and 37.8%. The mylonite has lower AVp and AVs(max) but slightly higher foliation anisotropy, which are 13.2%, 18.5%, and 3.07%, respectively, due to the lower content and CPO strength of mica. With increasing mica content, the deformed rocks tend to form transverse isotropy (TI) with fast velocities in the foliation plane and slow velocities normal to the foliation. However, the presence of prismatic minerals (e.g., amphibole and sillimanite) forces the overall symmetry to deviate from TI. An increase in feldspar content reduces the bulk anisotropy caused by mica or quartz because the fast-axis of feldspar aligns parallel to the slow-axis of mica and/or quartz. The effect of quartz on seismic properties of mica-bearing rocks is complex, depending on its content and prevailing slip system. The greatest shear-wave splitting and fastest Vp both occur for propagation directions within the foliation plane, consistent with the fast Pms (S-wave converted from P-wave at the Moho) polarization directions in the west Yunnan where mica/amphibole-bearing rocks have developed pervasive subvertical foliation and subhorizontal lineation. The fast Pms directions are perpendicular to the approximately E-W orienting fast SKS (S-wave traversing the core as P-wave) directions, indicating a decoupling at the Moho interface between the crust and mantle beneath the region. The seismic data are inconsistent with the model of crustal channel flow as the latter should produce a subhorizontal foliation where vertically incident shear waves suffer little splitting.
Particle simulation of plasmas on the massively parallel processor
NASA Technical Reports Server (NTRS)
Gledhill, I. M. A.; Storey, L. R. O.
1987-01-01
Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.
Zhu, Lei; Yin, Qiuyuan; Irwin, David M; Zhang, Shuyi
2015-01-01
Bats are an ideal mammalian group for exploring adaptations to fasting due to their large variety of diets and because fasting is a regular part of their life cycle. Mammals fed on a carbohydrate-rich diet experience a rapid decrease in blood glucose levels during a fast, thus, the development of mechanisms to resist the consequences of regular fasts, experienced on a daily basis, must have been crucial in the evolution of frugivorous bats. Phosphoenolpyruvate carboxykinase 1 (PEPCK1, encoded by the Pck1 gene) is the rate-limiting enzyme in gluconeogenesis and is largely responsible for the maintenance of glucose homeostasis during fasting in fruit-eating bats. To test whether Pck1 has experienced adaptive evolution in frugivorous bats, we obtained Pck1 coding sequence from 20 species of bats, including five Old World fruit bats (OWFBs) (Pteropodidae) and two New World fruit bats (NWFBs) (Phyllostomidae). Our molecular evolutionary analyses of these sequences revealed that Pck1 was under purifying selection in both Old World and New World fruit bats with no evidence of positive selection detected in either ancestral branch leading to fruit bats. Interestingly, however, six specific amino acid substitutions were detected on the ancestral lineage of OWFBs. In addition, we found considerable evidence for parallel evolution, at the amino acid level, between the PEPCK1 sequences of Old World fruit bats and New World fruit bats. Test for parallel evolution showed that four parallel substitutions (Q276R, R503H, I558V and Q593R) were driven by natural selection. Our study provides evidence that Pck1 underwent parallel evolution between Old World and New World fruit bats, two lineages of mammals that feed on a carbohydrate-rich diet and experience regular periods of fasting as part of their life cycle.
Irwin, David M.; Zhang, Shuyi
2015-01-01
Bats are an ideal mammalian group for exploring adaptations to fasting due to their large variety of diets and because fasting is a regular part of their life cycle. Mammals fed on a carbohydrate-rich diet experience a rapid decrease in blood glucose levels during a fast, thus, the development of mechanisms to resist the consequences of regular fasts, experienced on a daily basis, must have been crucial in the evolution of frugivorous bats. Phosphoenolpyruvate carboxykinase 1 (PEPCK1, encoded by the Pck1 gene) is the rate-limiting enzyme in gluconeogenesis and is largely responsible for the maintenance of glucose homeostasis during fasting in fruit-eating bats. To test whether Pck1 has experienced adaptive evolution in frugivorous bats, we obtained Pck1 coding sequence from 20 species of bats, including five Old World fruit bats (OWFBs) (Pteropodidae) and two New World fruit bats (NWFBs) (Phyllostomidae). Our molecular evolutionary analyses of these sequences revealed that Pck1 was under purifying selection in both Old World and New World fruit bats with no evidence of positive selection detected in either ancestral branch leading to fruit bats. Interestingly, however, six specific amino acid substitutions were detected on the ancestral lineage of OWFBs. In addition, we found considerable evidence for parallel evolution, at the amino acid level, between the PEPCK1 sequences of Old World fruit bats and New World fruit bats. Test for parallel evolution showed that four parallel substitutions (Q276R, R503H, I558V and Q593R) were driven by natural selection. Our study provides evidence that Pck1 underwent parallel evolution between Old World and New World fruit bats, two lineages of mammals that feed on a carbohydrate-rich diet and experience regular periods of fasting as part of their life cycle. PMID:25807515
NASA Technical Reports Server (NTRS)
Dagum, Leonardo
1989-01-01
The data parallel implementation of a particle simulation for hypersonic rarefied flow described by Dagum associates a single parallel data element with each particle in the simulation. The simulated space is divided into discrete regions called cells containing a variable and constantly changing number of particles. The implementation requires a global sort of the parallel data elements so as to arrange them in an order that allows immediate access to the information associated with cells in the simulation. Described here is a very fast algorithm for performing the necessary ranking of the parallel data elements. The performance of the new algorithm is compared with that of the microcoded instruction for ranking on the Connection Machine.
Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2012-01-10
The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2008-01-01
The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Single-cell analysis by ICP-MS/MS as a fast tool for cellular bioavailability studies of arsenite.
Meyer, S; López-Serrano, A; Mitze, H; Jakubowski, N; Schwerdtle, T
2018-01-24
Single-cell inductively coupled plasma mass spectrometry (SC-ICP-MS) has become a powerful and fast tool to evaluate the elemental composition at a single-cell level. In this study, the cellular bioavailability of arsenite (incubation of 25 and 50 μM for 0-48 h) has been successfully assessed by SC-ICP-MS/MS for the first time directly after re-suspending the cells in water. This procedure avoids the normally arising cell membrane permeabilization caused by cell fixation methods (e.g. methanol fixation). The reliability and feasibility of this SC-ICP-MS/MS approach with a limit of detection of 0.35 fg per cell was validated by conventional bulk ICP-MS/MS analysis after cell digestion and parallel measurement of sulfur and phosphorus.
Chen, Ying-Ying; Chang, Li-Te; Chen, Hung-Wei; Yang, Chia-Ying; Hsin, Ling-Wei
2017-03-13
A fast and facile synthesis of a series of 4-nitrophenyl 2-azidoethylcarbamate derivatives as activated urea building blocks was developed. The N-Fmoc-protected 2-aminoethyl mesylates derived from various commercially available N-Fmoc-protected α-amino acids, including those having functionalized side chains with acid-labile protective groups, were directly transformed into 4-nitrophenyl 2-azidoethylcarbamate derivatives in 1 h via a one-pot two-step reaction. These urea building blocks were utilized for the preparation of a series of urea moiety-containing mitoxantrone-amino acid conjugates in 75-92% yields and parallel solution-phase synthesis of a urea compound library consisted of 30 members in 38-70% total yields.
Coherent field propagation between tilted planes.
Stock, Johannes; Worku, Norman Girma; Gross, Herbert
2017-10-01
Propagating electromagnetic light fields between nonparallel planes is of special importance, e.g., within the design of novel computer-generated holograms or the simulation of optical systems. In contrast to the extensively discussed evaluation between parallel planes, the diffraction-based propagation of light onto a tilted plane is more burdensome, since discrete fast Fourier transforms cannot be applied directly. In this work, we propose a quasi-fast algorithm (O(N 3 log N)) that deals with this problem. Based on a proper decomposition into three rotations, the vectorial field distribution is calculated on a tilted plane using the spectrum of plane waves. The algorithm works on equidistant grids, so neither nonuniform Fourier transforms nor an explicit complex interpolation is necessary. The proposed algorithm is discussed in detail and applied to several examples of practical interest.
NASA Astrophysics Data System (ADS)
Walther, M.; Plenefisch, T.; Rümpker, G.
2014-02-01
Upper mantle anisotropy beneath Germany is investigated through the measurements and analysis of shear-wave splitting using SKS phases. We analysed teleseismic events recorded by 24 broadband stations of the German Regional Seismic Network (GRSN) and three broadband stations of the Gräfenberg-Array (GRF). These permanent German networks cover an area extending from the Alps in the south up to the Northern German basin towards north. In comparison to several former studies that are based either on short observation periods or that are restricted to limited areas of Germany, we resort to 22 yr of the GRSN (1991-2012) and 34 yr of GRF data archive (1979-2012). Due to the huge amount of data, we applied a fully automatic procedure to determine SKS splitting parameters from archived recordings and also applied strong quality constraints to obtain reliable solutions. From our analysis, two main features are obvious: For the stations in the middle and southern part of Germany we found homogeneous E-W to ENE-WSW fast-axis directions. In contrast, stations in NE-Germany exhibit a NW-SE oriented fast axis. Both findings can be correlated to major tectonic features in Central Europe. The E-W to ENE-WSW orientations in the middle and southern part of Germany are nearly parallel to the strike of the Variscan mountain belts, whereas the NW-SE direction in NE-Germany corresponds to the orientation of the nearby Tornquist-Teisseyre suture zone. For the southern part of Germany, there are indications for an alignment of the fast axis parallel to the curvature of the nearby Alps. Apart from the more large-scale features there are two stations (BFO and CLZ) which seem to have an imprint related to the regional geodynamic setting, namely the rifting in the Southern Rhine Graben and the formation of the Harz Mountains, respectively. We conclude that the observed regional variations of splitting parameter over Germany advocate for a mostly lithospheric route of the anisotropy. Furthermore, variations of the splitting parameters with respect to the azimuths of the incoming waves, as observed at some stations, point to vertical varying anisotropy. For some stations (BFO, RUE) the inversions for two anisotropic layers revealed directions of the fast axes that are similar to the strike directions of the surrounding tectonic units. For other stations, the confidence regions are too large for a tectonic interpretation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Malony, Allen D; Shende, Sameer
This is the final progress report for the FastOS (Phase 2) (FastOS-2) project with Argonne National Laboratory and the University of Oregon (UO). The project started at UO on July 1, 2008 and ran until April 30, 2010, at which time a six-month no-cost extension began. The FastOS-2 work at UO delivered excellent results in all research work areas: * scalable parallel monitoring * kernel-level performance measurement * parallel I/0 system measurement * large-scale and hybrid application performance measurement * onlne scalable performance data reduction and analysis * binary instrumentation
NASA Astrophysics Data System (ADS)
Sato, Yasuhiro; Furuki, Makoto; Tian, Minquan; Iwasa, Izumi; Pu, Lyong Sun; Tatsuura, Satoshi
2002-04-01
We demonstrated ultrafast single-shot multichannel demultiplexing by using a squarylium dye J aggregate film as an optical Kerr medium. High efficiency and fast recovery of the optical Kerr responses were achieved when a signal-pulse wavelength was close to the absorption peak of the J aggregate film with off-resonant excitation. The on/off ratio in demultiplexing of 1 Tb/s signals was improved to be approximately 5. By introducing time delay to both horizontal and vertical directions, we succeeded in directly observing the conversion of 1 Tb/s serial signals into two-dimensionally arranged parallel signals.
NASA Astrophysics Data System (ADS)
Kan, Guangyuan; He, Xiaoyan; Ding, Liuqian; Li, Jiren; Hong, Yang; Zuo, Depeng; Ren, Minglei; Lei, Tianjie; Liang, Ke
2018-01-01
Hydrological model calibration has been a hot issue for decades. The shuffled complex evolution method developed at the University of Arizona (SCE-UA) has been proved to be an effective and robust optimization approach. However, its computational efficiency deteriorates significantly when the amount of hydrometeorological data increases. In recent years, the rise of heterogeneous parallel computing has brought hope for the acceleration of hydrological model calibration. This study proposed a parallel SCE-UA method and applied it to the calibration of a watershed rainfall-runoff model, the Xinanjiang model. The parallel method was implemented on heterogeneous computing systems using OpenMP and CUDA. Performance testing and sensitivity analysis were carried out to verify its correctness and efficiency. Comparison results indicated that heterogeneous parallel computing-accelerated SCE-UA converged much more quickly than the original serial version and possessed satisfactory accuracy and stability for the task of fast hydrological model calibration.
NASA Astrophysics Data System (ADS)
Cao, Yi; Jung, Haemyeong; Song, Shuguang
2018-01-01
Though extensively studied, the roles of olivine crystal preferred orientations (CPOs or fabrics) in affecting the seismic anisotropies in the Earth's upper mantle are rather complicated and still not fully known. In this study, we attempted to address this issue by analyzing the seismic anisotropies [e.g., P-wave anisotropy (AVp), S-wave polarization anisotropy (AVs), radial anisotropy (ξ), and Rayleigh wave anisotropy (G)] of the Songshugou peridotites (dunite dominated) in the Qinling orogen in central China, based on our previously reported olivine CPOs. The seismic anisotropy patterns of olivine aggregates in our studied samples are well consistent with the prediction for their olivine CPO types; and the magnitude of seismic anisotropies shows a striking positive correlation with equilibrium pressure and temperature (P-T) conditions. Significant reductions of seismic anisotropies (AVp, max. AVs, and G) are observed in porphyroclastic dunite compared to coarse- and fine-grained dunites, as the results of olivine CPO transition (from A-/D-type in coarse-grained dunite, through AG-type-like in porphyroclastic dunite, to B-type-like in fine-grained dunite) and strength variation (weakening: A-/D-type → AG-type-like; strengthening: AG-type-like → B-type-like) during dynamic recrystallization. The transition of olivine CPOs from A-/D-type to B-/AG-type-like in the forearc mantle may weaken the seismic anisotropies and deviate the fast velocity direction and the fast S-wave polarization direction from trench-perpendicular to trench-oblique direction with the cooling and aging of forearc mantle. Depending on the size and distribution of the peridotite body such as the Songshugou peridotites, B- and AG-type-like olivine CPOs can be an additional (despite minor) local contributor to the orogen-parallel fast velocity direction and fast shear-wave polarization direction in the orogenic crust such as in the Songshugou area in Qinling orogen.
Anisotropic Rayleigh-wave phase velocities beneath northern Vietnam
NASA Astrophysics Data System (ADS)
Legendre, Cédric P.; Zhao, Li; Huang, Win-Gee; Huang, Bor-Shouh
2015-02-01
We explore the Rayleigh-wave phase-velocity structure beneath northern Vietnam over a broad period range of 5 to 250 s. We use the two-stations technique to derive the dispersion curves from the waveforms of 798 teleseismic events recoded by a set of 23 broadband seismic stations deployed in northern Vietnam. These dispersion curves are then inverted for both isotropic and azimuthally anisotropic Rayleigh-wave phase-velocity maps in the frequency range of 10 to 50 s. Main findings include a crustal expression of the Red River Shear Zone and the Song Ma Fault. Northern Vietnam displays a northeast/southwest dichotomy in the lithosphere with fast velocities beneath the South China Block and slow velocities beneath the Simao Block and between the Red River Fault and the Song Da Fault. The anisotropy in the region is relatively simple, with a high amplitude and fast directions parallel to the Red River Shear Zone in the western part. In the eastern part, the amplitudes are generally smaller and the fast axis displays more variations with periods.
Wideband aperture array using RF channelizers and massively parallel digital 2D IIR filterbank
NASA Astrophysics Data System (ADS)
Sengupta, Arindam; Madanayake, Arjuna; Gómez-García, Roberto; Engeberg, Erik D.
2014-05-01
Wideband receive-mode beamforming applications in wireless location, electronically-scanned antennas for radar, RF sensing, microwave imaging and wireless communications require digital aperture arrays that offer a relatively constant far-field beam over several octaves of bandwidth. Several beamforming schemes including the well-known true time-delay and the phased array beamformers have been realized using either finite impulse response (FIR) or fast Fourier transform (FFT) digital filter-sum based techniques. These beamforming algorithms offer the desired selectivity at the cost of a high computational complexity and frequency-dependant far-field array patterns. A novel approach to receiver beamforming is the use of massively parallel 2-D infinite impulse response (IIR) fan filterbanks for the synthesis of relatively frequency independent RF beams at an order of magnitude lower multiplier complexity compared to FFT or FIR filter based conventional algorithms. The 2-D IIR filterbanks demand fast digital processing that can support several octaves of RF bandwidth, fast analog-to-digital converters (ADCs) for RF-to-bits type direct conversion of wideband antenna element signals. Fast digital implementation platforms that can realize high-precision recursive filter structures necessary for real-time beamforming, at RF radio bandwidths, are also desired. We propose a novel technique that combines a passive RF channelizer, multichannel ADC technology, and single-phase massively parallel 2-D IIR digital fan filterbanks, realized at low complexity using FPGA and/or ASIC technology. There exists native support for a larger bandwidth than the maximum clock frequency of the digital implementation technology. We also strive to achieve More-than-Moore throughput by processing a wideband RF signal having content with N-fold (B = N Fclk/2) bandwidth compared to the maximum clock frequency Fclk Hz of the digital VLSI platform under consideration. Such increase in bandwidth is achieved without use of polyphase signal processing or time-interleaved ADC methods. That is, all digital processors operate at the same Fclk clock frequency without phasing, while wideband operation is achieved by sub-sampling of narrower sub-bands at the the RF channelizer outputs.
Long-range interactions and parallel scalability in molecular simulations
NASA Astrophysics Data System (ADS)
Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko
2007-01-01
Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.
A parallel-vector algorithm for rapid structural analysis on high-performance computers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.
1990-01-01
A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the 'loop unrolling' technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large-scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
A parallel-vector algorithm for rapid structural analysis on high-performance computers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.
1990-01-01
A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the loop unrolling technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
Research on the Application of Fast-steering Mirror in Stellar Interferometer
NASA Astrophysics Data System (ADS)
Mei, R.; Hu, Z. W.; Xu, T.; Sun, C. S.
2017-07-01
For a stellar interferometer, the fast-steering mirror (FSM) is widely utilized to correct wavefront tilt caused by atmospheric turbulence and internal instrumental vibration due to its high resolution and fast response frequency. In this study, the non-coplanar error between the FSM and actuator deflection axis introduced by manufacture, assembly, and adjustment is analyzed. Via a numerical method, the additional optical path difference (OPD) caused by above factors is studied, and its effects on tracking accuracy of stellar interferometer are also discussed. On the other hand, the starlight parallelism between the beams of two arms is one of the main factors of the loss of fringe visibility. By analyzing the influence of wavefront tilt caused by the atmospheric turbulence on fringe visibility, a simple and efficient real-time correction scheme of starlight parallelism is proposed based on a single array detector. The feasibility of this scheme is demonstrated by laboratory experiment. The results show that starlight parallelism meets the requirement of stellar interferometer in wavefront tilt preliminarily after the correction of fast-steering mirror.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trull, J.; Wang, B.; Parra, A.
2015-06-01
Pulse compression in dispersive strontium barium niobate crystal with a random size and distribution of the anti-parallel orientated nonlinear domains is observed via transverse second harmonic generation. The dependence of the transverse width of the second harmonic trace along the propagation direction allows for the determination of the initial chirp and duration of pulses in the femtosecond regime. This technique permits a real-time analysis of the pulse evolution and facilitates fast in-situ correction of pulse chirp acquired in the propagation through an optical system.
GPU real-time processing in NA62 trigger system
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Chiozzi, S.; Cretaro, P.; Di Lorenzo, S.; Fantechi, R.; Fiorini, M.; Frezza, O.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Piccini, M.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Vicini, P.
2017-01-01
A commercial Graphics Processing Unit (GPU) is used to build a fast Level 0 (L0) trigger system tested parasitically with the TDAQ (Trigger and Data Acquisition systems) of the NA62 experiment at CERN. In particular, the parallel computing power of the GPU is exploited to perform real-time fitting in the Ring Imaging CHerenkov (RICH) detector. Direct GPU communication using a FPGA-based board has been used to reduce the data transmission latency. The performance of the system for multi-ring reconstrunction obtained during the NA62 physics run will be presented.
Fast I/O for Massively Parallel Applications
NASA Technical Reports Server (NTRS)
OKeefe, Matthew T.
1996-01-01
The two primary goals for this report were the design, contruction and modeling of parallel disk arrays for scientific visualization and animation, and a study of the IO requirements of highly parallel applications. In addition, further work in parallel display systems required to project and animate the very high-resolution frames resulting from our supercomputing simulations in ocean circulation and compressible gas dynamics.
Ergül, Özgür
2011-11-01
Fast and accurate solutions of large-scale electromagnetics problems involving homogeneous dielectric objects are considered. Problems are formulated with the electric and magnetic current combined-field integral equation and discretized with the Rao-Wilton-Glisson functions. Solutions are performed iteratively by using the multilevel fast multipole algorithm (MLFMA). For the solution of large-scale problems discretized with millions of unknowns, MLFMA is parallelized on distributed-memory architectures using a rigorous technique, namely, the hierarchical partitioning strategy. Efficiency and accuracy of the developed implementation are demonstrated on very large problems involving as many as 100 million unknowns.
Romanowicz, Barbara; Cao, Aimin; Godwal, Budhiram; ...
2016-01-06
Using an updated data set of ballistic PKIKP travel time data at antipodal distances, we test different models of anisotropy in the Earth's innermost inner core (IMIC) and obtain significantly better fits for a fast axis aligned with Earth's rotation axis, rather than a quasi-equatorial direction, as proposed recently. Reviewing recent results on the single crystal structure and elasticity of iron at core conditions, we find that an hcp structure with the fast c axis parallel to Earth's rotation is more likely but a body-centered cubic structure with the [111] axis aligned in that direction results in very similar predictionsmore » for seismic anisotropy. These models are therefore not distinguishable based on current seismological data. In addition, to match the seismological observations, the inferred strength of anisotropy in the IMIC (6–7%) implies almost perfect alignment of iron crystals, an intriguing, albeit unlikely situation, especially in the presence of heterogeneity, which calls for further studies. Fast axis of anisotropy in the central part of the inner core aligned with Earth's axis of rotation Lastly, the structure of iron in the inner core is most likely hcp, not bcc Not currently possible to distinguish between hcp and bcc structures from seismic observations« less
Digital tomosynthesis mammography using a parallel maximum-likelihood reconstruction method
NASA Astrophysics Data System (ADS)
Wu, Tao; Zhang, Juemin; Moore, Richard; Rafferty, Elizabeth; Kopans, Daniel; Meleis, Waleed; Kaeli, David
2004-05-01
A parallel reconstruction method, based on an iterative maximum likelihood (ML) algorithm, is developed to provide fast reconstruction for digital tomosynthesis mammography. Tomosynthesis mammography acquires 11 low-dose projections of a breast by moving an x-ray tube over a 50° angular range. In parallel reconstruction, each projection is divided into multiple segments along the chest-to-nipple direction. Using the 11 projections, segments located at the same distance from the chest wall are combined to compute a partial reconstruction of the total breast volume. The shape of the partial reconstruction forms a thin slab, angled toward the x-ray source at a projection angle 0°. The reconstruction of the total breast volume is obtained by merging the partial reconstructions. The overlap region between neighboring partial reconstructions and neighboring projection segments is utilized to compensate for the incomplete data at the boundary locations present in the partial reconstructions. A serial execution of the reconstruction is compared to a parallel implementation, using clinical data. The serial code was run on a PC with a single PentiumIV 2.2GHz CPU. The parallel implementation was developed using MPI and run on a 64-node Linux cluster using 800MHz Itanium CPUs. The serial reconstruction for a medium-sized breast (5cm thickness, 11cm chest-to-nipple distance) takes 115 minutes, while a parallel implementation takes only 3.5 minutes. The reconstruction time for a larger breast using a serial implementation takes 187 minutes, while a parallel implementation takes 6.5 minutes. No significant differences were observed between the reconstructions produced by the serial and parallel implementations.
Equatorial anisotropy of the Earth's inner-inner core
NASA Astrophysics Data System (ADS)
Song, X.; Wang, T.; Xia, H.
2015-12-01
Anisotropy of Earth's inner core is a key to understand its evolution and the generation of the Earth's magnetic field. All the previous inner core anisotropy models have assumed a cylindrical anisotropy with the symmetry axis parallel (or nearly parallel) to the Earth's spin axis. However, we have recently found that the fast axis in the inner part of the inner core is close to the equator from inner-core waves extracted from earthquake coda. We obtained inner core phases, PKIIKP2 and PKIKP2 (round-trip phases between the station and its antipode that passes straight through the center of the Earth and that is reflected from the inner core boundary, respectively), from stackings of autocorrelations of the coda of large earthquakes (10,000~40,000 s after Mw>=7.0 earthquakes) at seismic station clusters around the world. We observed large variation of up to 10 s along equatorial paths in the differential travel times PKIIKP2 - PKIKP2, which are sensitive to inner-core structure. The observations can be explained by a cylindrical anisotropy in the inner inner core (IIC) (with a radius of slightly less than half the inner core radius) that has a fast axis aligned near the equator and a cylindrical anisotropy in the outer inner core (OIC) that has a fast axis along the north-south direction. We have obtained more observations using the combination of autocorrelations and cross-correlations at low-latitude station arrays. The results further confirm that the IIC has an equatorial anisotropy and a pattern different from the OIC. The equatorial fast axis of the IIC is near the Central America and the Southeast Asia. The drastic change in the fast axis and the form of anisotropy from the IIC to the OIC may suggest a phase change of the iron or a major shift in the crystallization and deformation during the formation and growth of the inner core.
Abrishami, V; Bilbao-Castro, J R; Vargas, J; Marabini, R; Carazo, J M; Sorzano, C O S
2015-10-01
We describe a fast and accurate method for the reconstruction of macromolecular complexes from a set of projections. Direct Fourier inversion (in which the Fourier Slice Theorem plays a central role) is a solution for dealing with this inverse problem. Unfortunately, the set of projections provides a non-equidistantly sampled version of the macromolecule Fourier transform in the single particle field (and, therefore, a direct Fourier inversion) may not be an optimal solution. In this paper, we introduce a gridding-based direct Fourier method for the three-dimensional reconstruction approach that uses a weighting technique to compute a uniform sampled Fourier transform. Moreover, the contrast transfer function of the microscope, which is a limiting factor in pursuing a high resolution reconstruction, is corrected by the algorithm. Parallelization of this algorithm, both on threads and on multiple CPU's, makes the process of three-dimensional reconstruction even faster. The experimental results show that our proposed gridding-based direct Fourier reconstruction is slightly more accurate than similar existing methods and presents a lower computational complexity both in terms of time and memory, thereby allowing its use on larger volumes. The algorithm is fully implemented in the open-source Xmipp package and is downloadable from http://xmipp.cnb.csic.es. Copyright © 2015 Elsevier B.V. All rights reserved.
Performance of parallel computation using CUDA for solving the one-dimensional elasticity equations
NASA Astrophysics Data System (ADS)
Darmawan, J. B. B.; Mungkasi, S.
2017-01-01
In this paper, we investigate the performance of parallel computation in solving the one-dimensional elasticity equations. Elasticity equations are usually implemented in engineering science. Solving these equations fast and efficiently is desired. Therefore, we propose the use of parallel computation. Our parallel computation uses CUDA of the NVIDIA. Our research results show that parallel computation using CUDA has a great advantage and is powerful when the computation is of large scale.
Parallel MR imaging: a user's guide.
Glockner, James F; Hu, Houchun H; Stanley, David W; Angelos, Lisa; King, Kevin
2005-01-01
Parallel imaging is a recently developed family of techniques that take advantage of the spatial information inherent in phased-array radiofrequency coils to reduce acquisition times in magnetic resonance imaging. In parallel imaging, the number of sampled k-space lines is reduced, often by a factor of two or greater, thereby significantly shortening the acquisition time. Parallel imaging techniques have only recently become commercially available, and the wide range of clinical applications is just beginning to be explored. The potential clinical applications primarily involve reduction in acquisition time, improved spatial resolution, or a combination of the two. Improvements in image quality can be achieved by reducing the echo train lengths of fast spin-echo and single-shot fast spin-echo sequences. Parallel imaging is particularly attractive for cardiac and vascular applications and will likely prove valuable as 3-T body and cardiovascular imaging becomes part of standard clinical practice. Limitations of parallel imaging include reduced signal-to-noise ratio and reconstruction artifacts. It is important to consider these limitations when deciding when to use these techniques. (c) RSNA, 2005.
FastID: Extremely Fast Forensic DNA Comparisons
2017-05-19
FastID: Extremely Fast Forensic DNA Comparisons Darrell O. Ricke, PhD Bioengineering Systems & Technologies Massachusetts Institute of...Technology Lincoln Laboratory Lexington, MA USA Darrell.Ricke@ll.mit.edu Abstract—Rapid analysis of DNA forensic samples can have a critical impact on...time sensitive investigations. Analysis of forensic DNA samples by massively parallel sequencing is creating the next gold standard for DNA
fast_protein_cluster: parallel and optimized clustering of large-scale protein modeling data.
Hung, Ling-Hong; Samudrala, Ram
2014-06-15
fast_protein_cluster is a fast, parallel and memory efficient package used to cluster 60 000 sets of protein models (with up to 550 000 models per set) generated by the Nutritious Rice for the World project. fast_protein_cluster is an optimized and extensible toolkit that supports Root Mean Square Deviation after optimal superposition (RMSD) and Template Modeling score (TM-score) as metrics. RMSD calculations using a laptop CPU are 60× faster than qcprot and 3× faster than current graphics processing unit (GPU) implementations. New GPU code further increases the speed of RMSD and TM-score calculations. fast_protein_cluster provides novel k-means and hierarchical clustering methods that are up to 250× and 2000× faster, respectively, than Clusco, and identify significantly more accurate models than Spicker and Clusco. fast_protein_cluster is written in C++ using OpenMP for multi-threading support. Custom streaming Single Instruction Multiple Data (SIMD) extensions and advanced vector extension intrinsics code accelerate CPU calculations, and OpenCL kernels support AMD and Nvidia GPUs. fast_protein_cluster is available under the M.I.T. license. (http://software.compbio.washington.edu/fast_protein_cluster) © The Author 2014. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Accardo, N.; Wiens, D. A.; Hernandez, S.; Aster, R. C.; Nyblade, A.; Anandakrishnan, S.; Huerta, A. D.; Wilson, T. J.
2011-12-01
We constrain azimuthal anisotropy in the Antarctic upper mantle using shear wave splitting parameters obtained from teleseismic SKS, SKKS, and PKS phases recorded at 30 broad-band seismometers deployed in West Antarctica, and the Transantarctic Mountains as a part of POLENET/ANET. The first seismometers were deployed in late 2007 and additional seismometers were deployed in 2008 and 2009. The seismometers generally operate year-round using solar power, insulated boxes, and either rechargeable AGM or primary lithium batteries. We used an eigenvalue technique to linearize the rotated and shifted shear wave particle motions and determine the best splitting parameters. Robust windows around the individual phases were chosen using the Teanby cluster-analysis algorithm. We visually inspected all results and assigned a quality rating based on factors including signal-to-noise ratios, particle motions, and error contours. The best results for each station were then stacked to get an average splitting direction and delay time. The delay times range from 0.33 to 1.33 s, but generally average about 1 s. We conclude that the splitting results from anisotropy in the upper mantle, since the large splitting times cannot be accumulated in the relatively thin crust (20-30 km) of the region. Overall, fast directions in West Antarctica are at large angles to the direction of Antarctic absolute plate motion in either hotspot or no-net rotation frameworks, showing that the anisotropic fabric does not result from shear associated with the motion of Antarctica over the mantle. The West Antarctic fast directions are also much different than those found in East Antarctica by previous studies. We suggest that the East Antarctic splitting results from anisotropy frozen into the cold cratonic continental lithosphere, whereas West Antarctic splitting is related to Cenozoic tectonism. Stations within the West Antarctic Rift System (WARS), a region of Cenozoic extension, show fast directions subparallel to the inferred WARS extension direction. Stations located in the Ellsworth-Whitmore Mountains (EWM) show fast directions parallel to those found within WARS. Furthermore, results from WARS and from EWM all show relatively large splitting times of 0.6 - 1.33 s. These results suggest upper mantle anisotropy that results from mantle flow and deformation related to the extensional deformation of the region. Two stations were installed in the Pensacola Mountains which are located grid-north of the EWM. The results from this region deviate from the dominant fast orientation seen in WARS but appear to be approximately perpendicular to the strike of the mountain range. Stations in Marie Byrd Land (MBL) show inconsistent fast directions and a wide range of delay times (0.3 - 0.9 s), perhaps as a result of complex mantle fabric related to a possible MBL hotspot.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter
In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter; ...
2016-06-30
In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
Unweighted least squares phase unwrapping by means of multigrid techniques
NASA Astrophysics Data System (ADS)
Pritt, Mark D.
1995-11-01
We present a multigrid algorithm for unweighted least squares phase unwrapping. This algorithm applies Gauss-Seidel relaxation schemes to solve the Poisson equation on smaller, coarser grids and transfers the intermediate results to the finer grids. This approach forms the basis of our multigrid algorithm for weighted least squares phase unwrapping, which is described in a separate paper. The key idea of our multigrid approach is to maintain the partial derivatives of the phase data in separate arrays and to correct these derivatives at the boundaries of the coarser grids. This maintains the boundary conditions necessary for rapid convergence to the correct solution. Although the multigrid algorithm is an iterative algorithm, we demonstrate that it is nearly as fast as the direct Fourier-based method. We also describe how to parallelize the algorithm for execution on a distributed-memory parallel processor computer or a network-cluster of workstations.
Magnetosheath Filamentary Structures Formed by Ion Acceleration at the Quasi-Parallel Bow Shock
NASA Technical Reports Server (NTRS)
Omidi, N.; Sibeck, D.; Gutynska, O.; Trattner, K. J.
2014-01-01
Results from 2.5-D electromagnetic hybrid simulations show the formation of field-aligned, filamentary plasma structures in the magnetosheath. They begin at the quasi-parallel bow shock and extend far into the magnetosheath. These structures exhibit anticorrelated, spatial oscillations in plasma density and ion temperature. Closer to the bow shock, magnetic field variations associated with density and temperature oscillations may also be present. Magnetosheath filamentary structures (MFS) form primarily in the quasi-parallel sheath; however, they may extend to the quasi-perpendicular magnetosheath. They occur over a wide range of solar wind Alfvénic Mach numbers and interplanetary magnetic field directions. At lower Mach numbers with lower levels of magnetosheath turbulence, MFS remain highly coherent over large distances. At higher Mach numbers, magnetosheath turbulence decreases the level of coherence. Magnetosheath filamentary structures result from localized ion acceleration at the quasi-parallel bow shock and the injection of energetic ions into the magnetosheath. The localized nature of ion acceleration is tied to the generation of fast magnetosonic waves at and upstream of the quasi-parallel shock. The increased pressure in flux tubes containing the shock accelerated ions results in the depletion of the thermal plasma in these flux tubes and the enhancement of density in flux tubes void of energetic ions. This results in the observed anticorrelation between ion temperature and plasma density.
The multigrid preconditioned conjugate gradient method
NASA Technical Reports Server (NTRS)
Tatebe, Osamu
1993-01-01
A multigrid preconditioned conjugate gradient method (MGCG method), which uses the multigrid method as a preconditioner of the PCG method, is proposed. The multigrid method has inherent high parallelism and improves convergence of long wavelength components, which is important in iterative methods. By using this method as a preconditioner of the PCG method, an efficient method with high parallelism and fast convergence is obtained. First, it is considered a necessary condition of the multigrid preconditioner in order to satisfy requirements of a preconditioner of the PCG method. Next numerical experiments show a behavior of the MGCG method and that the MGCG method is superior to both the ICCG method and the multigrid method in point of fast convergence and high parallelism. This fast convergence is understood in terms of the eigenvalue analysis of the preconditioned matrix. From this observation of the multigrid preconditioner, it is realized that the MGCG method converges in very few iterations and the multigrid preconditioner is a desirable preconditioner of the conjugate gradient method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Yang; Pinterich, Tamara; Wang, Jian
We present rapid measurement of submicron particle size distributions enables the characterization of aerosols with fast changing properties, and is often necessary for measurements onboard mobile platforms (e.g., research aircraft). Aerosol mobility size distribution is commonly measured by a scanning mobility particle sizer (SMPS), which relies on voltage scanning or stepping to classify particles of different sizes, and may take up to several minutes to obtain a complete size spectrum of aerosol particles. The recently developed fast integrated mobility spectrometer (FIMS) with enhanced dynamic size range classifies and detects particles from 10 to ~600 nm simultaneously, allowing submicron aerosol mobilitymore » size distributions to be captured at a time resolution of 1 second. In this study, we present a detailed data inversion routine for deriving aerosol size distribution from FIMS measurements. The inversion routine takes into consideration the FIMS transfer function, particle penetration efficiency in the FIMS, and multiple charging of aerosols. The accuracy of the FIMS measurement is demonstrated by comparing parallel FIMS and SMPS measurements of stable aerosols with a wide range of size spectrum shapes, including ambient aerosols and aerosols classified by a differential mobility analyzer (DMA). The FIMS and SMPS-derived size distributions show excellent agreements for all aerosols tested. In addition, total number concentrations of ambient aerosols were integrated from 1 Hz FIMS size distributions, and compared with those directly measured by a condensation particle counter (CPC) operated in parallel. Finally, the integrated and measured total particle concentrations agree well within 5%.« less
Wang, Yang; Pinterich, Tamara; Wang, Jian
2018-03-30
We present rapid measurement of submicron particle size distributions enables the characterization of aerosols with fast changing properties, and is often necessary for measurements onboard mobile platforms (e.g., research aircraft). Aerosol mobility size distribution is commonly measured by a scanning mobility particle sizer (SMPS), which relies on voltage scanning or stepping to classify particles of different sizes, and may take up to several minutes to obtain a complete size spectrum of aerosol particles. The recently developed fast integrated mobility spectrometer (FIMS) with enhanced dynamic size range classifies and detects particles from 10 to ~600 nm simultaneously, allowing submicron aerosol mobilitymore » size distributions to be captured at a time resolution of 1 second. In this study, we present a detailed data inversion routine for deriving aerosol size distribution from FIMS measurements. The inversion routine takes into consideration the FIMS transfer function, particle penetration efficiency in the FIMS, and multiple charging of aerosols. The accuracy of the FIMS measurement is demonstrated by comparing parallel FIMS and SMPS measurements of stable aerosols with a wide range of size spectrum shapes, including ambient aerosols and aerosols classified by a differential mobility analyzer (DMA). The FIMS and SMPS-derived size distributions show excellent agreements for all aerosols tested. In addition, total number concentrations of ambient aerosols were integrated from 1 Hz FIMS size distributions, and compared with those directly measured by a condensation particle counter (CPC) operated in parallel. Finally, the integrated and measured total particle concentrations agree well within 5%.« less
Fast parallel algorithm for slicing STL based on pipeline
NASA Astrophysics Data System (ADS)
Ma, Xulong; Lin, Feng; Yao, Bo
2016-05-01
In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a parallel algorithm has great advantages. However, traditional algorithms can't make full use of multi-core CPU hardware resources. In the paper, a fast parallel algorithm is presented to speed up data processing. A pipeline mode is adopted to design the parallel algorithm. And the complexity of the pipeline algorithm is analyzed theoretically. To evaluate the performance of the new algorithm, effects of threads number and layers number are investigated by a serial of experiments. The experimental results show that the threads number and layers number are two remarkable factors to the speedup ratio. The tendency of speedup versus threads number reveals a positive relationship which greatly agrees with the Amdahl's law, and the tendency of speedup versus layers number also keeps a positive relationship agreeing with Gustafson's law. The new algorithm uses topological information to compute contours with a parallel method of speedup. Another parallel algorithm based on data parallel is used in experiments to show that pipeline parallel mode is more efficient. A case study at last shows a suspending performance of the new parallel algorithm. Compared with the serial slicing algorithm, the new pipeline parallel algorithm can make full use of the multi-core CPU hardware, accelerate the slicing process, and compared with the data parallel slicing algorithm, the new slicing algorithm in this paper adopts a pipeline parallel model, and a much higher speedup ratio and efficiency is achieved.
NASA Technical Reports Server (NTRS)
Robbins, John S.
1988-01-01
Unit removed with minimal disturbance. Valve inlet and outlet ports adjacent to each other on same side of valve body. Ports inserted into special manifold on fluid line. Valve body attached to manifold by four bolts or, alternatively, by toggle clamps. Electromechanical actuator moves in direction parallel to fluid line to open and close valve. When necessary to clean valve, removed simply by opening bolts or toggle clamps. No need to move or separate ports of fluid line. Valve useful where disturbance of fluid line detrimental or where fast maintenance essential - in oil and chemical industries, automotive vehicles, aircraft, and powerplants.
Coherent diffraction imaging by moving a lens.
Shen, Cheng; Tan, Jiubin; Wei, Ce; Liu, Zhengjun
2016-07-25
A moveable lens is used for determining amplitude and phase on the object plane. The extended fractional Fourier transform is introduced to address the single lens imaging. We put forward a fast algorithm for the transform by convolution. Combined with parallel iterative phase retrieval algorithm, it is applied to reconstruct the complex amplitude of the object. Compared with inline holography, the implementation of our method is simple and easy. Without the oversampling operation, the computational load is less. Also the proposed method has a superiority of accuracy over the direct focusing measurement for the imaging of small size objects.
Fox, W.; Sciortino, F.; v. Stechow, A.; ...
2017-03-21
We report detailed laboratory observations of the structure of a reconnection current sheet in a two-fluid plasma regime with a guide magnetic field. We observe and quantitatively analyze the quadrupolar electron pressure variation in the ion-diffusion region, as originally predicted by extended magnetohydrodynamics simulations. The projection of the electron pressure gradient parallel to the magnetic field contributes significantly to balancing the parallel electric field, and the resulting cross-field electron jets in the reconnection layer are diamagnetic in origin. Furthermore, these results demonstrate how parallel and perpendicular force balance are coupled in guide field reconnection and confirm basic theoretical models ofmore » the importance of electron pressure gradients for obtaining fast magnetic reconnection.« less
Automatic recognition of vector and parallel operations in a higher level language
NASA Technical Reports Server (NTRS)
Schneck, P. B.
1971-01-01
A compiler for recognizing statements of a FORTRAN program which are suited for fast execution on a parallel or pipeline machine such as Illiac-4, Star or ASC is described. The technique employs interval analysis to provide flow information to the vector/parallel recognizer. Where profitable the compiler changes scalar variables to subscripted variables. The output of the compiler is an extension to FORTRAN which shows parallel and vector operations explicitly.
Real-time trajectory optimization on parallel processors
NASA Technical Reports Server (NTRS)
Psiaki, Mark L.
1993-01-01
A parallel algorithm has been developed for rapidly solving trajectory optimization problems. The goal of the work has been to develop an algorithm that is suitable to do real-time, on-line optimal guidance through repeated solution of a trajectory optimization problem. The algorithm has been developed on an INTEL iPSC/860 message passing parallel processor. It uses a zero-order-hold discretization of a continuous-time problem and solves the resulting nonlinear programming problem using a custom-designed augmented Lagrangian nonlinear programming algorithm. The algorithm achieves parallelism of function, derivative, and search direction calculations through the principle of domain decomposition applied along the time axis. It has been encoded and tested on 3 example problems, the Goddard problem, the acceleration-limited, planar minimum-time to the origin problem, and a National Aerospace Plane minimum-fuel ascent guidance problem. Execution times as fast as 118 sec of wall clock time have been achieved for a 128-stage Goddard problem solved on 32 processors. A 32-stage minimum-time problem has been solved in 151 sec on 32 processors. A 32-stage National Aerospace Plane problem required 2 hours when solved on 32 processors. A speed-up factor of 7.2 has been achieved by using 32-nodes instead of 1-node to solve a 64-stage Goddard problem.
Unexpectedly Fast Phonon-Assisted Exciton Hopping between Carbon Nanotubes
Davoody, A. H.; Karimi, F.; Arnold, M. S.; ...
2017-06-05
Carbon-nanotube (CNT) aggregates are promising light-absorbing materials for photovoltaics. The hopping rate of excitons between CNTs directly affects the efficiency of these devices. We theoretically investigate phonon-assisted exciton hopping, where excitons scatter with phonons into a same-tube transition state, followed by intertube Coulomb scattering into the final state. Second-order hopping between bright excitonic states is as fast as the first-order process (~1 ps). For perpendicular CNTs, the high rate stems from the high density of phononic states; for parallel CNTs, the reason lies in relaxed selection rules. Moreover, second-order exciton transfer between dark and bright states, facilitated by phonons withmore » large angular momentum, has rates comparable to bright-to-bright transfer, so dark excitons provide an additional pathway for energy transfer in CNT composites. Furthermore, as dark excitons are difficult to probe in experiment, predictive theory is critical for understanding exciton dynamics in CNT composites.« less
Unexpectedly Fast Phonon-Assisted Exciton Hopping between Carbon Nanotubes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davoody, A. H.; Karimi, F.; Arnold, M. S.
Carbon-nanotube (CNT) aggregates are promising light-absorbing materials for photovoltaics. The hopping rate of excitons between CNTs directly affects the efficiency of these devices. We theoretically investigate phonon-assisted exciton hopping, where excitons scatter with phonons into a same-tube transition state, followed by intertube Coulomb scattering into the final state. Second-order hopping between bright excitonic states is as fast as the first-order process (~1 ps). For perpendicular CNTs, the high rate stems from the high density of phononic states; for parallel CNTs, the reason lies in relaxed selection rules. Moreover, second-order exciton transfer between dark and bright states, facilitated by phonons withmore » large angular momentum, has rates comparable to bright-to-bright transfer, so dark excitons provide an additional pathway for energy transfer in CNT composites. Furthermore, as dark excitons are difficult to probe in experiment, predictive theory is critical for understanding exciton dynamics in CNT composites.« less
Fast reversible learning based on neurons functioning as anisotropic multiplex hubs
NASA Astrophysics Data System (ADS)
Vardi, Roni; Goldental, Amir; Sheinin, Anton; Sardi, Shira; Kanter, Ido
2017-05-01
Neural networks are composed of neurons and synapses, which are responsible for learning in a slow adaptive dynamical process. Here we experimentally show that neurons act like independent anisotropic multiplex hubs, which relay and mute incoming signals following their input directions. Theoretically, the observed information routing enriches the computational capabilities of neurons by allowing, for instance, equalization among different information routes in the network, as well as high-frequency transmission of complex time-dependent signals constructed via several parallel routes. In addition, this kind of hubs adaptively eliminate very noisy neurons from the dynamics of the network, preventing masking of information transmission. The timescales for these features are several seconds at most, as opposed to the imprint of information by the synaptic plasticity, a process which exceeds minutes. Results open the horizon to the understanding of fast and adaptive learning realities in higher cognitive brain's functionalities.
Radio frequency-assisted fast superconducting switch
DOE Office of Scientific and Technical Information (OSTI.GOV)
Solovyov, Vyacheslav; Li, Qiang
A radio frequency-assisted fast superconducting switch is described. A superconductor is closely coupled to a radio frequency (RF) coil. To turn the switch "off," i.e., to induce a transition to the normal, resistive state in the superconductor, a voltage burst is applied to the RF coil. This voltage burst is sufficient to induce a current in the coupled superconductor. The combination of the induced current with any other direct current flowing through the superconductor is sufficient to exceed the critical current of the superconductor at the operating temperature, inducing a transition to the normal, resistive state. A by-pass MOSFET maymore » be configured in parallel with the superconductor to act as a current shunt, allowing the voltage across the superconductor to drop below a certain value, at which time the superconductor undergoes a transition to the superconducting state and the switch is reset.« less
NASA Astrophysics Data System (ADS)
Goossens, Bart; Aelterman, Jan; Luong, Hi"p.; Pižurica, Aleksandra; Philips, Wilfried
2011-09-01
The shearlet transform is a recent sibling in the family of geometric image representations that provides a traditional multiresolution analysis combined with a multidirectional analysis. In this paper, we present a fast DFT-based analysis and synthesis scheme for the 2D discrete shearlet transform. Our scheme conforms to the continuous shearlet theory to high extent, provides perfect numerical reconstruction (up to floating point rounding errors) in a non-iterative scheme and is highly suitable for parallel implementation (e.g. FPGA, GPU). We show that our discrete shearlet representation is also a tight frame and the redundancy factor of the transform is around 2.6, independent of the number of analysis directions. Experimental denoising results indicate that the transform performs the same or even better than several related multiresolution transforms, while having a significantly lower redundancy factor.
Development of fast cooling pulsed magnets at the Wuhan National High Magnetic Field Center.
Peng, Tao; Sun, Quqin; Zhao, Jianlong; Jiang, Fan; Li, Liang; Xu, Qiang; Herlach, Fritz
2013-12-01
Pulsed magnets with fast cooling channels have been developed at the Wuhan National High Magnetic Field Center. Between the inner and outer sections of a coil wound with a continuous length of CuNb wire, G10 rods with cross section 4 mm × 5 mm were inserted as spacers around the entire circumference, parallel to the coil axis. The free space between adjacent rods is 6 mm. The liquid nitrogen flows freely in the channels between these rods, and in the direction perpendicular to the rods through grooves provided in the rods. For a typical 60 T pulsed magnetic field with pulse duration of 40 ms, the cooling time between subsequent pulses is reduced from 160 min to 35 min. Subsequently, the same technology was applied to a 50 T magnet with 300 ms pulse duration. The cooling time of this magnet was reduced from 480 min to 65 min.
The magnetic field and turbulence of the cosmic web measured using a brilliant fast radio burst.
Ravi, V; Shannon, R M; Bailes, M; Bannister, K; Bhandari, S; Bhat, N D R; Burke-Spolaor, S; Caleb, M; Flynn, C; Jameson, A; Johnston, S; Keane, E F; Kerr, M; Tiburzi, C; Tuntsov, A V; Vedantham, H K
2016-12-09
Fast radio bursts (FRBs) are millisecond-duration events thought to originate beyond the Milky Way galaxy. Uncertainty surrounding the burst sources, and their propagation through intervening plasma, has limited their use as cosmological probes. We report on a mildly dispersed (dispersion measure 266.5 ± 0.1 parsecs per cubic centimeter), exceptionally intense (120 ± 30 janskys), linearly polarized, scintillating burst (FRB 150807) that we directly localize to 9 square arc minutes. On the basis of a low Faraday rotation (12.0 ± 0.7 radians per square meter), we infer negligible magnetization in the circum-burst plasma and constrain the net magnetization of the cosmic web along this sightline to <21 nanogauss, parallel to the line-of-sight. The burst scintillation suggests weak turbulence in the ionized intergalactic medium. Copyright © 2016, American Association for the Advancement of Science.
Unbiased free energy estimates in fast nonequilibrium transformations using Gaussian mixtures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Procacci, Piero
2015-04-21
In this paper, we present an improved method for obtaining unbiased estimates of the free energy difference between two thermodynamic states using the work distribution measured in nonequilibrium driven experiments connecting these states. The method is based on the assumption that any observed work distribution is given by a mixture of Gaussian distributions, whose normal components are identical in either direction of the nonequilibrium process, with weights regulated by the Crooks theorem. Using the prototypical example for the driven unfolding/folding of deca-alanine, we show that the predicted behavior of the forward and reverse work distributions, assuming a combination of onlymore » two Gaussian components with Crooks derived weights, explains surprisingly well the striking asymmetry in the observed distributions at fast pulling speeds. The proposed methodology opens the way for a perfectly parallel implementation of Jarzynski-based free energy calculations in complex systems.« less
NASA Technical Reports Server (NTRS)
Barnard, Stephen T.; Simon, Horst; Lasinski, T. A. (Technical Monitor)
1994-01-01
The design of a parallel implementation of multilevel recursive spectral bisection is described. The goal is to implement a code that is fast enough to enable dynamic repartitioning of adaptive meshes.
A Decade of Shear-Wave Splitting Observations in Alaska
NASA Astrophysics Data System (ADS)
Bellesiles, A. K.; Christensen, D. H.; Abers, G. A.; Hansen, R. A.; Pavlis, G. L.; Song, X.
2010-12-01
Over the last decade four PASSCAL experiments have been conducted in different regions of Alaska. ARCTIC, BEAAR and MOOS form a north-south transect across the state, from the Arctic Ocean to Price Williams Sound, while the STEEP experiment is currently deployed to the east of that line in the St Elias Mountains of Southeastern Alaska. Shear-wave splitting observations from these networks in addition to several permanent stations of the Alaska Earthquake Information Center were determined in an attempt to understand mantle flow under Alaska in a variety of different geologic settings. Results show two dominant splitting patterns in Alaska, separated by the subducted Pacific Plate. North of the subducted Pacific Plate fast directions are parallel to the trench (along strike of the subducted Pacific Plate) indicating large scale mantle flow in the northeast-southwest direction with higher anisotropy (splitting times) within the mantle wedge. Within or below the Pacific Plate fast directions are normal to the trench in the direction of Pacific Plate convergence. In addition to these two prominent splitting patterns there are several regions that do not match either of these trends. These more complex regions which include the results from STEEP could be due to several factors including effects from the edge of the Pacific Plate. The increase of station coverage that Earthscope will bring to Alaska will aid in developing a more complete model for anisotropy and mantle flow in Alaska.
Dynamics of landfast sea ice near Jangbogo Antarctic Research Station observed by SAR interferometry
NASA Astrophysics Data System (ADS)
Lee, H.; Han, H.
2015-12-01
Landfast sea ice is a type of sea ice adjacent to the coast and immobile for a certain period of time. It is important to analyze the temporal and spatial variation of landfast ice because it has significant influences on marine ecosystem and the safe operation of icebreaker vessels. However, it has been a difficult task for both remote sensing and in situ observation to discriminate landfast ice from other types of sea ice, such as pack ice, and also to understand the dynamics and internal strss-strain of fast ice. In this study, we identify landfast ice and its annual variation in Terra Nova Bay (74° 37' 4"S, 164° 13' 7"E), East Antarctica, where Jangbogo Antarctic Research Station has recently been constructed in 2014, by using Interferometric Synthetic Aperture Radar (InSAR) technology. We generated 38 interferograms having temporal baselines of 1-9 days out of 62 COSMO-SkyMed SAR images over Terra Nova Bay obtained from December 2010 to January 2012. Landfast ice began to melt in November 2011 when air temperature raised above freezing point but lasted more than two month to the end of the study period in January 2012. No meaningful relationship was found between sea ice extent and wind and current. Glacial strain (~67cm/day) is similar to tidal strain (~40 cm) so that they appear similar in one-day InSAR. As glacial stress is cumulative while tidal stress is oscillatory, InSAR images with weekly temporal baseline (7~9 days) revealed that a consistent motion of Campbell Glacier Tongue (CGT) is pushing the sea ice continuously to make interferometric fringes parallel to the glacier-sea ice contacts. Glacial interferometric fringe is parallel to the glacier-sea ice contact lines while tidal strain should be parallel to the coastlines defined by sea shore and glacier tongue. DDInSAR operation removed the consistent glacial strain leaving tidal strain alone so that the response of fast ice to tide can be used to deduce physical properties of sea ice in various ice stages. One-day InSAR images revealed that fast ice is not attached to CGT in the early ice formation stages while they began to couple with each other so that the entire glacial motion of up to 67cm/day is transferred directly to fast ice. In the final thawing stage just before ice breakage, ocean wave travelling through the fast ice is also observed by one-day InSAR.
Motion streaks in fast motion rivalry cause orientation-selective suppression.
Apthorp, Deborah; Wenderoth, Peter; Alais, David
2009-05-14
We studied binocular rivalry between orthogonally translating arrays of random Gaussian blobs and measured the strength of rivalry suppression for static oriented probes. Suppression depth was quantified by expressing monocular probe thresholds during dominance relative to thresholds during suppression. Rivalry between two fast motions or two slow motions was compared in order to test the suggestion that fast-moving objects leave oriented "motion streaks" due to temporal integration (W. S. Geisler, 1999). If fast motions do produce motion streaks, then fast motion rivalry might also entail rivalry between the orthogonal streak orientations. We tested this using a static oriented probe that was aligned either parallel to the motion trajectory (hence collinear with the "streaks") or was orthogonal to the trajectory, predicting that rivalry suppression would be greater for parallel probes, and only for rivalry between fast motions. Results confirmed that suppression depth did depend on probe orientation for fast motion but not for slow motion. Further experiments showed that threshold elevations for the oriented probe during suppression exhibited clear orientation tuning. However, orientation-tuned elevations were also present during dominance, suggesting within-channel masking as the basis of the extra-deep suppression. In sum, the presence of orientation-dependent suppression in fast motion rivalry is consistent with the "motion streaks" hypothesis.
Novel Optical Processor for Phased Array Antenna.
1992-10-20
parallel glass slide into the signal beam optical loop. The parallel glass acts like a variable phase shifter to the signal beam simulating phase drift...A list of possible designs are given as follows , _ _ Velocity fa (100dB/cm) Lumit Wavelength I M2I1 TeO2 Longi 4.2 /m/ns about 3 GHz 1.4 4m 34 Fast...subject to achievable acoustic frequency, the preferred materials are the slow shear wave in TeO2 , the fast shear wave in TeO2 or the shear waves in
Bit-parallel arithmetic in a massively-parallel associative processor
NASA Technical Reports Server (NTRS)
Scherson, Isaac D.; Kramer, David A.; Alleyne, Brian D.
1992-01-01
A simple but powerful new architecture based on a classical associative processor model is presented. Algorithms for performing the four basic arithmetic operations both for integer and floating point operands are described. For m-bit operands, the proposed architecture makes it possible to execute complex operations in O(m) cycles as opposed to O(m exp 2) for bit-serial machines. A word-parallel, bit-parallel, massively-parallel computing system can be constructed using this architecture with VLSI technology. The operation of this system is demonstrated for the fast Fourier transform and matrix multiplication.
NASA Astrophysics Data System (ADS)
Kim, Stephan D.; Luo, Jiajun; Buchholz, D. Bruce; Chang, R. P. H.; Grayson, M.
2016-09-01
A modular time division multiplexer (MTDM) device is introduced to enable parallel measurement of multiple samples with both fast and slow decay transients spanning from millisecond to month-long time scales. This is achieved by dedicating a single high-speed measurement instrument for rapid data collection at the start of a transient, and by multiplexing a second low-speed measurement instrument for slow data collection of several samples in parallel for the later transients. The MTDM is a high-level design concept that can in principle measure an arbitrary number of samples, and the low cost implementation here allows up to 16 samples to be measured in parallel over several months, reducing the total ensemble measurement duration and equipment usage by as much as an order of magnitude without sacrificing fidelity. The MTDM was successfully demonstrated by simultaneously measuring the photoconductivity of three amorphous indium-gallium-zinc-oxide thin films with 20 ms data resolution for fast transients and an uninterrupted parallel run time of over 20 days. The MTDM has potential applications in many areas of research that manifest response times spanning many orders of magnitude, such as photovoltaics, rechargeable batteries, amorphous semiconductors such as silicon and amorphous indium-gallium-zinc-oxide.
Kim, Stephan D; Luo, Jiajun; Buchholz, D Bruce; Chang, R P H; Grayson, M
2016-09-01
A modular time division multiplexer (MTDM) device is introduced to enable parallel measurement of multiple samples with both fast and slow decay transients spanning from millisecond to month-long time scales. This is achieved by dedicating a single high-speed measurement instrument for rapid data collection at the start of a transient, and by multiplexing a second low-speed measurement instrument for slow data collection of several samples in parallel for the later transients. The MTDM is a high-level design concept that can in principle measure an arbitrary number of samples, and the low cost implementation here allows up to 16 samples to be measured in parallel over several months, reducing the total ensemble measurement duration and equipment usage by as much as an order of magnitude without sacrificing fidelity. The MTDM was successfully demonstrated by simultaneously measuring the photoconductivity of three amorphous indium-gallium-zinc-oxide thin films with 20 ms data resolution for fast transients and an uninterrupted parallel run time of over 20 days. The MTDM has potential applications in many areas of research that manifest response times spanning many orders of magnitude, such as photovoltaics, rechargeable batteries, amorphous semiconductors such as silicon and amorphous indium-gallium-zinc-oxide.
Conjugate gradient based projection - A new explicit methodology for frictional contact
NASA Technical Reports Server (NTRS)
Tamma, Kumar K.; Li, Maocheng; Sha, Desong
1993-01-01
With special attention towards the applicability to parallel computation or vectorization, a new and effective explicit approach for linear complementary formulations involving a conjugate gradient based projection methodology is proposed in this study for contact problems with Coulomb friction. The overall objectives are focussed towards providing an explicit methodology of computation for the complete contact problem with friction. In this regard, the primary idea for solving the linear complementary formulations stems from an established search direction which is projected to a feasible region determined by the non-negative constraint condition; this direction is then applied to the Fletcher-Reeves conjugate gradient method resulting in a powerful explicit methodology which possesses high accuracy, excellent convergence characteristics, fast computational speed and is relatively simple to implement for contact problems involving Coulomb friction.
Coherent radio-frequency detection for narrowband direct comb spectroscopy.
Anstie, James D; Perrella, Christopher; Light, Philip S; Luiten, Andre N
2016-02-22
We demonstrate a scheme for coherent narrowband direct optical frequency comb spectroscopy. An extended cavity diode laser is injection locked to a single mode of an optical frequency comb, frequency shifted, and used as a local oscillator to optically down-mix the interrogating comb on a fast photodetector. The high spectral coherence of the injection lock generates a microwave frequency comb at the output of the photodiode with very narrow features, enabling spectral information to be further down-mixed to RF frequencies, allowing optical transmittance and phase to be obtained using electronics commonly found in the lab. We demonstrate two methods for achieving this step: a serial mode-by-mode approach and a parallel dual-comb approach, with the Cs D1 transition at 894 nm as a test case.
NASA Astrophysics Data System (ADS)
Tartan, Chloe C.; Salter, Patrick S.; Booth, Martin J.; Morris, Stephen M.; Elston, Steve J.
2016-09-01
Direct Laser Writing (DLW) by two-photon photopolymerization (TPP) enables the fabrication of micron-scale polymeric structures in soft matter systems. The technique has implications in a broad range of optics and photonics; in particular fast-switching liquid crystal (LC) modes for the development of next generation display technologies. In this paper, we report two different methodologies using our TPP-based fabrication technique. Two explicit examples are provided of voltage-dependent LC director profiles that are inherently unstable, but which appear to be promising candidates for fast-switching photonics applications. In the first instance, 1 μm-thick periodic walls of polymer network are written into a planar aligned (parallel rubbed) nematic pi-cell device containing a nematic LC-monomer mixture. The structures are fabricated when the device is electrically driven into a fast-switching nematic LC state and aberrations induced by the device substrates are corrected for by virtue of the adaptive optics elements included within the DLW setup. Optical polarizing microscopy images taken post-fabrication reveal that polymer walls oriented perpendicular to the rubbing direction promote the stability of the so-called optically compensated bend mode upon removal of the externally applied field. In the second case, polymer walls are written in a nematic LC-optically adhesive glue mixture. A polymer- LCs-polymer-slices or `POLICRYPS' template is formed by immersing the device in acetone post-fabrication to remove any remaining non-crosslinked material. Injecting the resultant series of polymer microchannels ( 1 μm-thick) with a short-pitch, chiral nematic LC mixture leads to the spontaneous alignment of a fast-switching chiral nematic mode, where the helical axis lies parallel to the glass substrates. Optimal contrast between the bright and dark states of the uniform lying helix alignment is achieved when the structures are spaced at the order of the device thickness, which was also found to be the case for the achiral system. The high resolution DLW technique limits structures to the focal spot size of the beam, 1 μm in diameter, such that the transmittance is expected to be significantly enhanced relative to other stabilization techniques. Moreover, both devices remain stable under electrical and thermal cycling.
Zeki, Semir
2016-10-01
Results from a variety of sources, some many years old, lead ineluctably to a re-appraisal of the twin strategies of hierarchical and parallel processing used by the brain to construct an image of the visual world. Contrary to common supposition, there are at least three 'feed-forward' anatomical hierarchies that reach the primary visual cortex (V1) and the specialized visual areas outside it, in parallel. These anatomical hierarchies do not conform to the temporal order with which visual signals reach the specialized visual areas through V1. Furthermore, neither the anatomical hierarchies nor the temporal order of activation through V1 predict the perceptual hierarchies. The latter shows that we see (and become aware of) different visual attributes at different times, with colour leading form (orientation) and directional visual motion, even though signals from fast-moving, high-contrast stimuli are among the earliest to reach the visual cortex (of area V5). Parallel processing, on the other hand, is much more ubiquitous than commonly supposed but is subject to a barely noticed but fundamental aspect of brain operations, namely that different parallel systems operate asynchronously with respect to each other and reach perceptual endpoints at different times. This re-assessment leads to the conclusion that the visual brain is constituted of multiple, parallel and asynchronously operating task- and stimulus-dependent hierarchies (STDH); which of these parallel anatomical hierarchies have temporal and perceptual precedence at any given moment is stimulus and task related, and dependent on the visual brain's ability to undertake multiple operations asynchronously. © 2016 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Equatorial anisotropy of the Earth's inner inner core from autocorrelations of earthquake coda
NASA Astrophysics Data System (ADS)
Wang, T.; Song, X.; Xia, H.
2014-12-01
The anisotropic structure of the inner core seems complex with significant depth and lateral variations. An innermost inner core has been suggested with a distinct form of anisotropy, but it has considerable uncertainties in its form, size, or even existence. All the previous inner-core anisotropy models have assumed a cylindrical anisotropy with the symmetry axis parallel (or nearly parallel) to the Earth's spin axis. In this study, we obtain inner-core phases, PKIIKP2 and PKIKP2 (the round-trip phases between the station and its antipode that passes straight through the center of the Earth and that is reflected from the inner-core boundary, respectively), from stackings of autocorrelations of earthquake coda at seismic station clusters around the world. The differential travel times PKIIKP2 - PKIKP2, which are sensitive to inner-core structure, show fast arrivals at high latitudes. However, we also observed large variations of up to 10 s along equatorial paths. These observations can be explained by a cylindrical anisotropy in the inner inner core (IIC) (with a radius of slightly less than half the inner core radius) that has a fast axis aligned near the equator and a cylindrical anisotropy in the outer inner core (OIC) that has a fast axis along the north-south direction. The equatorial fast axis of the IIC is near the Central America and the Southeast Asia. The form of the anisotropy in the IIC is distinctly different from that in the OIC and the anisotropy amplitude in the IIC is about 70% stronger than in the OIC. The different forms of anisotropy may be explained by a two-phase system of iron in the inner core (hcp in the OIC and bcc in the IIC). These results may suggest a major shift of the tectonics of the inner core during its formation and growth.
3D Deformation and Evolution of Mediterranean Basins: Insights From Crustal and Mantle Anisotropy
NASA Astrophysics Data System (ADS)
Lebedev, S.; Endrun, B.; Meier, T. M.; Adam, J.; Tirel, C.
2010-12-01
The slow convergence of Africa and Eurasia has been accompanied by spectacular tectonic activity within the Mediterranean. The evolution and retreat of multiple subduction zones has brought about pervasive deformation of continental back-arc basins. Continental deformation in the Mediterranean is at rates among the highest globally, and with diverse patterns and boundary conditions. Better understanding of this deformation promises important new insights into the dynamics of continents, and numerous competing models have been put forward. The lack of consensus to date is in large part due to the paucity of observational constraints on the deformation and flow within the deep crust and lithospheric mantle. Observations of seismic anisotropy provide constraints on deformation at depth. Array analysis of surface waves, in particular, can resolve variations in anisotropic fabric both laterally and as a function of depth. Analyses of other data types, including SKS splitting and Pn anisotropy, cross-validate and complement surface-wave constraints on anisotropy. Recent seismic-anisotropy imaging in the North Tyrrhenian and the Aegean indicates widespread diffuse deformation within the lithosphere, some of it with previously unknown patterns. Anisotropy shows the layering of finite strain in the crust and mantle. It reveals complex, depth-dependent flow patterns within the extending lithosphere and underlying asthenosphere. In the northern Aegean, fast shear-wave propagation directions within the mantle lithosphere are N-S, parallel to the direction of current extension. This indicates that the brittle upper crust, undergoing both stretching and bookshelf-like faulting on NE-SW trending faults, is underlain by a viscous mantle lithosphere that is flowing straight in the direction of the N-S extension. In the south-central Aegean, deforming weakly at present, anisotropic fabric in the lower crust trends parallel to the direction of paleo-extension in the Miocene; this fabric is a record of pervasive crustal flow that accompanied the exhumation of metamorphic core complexes at that time. In the North Tyrrhenian, extension over the last 10 m.y. has also caused exhumation of metamorphic rocks, with stretching lineations recording an E-W extension direction. Anisotropic fabric in both the lower crust and mantle lithosphere match this direction, confirming that viscous flow within both layers has accommodated the extension. Previously observed SKS-wave splitting in the northern and central Aegean shows predominantly NE-SW fast-propagation directions and is likely to indicate current and recent flow in the asthenosphere due to the rapid retreat of the Hellenic subduction zone. In the North Tyrrhenian, anisotropy also changes at the lithosphere-asthenosphere boundary. Whereas the lithosphere preserves the E-W trending fabric that is a record of recent extension, the asthenosphere shows NW-SE trending fabric that indicates asthenospheric flow parallel to the Apennines and the trench, probably related to the complex configuration of the subducting slabs beneath the Alps and the Apennines.
Energetic particles in laboratory, space and astrophysical plasmas
NASA Astrophysics Data System (ADS)
McClements, K. G.; Turnyanskiy, M. R.
2017-01-01
Some recent studies of energetic particles in laboratory, space and astrophysical plasmas are discussed, and a number of common themes identified. Such comparative studies can elucidate the underlying physical processes. For example microwave bursts observed during edge localised modes (ELMs) in the mega amp spherical tokamak (MAST) can be attributed to energetic electrons accelerated by parallel electric fields associated with the ELMs. The very large numbers of electrons known to be accelerated in solar flares must also arise from parallel electric fields, and the demonstration of energetic electron production during ELMs suggests close links at the kinetic level between ELMs and flares. Energetic particle studies in solar flares have focussed largely on electrons rather than ions, since bremsstrahlung from deka-keV electrons provides the best available explanation of flare hard x-ray emission. However ion acceleration (but not electron acceleration) has been observed during merging startup of plasmas in MAST with dimensionless parameters similar to those of the solar corona during flares. Recent measurements in the Earth’s radiation belts demonstrate clearly a direct link between ion cyclotron emission (ICE) and fast particle population inversion, supporting the hypothesis that ICE in tokamaks is driven by fast particle distributions of this type. Shear Alfvén waves in plasmas with beta less than the electron to ion mass ratio have a parallel electric field that, in the solar corona, could accelerate electrons to hard x-ray-emitting energies; an extension of this calculation to plasmas with Alfvén speed arbitrarily close to the speed of light suggests that the mechanism could play a role in the production of cosmic ray electrons.
Short-term gas dispersion in idealised urban canopy in street parallel with flow direction
NASA Astrophysics Data System (ADS)
Chaloupecká, Hana; Jaňour, Zbyněk; Nosek, Štěpán
2016-03-01
Chemical attacks (e.g. Syria 2014-15 chlorine, 2013 sarine or Iraq 2006-7 chlorine) as well as chemical plant disasters (e.g. Spain 2015 nitric oxide, ferric chloride; Texas 2014 methyl mercaptan) threaten mankind. In these crisis situations, gas clouds are released. Dispersion of gas clouds is the issue of interest investigated in this paper. The paper describes wind tunnel experiments of dispersion from ground level point gas source. The source is situated in a model of an idealised urban canopy. The short duration releases of passive contaminant ethane are created by an electromagnetic valve. The gas cloud concentrations are measured in individual places at the height of the human breathing zone within a street parallel with flow direction by Fast-response Ionisation Detector. The simulations of the gas release for each measurement position are repeated many times under the same experimental set up to obtain representative datasets. These datasets are analysed to compute puff characteristics (arrival, leaving time and duration). The results indicate that the mean value of the dimensionless arrival time can be described as a growing linear function of the dimensionless coordinate in the street parallel with flow direction where the gas source is situated. The same might be stated about the dimensionless leaving time as well as the dimensionless duration, however these fits are worse. Utilising a linear function, we might also estimate some other statistical characteristics from datasets than the datasets means (medians, trimeans). The datasets of the dimensionless arrival time, the dimensionless leaving time and the dimensionless duration can be fitted by the generalized extreme value distribution (GEV) in all sampling positions except one.
Seismic Anisotropy And Upper Mantle Structure In Se Brazil
NASA Astrophysics Data System (ADS)
Heintz, M.; Vauchez, A.; Assumpcao, M.; Egydio-Silva, M.
We present preliminary shear wave splitting measurements performed in south-east Brazil in a quite complex region, from a geological point of view. Seismic anisotropy is the result of a preferred orientation of anisotropic minerals (olivine) in the upper mantle, due to deformation. Splitting parameters Ø (direction of the fastest S wave) are compared to large-scale tectonic structures of the area, in order to infer to which extent the deformations in the upper mantle and in the crust are mechanically coupled. The field of study is a region of 1000 by 1000 km, along the Atlantic coast from São Paulo to 500 km north of Rio de Janeiro. This region is made up of large scale geological units as the southern termination of the São Francisco craton, from archean age, surrounded by two neoproterozoic belts (the Ribeira belt to the east and the Brasilia belt to the west), and the Parana basin, which is a vast flood basalt region. Teleseisms used were acquired by 39 seismological stations well distributed in the region of interest. The results highlight the fact that the orientations of the polarization plane of the fast split shear wave vary a lot in this region, and measurements could be splitted into 5 groups : directions are parallel to the NE-SW trending of the Ribeira belt, some are parallel to the NW-SE trending of the Brasilia belt, in the NE-SW direction of the Transbrasiliano lineament, parallel to the absolute plate maotion (APM) that is EW in this region, or turning around a cylindrical low velocity anomaly imaged in the Parana basin and supposed to be the fossil plume head conduit of the Tristan da Cunha plume head.
Seismic properties of lawsonite eclogites from the southern Motagua fault zone, Guatemala
NASA Astrophysics Data System (ADS)
Kim, Daeyeong; Wallis, Simon; Endo, Shunsuke; Ree, Jin-Han
2016-05-01
We present new data on the crystal preferred orientation (CPO) and seismic properties of omphacite and lawsonite in extremely fresh eclogite from the southern Motagua fault zone, Guatemala, to discuss the seismic anisotropy of subducting oceanic crust. The CPO of omphacite is characterized by (010)[001], and it shows P-wave seismic anisotropies (AVP) of 1.4%-3.2% and S-wave seismic anisotropies (AVS) of 1.4%-2.7%. Lawsonite exhibits (001) planes parallel to the foliation and [010] axes parallel to the lineation, and seismic anisotropies of 1.7%-6.6% AVP and 3.4%-14.7% AVS. The seismic anisotropy of a rock mass consisting solely of omphacite and lawsonite is 1.2%-4.1% AVP and 1.8%-6.8% AVS. For events that propagate more or less parallel to the maximum extension direction, X, the fast S-wave velocity (VS) polarization is parallel to the Z in the Y-Z section (rotated from the X-Z section), causing trench-normal seismic anisotropy for orthogonal subduction. Based on the high modal abundance and strong fabric of lawsonite, the AVS of eclogites is estimated as ~ 11.7% in the case that lawsonite makes up ~ 75% of the rock mass. On this basis, we suggest that lawsonite in both blueschist and eclogite may play important roles in the formation of complex pattern of seismic anisotropy observed in NE Japan: weak trench-parallel anisotropy in the forearc basin domains and trench-normal anisotropy in the backarc region.
NASA Astrophysics Data System (ADS)
Hermens, Ulrike; Pothen, Mario; Winands, Kai; Arntz, Kristian; Klocke, Fritz
2018-02-01
Laser-induced periodic surface structures (LIPSS) found in particular applications in the fields of surface functionalization have been investigated since many years. The direction of these ripple structures with a periodicity in the nanoscale can be manipulated by changing the laser polarization. For industrial use, it is useful to manipulate the direction of these structures automatically and to obtain smooth changes of their orientation without any visible inhomogeneity. However, currently no system solution exists that is able to control the polarization direction completely automated in one software solution so far. In this paper, a system solution is presented that includes a liquid crystal polarizer to control the polarization direction. It is synchronized with a scanner, a dynamic beam expander and a five axis-system. It provides fast switching times and small step sizes. First results of fabricated structures are also presented. In a systematic study, the conjunction of LIPSS with different orientation in two parallel line scans has been investigated.
Large amplitude MHD waves upstream of the Jovian bow shock
NASA Technical Reports Server (NTRS)
Goldstein, M. L.; Smith, C. W.; Matthaeus, W. H.
1983-01-01
Observations of large amplitude magnetohydrodynamics (MHD) waves upstream of Jupiter's bow shock are analyzed. The waves are found to be right circularly polarized in the solar wind frame which suggests that they are propagating in the fast magnetosonic mode. A complete spectral and minimum variance eigenvalue analysis of the data was performed. The power spectrum of the magnetic fluctuations contains several peaks. The fluctuations at 2.3 mHz have a direction of minimum variance along the direction of the average magnetic field. The direction of minimum variance of these fluctuations lies at approximately 40 deg. to the magnetic field and is parallel to the radial direction. We argue that these fluctuations are waves excited by protons reflected off the Jovian bow shock. The inferred speed of the reflected protons is about two times the solar wind speed in the plasma rest frame. A linear instability analysis is presented which suggests an explanation for many of the observed features of the observations.
A velocity map imaging mass spectrometer for photofragments of fast ion beams
NASA Astrophysics Data System (ADS)
Johnston, M. David; Pearson, Wright L.; Wang, Greg; Metz, Ricardo B.
2018-01-01
We present the details of a fast ion velocity map imaging mass spectrometer that is capable of imaging the photofragments of trap-cooled (≥7 K) ions produced in a versatile ion source. The new instrument has been used to study the predissociation of N2O+ produced by electric discharge and the direct dissociation of Al2+ formed by laser ablation. The instrument's resolution is currently limited by the diameter of the collimating iris to a value of Δv/v = 7.6%. Photofragment images of N2O+ show that when the predissociative state is changed from 2Σ+(200) to 2Σ+(300) the dominant product channel shifts from a spin-forbidden ground state, N (4S) + NO+(v = 5), to a spin-allowed pathway, N*(2D) + NO+. The first photofragment images of Al2+ confirm the existence of a directly dissociative parallel transition (2Σ+u ← 2Σ+g) that yields products with a large amount of kinetic energy. D0 of ground state Al2+ (2Σ+g) measured from these images is 138 ± 5 kJ/mol, which is consistent with the published literature.
Depth-varying azimuthal anisotropy in the Tohoku subduction channel
NASA Astrophysics Data System (ADS)
Liu, Xin; Zhao, Dapeng
2017-09-01
We determine a detailed 3-D model of azimuthal anisotropy tomography of the Tohoku subduction zone from the Japan Trench outer-rise to the back-arc near the Japan Sea coast, using a large number of high-quality P and S wave arrival-time data of local earthquakes recorded by the dense seismic network on the Japan Islands. Depth-varying seismic azimuthal anisotropy is revealed in the Tohoku subduction channel. The shallow portion of the Tohoku megathrust zone (<30 km depth) generally exhibits trench-normal fast-velocity directions (FVDs) except for the source area of the 2011 Tohoku-oki earthquake (Mw 9.0) where the FVD is nearly trench-parallel, whereas the deeper portion of the megathrust zone (at depths of ∼30-50 km) mainly exhibits trench-parallel FVDs. Trench-normal FVDs are revealed in the mantle wedge beneath the volcanic front and the back-arc. The Pacific plate mainly exhibits trench-parallel FVDs, except for the top portion of the subducting Pacific slab where visible trench-normal FVDs are revealed. A qualitative tectonic model is proposed to interpret such anisotropic features, suggesting transposition of earlier fabrics in the oceanic lithosphere into subduction-induced new structures in the subduction channel.
Advances in locally constrained k-space-based parallel MRI.
Samsonov, Alexey A; Block, Walter F; Arunachalam, Arjun; Field, Aaron S
2006-02-01
In this article, several theoretical and methodological developments regarding k-space-based, locally constrained parallel MRI (pMRI) reconstruction are presented. A connection between Parallel MRI with Adaptive Radius in k-Space (PARS) and GRAPPA methods is demonstrated. The analysis provides a basis for unified treatment of both methods. Additionally, a weighted PARS reconstruction is proposed, which may absorb different weighting strategies for improved image reconstruction. Next, a fast and efficient method for pMRI reconstruction of data sampled on non-Cartesian trajectories is described. In the new technique, the computational burden associated with the numerous matrix inversions in the original PARS method is drastically reduced by limiting direct calculation of reconstruction coefficients to only a few reference points. The rest of the coefficients are found by interpolating between the reference sets, which is possible due to the similar configuration of points participating in reconstruction for highly symmetric trajectories, such as radial and spirals. As a result, the time requirements are drastically reduced, which makes it practical to use pMRI with non-Cartesian trajectories in many applications. The new technique was demonstrated with simulated and actual data sampled on radial trajectories. Copyright 2006 Wiley-Liss, Inc.
[CMACPAR an modified parallel neuro-controller for control processes].
Ramos, E; Surós, R
1999-01-01
CMACPAR is a Parallel Neurocontroller oriented to real time systems as for example Control Processes. Its characteristics are mainly a fast learning algorithm, a reduced number of calculations, great generalization capacity, local learning and intrinsic parallelism. This type of neurocontroller is used in real time applications required by refineries, hydroelectric centers, factories, etc. In this work we present the analysis and the parallel implementation of a modified scheme of the Cerebellar Model CMAC for the n-dimensional space projection using a mean granularity parallel neurocontroller. The proposed memory management allows for a significant memory reduction in training time and required memory size.
Wiens, Curtis N.; Artz, Nathan S.; Jang, Hyungseok; McMillan, Alan B.; Reeder, Scott B.
2017-01-01
Purpose To develop an externally calibrated parallel imaging technique for three-dimensional multispectral imaging (3D-MSI) in the presence of metallic implants. Theory and Methods A fast, ultrashort echo time (UTE) calibration acquisition is proposed to enable externally calibrated parallel imaging techniques near metallic implants. The proposed calibration acquisition uses a broadband radiofrequency (RF) pulse to excite the off-resonance induced by the metallic implant, fully phase-encoded imaging to prevent in-plane distortions, and UTE to capture rapidly decaying signal. The performance of the externally calibrated parallel imaging reconstructions was assessed using phantoms and in vivo examples. Results Phantom and in vivo comparisons to self-calibrated parallel imaging acquisitions show that significant reductions in acquisition times can be achieved using externally calibrated parallel imaging with comparable image quality. Acquisition time reductions are particularly large for fully phase-encoded methods such as spectrally resolved fully phase-encoded three-dimensional (3D) fast spin-echo (SR-FPE), in which scan time reductions of up to 8 min were obtained. Conclusion A fully phase-encoded acquisition with broadband excitation and UTE enabled externally calibrated parallel imaging for 3D-MSI, eliminating the need for repeated calibration regions at each frequency offset. Significant reductions in acquisition time can be achieved, particularly for fully phase-encoded methods like SR-FPE. PMID:27403613
NASA Astrophysics Data System (ADS)
Lebedev, M.; Collet, O.; Bona, A.; Gurevich, B.
2015-12-01
Estimations of hydrocarbon and water resources as well as reservoir management during production are the main challenges facing the resource recovery industry nowadays. The recently discovered reservoirs are not only deep but they are also located in complicated geological formations. Hence, the effect of anisotropy on reservoir imaging becomes significant. Shear wave (S-wave) splitting has been observed in the field and laboratory experiments for decades. Despite the fact that S-wave splitting is widely used for evaluation of subsurface anisotropy, the effects of stresses as well fluid saturation on anisotropy have not been understood in detail. In this paper we present the laboratory study of the effect of stress and saturation on S-wave splitting for a Bentheim sandstone sample. The cubic sample (50mm3), porosity 22%, density 1890kg/m3) was placed into a true-triaxial cell. The sample was subjected to several combinations of stresses varying from 0 to 10MPa and applied to the sample in two directions (X and Y), while no stress was applied to the sample in the Z-direction. The sample's bedding was nearly oriented parallel to Y-Z plane. The ultrasonic S-waves were exited at a frequency of 0.5MHz by a piezoelectric transducer and were propagating in the Z-direction. Upon wave arrival onto the free surface the displacement of the surface was monitored by a Laser Doppler interferometer. Hodograms of the central point of the dry sample (Fig. 1) demonstrate how S-wave polarizations for both "fast" and "slow" S-waves change when increasing the stress in the X direction, while the stress in direction Y is kept constant at 3 MPa. Polarization of the fast S wave is shifted towards the X-axis (axis of the maximum stress). While both S-wave velocities increase with stress, the anisotropy level remains the same. No shift of polarization of fast wave was observed when the stress along the Y-axis was kept at 3 MPa, while the stress along the X-axis was increasing. However, in that case, S-wave splitting is more prominent. The fast S-wave velocity is increasing with the stress increase while the slow S-wave velocity starts decreasing after 5MPa, indicating possible cracks opening in the Y-direction. Interestingly no change in anisotropy was observed for the water-saturated sample.
Optical frequency comb Faraday rotation spectroscopy
NASA Astrophysics Data System (ADS)
Johansson, Alexandra C.; Westberg, Jonas; Wysocki, Gerard; Foltynowicz, Aleksandra
2018-05-01
We demonstrate optical frequency comb Faraday rotation spectroscopy (OFC-FRS) for broadband interference-free detection of paramagnetic species. The system is based on a femtosecond doubly resonant optical parametric oscillator and a fast-scanning Fourier transform spectrometer (FTS). The sample is placed in a DC magnetic field parallel to the light propagation. Efficient background suppression is implemented via switching the direction of the field on consecutive FTS scans and subtracting the consecutive spectra, which enables long-term averaging. In this first demonstration, we measure the entire Q- and R-branches of the fundamental band of nitric oxide in the 5.2-5.4 µm range and achieve good agreement with a theoretical model.
Parallel fast multipole boundary element method applied to computational homogenization
NASA Astrophysics Data System (ADS)
Ptaszny, Jacek
2018-01-01
In the present work, a fast multipole boundary element method (FMBEM) and a parallel computer code for 3D elasticity problem is developed and applied to the computational homogenization of a solid containing spherical voids. The system of equation is solved by using the GMRES iterative solver. The boundary of the body is dicretized by using the quadrilateral serendipity elements with an adaptive numerical integration. Operations related to a single GMRES iteration, performed by traversing the corresponding tree structure upwards and downwards, are parallelized by using the OpenMP standard. The assignment of tasks to threads is based on the assumption that the tree nodes at which the moment transformations are initialized can be partitioned into disjoint sets of equal or approximately equal size and assigned to the threads. The achieved speedup as a function of number of threads is examined.
Increased Energy Delivery for Parallel Battery Packs with No Regulated Bus
NASA Astrophysics Data System (ADS)
Hsu, Chung-Ti
In this dissertation, a new approach to paralleling different battery types is presented. A method for controlling charging/discharging of different battery packs by using low-cost bi-directional switches instead of DC-DC converters is proposed. The proposed system architecture, algorithms, and control techniques allow batteries with different chemistry, voltage, and SOC to be properly charged and discharged in parallel without causing safety problems. The physical design and cost for the energy management system is substantially reduced. Additionally, specific types of failures in the maximum power point tracking (MPPT) in a photovoltaic (PV) system when tracking only the load current of a DC-DC converter are analyzed. The periodic nonlinear load current will lead MPPT realized by the conventional perturb and observe (P&O) algorithm to be problematic. A modified MPPT algorithm is proposed and it still only requires typically measured signals, yet is suitable for both linear and periodic nonlinear loads. Moreover, for a modular DC-DC converter using several converters in parallel, the input power from PV panels is processed and distributed at the module level. Methods for properly implementing distributed MPPT are studied. A new approach to efficient MPPT under partial shading conditions is presented. The power stage architecture achieves fast input current change rate by combining a current-adjustable converter with a few converters operating at a constant current.
NASA Astrophysics Data System (ADS)
Du, Xiaoping; Wang, Yang; Liu, Hao
2018-04-01
The space object in highly elliptical orbit is always presented as an image point on the ground-based imaging equipment so that it is difficult to resolve and identify the shape and attitude directly. In this paper a novel algorithm is presented for the estimation of spacecraft shape. The apparent magnitude model suitable for the inversion of object information such as shape and attitude is established based on the analysis of photometric characteristics. A parallel adaptive shape inversion algorithm based on UKF was designed after the achievement of dynamic equation of the nonlinear, Gaussian system involved with the influence of various dragging forces. The result of a simulation study demonstrate the viability and robustness of the new filter and its fast convergence rate. It realizes the inversion of combination shape with high accuracy, especially for the bus of cube and cylinder. Even though with sparse photometric data, it still can maintain a higher success rate of inversion.
NASA Astrophysics Data System (ADS)
Lin, Na; Jia, Zhe; Wang, Zhihui; Zhao, Hui; Ai, Guo; Song, Xiangyun; Bai, Ying; Battaglia, Vincent; Sun, Chengdong; Qiao, Juan; Wu, Kai; Liu, Gao
2017-10-01
The structure degradation of commercial Lithium-ion battery (LIB) graphite anodes with different cycling numbers and charge rates was investigated by focused ion beam (FIB) and scanning electron microscopy (SEM). The cross-section image of graphite anode by FIB milling shows that cracks, resulted in the volume expansion of graphite electrode during long-term cycling, were formed in parallel with the current collector. The crack occurs in the bulk of graphite particles near the lithium insertion surface, which might derive from the stress induced during lithiation and de-lithiation cycles. Subsequently, crack takes place along grain boundaries of the polycrystalline graphite, but only in the direction parallel with the current collector. Furthermore, fast charge graphite electrodes are more prone to form cracks since the tensile strength of graphite is more likely to be surpassed at higher charge rates. Therefore, for LIBs long-term or high charge rate applications, the tensile strength of graphite anode should be taken into account.
Parallel heuristics for scalable community detection
Lu, Hao; Halappanavar, Mahantesh; Kalyanaraman, Ananth
2015-08-14
Community detection has become a fundamental operation in numerous graph-theoretic applications. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method ismore » also inherently sequential, thereby limiting its scalability. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose heuristics that are designed to break the sequential barrier. For evaluation purposes, we implemented our heuristics using OpenMP multithreading, and tested them over real world graphs derived from multiple application domains. Compared to the serial Louvain implementation, our parallel implementation is able to produce community outputs with a higher modularity for most of the inputs tested, in comparable number or fewer iterations, while providing real speedups of up to 16x using 32 threads.« less
The Mercury System: Embedding Computation into Disk Drives
2004-08-20
enabling technologies to build extremely fast data search engines . We do this by moving the search closer to the data, and performing it in hardware...engine searches in parallel across a disk or disk surface 2. System Parallelism: Searching is off-loaded to search engines and main processor can
Feng, Yanqiu; Song, Yanli; Wang, Cong; Xin, Xuegang; Feng, Qianjin; Chen, Wufan
2013-10-01
To develop and test a new algorithm for fast direct Fourier transform (DrFT) reconstruction of MR data on non-Cartesian trajectories composed of lines with equally spaced points. The DrFT, which is normally used as a reference in evaluating the accuracy of other reconstruction methods, can reconstruct images directly from non-Cartesian MR data without interpolation. However, DrFT reconstruction involves substantially intensive computation, which makes the DrFT impractical for clinical routine applications. In this article, the Chirp transform algorithm was introduced to accelerate the DrFT reconstruction of radial and Periodically Rotated Overlapping ParallEL Lines with Enhanced Reconstruction (PROPELLER) MRI data located on the trajectories that are composed of lines with equally spaced points. The performance of the proposed Chirp transform algorithm-DrFT algorithm was evaluated by using simulation and in vivo MRI data. After implementing the algorithm on a graphics processing unit, the proposed Chirp transform algorithm-DrFT algorithm achieved an acceleration of approximately one order of magnitude, and the speed-up factor was further increased to approximately three orders of magnitude compared with the traditional single-thread DrFT reconstruction. Implementation the Chirp transform algorithm-DrFT algorithm on the graphics processing unit can efficiently calculate the DrFT reconstruction of the radial and PROPELLER MRI data. Copyright © 2012 Wiley Periodicals, Inc.
Fast segmentation of satellite images using SLIC, WebGL and Google Earth Engine
NASA Astrophysics Data System (ADS)
Donchyts, Gennadii; Baart, Fedor; Gorelick, Noel; Eisemann, Elmar; van de Giesen, Nick
2017-04-01
Google Earth Engine (GEE) is a parallel geospatial processing platform, which harmonizes access to petabytes of freely available satellite images. It provides a very rich API, allowing development of dedicated algorithms to extract useful geospatial information from these images. At the same time, modern GPUs provide thousands of computing cores, which are mostly not utilized in this context. In the last years, WebGL became a popular and well-supported API, allowing fast image processing directly in web browsers. In this work, we will evaluate the applicability of WebGL to enable fast segmentation of satellite images. A new implementation of a Simple Linear Iterative Clustering (SLIC) algorithm using GPU shaders will be presented. SLIC is a simple and efficient method to decompose an image in visually homogeneous regions. It adapts a k-means clustering approach to generate superpixels efficiently. While this approach will be hard to scale, due to a significant amount of data to be transferred to the client, it should significantly improve exploratory possibilities and simplify development of dedicated algorithms for geoscience applications. Our prototype implementation will be used to improve surface water detection of the reservoirs using multispectral satellite imagery.
Explicit and Implicit Processes Constitute the Fast and Slow Processes of Sensorimotor Learning.
McDougle, Samuel D; Bond, Krista M; Taylor, Jordan A
2015-07-01
A popular model of human sensorimotor learning suggests that a fast process and a slow process work in parallel to produce the canonical learning curve (Smith et al., 2006). Recent evidence supports the subdivision of sensorimotor learning into explicit and implicit processes that simultaneously subserve task performance (Taylor et al., 2014). We set out to test whether these two accounts of learning processes are homologous. Using a recently developed method to assay explicit and implicit learning directly in a sensorimotor task, along with a computational modeling analysis, we show that the fast process closely resembles explicit learning and the slow process approximates implicit learning. In addition, we provide evidence for a subdivision of the slow/implicit process into distinct manifestations of motor memory. We conclude that the two-state model of motor learning is a close approximation of sensorimotor learning, but it is unable to describe adequately the various implicit learning operations that forge the learning curve. Our results suggest that a wider net be cast in the search for the putative psychological mechanisms and neural substrates underlying the multiplicity of processes involved in motor learning. Copyright © 2015 the authors 0270-6474/15/359568-12$15.00/0.
Preferential Heating of Oxygen 5+ Ions by Finite-Amplitude Oblique Alfven Waves
NASA Technical Reports Server (NTRS)
Maneva, Yana G.; Vinas, Adolfo; Araneda, Jamie; Poedts, Stefaan
2016-01-01
Minor ions in the fast solar wind are known to have higher temperatures and to flow faster than protons in the interplanetary space. In this study we combine previous research on parametric instability theory and 2.5D hybrid simulations to study the onset of preferential heating of Oxygen 5+ ions by large-scale finite-amplitude Alfven waves in the collisionless fast solar wind. We consider initially non-drifting isotropic multi-species plasma, consisting of isothermal massless fluid electrons, kinetic protons and kinetic Oxygen 5+ ions. The external energy source for the plasma heating and energization are oblique monochromatic Alfven-cyclotron waves. The waves have been created by rotating the direction of initial parallel pump, which is a solution of the multi-fluid plasma dispersion relation. We consider propagation angles theta less than or equal to 30 deg. The obliquely propagating Alfven pump waves lead to strong diffusion in the ion phase space, resulting in highly anisotropic heavy ion velocity distribution functions and proton beams. We discuss the application of the model to the problems of preferential heating of minor ions in the solar corona and the fast solar wind.
Explicit and Implicit Processes Constitute the Fast and Slow Processes of Sensorimotor Learning
Bond, Krista M.; Taylor, Jordan A.
2015-01-01
A popular model of human sensorimotor learning suggests that a fast process and a slow process work in parallel to produce the canonical learning curve (Smith et al., 2006). Recent evidence supports the subdivision of sensorimotor learning into explicit and implicit processes that simultaneously subserve task performance (Taylor et al., 2014). We set out to test whether these two accounts of learning processes are homologous. Using a recently developed method to assay explicit and implicit learning directly in a sensorimotor task, along with a computational modeling analysis, we show that the fast process closely resembles explicit learning and the slow process approximates implicit learning. In addition, we provide evidence for a subdivision of the slow/implicit process into distinct manifestations of motor memory. We conclude that the two-state model of motor learning is a close approximation of sensorimotor learning, but it is unable to describe adequately the various implicit learning operations that forge the learning curve. Our results suggest that a wider net be cast in the search for the putative psychological mechanisms and neural substrates underlying the multiplicity of processes involved in motor learning. PMID:26134640
Computer-Aided Parallelizer and Optimizer
NASA Technical Reports Server (NTRS)
Jin, Haoqiang
2011-01-01
The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.
Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems
NASA Technical Reports Server (NTRS)
Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael
2000-01-01
The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.
A fast, parallel algorithm for distant-dependent calculation of crystal properties
NASA Astrophysics Data System (ADS)
Stein, Matthew
2017-12-01
A fast, parallel algorithm for distant-dependent calculation and simulation of crystal properties is presented along with speedup results and methods of application. An illustrative example is used to compute the Lennard-Jones lattice constants up to 32 significant figures for 4 ≤ p ≤ 30 in the simple cubic, face-centered cubic, body-centered cubic, hexagonal-close-pack, and diamond lattices. In most cases, the known precision of these constants is more than doubled, and in some cases, corrected from previously published figures. The tools and strategies to make this computation possible are detailed along with application to other potentials, including those that model defects.
Effects of ATC automation on precision approaches to closely space parallel runways
NASA Technical Reports Server (NTRS)
Slattery, R.; Lee, K.; Sanford, B.
1995-01-01
Improved navigational technology (such as the Microwave Landing System and the Global Positioning System) installed in modern aircraft will enable air traffic controllers to better utilize available airspace. Consequently, arrival traffic can fly approaches to parallel runways separated by smaller distances than are currently allowed. Previous simulation studies of advanced navigation approaches have found that controller workload is increased when there is a combination of aircraft that are capable of following advanced navigation routes and aircraft that are not. Research into Air Traffic Control automation at Ames Research Center has led to the development of the Center-TRACON Automation System (CTAS). The Final Approach Spacing Tool (FAST) is the component of the CTAS used in the TRACON area. The work in this paper examines, via simulation, the effects of FAST used for aircraft landing on closely spaced parallel runways. The simulation contained various combinations of aircraft, equipped and unequipped with advanced navigation systems. A set of simulations was run both manually and with an augmented set of FAST advisories to sequence aircraft, assign runways, and avoid conflicts. The results of the simulations are analyzed, measuring the airport throughput, aircraft delay, loss of separation, and controller workload.
Distributed Function Mining for Gene Expression Programming Based on Fast Reduction.
Deng, Song; Yue, Dong; Yang, Le-chan; Fu, Xiong; Feng, Ya-zhou
2016-01-01
For high-dimensional and massive data sets, traditional centralized gene expression programming (GEP) or improved algorithms lead to increased run-time and decreased prediction accuracy. To solve this problem, this paper proposes a new improved algorithm called distributed function mining for gene expression programming based on fast reduction (DFMGEP-FR). In DFMGEP-FR, fast attribution reduction in binary search algorithms (FAR-BSA) is proposed to quickly find the optimal attribution set, and the function consistency replacement algorithm is given to solve integration of the local function model. Thorough comparative experiments for DFMGEP-FR, centralized GEP and the parallel gene expression programming algorithm based on simulated annealing (parallel GEPSA) are included in this paper. For the waveform, mushroom, connect-4 and musk datasets, the comparative results show that the average time-consumption of DFMGEP-FR drops by 89.09%%, 88.85%, 85.79% and 93.06%, respectively, in contrast to centralized GEP and by 12.5%, 8.42%, 9.62% and 13.75%, respectively, compared with parallel GEPSA. Six well-studied UCI test data sets demonstrate the efficiency and capability of our proposed DFMGEP-FR algorithm for distributed function mining.
Automatic Multilevel Parallelization Using OpenMP
NASA Technical Reports Server (NTRS)
Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)
2002-01-01
In this paper we describe the extension of the CAPO (CAPtools (Computer Aided Parallelization Toolkit) OpenMP) parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report some results for several benchmark codes and one full application that have been parallelized using our system.
NASA Astrophysics Data System (ADS)
Zhao, L.; Boehmer, H.; Edrich, D.; Heidbrink, W.; McWilliams, R.; Zimmerman, D.; Leneman, D.
2003-10-01
To study fast-ion transport, a 3-cm diameter, 17 MHZ, ˜80W, ˜3 mA argon source launches ˜500 eV ions in the LArge Plasma Device (LAPD). The beam is diagnosed with a gridded analyzer and, on a test stand at Irvine, laser-induced fluorescence (LIF). Neutral scattering is important near the source. The measured beam energy can be more than 100 eV larger than the accelerating voltage applied to the extraction grids. In LAPD the profile of the pulsed ion beam is measured at various axial locations between z=0.3-6.0 m from the source. When the beam velocity is parallel to the solenoidal field (0^o) evidence of peristaltic focusing, beam attenuation, and radial scattering is observed. At an angle of 22^o with respect to the field the beam follows the expected helical trajectory. Three meters axially from the source strong attenuation and elongation of the beam in the direction of the gyro-angle are observed. The data are compared with classical Coulomb and neutral scattering theory.
Fast segmentation of stained nuclei in terabyte-scale, time resolved 3D microscopy image stacks.
Stegmaier, Johannes; Otte, Jens C; Kobitski, Andrei; Bartschat, Andreas; Garcia, Ariel; Nienhaus, G Ulrich; Strähle, Uwe; Mikut, Ralf
2014-01-01
Automated analysis of multi-dimensional microscopy images has become an integral part of modern research in life science. Most available algorithms that provide sufficient segmentation quality, however, are infeasible for a large amount of data due to their high complexity. In this contribution we present a fast parallelized segmentation method that is especially suited for the extraction of stained nuclei from microscopy images, e.g., of developing zebrafish embryos. The idea is to transform the input image based on gradient and normal directions in the proximity of detected seed points such that it can be handled by straightforward global thresholding like Otsu's method. We evaluate the quality of the obtained segmentation results on a set of real and simulated benchmark images in 2D and 3D and show the algorithm's superior performance compared to other state-of-the-art algorithms. We achieve an up to ten-fold decrease in processing times, allowing us to process large data sets while still providing reasonable segmentation results.
NASA Astrophysics Data System (ADS)
Qiang, Ji
2017-10-01
A three-dimensional (3D) Poisson solver with longitudinal periodic and transverse open boundary conditions can have important applications in beam physics of particle accelerators. In this paper, we present a fast efficient method to solve the Poisson equation using a spectral finite-difference method. This method uses a computational domain that contains the charged particle beam only and has a computational complexity of O(Nu(logNmode)) , where Nu is the total number of unknowns and Nmode is the maximum number of longitudinal or azimuthal modes. This saves both the computational time and the memory usage of using an artificial boundary condition in a large extended computational domain. The new 3D Poisson solver is parallelized using a message passing interface (MPI) on multi-processor computers and shows a reasonable parallel performance up to hundreds of processor cores.
Fast data reconstructed method of Fourier transform imaging spectrometer based on multi-core CPU
NASA Astrophysics Data System (ADS)
Yu, Chunchao; Du, Debiao; Xia, Zongze; Song, Li; Zheng, Weijian; Yan, Min; Lei, Zhenggang
2017-10-01
Imaging spectrometer can gain two-dimensional space image and one-dimensional spectrum at the same time, which shows high utility in color and spectral measurements, the true color image synthesis, military reconnaissance and so on. In order to realize the fast reconstructed processing of the Fourier transform imaging spectrometer data, the paper designed the optimization reconstructed algorithm with OpenMP parallel calculating technology, which was further used for the optimization process for the HyperSpectral Imager of `HJ-1' Chinese satellite. The results show that the method based on multi-core parallel computing technology can control the multi-core CPU hardware resources competently and significantly enhance the calculation of the spectrum reconstruction processing efficiency. If the technology is applied to more cores workstation in parallel computing, it will be possible to complete Fourier transform imaging spectrometer real-time data processing with a single computer.
Potential Application of a Graphical Processing Unit to Parallel Computations in the NUBEAM Code
NASA Astrophysics Data System (ADS)
Payne, J.; McCune, D.; Prater, R.
2010-11-01
NUBEAM is a comprehensive computational Monte Carlo based model for neutral beam injection (NBI) in tokamaks. NUBEAM computes NBI-relevant profiles in tokamak plasmas by tracking the deposition and the slowing of fast ions. At the core of NUBEAM are vector calculations used to track fast ions. These calculations have recently been parallelized to run on MPI clusters. However, cost and interlink bandwidth limit the ability to fully parallelize NUBEAM on an MPI cluster. Recent implementation of double precision capabilities for Graphical Processing Units (GPUs) presents a cost effective and high performance alternative or complement to MPI computation. Commercially available graphics cards can achieve up to 672 GFLOPS double precision and can handle hundreds of thousands of threads. The ability to execute at least one thread per particle simultaneously could significantly reduce the execution time and the statistical noise of NUBEAM. Progress on implementation on a GPU will be presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, J.; Alpan, F. A.; Fischer, G.A.
2011-07-01
Traditional two-dimensional (2D)/one-dimensional (1D) SYNTHESIS methodology has been widely used to calculate fast neutron (>1.0 MeV) fluence exposure to reactor pressure vessel in the belt-line region. However, it is expected that this methodology cannot provide accurate fast neutron fluence calculation at elevations far above or below the active core region. A three-dimensional (3D) parallel discrete ordinates calculation for ex-vessel neutron dosimetry on a Westinghouse 4-Loop XL Pressurized Water Reactor has been done. It shows good agreement between the calculated results and measured results. Furthermore, the results show very different fast neutron flux values at some of the former plate locationsmore » and elevations above and below an active core than those calculated by a 2D/1D SYNTHESIS method. This indicates that for certain irregular reactor internal structures, where the fast neutron flux has a very strong local effect, it is required to use a 3D transport method to calculate accurate fast neutron exposure. (authors)« less
Wiens, Curtis N; Artz, Nathan S; Jang, Hyungseok; McMillan, Alan B; Reeder, Scott B
2017-06-01
To develop an externally calibrated parallel imaging technique for three-dimensional multispectral imaging (3D-MSI) in the presence of metallic implants. A fast, ultrashort echo time (UTE) calibration acquisition is proposed to enable externally calibrated parallel imaging techniques near metallic implants. The proposed calibration acquisition uses a broadband radiofrequency (RF) pulse to excite the off-resonance induced by the metallic implant, fully phase-encoded imaging to prevent in-plane distortions, and UTE to capture rapidly decaying signal. The performance of the externally calibrated parallel imaging reconstructions was assessed using phantoms and in vivo examples. Phantom and in vivo comparisons to self-calibrated parallel imaging acquisitions show that significant reductions in acquisition times can be achieved using externally calibrated parallel imaging with comparable image quality. Acquisition time reductions are particularly large for fully phase-encoded methods such as spectrally resolved fully phase-encoded three-dimensional (3D) fast spin-echo (SR-FPE), in which scan time reductions of up to 8 min were obtained. A fully phase-encoded acquisition with broadband excitation and UTE enabled externally calibrated parallel imaging for 3D-MSI, eliminating the need for repeated calibration regions at each frequency offset. Significant reductions in acquisition time can be achieved, particularly for fully phase-encoded methods like SR-FPE. Magn Reson Med 77:2303-2309, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
fastBMA: scalable network inference and transitive reduction.
Hung, Ling-Hong; Shi, Kaiyuan; Wu, Migao; Young, William Chad; Raftery, Adrian E; Yeung, Ka Yee
2017-10-01
Inferring genetic networks from genome-wide expression data is extremely demanding computationally. We have developed fastBMA, a distributed, parallel, and scalable implementation of Bayesian model averaging (BMA) for this purpose. fastBMA also includes a computationally efficient module for eliminating redundant indirect edges in the network by mapping the transitive reduction to an easily solved shortest-path problem. We evaluated the performance of fastBMA on synthetic data and experimental genome-wide time series yeast and human datasets. When using a single CPU core, fastBMA is up to 100 times faster than the next fastest method, LASSO, with increased accuracy. It is a memory-efficient, parallel, and distributed application that scales to human genome-wide expression data. A 10 000-gene regulation network can be obtained in a matter of hours using a 32-core cloud cluster (2 nodes of 16 cores). fastBMA is a significant improvement over its predecessor ScanBMA. It is more accurate and orders of magnitude faster than other fast network inference methods such as the 1 based on LASSO. The improved scalability allows it to calculate networks from genome scale data in a reasonable time frame. The transitive reduction method can improve accuracy in denser networks. fastBMA is available as code (M.I.T. license) from GitHub (https://github.com/lhhunghimself/fastBMA), as part of the updated networkBMA Bioconductor package (https://www.bioconductor.org/packages/release/bioc/html/networkBMA.html) and as ready-to-deploy Docker images (https://hub.docker.com/r/biodepot/fastbma/). © The Authors 2017. Published by Oxford University Press.
Cold Electrons as the Drivers of Parallel, Electrostatic Waves in Asymmetric Reconnection
NASA Astrophysics Data System (ADS)
Holmes, J.; Ergun, R.; Newman, D. L.; Wilder, F. D.; Schwartz, S. J.; Goodrich, K.; Eriksson, S.; Torbert, R. B.; Russell, C. T.; Lindqvist, P. A.; Giles, B. L.; Pollock, C. J.; Le Contel, O.; Strangeway, R. J.; Burch, J. L.
2016-12-01
The Magnetospheric MultiScale mission (MMS) has observed several instances of asymmetric reconnection at Earth's magnetopause, where plasma from the magnetosheath encounters that of the magnetosphere. On Earth's dayside, the magnetosphere is often made up of a two-component distribution of cold (<< 10 eV) and hot ( 1 keV) plasma, sometimes including the cold ion plume. Magnetosheath plasma is primarily warm ( 100 eV) post-shock solar wind. Where they meet, magnetopause reconnection alters the magnetic topology such that these two populations are left cohabiting a field line and rapidly mix. There have been several events observed by MMS where the Fast Plasma Instrument (FPI) clearly shows cold ions near the diffusion region impinging upon the warm magnetosheath population. In many of these, we also see patches of strong electrostatic waves parallel to the magnetic field - a smoking gun for rapid mixing via nonlinear processes. Cold ions alone are too slow to create the same waves; solving for roots of a simplified dispersion relation shows the electron population damps out the ion modes. From this, we infer the presence of cold electrons; in one notable case found by Wilder et al. 2016 (in review), they have been observed directly by FPI. Vlasov simulations of plasma mixing for a number of these events closely reproduce the observed electric field signatures. We conclude from numerical analysis and direct MMS observations that cold plasma mixing, including cold electrons, is the primary driver of parallel electrostatic waves observed near the electron diffusion region in asymmetric magnetic reconnection.
Nonlinear Evolution of Short-wavelength Torsional Alfvén Waves
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shestov, S. V.; Nakariakov, V. M.; Ulyanov, A. S.
2017-05-10
We analyze nonlinear evolution of torsional Alfvén waves in a straight magnetic flux tube filled in with a low- β plasma, and surrounded with a plasma of lower density. Such magnetic tubes model, in particular, a segment of a coronal loop or a polar plume. The wavelength is taken comparable to the tube radius. We perform a numerical simulation of the wave propagation using ideal magnetohydrodynamics. We find that a torsional wave nonlinearly induces three kinds of compressive flows: the parallel flow at the Alfvén speed, which constitutes a bulk plasma motion along the magnetic field, the tube wave, andmore » also transverse flows in the radial direction, associated with sausage fast magnetoacoustic modes. In addition, the nonlinear torsional wave steepens and its propagation speed increases. The latter effect leads to the progressive distortion of the torsional wave front, i.e., nonlinear phase mixing. Because of the intrinsic non-uniformity of the torsional wave amplitude across the tube radius, the nonlinear effects are more pronounced in regions with higher wave amplitudes. They are always absent at the axes of the flux tube. In the case of a linear radial profile of the wave amplitude, the nonlinear effects are localized in an annulus region near the tube boundary. Thus, the parallel compressive flows driven by torsional Alfvén waves in the solar and stellar coronae, are essentially non-uniform in the perpendicular direction. The presence of additional sinks for the wave energy reduces the efficiency of the nonlinear parallel cascade in torsional Alfvén waves.« less
Nonlinear Evolution of Short-wavelength Torsional Alfvén Waves
NASA Astrophysics Data System (ADS)
Shestov, S. V.; Nakariakov, V. M.; Ulyanov, A. S.; Reva, A. A.; Kuzin, S. V.
2017-05-01
We analyze nonlinear evolution of torsional Alfvén waves in a straight magnetic flux tube filled in with a low-β plasma, and surrounded with a plasma of lower density. Such magnetic tubes model, in particular, a segment of a coronal loop or a polar plume. The wavelength is taken comparable to the tube radius. We perform a numerical simulation of the wave propagation using ideal magnetohydrodynamics. We find that a torsional wave nonlinearly induces three kinds of compressive flows: the parallel flow at the Alfvén speed, which constitutes a bulk plasma motion along the magnetic field, the tube wave, and also transverse flows in the radial direction, associated with sausage fast magnetoacoustic modes. In addition, the nonlinear torsional wave steepens and its propagation speed increases. The latter effect leads to the progressive distortion of the torsional wave front, I.e., nonlinear phase mixing. Because of the intrinsic non-uniformity of the torsional wave amplitude across the tube radius, the nonlinear effects are more pronounced in regions with higher wave amplitudes. They are always absent at the axes of the flux tube. In the case of a linear radial profile of the wave amplitude, the nonlinear effects are localized in an annulus region near the tube boundary. Thus, the parallel compressive flows driven by torsional Alfvén waves in the solar and stellar coronae, are essentially non-uniform in the perpendicular direction. The presence of additional sinks for the wave energy reduces the efficiency of the nonlinear parallel cascade in torsional Alfvén waves.
Anisotropic tomography of the Atlantic ocean
NASA Astrophysics Data System (ADS)
Silveira, G.; Stutzmann, E.
2003-04-01
We present a regional tri-dimensional model of the Atlantic Ocean with anisotropy. The model, derived from Rayleigh and Love phase velocity measurements, is defined from the Moho down to 300 km depth with a lateral resolution of about 500 km and is presented in terms of average isotropic S-wave velocity, azimuthal anisotropy and transverse isotropy. The cratons beneath North America, Brazil and Africa are clearly associated with fast S-wave velocity anomalies. The Mid Atlantic Ridge is a shallow structure in the North Atlantic corresponding to a negative velocity anomaly down to about 150 km depth. In contrast, the ridge negative signature is visible in the South Atlantic down to the deepest depth inverted, that is 300~km depth. This difference is probably related to the presence of hot-spots along or close to the ridge axis in the South Atlantic and may indicate a different mechanism for the ridge between the North and South Atlantic. Negative velocity anomalies are clearly associated with hot-spots from the surface down to at least 300km depth, they are much broader that the supposed size of the hot-spots and seem to be connected along a North-South direction. Down to 100 km depth, a fast S-wave velocity anomaly is extenting from Africa into the Atlantic Ocean within the zone defined as the Africa superswell area. This result indicates that the hot material rising from below does not reach the surface in this area but may be pushing the lithosphere upward. In most parts of the Atlantic, the azimuthal anisotropy directions remain stable with increasing depth. Close to the ridge, the fast S-wave velocity direction is roughly parallel to the sea floor spreading direction. The hot-spot anisotropy signature is striking beneath Bermuda, Cape Verde and Fernando Noronha islands where the fast S-wave velocity direction seems to diverge radially from the hot-spots. The Atlantic average radial anisotropy is similar to that of the PREM model, that is positive down to about 220 km, but with slightly smaller amplitude and null deeper. Cratons have a lower than average radial anisotropy. As for the velocities, there is a difference between North and South Atlantic. Most hot-spots and the South Atlantic ridge are associated with positive radial anisotropy perturbation whereas the North atlantic ridge corresponds to negative radial anisotropy perturbation.
Zhu, Xiang; Zhang, Dianwen
2013-01-01
We present a fast, accurate and robust parallel Levenberg-Marquardt minimization optimizer, GPU-LMFit, which is implemented on graphics processing unit for high performance scalable parallel model fitting processing. GPU-LMFit can provide a dramatic speed-up in massive model fitting analyses to enable real-time automated pixel-wise parametric imaging microscopy. We demonstrate the performance of GPU-LMFit for the applications in superresolution localization microscopy and fluorescence lifetime imaging microscopy. PMID:24130785
A distributed parallel storage architecture and its potential application within EOSDIS
NASA Technical Reports Server (NTRS)
Johnston, William E.; Tierney, Brian; Feuquay, Jay; Butzer, Tony
1994-01-01
We describe the architecture, implementation, use of a scalable, high performance, distributed-parallel data storage system developed in the ARPA funded MAGIC gigabit testbed. A collection of wide area distributed disk servers operate in parallel to provide logical block level access to large data sets. Operated primarily as a network-based cache, the architecture supports cooperation among independently owned resources to provide fast, large-scale, on-demand storage to support data handling, simulation, and computation.
Engine-start Control Strategy of P2 Parallel Hybrid Electric Vehicle
NASA Astrophysics Data System (ADS)
Xiangyang, Xu; Siqi, Zhao; Peng, Dong
2017-12-01
A smooth and fast engine-start process is important to parallel hybrid electric vehicles with an electric motor mounted in front of the transmission. However, there are some challenges during the engine-start control. Firstly, the electric motor must simultaneously provide a stable driving torque to ensure the drivability and a compensative torque to drag the engine before ignition. Secondly, engine-start time is a trade-off control objective because both fast start and smooth start have to be considered. To solve these problems, this paper first analyzed the resistance of the engine start process, and established a physic model in MATLAB/Simulink. Then a model-based coordinated control strategy among engine, motor and clutch was developed. Two basic control strategy during fast start and smooth start process were studied. Simulation results showed that the control objectives were realized by applying given control strategies, which can meet different requirement from the driver.
On the inversion of geodetic integrals defined over the sphere using 1-D FFT
NASA Astrophysics Data System (ADS)
García, R. V.; Alejo, C. A.
2005-08-01
An iterative method is presented which performs inversion of integrals defined over the sphere. The method is based on one-dimensional fast Fourier transform (1-D FFT) inversion and is implemented with the projected Landweber technique, which is used to solve constrained least-squares problems reducing the associated 1-D cyclic-convolution error. The results obtained are as precise as the direct matrix inversion approach, but with better computational efficiency. A case study uses the inversion of Hotine’s integral to obtain gravity disturbances from geoid undulations. Numerical convergence is also analyzed and comparisons with respect to the direct matrix inversion method using conjugate gradient (CG) iteration are presented. Like the CG method, the number of iterations needed to get the optimum (i.e., small) error decreases as the measurement noise increases. Nevertheless, for discrete data given over a whole parallel band, the method can be applied directly without implementing the projected Landweber method, since no cyclic convolution error exists.
Adaptive multiple super fast simulated annealing for stochastic microstructure reconstruction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ryu, Seun; Lin, Guang; Sun, Xin
2013-01-01
Fast image reconstruction from statistical information is critical in image fusion from multimodality chemical imaging instrumentation to create high resolution image with large domain. Stochastic methods have been used widely in image reconstruction from two point correlation function. The main challenge is to increase the efficiency of reconstruction. A novel simulated annealing method is proposed for fast solution of image reconstruction. Combining the advantage of very fast cooling schedules, dynamic adaption and parallelization, the new simulation annealing algorithm increases the efficiencies by several orders of magnitude, making the large domain image fusion feasible.
Glover, William A; Atienza, Ederlyn E; Nesbitt, Shannon; Kim, Woo J; Castor, Jared; Cook, Linda; Jerome, Keith R
2016-01-01
Quantitative DNA detection of cytomegalovirus (CMV) and BK virus (BKV) is critical in the management of transplant patients. Quantitative laboratory-developed procedures for CMV and BKV have been described in which much of the processing is automated, resulting in rapid, reproducible, and high-throughput testing of transplant patients. To increase the efficiency of such assays, the performance and stability of four commercial preassembled frozen fast qPCR master mixes (Roche FastStart Universal Probe Master Mix with Rox, Bio-Rad SsoFast Probes Supermix with Rox, Life Technologies TaqMan FastAdvanced Master Mix, and Life Technologies Fast Universal PCR Master Mix), in combination with in-house designed primers and probes, was evaluated using controls and standards from standard CMV and BK assays. A subsequent parallel evaluation using patient samples was performed comparing the performance of freshly prepared assay mixes versus aliquoted frozen master mixes made with two of the fast qPCR mixes (Life Technologies TaqMan FastAdvanced Master Mix, and Bio-Rad SsoFast Probes Supermix with Rox), chosen based on their performance and compatibility with existing PCR cycling conditions. The data demonstrate that the frozen master mixes retain excellent performance over a period of at least 10 weeks. During the parallel testing using clinical specimens, no difference in quantitative results was observed between the preassembled frozen master mixes and freshly prepared master mixes. Preassembled fast real-time qPCR frozen master mixes perform well and represent an additional strategy laboratories can implement to reduce assay preparation times, and to minimize technical errors and effort necessary to perform clinical PCR. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Xue, M.; Li, L.; Chen, L.
2016-12-01
South China Sea (SCS) is located in the continental margin of Eurasia plate, where different geological blocks/tectonic plates interact. The dynamic mechanism of the formation of South China Sea (SCS) has been debated for decades. In this study, we first synthesize our geophysical results obtained in South China Sea, including an updated 3D velocity model from surface tomography using surrounding land stations and regional earthquakes, and shear wave splitting results obtained at surrounding land stations and OBS, using local, regional, and teleseismic earthquakes. The observed splitting results in South China Sea are complex: the fast polarization direction beneath the central basin is approximately NE-SW, nearly parallel to the extinct ridge in the central basin of SCS; however, the fast axis within the slab is trench-parallel outside the ridge subduction region. In 3D velocity models, subducting slabs are observed as dipping high velocity anomalies, and discontinuous low velocities are observed above the subduction slab, as well as in the basin. How the splitting observations are connected with the velocity models? How observations are linked to one another? How are the observations in central basin linked with surrounding region? We are aiming to link these observations themselves as well as with newly published results from geophysics, geochemistry, and geology in this region. Such a synthesis will improve our understanding about the evolution of South China Sea and facilitate new ideas.
Fast, Parallel and Secure Cryptography Algorithm Using Lorenz's Attractor
NASA Astrophysics Data System (ADS)
Marco, Anderson Gonçalves; Martinez, Alexandre Souto; Bruno, Odemir Martinez
A novel cryptography method based on the Lorenz's attractor chaotic system is presented. The proposed algorithm is secure and fast, making it practical for general use. We introduce the chaotic operation mode, which provides an interaction among the password, message and a chaotic system. It ensures that the algorithm yields a secure codification, even if the nature of the chaotic system is known. The algorithm has been implemented in two versions: one sequential and slow and the other, parallel and fast. Our algorithm assures the integrity of the ciphertext (we know if it has been altered, which is not assured by traditional algorithms) and consequently its authenticity. Numerical experiments are presented, discussed and show the behavior of the method in terms of security and performance. The fast version of the algorithm has a performance comparable to AES, a popular cryptography program used commercially nowadays, but it is more secure, which makes it immediately suitable for general purpose cryptography applications. An internet page has been set up, which enables the readers to test the algorithm and also to try to break into the cipher.
Components of action potential repolarization in cerebellar parallel fibres.
Pekala, Dobromila; Baginskas, Armantas; Szkudlarek, Hanna J; Raastad, Morten
2014-11-15
Repolarization of the presynaptic action potential is essential for transmitter release, excitability and energy expenditure. Little is known about repolarization in thin, unmyelinated axons forming en passant synapses, which represent the most common type of axons in the mammalian brain's grey matter.We used rat cerebellar parallel fibres, an example of typical grey matter axons, to investigate the effects of K(+) channel blockers on repolarization. We show that repolarization is composed of a fast tetraethylammonium (TEA)-sensitive component, determining the width and amplitude of the spike, and a slow margatoxin (MgTX)-sensitive depolarized after-potential (DAP). These two components could be recorded at the granule cell soma as antidromic action potentials and from the axons with a newly developed miniaturized grease-gap method. A considerable proportion of fast repolarization remained in the presence of TEA, MgTX, or both. This residual was abolished by the addition of quinine. The importance of proper control of fast repolarization was demonstrated by somatic recordings of antidromic action potentials. In these experiments, the relatively broad K(+) channel blocker 4-aminopyridine reduced the fast repolarization, resulting in bursts of action potentials forming on top of the DAP. We conclude that repolarization of the action potential in parallel fibres is supported by at least three groups of K(+) channels. Differences in their temporal profiles allow relatively independent control of the spike and the DAP, whereas overlap of their temporal profiles provides robust control of axonal bursting properties.
Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data.
Li, Wenyuan; Gong, Ke; Li, Qingjiao; Alber, Frank; Zhou, Xianghong Jasmine
2015-03-15
Genome-wide proximity ligation assays, e.g. Hi-C and its variant TCC, have recently become important tools to study spatial genome organization. Removing biases from chromatin contact matrices generated by such techniques is a critical preprocessing step of subsequent analyses. The continuing decline of sequencing costs has led to an ever-improving resolution of the Hi-C data, resulting in very large matrices of chromatin contacts. Such large-size matrices, however, pose a great challenge on the memory usage and speed of its normalization. Therefore, there is an urgent need for fast and memory-efficient methods for normalization of Hi-C data. We developed Hi-Corrector, an easy-to-use, open source implementation of the Hi-C data normalization algorithm. Its salient features are (i) scalability-the software is capable of normalizing Hi-C data of any size in reasonable times; (ii) memory efficiency-the sequential version can run on any single computer with very limited memory, no matter how little; (iii) fast speed-the parallel version can run very fast on multiple computing nodes with limited local memory. The sequential version is implemented in ANSI C and can be easily compiled on any system; the parallel version is implemented in ANSI C with the MPI library (a standardized and portable parallel environment designed for solving large-scale scientific problems). The package is freely available at http://zhoulab.usc.edu/Hi-Corrector/. © The Author 2014. Published by Oxford University Press.
Massively Parallel Processing for Fast and Accurate Stamping Simulations
NASA Astrophysics Data System (ADS)
Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-tao; Paul, Sabu
2005-08-01
The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.
Automatic Multilevel Parallelization Using OpenMP
NASA Technical Reports Server (NTRS)
Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)
2002-01-01
In this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report first results for several benchmark codes and one full application that have been parallelized using our system.
Ordered fast fourier transforms on a massively parallel hypercube multiprocessor
NASA Technical Reports Server (NTRS)
Tong, Charles; Swarztrauber, Paul N.
1989-01-01
Design alternatives for ordered Fast Fourier Transformation (FFT) algorithms were examined on massively parallel hypercube multiprocessors such as the Connection Machine. Particular emphasis is placed on reducing communication which is known to dominate the overall computing time. To this end, the order and computational phases of the FFT were combined, and the sequence to processor maps that reduce communication were used. The class of ordered transforms is expanded to include any FFT in which the order of the transform is the same as that of the input sequence. Two such orderings are examined, namely, standard-order and A-order which can be implemented with equal ease on the Connection Machine where orderings are determined by geometries and priorities. If the sequence has N = 2 exp r elements and the hypercube has P = 2 exp d processors, then a standard-order FFT can be implemented with d + r/2 + 1 parallel transmissions. An A-order sequence can be transformed with 2d - r/2 parallel transmissions which is r - d + 1 fewer than the standard order. A parallel method for computing the trigonometric coefficients is presented that does not use trigonometric functions or interprocessor communication. A performance of 0.9 GFLOPS was obtained for an A-order transform on the Connection Machine.
Parallel processing in the honeybee olfactory pathway: structure, function, and evolution.
Rössler, Wolfgang; Brill, Martin F
2013-11-01
Animals face highly complex and dynamic olfactory stimuli in their natural environments, which require fast and reliable olfactory processing. Parallel processing is a common principle of sensory systems supporting this task, for example in visual and auditory systems, but its role in olfaction remained unclear. Studies in the honeybee focused on a dual olfactory pathway. Two sets of projection neurons connect glomeruli in two antennal-lobe hemilobes via lateral and medial tracts in opposite sequence with the mushroom bodies and lateral horn. Comparative studies suggest that this dual-tract circuit represents a unique adaptation in Hymenoptera. Imaging studies indicate that glomeruli in both hemilobes receive redundant sensory input. Recent simultaneous multi-unit recordings from projection neurons of both tracts revealed widely overlapping response profiles strongly indicating parallel olfactory processing. Whereas lateral-tract neurons respond fast with broad (generalistic) profiles, medial-tract neurons are odorant specific and respond slower. In analogy to "what-" and "where" subsystems in visual pathways, this suggests two parallel olfactory subsystems providing "what-" (quality) and "when" (temporal) information. Temporal response properties may support across-tract coincidence coding in higher centers. Parallel olfactory processing likely enhances perception of complex odorant mixtures to decode the diverse and dynamic olfactory world of a social insect.
NASA Astrophysics Data System (ADS)
Varga, Robert J.; Horst, Andrew J.; Gee, Jeffrey S.; Karson, Jeffrey A.
2008-08-01
Rare, fault-bounded escarpments expose natural cross sections of ocean crust in several areas and provide an unparalleled opportunity to study the end products of tectonic and magmatic processes that operated at depth beneath oceanic spreading centers. We mapped the geologic structure of ocean crust produced at the East Pacific Rise (EPR) and now exposed along steep cliffs of the Pito Deep Rift near the northern edge of the Easter microplate. The upper oceanic crust in this area is typified by basaltic lavas underlain by a sheeted dike complex comprising northeast striking, moderately to steeply southeast dipping dikes. Paleomagnetic remanence of oriented blocks of dikes collected with both Alvin and Jason II indicate clockwise rotation of ˜61° related to rotation of the microplate indicating structural coupling between the microplate and crust of the Nazca Plate to the north. The consistent southeast dip of dikes formed as the result of tilting at the EPR shortly after their injection. Anisotropy of magnetic susceptibility of dikes provides well-defined magmatic flow directions that are dominantly dike-parallel and shallowly plunging. Corrected to their original EPR orientation, magma flow is interpreted as near-horizontal and parallel to the ridge axis. These data provide the first direct evidence from sheeted dikes in ocean crust for along-axis magma transport. These results also suggest that lateral transport in dikes is important even at fast spreading ridges where a laterally continuous subaxial magma chamber is present.
Turbomachinery CFD on parallel computers
NASA Technical Reports Server (NTRS)
Blech, Richard A.; Milner, Edward J.; Quealy, Angela; Townsend, Scott E.
1992-01-01
The role of multistage turbomachinery simulation in the development of propulsion system models is discussed. Particularly, the need for simulations with higher fidelity and faster turnaround time is highlighted. It is shown how such fast simulations can be used in engineering-oriented environments. The use of parallel processing to achieve the required turnaround times is discussed. Current work by several researchers in this area is summarized. Parallel turbomachinery CFD research at the NASA Lewis Research Center is then highlighted. These efforts are focused on implementing the average-passage turbomachinery model on MIMD, distributed memory parallel computers. Performance results are given for inviscid, single blade row and viscous, multistage applications on several parallel computers, including networked workstations.
Shear wave splitting and shear wave splitting tomography of the southern Puna plateau
NASA Astrophysics Data System (ADS)
Calixto, Frank J.; Robinson, Danielle; Sandvol, Eric; Kay, Suzanne; Abt, David; Fischer, Karen; Heit, Ben; Yuan, Xiaohui; Comte, Diana; Alvarado, Patricia
2014-11-01
We have investigated the seismic anisotropy beneath the Central Andean southern Puna plateau by applying shear wave splitting analysis and shear wave splitting tomography to local S waves and teleseismic SKS, SKKS and PKS phases. Overall, a very complex pattern of fast directions throughout the southern Puna plateau region and a circular pattern of fast directions around the region of the giant Cerro Galan ignimbrite complex are observed. In general, teleseismic lag times are much greater than those for local events which are interpreted to reflect a significant amount of sub and inner slab anisotropy. The complex pattern observed from shear wave splitting analysis alone is the result of a complex 3-D anisotropic structure under the southern Puna plateau. Our application of shear wave splitting tomography provides a 3-D model of anisotropy in the southern Puna plateau that shows different patterns depending on the driving mechanism of upper-mantle flow and seismic anisotropy. The trench parallel a-axes in the continental lithosphere above the slab east of 68W may be related to deformation of the overriding continental lithosphere since it is under compressive stresses which are orthogonal to the trench. The more complex pattern below the Cerro Galan ignimbrite complex and above the slab is interpreted to reflect delamination of continental lithosphere and upwelling of hot asthenosphere. The a-axes beneath the Cerro Galan, Cerro Blanco and Carachi Pampa volcanic centres at 100 km depth show some weak evidence for vertically orientated fast directions, which could be due to vertical asthenospheric flow around a delaminated block. Additionally, our splitting tomographic model shows that there is a significant amount of seismic anisotropy beneath the slab. The subslab mantle west of 68W shows roughly trench parallel horizontal a-axes that are probably driven by slab roll back and the relatively small coupling between the Nazca slab and the underlying mantle. In contrast, the subslab region (i.e. depths greater than 200 km) east of 68W shows a circular pattern of a-axes centred on a region with small strength of anisotropy (Cerro Galan and its eastern edge) which suggest the dominant mechanism is a combination of slab roll back and flow driven by an overlying abnormally heated slab or possibly a slab gap. There seems to be some evidence for vertical flow below the slab at depths of 200-400 km driven by the abnormally heated slab or slab gap. This cannot be resolved by the tomographic inversion due to the lack of ray crossings in the subslab mantle.
2015-06-01
cient parallel code for applying the operator. Our method constructs a polynomial preconditioner using a nonlinear least squares (NLLS) algorithm. We show...apply the underlying operator. Such a preconditioner can be very attractive in scenarios where one has a highly efficient parallel code for applying...repeatedly solve a large system of linear equations where one has an extremely fast parallel code for applying an underlying fixed linear operator
Density-based parallel skin lesion border detection with webCL
2015-01-01
Background Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate borders through a hand drawn representation based upon visual inspection. Due to the subjective nature of this technique, intra- and inter-observer variations are common. Because of this, the automated assessment of lesion borders in dermoscopic images has become an important area of study. Methods Fast density based skin lesion border detection method has been implemented in parallel with a new parallel technology called WebCL. WebCL utilizes client side computing capabilities to use available hardware resources such as multi cores and GPUs. Developed WebCL-parallel density based skin lesion border detection method runs efficiently from internet browsers. Results Previous research indicates that one of the highest accuracy rates can be achieved using density based clustering techniques for skin lesion border detection. While these algorithms do have unfavorable time complexities, this effect could be mitigated when implemented in parallel. In this study, density based clustering technique for skin lesion border detection is parallelized and redesigned to run very efficiently on the heterogeneous platforms (e.g. tablets, SmartPhones, multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units) by transforming the technique into a series of independent concurrent operations. Heterogeneous computing is adopted to support accessibility, portability and multi-device use in the clinical settings. For this, we used WebCL, an emerging technology that enables a HTML5 Web browser to execute code in parallel for heterogeneous platforms. We depicted WebCL and our parallel algorithm design. In addition, we tested parallel code on 100 dermoscopy images and showed the execution speedups with respect to the serial version. Results indicate that parallel (WebCL) version and serial version of density based lesion border detection methods generate the same accuracy rates for 100 dermoscopy images, in which mean of border error is 6.94%, mean of recall is 76.66%, and mean of precision is 99.29% respectively. Moreover, WebCL version's speedup factor for 100 dermoscopy images' lesion border detection averages around ~491.2. Conclusions When large amount of high resolution dermoscopy images considered in a usual clinical setting along with the critical importance of early detection and diagnosis of melanoma before metastasis, the importance of fast processing dermoscopy images become obvious. In this paper, we introduce WebCL and the use of it for biomedical image processing applications. WebCL is a javascript binding of OpenCL, which takes advantage of GPU computing from a web browser. Therefore, WebCL parallel version of density based skin lesion border detection introduced in this study can supplement expert dermatologist, and aid them in early diagnosis of skin lesions. While WebCL is currently an emerging technology, a full adoption of WebCL into the HTML5 standard would allow for this implementation to run on a very large set of hardware and software systems. WebCL takes full advantage of parallel computational resources including multi-cores and GPUs on a local machine, and allows for compiled code to run directly from the Web Browser. PMID:26423836
Density-based parallel skin lesion border detection with webCL.
Lemon, James; Kockara, Sinan; Halic, Tansel; Mete, Mutlu
2015-01-01
Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate borders through a hand drawn representation based upon visual inspection. Due to the subjective nature of this technique, intra- and inter-observer variations are common. Because of this, the automated assessment of lesion borders in dermoscopic images has become an important area of study. Fast density based skin lesion border detection method has been implemented in parallel with a new parallel technology called WebCL. WebCL utilizes client side computing capabilities to use available hardware resources such as multi cores and GPUs. Developed WebCL-parallel density based skin lesion border detection method runs efficiently from internet browsers. Previous research indicates that one of the highest accuracy rates can be achieved using density based clustering techniques for skin lesion border detection. While these algorithms do have unfavorable time complexities, this effect could be mitigated when implemented in parallel. In this study, density based clustering technique for skin lesion border detection is parallelized and redesigned to run very efficiently on the heterogeneous platforms (e.g. tablets, SmartPhones, multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units) by transforming the technique into a series of independent concurrent operations. Heterogeneous computing is adopted to support accessibility, portability and multi-device use in the clinical settings. For this, we used WebCL, an emerging technology that enables a HTML5 Web browser to execute code in parallel for heterogeneous platforms. We depicted WebCL and our parallel algorithm design. In addition, we tested parallel code on 100 dermoscopy images and showed the execution speedups with respect to the serial version. Results indicate that parallel (WebCL) version and serial version of density based lesion border detection methods generate the same accuracy rates for 100 dermoscopy images, in which mean of border error is 6.94%, mean of recall is 76.66%, and mean of precision is 99.29% respectively. Moreover, WebCL version's speedup factor for 100 dermoscopy images' lesion border detection averages around ~491.2. When large amount of high resolution dermoscopy images considered in a usual clinical setting along with the critical importance of early detection and diagnosis of melanoma before metastasis, the importance of fast processing dermoscopy images become obvious. In this paper, we introduce WebCL and the use of it for biomedical image processing applications. WebCL is a javascript binding of OpenCL, which takes advantage of GPU computing from a web browser. Therefore, WebCL parallel version of density based skin lesion border detection introduced in this study can supplement expert dermatologist, and aid them in early diagnosis of skin lesions. While WebCL is currently an emerging technology, a full adoption of WebCL into the HTML5 standard would allow for this implementation to run on a very large set of hardware and software systems. WebCL takes full advantage of parallel computational resources including multi-cores and GPUs on a local machine, and allows for compiled code to run directly from the Web Browser.
NASA Astrophysics Data System (ADS)
Margheriti, L.; Ferulano, M. F.; Di Bona, M.
2006-11-01
Shear wave splitting is measured at 14 seismic stations in the Reggio Emilia region above local background seismicity and two sequences of seismic events. The good quality of the waveforms together with the favourable distribution of earthquake foci allows us to place strong constraints on the geometry and the depth of the anisotropic volume. It is about 60 km2 wide and located between 6 and 11 km depth, inside Mesozoic age carbonate rocks. The splitting results suggest also the presence of a shallower anisotropic layer about 1 km thick and few km wide in the Pliocene-Quaternary alluvium above the Mesozoic layer. The fast polarization directions (N30°E) are approximately parallel to the maximum horizontal stress (σ1 is SSW-NNE) in the region and also parallel to the strike of the main structural features in the Reggio Emilia area. The size of the delay times suggests about 4.5 per cent shear wave velocity anisotropy. These parameters agree with an interpretation of seismic anisotropy in terms of the extensive-dilatancy anisotropy model which considers the rock volume to be pervaded by fluid-saturated microcracks aligned by the active stress field. We cannot completely rule out the contribution of aligned macroscopic fractures as the cause of the shear wave anisotropy even if the parallel shear wave polarizations we found are diagnostic of transverse isotropy with a horizontal axis of symmetry. This symmetry is commonly explained by parallel stress-aligned microcracks.
GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit
Pronk, Sander; Páll, Szilárd; Schulz, Roland; Larsson, Per; Bjelkmar, Pär; Apostolov, Rossen; Shirts, Michael R.; Smith, Jeremy C.; Kasson, Peter M.; van der Spoel, David; Hess, Berk; Lindahl, Erik
2013-01-01
Motivation: Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. Results: Here, we present a range of new simulation algorithms and features developed during the past 4 years, leading up to the GROMACS 4.5 software package. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations. Availability: GROMACS is an open source and free software available from http://www.gromacs.org. Contact: erik.lindahl@scilifelab.se Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23407358
[Metabolic study of the initial period of fasting in the king penguin chick].
Cherel, Y; Le Maho, Y
1985-01-01
There is an 80% decrease in the specific daily change in body mass (dm/m dt) during the first 5-6 days of fasting in king penguin chicks, which characterizes period I of fasting. Parallel decreases in plasma alanine and uric acid concentrations suggest an important reduction in protein degradation. Plasma concentration of beta-hydroxybutyrate and glucose are high, respectively 1.3 and 12.5 mmol X 1(-1), and do not change significantly.
NASA Astrophysics Data System (ADS)
Reerink, Thomas J.; van de Berg, Willem Jan; van de Wal, Roderik S. W.
2016-11-01
This paper accompanies the second OBLIMAP open-source release. The package is developed to map climate fields between a general circulation model (GCM) and an ice sheet model (ISM) in both directions by using optimal aligned oblique projections, which minimize distortions. The curvature of the surfaces of the GCM and ISM grid differ, both grids may be irregularly spaced and the ratio of the grids is allowed to differ largely. OBLIMAP's stand-alone version is able to map data sets that differ in various aspects on the same ISM grid. Each grid may either coincide with the surface of a sphere, an ellipsoid or a flat plane, while the grid types might differ. Re-projection of, for example, ISM data sets is also facilitated. This is demonstrated by relevant applications concerning the major ice caps. As the stand-alone version also applies to the reverse mapping direction, it can be used as an offline coupler. Furthermore, OBLIMAP 2.0 is an embeddable GCM-ISM coupler, suited for high-frequency online coupled experiments. A new fast scan method is presented for structured grids as an alternative for the former time-consuming grid search strategy, realising a performance gain of several orders of magnitude and enabling the mapping of high-resolution data sets with a much larger number of grid nodes. Further, a highly flexible masked mapping option is added. The limitation of the fast scan method with respect to unstructured and adaptive grids is discussed together with a possible future parallel Message Passing Interface (MPI) implementation.
Plagioclase-dominated Seismic Anisotropy in the Basin and Range Lower Crust
NASA Astrophysics Data System (ADS)
Bernard, R. E.; Behr, W. M.
2017-12-01
Observations of seismic anisotropy have the ability to provide important information on deformation and structures within the lithosphere. While the mechanisms controlling seismic anisotropy in the upper mantle are fairly well understood (i.e., olivine "lattice preferred orientation" or LPO), less is known about the minerals and structures controlling regional lower crustal anisotropy. We use lower crustal xenoliths from young cinder cones in the eastern Mojave/western Basin and Range to investigate mineral LPOs and their effect on seismic anisotropy. Lower crustal gabbros were collected from two areas roughly 80 km apart — the Cima and Deadman Lake Volcanic Fields. Lower crustal fabrics measured using EBSD are dominated by LPOs in plagioclase associated with both plastic deformation and magmatic flow. In all fabric types, plagioclase LPOs produce seismic fast axes oriented perpendicular to the foliation plane. This is in contrast to mantle peridotite xenoliths from the same locations, which preserve olivine LPOs with fast axes aligned parallel to the foliation plane. The orthogonal orientations of mantle and lower crustal fast axes relative to foliation implies that even where fabric development in both layers is coeval and kinematically compatible, their measured anisotropies can be perpendicular to each other, therefore appearing anti-correlated when measured seismically. Furthermore, our observation of plagioclase-dominated LPO and negligible concentrations of mica is at odds with the common assumption that lower crustal anisotropy is dominated by micaceous minerals, whose slow axes reliably align parallel to lineation or flow. In contrast, our data show that for plagioclase, fast axes align perpendicular to flow and the slow axes are variably aligned within the foliation plane. Therefore, for a crustal section dominated by plagioclase LPO with assumed horizontal foliation, there would be a vertical rather than a horizontal axis of symmetry, resulting in a lack of azimuthal anisotropy and minimal shear wave splitting for vertically propagating waves. Crustal seismic studies in this type of setting may only be able to identify crustal flow planes, but not flow directions. These findings may be generally applicable to regions of significant mafic volcanism and lower crustal magmatic underplating.
Anti-oxidative cellular protection effect of fasting-induced autophagy as a mechanism for hormesis.
Moore, Michael N; Shaw, Jennifer P; Ferrar Adams, Dawn R; Viarengo, Aldo
2015-06-01
The aim of this investigation was to test the hypothesis that fasting-induced augmented lysosomal autophagic turnover of cellular proteins and organelles will reduce potentially harmful lipofuscin (age-pigment) formation in cells by more effectively removing oxidatively damaged proteins. An animal model (marine snail--common periwinkle, Littorina littorea) was used to experimentally test this hypothesis. Snails were deprived of algal food for 7 days to induce an augmented autophagic response in their hepatopancreatic digestive cells (hepatocyte analogues). This treatment resulted in a 25% reduction in the cellular content of lipofuscin in the digestive cells of the fasting animals in comparison with snails fed ad libitum on green alga (Ulva lactuca). Similar findings have previously been observed in the digestive cells of marine mussels subjected to copper-induced oxidative stress. Additional measurements showed that fasting significantly increased cellular health based on lysosomal membrane stability, and reduced lipid peroxidation and lysosomal/cellular triglyceride. These findings support the hypothesis that fasting-induced augmented autophagic turnover of cellular proteins has an anti-oxidative cytoprotective effect by more effectively removing damaged proteins, resulting in a reduction in the formation of potentially harmful proteinaceous aggregates such as lipofuscin. The inference from this study is that autophagy is important in mediating hormesis. An increase was demonstrated in physiological complexity with fasting, using graph theory in a directed cell physiology network (digraph) model to integrate the various biomarkers. This was commensurate with increased health status, and supportive of the hormesis hypothesis. The potential role of enhanced autophagic lysosomal removal of damaged proteins in the evolutionary acquisition of stress tolerance in intertidal molluscs is discussed and parallels are drawn with the growing evidence for the involvement of autophagy in hormesis and anti-ageing processes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Azimuthal Anisotropy beneath the Contiguous United States Revealed by Shear Wave Splitting
NASA Astrophysics Data System (ADS)
Liu, K. H.; Yang, B.; Liu, Y.; Dahm, H. H.; Refayee, H. A.; Gao, S. S.
2017-12-01
We have produced a uniformly-measured XKS (including SKS, SKKS, and PKS) splitting database for the contiguous United States and adjacent areas. The database consists of about 30,000 pairs of splitting parameters from 3185 stations. Both the fast orientations and splitting times show systematic spatial variations. The vast majority of the fast orientations are in agreement with the absolute plate motion (APM) direction computed under a fixed hot-spot reference frame. Spatial coherency analysis of the splitting parameters indicates that for the majority of the study area, where a single layer of anisotropy with a horizontal axis of symmetry is inferred, the source of anisotropy is located in the rheologically transitional zone between the lithosphere and asthenosphere. Beneath the western U.S., the previously recognized semi-circular feature of the fast orientations has a much greater spatial coverage, extending to northern Mexico and the Rio Grande Rift. The fast orientations are parallel to the western, southern, and southeastern edges of the North American Craton and can be interpreted by simple shear strain associated with mantle flow around the cratonic keel. The combination of anisotropy induced by this around keel flow and the APM can effectively explain the E-W fast orientations beneath the southern margin of the North American Craton and NE U.S., as well as the nearly N-S fast orientations and small splitting times observed in the SE U.S. The splitting times show a systematic decrease from both the western and eastern U.S. toward the central U.S., where the thickness of the lithosphere is the largest in the study area. This trend can be explained by the reduced efficiency of anisotropy development at greater depth, as well as by the lack of around keel flow in the continental interior.
A Comprehensive Seismic Characterization of the Cove Fort-Sulphurdale Geothermal Site, Utah
NASA Astrophysics Data System (ADS)
Zhang, H.; Li, J.; Zhang, X.; Liu, Y.; Kuleli, H. S.; Toksoz, M. N.
2012-12-01
The Cove Fort-Sulphurdale geothermal area is located in the transition zone between the extensional Basin and Range Province to the west and the uplifted Colorado Plateau to the east. The region around the geothermal site has the highest heat flow values of over 260 mWm-2 in Utah. To better understand the structure around the geothermal site, the MIT group deployed 10 seismic stations for a period of one year from August 2010. The local seismic network detected over 500 local earthquakes, from which ~200 events located within the network were selected for further analysis. Our seismic analysis is focused on three aspects: seismic velocity and attenuation tomography, seismic event focal mechanism analysis, and seismic shear wave splitting analysis. First P- and S-wave arrivals are picked manually and then the waveform cross-correlation technique is applied to obtain more accurate differential times between event pairs observed on common stations. The double-difference tomography method of Zhang and Thurber (2003) is used to simultaneously determine Vp and Vs models and seismic event locations. For the attenuation tomography, we first calculate t* values from spectrum fitting and then invert them to get Q models based on known velocity models and seismic event locations. Due to the limited station coverage and relatively low signal to noise ratio, many seismic waveforms do not have clear first P arrival polarities and as a result the conventional focal mechanism determination method relying on the polarity information is not applicable. Therefore, we used the full waveform matching method of Li et al. (2010) to determine event focal mechanisms. For the shear wave splitting analysis, we used the cross-correlation method to determine the delay times between fast and slow shear waves and the polarization angles of fast shear waves. The delay times are further taken to image the anisotropy percentage distribution in three dimensions using the shear wave splitting tomography method of Zhang et al. (2007). For the study region, overall the velocity is lower and attenuation is higher in the western part. Correspondingly, the anisotropy is also stronger, indicating the fractures may be more developed in the western part. The average fast polarization directions of fast shear waves at each station mostly point NNE. From the focal mechanism analysis from selected events, it shows that the normal faulting events have strikes in NNE direction, and the events with strike slip mechanism have strikes either parallel with the NNE trending faults or their conjugate ones. Assuming the maximum horizontal stress (SHmax) is parallel with the strike of the normal faulting events and bisects the two fault planes of the strike-slip events, the inverted source mechanism suggests a NNE oriented maximum horizontal stress regime. This area is under W-E tensional stress, which means maximum compressional stress should be in the N-E or NNE direction in general. The combination of shear wave splitting and focal mechanism analysis suggests that in this region the faults and fractures are aligned in the NNE direction.
Direct imaging detectors for electron microscopy
NASA Astrophysics Data System (ADS)
Faruqi, A. R.; McMullan, G.
2018-01-01
Electronic detectors used for imaging in electron microscopy are reviewed in this paper. Much of the detector technology is based on the developments in microelectronics, which have allowed the design of direct detectors with fine pixels, fast readout and which are sufficiently radiation hard for practical use. Detectors included in this review are hybrid pixel detectors, monolithic active pixel sensors based on CMOS technology and pnCCDs, which share one important feature: they are all direct imaging detectors, relying on directly converting energy in a semiconductor. Traditional methods of recording images in the electron microscope such as film and CCDs, are mentioned briefly along with a more detailed description of direct electronic detectors. Many applications benefit from the use of direct electron detectors and a few examples are mentioned in the text. In recent years one of the most dramatic advances in structural biology has been in the deployment of the new backthinned CMOS direct detectors to attain near-atomic resolution molecular structures with electron cryo-microscopy (cryo-EM). The development of direct detectors, along with a number of other parallel advances, has seen a very significant amount of new information being recorded in the images, which was not previously possible-and this forms the main emphasis of the review.
Liu, Peilu; Li, Xinghua; Li, Haopeng; Su, Zhikun; Zhang, Hongxu
2017-01-01
In order to improve the accuracy of ultrasonic phased array focusing time delay, analyzing the original interpolation Cascade-Integrator-Comb (CIC) filter, an 8× interpolation CIC filter parallel algorithm was proposed, so that interpolation and multichannel decomposition can simultaneously process. Moreover, we summarized the general formula of arbitrary multiple interpolation CIC filter parallel algorithm and established an ultrasonic phased array focusing time delay system based on 8× interpolation CIC filter parallel algorithm. Improving the algorithmic structure, 12.5% of addition and 29.2% of multiplication was reduced, meanwhile the speed of computation is still very fast. Considering the existing problems of the CIC filter, we compensated the CIC filter; the compensated CIC filter’s pass band is flatter, the transition band becomes steep, and the stop band attenuation increases. Finally, we verified the feasibility of this algorithm on Field Programming Gate Array (FPGA). In the case of system clock is 125 MHz, after 8× interpolation filtering and decomposition, time delay accuracy of the defect echo becomes 1 ns. Simulation and experimental results both show that the algorithm we proposed has strong feasibility. Because of the fast calculation, small computational amount and high resolution, this algorithm is especially suitable for applications with high time delay accuracy and fast detection. PMID:29023385
Liu, Peilu; Li, Xinghua; Li, Haopeng; Su, Zhikun; Zhang, Hongxu
2017-10-12
In order to improve the accuracy of ultrasonic phased array focusing time delay, analyzing the original interpolation Cascade-Integrator-Comb (CIC) filter, an 8× interpolation CIC filter parallel algorithm was proposed, so that interpolation and multichannel decomposition can simultaneously process. Moreover, we summarized the general formula of arbitrary multiple interpolation CIC filter parallel algorithm and established an ultrasonic phased array focusing time delay system based on 8× interpolation CIC filter parallel algorithm. Improving the algorithmic structure, 12.5% of addition and 29.2% of multiplication was reduced, meanwhile the speed of computation is still very fast. Considering the existing problems of the CIC filter, we compensated the CIC filter; the compensated CIC filter's pass band is flatter, the transition band becomes steep, and the stop band attenuation increases. Finally, we verified the feasibility of this algorithm on Field Programming Gate Array (FPGA). In the case of system clock is 125 MHz, after 8× interpolation filtering and decomposition, time delay accuracy of the defect echo becomes 1 ns. Simulation and experimental results both show that the algorithm we proposed has strong feasibility. Because of the fast calculation, small computational amount and high resolution, this algorithm is especially suitable for applications with high time delay accuracy and fast detection.
FastQuery: A Parallel Indexing System for Scientific Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chou, Jerry; Wu, Kesheng; Prabhat,
2011-07-29
Modern scientific datasets present numerous data management and analysis challenges. State-of-the- art index and query technologies such as FastBit can significantly improve accesses to these datasets by augmenting the user data with indexes and other secondary information. However, a challenge is that the indexes assume the relational data model but the scientific data generally follows the array data model. To match the two data models, we design a generic mapping mechanism and implement an efficient input and output interface for reading and writing the data and their corresponding indexes. To take advantage of the emerging many-core architectures, we also developmore » a parallel strategy for indexing using threading technology. This approach complements our on-going MPI-based parallelization efforts. We demonstrate the flexibility of our software by applying it to two of the most commonly used scientific data formats, HDF5 and NetCDF. We present two case studies using data from a particle accelerator model and a global climate model. We also conducted a detailed performance study using these scientific datasets. The results show that FastQuery speeds up the query time by a factor of 2.5x to 50x, and it reduces the indexing time by a factor of 16 on 24 cores.« less
Plana-Ruiz, S; Portillo, J; Estradé, S; Peiró, F; Kolb, Ute; Nicolopoulos, S
2018-06-06
A general method to set illuminating conditions for selectable beam convergence and probe size is presented in this work for Transmission Electron Microscopes (TEM) fitted with µs/pixel fast beam scanning control, (S)TEM, and an annular dark field detector. The case of interest of beam convergence and probe size, which enables diffraction pattern indexation, is then used as a starting point in this work to add 100 Hz precession to the beam while imaging the specimen at a fast rate and keeping the projector system in diffraction mode. The described systematic alignment method for the adjustment of beam precession on the specimen plane while scanning at fast rates is mainly based on the sharpness of the precessed STEM image. The complete alignment method for parallel condition and precession, Quasi-Parallel PED-STEM, is presented in block diagram scheme, as it has been tested on a variety of instruments. The immediate application of this methodology is that it renders the TEM column ready for the acquisition of Precessed Electron Diffraction Tomographies (EDT) as well as for the acquisition of slow Precessed Scanning Nanometer Electron Diffraction (SNED). Examples of the quality of the Precessed Electron Diffraction (PED) patterns and PED-STEM alignment images are presented with corresponding probe sizes and convergence angles. Copyright © 2018. Published by Elsevier B.V.
Upper mantle seismic anisotropy beneath Northern Peru from shear wave splitting analysis.
NASA Astrophysics Data System (ADS)
Franca, G. S.; Condori, C.; Tavera, H.; Eakin, C. M.; Beck, S. L.
2017-12-01
Beneath much of Peru lies the largest region of flat-slab subduction in the world today. The origins and dynamics of the Peruvian flat-slab however remain elusive, particularly in the north away from the Nazca Ridge. Studies of seismic anisotropy can potentially provide us with insight into the dynamics of recent and past deformational processes in the upper mantle. In this study, we conduct shear wave splitting to investigate seismic anisotropy across the northern extent of the Peruvian flat-slab for the first time. For the analysis, we used arrivals of SKS, SKKS and PKS phases from teleseismic events (88° > Δ < 150°) recorded at 30 broadband seismic stations from the Peruvian permanent and portable seismic networks, and international networks (CTBTO and RSBR-Brazil). The preliminary results reveal a complex anisotropy pattern with variations along strike. In the northernmost region, the average delay times range between 1.0 s and 1.2 s, with fast directions predominantly ENE-WSW oriented in a direction approximately perpendicular to the trench, parallel with subduction of the Nazca plate. Meanwhile towards the central region of Peru, the predominant fast direction changes to SE-NW oblique with the trench, but consistent with the pattern seen previously over the southern extent of the flat-slab by Eakin et al. (2013, 2015). These characteristics suggest a fundamental difference between the anisotropic structures, and therefore underlying mantle processes, beneath the northern and central portions of the Peruvian flat-slab.
Parallelized direct execution simulation of message-passing parallel programs
NASA Technical Reports Server (NTRS)
Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.
1994-01-01
As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
Rideaux, Reuben; Apthorp, Deborah; Edwards, Mark
2015-02-12
Recent findings have indicated the capacity to consolidate multiple items into visual short-term memory in parallel varies as a function of the type of information. That is, while color can be consolidated in parallel, evidence suggests that orientation cannot. Here we investigated the capacity to consolidate multiple motion directions in parallel and reexamined this capacity using orientation. This was achieved by determining the shortest exposure duration necessary to consolidate a single item, then examining whether two items, presented simultaneously, could be consolidated in that time. The results show that parallel consolidation of direction and orientation information is possible, and that parallel consolidation of direction appears to be limited to two. Additionally, we demonstrate the importance of adequate separation between feature intervals used to define items when attempting to consolidate in parallel, suggesting that when multiple items are consolidated in parallel, as opposed to serially, the resolution of representations suffer. Finally, we used facilitation of spatial attention to show that the deterioration of item resolution occurs during parallel consolidation, as opposed to storage. © 2015 ARVO.
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry
1998-01-01
This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.
NASA Astrophysics Data System (ADS)
Horst, A. J.; Varga, R. J.; Gee, J. S.; Karson, J. A.
2008-12-01
Escarpments bounding the Pito Deep Rift expose cross-sections into ~3 Ma oceanic crust accreted at a super-fast spreading (>140 mm/yr) segment of the East Pacific Rise (EPR). Dikes within the sheeted dike complex persistently strike NE, parallel to local abyssal hill lineaments and magnetic anomaly stripes, and dip SE, outward and away from the EPR. During the Pito Deep 2005 Cruise, both ALVIN and JASON II used the Geocompass to fully orient a total of 69 samples [63 basaltic dikes, 6 massive gabbros] collected in situ. Paleomagnetic analyses of these oriented samples provide a quantitative constraint of kinematics of structural rotations of dikes. Magnetic remanence of dike samples indicates a dominant normal polarity with almost all directions rotated clockwise from the expected direction. The most geologically plausible model to account for these dispersions using these data coupled with the general orientation of the dikes incorporates two different structural rotations: 1) A horizontal-axis rotation that occurred near the EPR axis, related to sub-axial subsidence, and 2) A clockwise vertical-axis rotation, associated with the rotation of the Easter microplate consistent with current models. Additionally, the anisotropy of magnetic susceptibility (AMS) of dike samples indicates rock fabric and magmatic flow direction within dikes. In most samples, two of three AMS eigenvectors lie near the dike plane orientations. Generally, Kmin lies perpendicular to dike planes, while Kmax is often shallow within the dike planes, indicating dominantly subhorizontal magma flow. Steep Kmax in a few samples indicates vertical flow directions that suggest either primary flow or gravitational back-flow during waning stages of dike intrusion. These results provide the first direct evidence for primarily horizontal magma flow in sheeted dikes of super-fast spread oceanic crust. Results for Pito Deep Rift and previous results for Hess Deep Rift reveal outward dipping dikes that are interpreted as a result of subaxial spreading processes that are not evident from surface studies of spreading centers. Both areas show evidence of subaxial subsidence during accretion and lateral magmatic flow in the sheeted dike complex.
NASA Technical Reports Server (NTRS)
Kouznetsov, Igor; Lotko, William
1995-01-01
The 'radial' transport of energy by internal ULF waves, stimulated by dayside magnetospheric boundary oscillations, is analyzed in the framework of one-fluid magnetohydrodynamics. (the term radial is used here to denote the direction orthogonal to geomagnetic flux surfaces.) The model for the inhomogeneous magnetospheric plasma and background magnetic field is axisymmetric and includes radial and parallel variations in the magnetic field, magnetic curvature, plasma density, and low but finite plasma pressure. The radial mode structure of the coupled fast and intermediate MHD waves is determined by numerical solution of the inhomogeneous wave equation; the parallel mode structure is characterized by a Wentzel-Kramer-Brillouin (WKB) approximation. Ionospheric dissipation is modeled by allowing the parallel wave number to be complex. For boudnary oscillations with frequencies in the range from 10 to 48 mHz, and using a dipole model for the background magnetic field, the combined effects of magnetic curvature and finite plasma pressure are shown to (1) enhance the amplitude of field line resonances by as much as a factor of 2 relative to values obtained in a cold plasma or box-model approximation for the dayside magnetosphere; (2) increase the energy flux delivered to a given resonance by a factor of 2-4; and (3) broaden the spectral width of the resonance by a factor of 2-3. The effects are attributed to the existence of an 'Alfven buoyancy oscillation,' which approaches the usual shear mode Alfven wave at resonance, but unlike the shear Alfven mode, it is dispersive at short perpendicular wavelengths. The form of dispersion is analogous to that of an internal atmospheric gravity wave, with the magnetic tension of the curved background field providing the restoring force and allowing radial propagation of the mode. For nominal dayside parameters, the propagation band of the Alfven buoyancy wave occurs between the location of its (field line) resonance and that of the fast mode cutoff that exists at larger radial distances.
FleCSPH - a parallel and distributed SPH implementation based on the FleCSI framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Junghans, Christoph; Loiseau, Julien
2017-06-20
FleCSPH is a multi-physics compact application that exercises FleCSI parallel data structures for tree-based particle methods. In particular, FleCSPH implements a smoothed-particle hydrodynamics (SPH) solver for the solution of Lagrangian problems in astrophysics and cosmology. FleCSPH includes support for gravitational forces using the fast multipole method (FMM).
On the suitability of the connection machine for direct particle simulation
NASA Technical Reports Server (NTRS)
Dagum, Leonard
1990-01-01
The algorithmic structure was examined of the vectorizable Stanford particle simulation (SPS) method and the structure is reformulated in data parallel form. Some of the SPS algorithms can be directly translated to data parallel, but several of the vectorizable algorithms have no direct data parallel equivalent. This requires the development of new, strictly data parallel algorithms. In particular, a new sorting algorithm is developed to identify collision candidates in the simulation and a master/slave algorithm is developed to minimize communication cost in large table look up. Validation of the method is undertaken through test calculations for thermal relaxation of a gas, shock wave profiles, and shock reflection from a stationary wall. A qualitative measure is provided of the performance of the Connection Machine for direct particle simulation. The massively parallel architecture of the Connection Machine is found quite suitable for this type of calculation. However, there are difficulties in taking full advantage of this architecture because of lack of a broad based tradition of data parallel programming. An important outcome of this work has been new data parallel algorithms specifically of use for direct particle simulation but which also expand the data parallel diction.
A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TRIANGULATED SURFACES*
Fu, Zhisong; Jeong, Won-Ki; Pan, Yongsheng; Kirby, Robert M.; Whitaker, Ross T.
2012-01-01
This paper presents an efficient, fine-grained parallel algorithm for solving the Eikonal equation on triangular meshes. The Eikonal equation, and the broader class of Hamilton–Jacobi equations to which it belongs, have a wide range of applications from geometric optics and seismology to biological modeling and analysis of geometry and images. The ability to solve such equations accurately and efficiently provides new capabilities for exploring and visualizing parameter spaces and for solving inverse problems that rely on such equations in the forward model. Efficient solvers on state-of-the-art, parallel architectures require new algorithms that are not, in many cases, optimal, but are better suited to synchronous updates of the solution. In previous work [W. K. Jeong and R. T. Whitaker, SIAM J. Sci. Comput., 30 (2008), pp. 2512–2534], the authors proposed the fast iterative method (FIM) to efficiently solve the Eikonal equation on regular grids. In this paper we extend the fast iterative method to solve Eikonal equations efficiently on triangulated domains on the CPU and on parallel architectures, including graphics processors. We propose a new local update scheme that provides solutions of first-order accuracy for both architectures. We also propose a novel triangle-based update scheme and its corresponding data structure for efficient irregular data mapping to parallel single-instruction multiple-data (SIMD) processors. We provide detailed descriptions of the implementations on a single CPU, a multicore CPU with shared memory, and SIMD architectures with comparative results against state-of-the-art Eikonal solvers. PMID:22641200
Final report for the Tera Computer TTI CRADA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davidson, G.S.; Pavlakos, C.; Silva, C.
1997-01-01
Tera Computer and Sandia National Laboratories have completed a CRADA, which examined the Tera Multi-Threaded Architecture (MTA) for use with large codes of importance to industry and DOE. The MTA is an innovative architecture that uses parallelism to mask latency between memories and processors. The physical implementation is a parallel computer with high cross-section bandwidth and GaAs processors designed by Tera, which support many small computation threads and fast, lightweight context switches between them. When any thread blocks while waiting for memory accesses to complete, another thread immediately begins execution so that high CPU utilization is maintained. The Tera MTAmore » parallel computer has a single, global address space, which is appealing when porting existing applications to a parallel computer. This ease of porting is further enabled by compiler technology that helps break computations into parallel threads. DOE and Sandia National Laboratories were interested in working with Tera to further develop this computing concept. While Tera Computer would continue the hardware development and compiler research, Sandia National Laboratories would work with Tera to ensure that their compilers worked well with important Sandia codes, most particularly CTH, a shock physics code used for weapon safety computations. In addition to that important code, Sandia National Laboratories would complete research on a robotic path planning code, SANDROS, which is important in manufacturing applications, and would evaluate the MTA performance on this code. Finally, Sandia would work directly with Tera to develop 3D visualization codes, which would be appropriate for use with the MTA. Each of these tasks has been completed to the extent possible, given that Tera has just completed the MTA hardware. All of the CRADA work had to be done on simulators.« less
Seismic anisotropy and mantle creep in young orogens
Meissner, R.; Mooney, W.D.; Artemieva, I.
2002-01-01
Seismic anisotropy provides evidence for the physical state and tectonic evolution of the lithosphere. We discuss the origin of anisotropy at various depths, and relate it to tectonic stress, geotherms and rheology. The anisotropy of the uppermost mantle is controlled by the orthorhombic mineral olivine, and may result from ductile deformation, dynamic recrystallization or annealing. Anisotropy beneath young orogens has been measured for the seismic phase Pn that propagates in the uppermost mantle. This anisotropy is interpreted as being caused by deformation during the most recent thermotectonic event, and thus provides information on the process of mountain building. Whereas tectonic stress and many structural features in the upper crust are usually orientated perpendicular to the structural axis of mountain belts, Pn anisotropy is aligned parallel to the structural axis. We interpret this to indicate mountain-parallel ductile (i.e. creeping) deformation in the uppermost mantle that is a consequence of mountain-perpendicular compressive stresses. The preferred orientation of the fast axes of some anisotropic minerals, such as olivine, is known to be in the creep direction, a consequence of the anisotropy of strength and viscosity of orientated minerals. In order to explain the anisotropy of the mantle beneath young orogens we extend the concept of crustal 'escape' (or 'extrusion') tectonics to the uppermost mantle. We present rheological model calculations to support this hypothesis. Mountain-perpendicular horizontal stress (determined in the upper crust) and mountain-parallel seismic anisotropy (in the uppermost mantle) require a zone of ductile decoupling in the middle or lower crust of young mountain belts. Examples for stress and mountain-parallel Pn anisotropy are given for Tibet, the Alpine chains, and young mountain ranges in the Americas. Finally, we suggest a simple model for initiating mountain parallel creep.
Fast Fourier Transform algorithm design and tradeoffs
NASA Technical Reports Server (NTRS)
Kamin, Ray A., III; Adams, George B., III
1988-01-01
The Fast Fourier Transform (FFT) is a mainstay of certain numerical techniques for solving fluid dynamics problems. The Connection Machine CM-2 is the target for an investigation into the design of multidimensional Single Instruction Stream/Multiple Data (SIMD) parallel FFT algorithms for high performance. Critical algorithm design issues are discussed, necessary machine performance measurements are identified and made, and the performance of the developed FFT programs are measured. Fast Fourier Transform programs are compared to the currently best Cray-2 FFT program.
A High Order, Locally-Adaptive Method for the Navier-Stokes Equations
NASA Astrophysics Data System (ADS)
Chan, Daniel
1998-11-01
I have extended the FOSLS method of Cai, Manteuffel and McCormick (1997) and implemented it within the framework of a spectral element formulation using the Legendre polynomial basis function. The FOSLS method solves the Navier-Stokes equations as a system of coupled first-order equations and provides the ellipticity that is needed for fast iterative matrix solvers like multigrid to operate efficiently. Each element is treated as an object and its properties are self-contained. Only C^0 continuity is imposed across element interfaces; this design allows local grid refinement and coarsening without the burden of having an elaborate data structure, since only information along element boundaries is needed. With the FORTRAN 90 programming environment, I can maintain a high computational efficiency by employing a hybrid parallel processing model. The OpenMP directives provides parallelism in the loop level which is executed in a shared-memory SMP and the MPI protocol allows the distribution of elements to a cluster of SMP's connected via a commodity network. This talk will provide timing results and a comparison with a second order finite difference method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry
Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...
2016-10-27
Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Na; Jia, Zhe; Wang, Zhihui
Here in this paper, the structure degradation of commercial Lithium-ion battery (LIB) graphite anodes with different cycling numbers and charge rates was investigated by focused ion beam (FIB) and scanning electron microscopy (SEM). The cross-section image of graphite anode by FIB milling shows that cracks, resulted in the volume expansion of graphite electrode during long-term cycling, were formed in parallel with the current collector. The crack occurs in the bulk of graphite particles near the lithium insertion surface, which might derive from the stress induced during lithiation and de-lithiation cycles. Subsequently, crack takes place along grain boundaries of the polycrystallinemore » graphite, but only in the direction parallel with the current collector. Furthermore, fast charge graphite electrodes are more prone to form cracks since the tensile strength of graphite is more likely to be surpassed at higher charge rates. Therefore, for LIBs long-term or high charge rate applications, the tensile strength of graphite anode should be taken into account.« less
Lin, Na; Jia, Zhe; Wang, Zhihui; ...
2017-10-01
Here in this paper, the structure degradation of commercial Lithium-ion battery (LIB) graphite anodes with different cycling numbers and charge rates was investigated by focused ion beam (FIB) and scanning electron microscopy (SEM). The cross-section image of graphite anode by FIB milling shows that cracks, resulted in the volume expansion of graphite electrode during long-term cycling, were formed in parallel with the current collector. The crack occurs in the bulk of graphite particles near the lithium insertion surface, which might derive from the stress induced during lithiation and de-lithiation cycles. Subsequently, crack takes place along grain boundaries of the polycrystallinemore » graphite, but only in the direction parallel with the current collector. Furthermore, fast charge graphite electrodes are more prone to form cracks since the tensile strength of graphite is more likely to be surpassed at higher charge rates. Therefore, for LIBs long-term or high charge rate applications, the tensile strength of graphite anode should be taken into account.« less
A critical analysis of shock models for chondrule formation
NASA Astrophysics Data System (ADS)
Stammler, Sebastian M.; Dullemond, Cornelis P.
2014-11-01
In recent years many models of chondrule formation have been proposed. One of those models is the processing of dust in shock waves in protoplanetary disks. In this model, the dust and the chondrule precursors are overrun by shock waves, which heat them up by frictional heating and thermal exchange with the gas. In this paper we reanalyze the nebular shock model of chondrule formation and focus on the downstream boundary condition. We show that for large-scale plane-parallel chondrule-melting shocks the postshock equilibrium temperature is too high to avoid volatile loss. Even if we include radiative cooling in lateral directions out of the disk plane into our model (thereby breaking strict plane-parallel geometry) we find that for a realistic vertical extent of the solar nebula disk the temperature decline is not fast enough. On the other hand, if we assume that the shock is entirely optically thin so that particles can radiate freely, the cooling rates are too high to produce the observed chondrules textures. Global nebular shocks are therefore problematic as the primary sources of chondrules.
Influence of fast advective flows on pattern formation of Dictyostelium discoideum
Bae, Albert; Zykov, Vladimir; Bodenschatz, Eberhard
2018-01-01
We report experimental and numerical results on pattern formation of self-organizing Dictyostelium discoideum cells in a microfluidic setup under a constant buffer flow. The external flow advects the signaling molecule cyclic adenosine monophosphate (cAMP) downstream, while the chemotactic cells attached to the solid substrate are not transported with the flow. At high flow velocities, elongated cAMP waves are formed that cover the whole length of the channel and propagate both parallel and perpendicular to the flow direction. While the wave period and transverse propagation velocity are constant, parallel wave velocity and the wave width increase linearly with the imposed flow. We also observe that the acquired wave shape is highly dependent on the wave generation site and the strength of the imposed flow. We compared the wave shape and velocity with numerical simulations performed using a reaction-diffusion model and found excellent agreement. These results are expected to play an important role in understanding the process of pattern formation and aggregation of D. discoideum that may experience fluid flows in its natural habitat. PMID:29590179
Crosetto, D.B.
1996-12-31
The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor to a plurality of slave processors to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor`s status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer, a digital signal processor, a parallel transfer controller, and two three-port memory devices. A communication switch within each node connects it to a fast parallel hardware channel through which all high density data arrives or leaves the node. 6 figs.
Crosetto, Dario B.
1996-01-01
The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor (100) to a plurality of slave processors (200) to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor's status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer (104), a digital signal processor (114), a parallel transfer controller (106), and two three-port memory devices. A communication switch (108) within each node (100) connects it to a fast parallel hardware channel (70) through which all high density data arrives or leaves the node.
Carrera, Mónica; Gallardo, José M; Pascual, Santiago; González, Ángel F; Medina, Isabel
2016-06-16
Anisakids are fish-borne parasites that are responsible for a large number of human infections and allergic reactions around the world. World health organizations and food safety authorities aim to control and prevent this emerging health problem. In the present work, a new method for the fast monitoring of these parasites is described. The strategy is divided in three steps: (i) purification of thermostable proteins from fish-borne parasites (Anisakids), (ii) in-solution HIFU trypsin digestion and (iii) monitoring of several peptide markers by parallel reaction monitoring (PRM) mass spectrometry. This methodology allows the fast detection of Anisakids in <2h. An affordable assay utilizing this methodology will facilitate testing for regulatory and safety applications. The work describes for the first time, the Protein Biomarker Discovery and the Fast Monitoring for the identification and detection of Anisakids in fishery products. The strategy is based on the purification of thermostable proteins, the use of accelerated in-solution trypsin digestions under an ultrasonic field provided by High-Intensity Focused Ultrasound (HIFU) and the monitoring of several peptide biomarkers by Parallel Reaction Monitoring (PRM) Mass Spectrometry in a linear ion trap mass spectrometer. The workflow allows the unequivocal detection of Anisakids, in <2h. The present strategy constitutes the fastest method for Anisakids detection, whose application in the food quality control area, could provide to the authorities an effective and rapid method to guarantee the safety to the consumers. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Sheykina, Nadiia; Bogatina, Nina
The following variants of roots location relatively to static and alternative components of magnetic field were studied. At first variant the static magnetic field was directed parallel to the gravitation vector, the alternative magnetic field was directed perpendicular to static one; roots were directed perpendicular to both two fields’ components and gravitation vector. At the variant the negative gravitropysm for cress roots was observed. At second variant the static magnetic field was directed parallel to the gravitation vector, the alternative magnetic field was directed perpendicular to static one; roots were directed parallel to alternative magnetic field. At third variant the alternative magnetic field was directed parallel to the gravitation vector, the static magnetic field was directed perpendicular to the gravitation vector, roots were directed perpendicular to both two fields components and gravitation vector; At forth variant the alternative magnetic field was directed parallel to the gravitation vector, the static magnetic field was directed perpendicular to the gravitation vector, roots were directed parallel to static magnetic field. In all cases studied the alternative magnetic field frequency was equal to Ca ions cyclotron frequency. In 2, 3 and 4 variants gravitropism was positive. But the gravitropic reaction speeds were different. In second and forth variants the gravitropic reaction speed in error limits coincided with the gravitropic reaction speed under Earth’s conditions. At third variant the gravitropic reaction speed was slowed essentially.
High performance Python for direct numerical simulations of turbulent flows
NASA Astrophysics Data System (ADS)
Mortensen, Mikael; Langtangen, Hans Petter
2016-06-01
Direct Numerical Simulations (DNS) of the Navier Stokes equations is an invaluable research tool in fluid dynamics. Still, there are few publicly available research codes and, due to the heavy number crunching implied, available codes are usually written in low-level languages such as C/C++ or Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS code that nearly matches the performance of C++ for thousands of processors and billions of unknowns. We also describe a version optimized through Cython, that is found to match the speed of C++. The solvers are written from scratch in Python, both the mesh, the MPI domain decomposition, and the temporal integrators. The solvers have been verified and benchmarked on the Shaheen supercomputer at the KAUST supercomputing laboratory, and we are able to show very good scaling up to several thousand cores. A very important part of the implementation is the mesh decomposition (we implement both slab and pencil decompositions) and 3D parallel Fast Fourier Transforms (FFT). The mesh decomposition and FFT routines have been implemented in Python using serial FFT routines (either NumPy, pyFFTW or any other serial FFT module), NumPy array manipulations and with MPI communications handled by MPI for Python (mpi4py). We show how we are able to execute a 3D parallel FFT in Python for a slab mesh decomposition using 4 lines of compact Python code, for which the parallel performance on Shaheen is found to be slightly better than similar routines provided through the FFTW library. For a pencil mesh decomposition 7 lines of code is required to execute a transform.
Flexible, fast and accurate sequence alignment profiling on GPGPU with PaSWAS.
Warris, Sven; Yalcin, Feyruz; Jackson, Katherine J L; Nap, Jan Peter
2015-01-01
To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis. With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.
Low cost automated whole smear microscopy screening system for detection of acid fast bacilli.
Law, Yan Nei; Jian, Hanbin; Lo, Norman W S; Ip, Margaret; Chan, Mia Mei Yuk; Kam, Kai Man; Wu, Xiaohua
2018-01-01
In countries with high tuberculosis (TB) burden, there is urgent need for rapid, large-scale screening to detect smear-positive patients. We developed a computer-aided whole smear screening system that focuses in real-time, captures images and provides diagnostic grading, for both bright-field and fluorescence microscopy for detection of acid-fast-bacilli (AFB) from respiratory specimens. To evaluate the performance of dual-mode screening system in AFB diagnostic algorithms on concentrated smears with auramine O (AO) staining, as well as direct smears with AO and Ziehl-Neelsen (ZN) staining, using mycobacterial culture results as gold standard. Adult patient sputum samples requesting for M. tuberculosis cultures were divided into three batches for staining: direct AO-stained, direct ZN-stained and concentrated smears AO-stained. All slides were graded by an experienced microscopist, in parallel with the automated whole smear screening system. Sensitivity and specificity of a TB diagnostic algorithm in using the screening system alone, and in combination with a microscopist, were evaluated. Of 488 direct AO-stained smears, 228 were culture positive. These yielded a sensitivity of 81.6% and specificity of 74.2%. Of 334 direct smears with ZN staining, 142 were culture positive, which gave a sensitivity of 70.4% and specificity of 76.6%. Of 505 concentrated smears with AO staining, 250 were culture positive, giving a sensitivity of 86.4% and specificity of 71.0%. To further improve performance, machine grading was confirmed by manual smear grading when the number of AFBs detected fell within an uncertainty range. These combined results gave significant improvement in specificity (AO-direct:85.4%; ZN-direct:85.4%; AO-concentrated:92.5%) and slight improvement in sensitivity while requiring only limited manual workload. Our system achieved high sensitivity without substantially compromising specificity when compared to culture results. Significant improvement in specificity was obtained when uncertain results were confirmed by manual smear grading. This approach had potential to substantially reduce workload of microscopists in high burden countries.
Fast MPEG-CDVS Encoder With GPU-CPU Hybrid Computing
NASA Astrophysics Data System (ADS)
Duan, Ling-Yu; Sun, Wei; Zhang, Xinfeng; Wang, Shiqi; Chen, Jie; Yin, Jianxiong; See, Simon; Huang, Tiejun; Kot, Alex C.; Gao, Wen
2018-05-01
The compact descriptors for visual search (CDVS) standard from ISO/IEC moving pictures experts group (MPEG) has succeeded in enabling the interoperability for efficient and effective image retrieval by standardizing the bitstream syntax of compact feature descriptors. However, the intensive computation of CDVS encoder unfortunately hinders its widely deployment in industry for large-scale visual search. In this paper, we revisit the merits of low complexity design of CDVS core techniques and present a very fast CDVS encoder by leveraging the massive parallel execution resources of GPU. We elegantly shift the computation-intensive and parallel-friendly modules to the state-of-the-arts GPU platforms, in which the thread block allocation and the memory access are jointly optimized to eliminate performance loss. In addition, those operations with heavy data dependence are allocated to CPU to resolve the extra but non-necessary computation burden for GPU. Furthermore, we have demonstrated the proposed fast CDVS encoder can work well with those convolution neural network approaches which has harmoniously leveraged the advantages of GPU platforms, and yielded significant performance improvements. Comprehensive experimental results over benchmarks are evaluated, which has shown that the fast CDVS encoder using GPU-CPU hybrid computing is promising for scalable visual search.
Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas
2016-04-01
Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .
Fast Segmentation of Stained Nuclei in Terabyte-Scale, Time Resolved 3D Microscopy Image Stacks
Stegmaier, Johannes; Otte, Jens C.; Kobitski, Andrei; Bartschat, Andreas; Garcia, Ariel; Nienhaus, G. Ulrich; Strähle, Uwe; Mikut, Ralf
2014-01-01
Automated analysis of multi-dimensional microscopy images has become an integral part of modern research in life science. Most available algorithms that provide sufficient segmentation quality, however, are infeasible for a large amount of data due to their high complexity. In this contribution we present a fast parallelized segmentation method that is especially suited for the extraction of stained nuclei from microscopy images, e.g., of developing zebrafish embryos. The idea is to transform the input image based on gradient and normal directions in the proximity of detected seed points such that it can be handled by straightforward global thresholding like Otsu’s method. We evaluate the quality of the obtained segmentation results on a set of real and simulated benchmark images in 2D and 3D and show the algorithm’s superior performance compared to other state-of-the-art algorithms. We achieve an up to ten-fold decrease in processing times, allowing us to process large data sets while still providing reasonable segmentation results. PMID:24587204
NASA Technical Reports Server (NTRS)
Ierotheou, C.; Johnson, S.; Leggett, P.; Cross, M.; Evans, E.; Jin, Hao-Qiang; Frumkin, M.; Yan, J.; Biegel, Bryan (Technical Monitor)
2001-01-01
The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. Historically, the lack of a programming standard for using directives and the rather limited performance due to scalability have affected the take-up of this programming model approach. Significant progress has been made in hardware and software technologies, as a result the performance of parallel programs with compiler directives has also made improvements. The introduction of an industrial standard for shared-memory programming with directives, OpenMP, has also addressed the issue of portability. In this study, we have extended the computer aided parallelization toolkit (developed at the University of Greenwich), to automatically generate OpenMP based parallel programs with nominal user assistance. We outline the way in which loop types are categorized and how efficient OpenMP directives can be defined and placed using the in-depth interprocedural analysis that is carried out by the toolkit. We also discuss the application of the toolkit on the NAS Parallel Benchmarks and a number of real-world application codes. This work not only demonstrates the great potential of using the toolkit to quickly parallelize serial programs but also the good performance achievable on up to 300 processors for hybrid message passing and directive-based parallelizations.
Shear wave splitting and crustal anisotropy at the Mid-Atlantic Ridge, 35°N
NASA Astrophysics Data System (ADS)
Barclay, Andrew H.; Toomey, Douglas R.
2003-08-01
Shear wave splitting observed in microearthquake data at the axis of the Mid-Atlantic Ridge near 35°N has a fast polarization direction that is parallel to the trend of the axial valley. The time delays between fast and slow S wave arrivals range from 35 to 180 ms, with an average of 90 ms, and show no relationship with ray path length, source-to-receiver azimuth, or receiver location. The anisotropy is attributed to a shallow distribution of vertical, fluid-filled cracks, aligned parallel to the trend of the axial valley. Joint modeling of the shear wave anisotropy and coincident P wave anisotropy results, using recent theoretical models for the elasticity of a porous medium with aligned cracks, suggests that the crack distribution that causes the observed P wave anisotropy can account for at most 10 ms of the shear wave delay. Most of the shear wave delay thus likely accrues within the shallowmost 500 m (seismic layer 2A), and the percent S wave anisotropy within this highly fissured layer is 8-30%. Isolated, fluid-filled cracks at 500 m to 3 km depth that are too thin or too shallow to be detected by the P wave experiment may also contribute to the shear wave delays. The joint analysis of P and S wave anisotropy is an important approach for constraining the crack distributions in the upper oceanic crust and is especially suited for seismically active hydrothermal systems at slow and intermediate spreading mid-ocean ridges.
Parallelization of NAS Benchmarks for Shared Memory Multiprocessors
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)
1998-01-01
This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.
Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes
NASA Technical Reports Server (NTRS)
Yan, Jerry; Jin, Haoqiang; Frumkin, Michael; Yan, Jerry (Technical Monitor)
2000-01-01
The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate OpenMP-based parallel programs with nominal user assistance. We outline techniques used in the implementation of the tool and discuss the application of this tool on the NAS Parallel Benchmarks and several computational fluid dynamics codes. This work demonstrates the great potential of using the tool to quickly port parallel programs and also achieve good performance that exceeds some of the commercial tools.
Effect of parallel electric fields on the ponderomotive stabilization of MHD instabilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Litwin, C.; Hershkowitz, N.
The contribution of the wave electric field component E/sub parallel/, parallel to the magnetic field, to the ponderomotive stabilization of curvature driven instabilities is evaluated and compared to the transverse component contribution. For the experimental density range, in which the stability is primarily determined by the m = 1 magnetosonic wave, this contribution is found to be the dominant and stabilizing when the electron temperature is neglected. For sufficiently high electron temperatures the dominant fast wave is found to be axially evanescent. In the same limit, E/sub parallel/ becomes radially oscillating. It is concluded that the increased electron temperature nearmore » the plasma surface reduces the magnitude of ponderomotive effects.« less
Parallelizing alternating direction implicit solver on GPUs
USDA-ARS?s Scientific Manuscript database
We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...
Simultaneous Multi-Slice fMRI using Spiral Trajectories
Zahneisen, Benjamin; Poser, Benedikt A.; Ernst, Thomas; Stenger, V. Andrew
2014-01-01
Parallel imaging methods using multi-coil receiver arrays have been shown to be effective for increasing MRI acquisition speed. However parallel imaging methods for fMRI with 2D sequences show only limited improvements in temporal resolution because of the long echo times needed for BOLD contrast. Recently, Simultaneous Multi-Slice (SMS) imaging techniques have been shown to increase fMRI temporal resolution by factors of four and higher. In SMS fMRI multiple slices can be acquired simultaneously using Echo Planar Imaging (EPI) and the overlapping slices are un-aliased using a parallel imaging reconstruction with multiple receivers. The slice separation can be further improved using the “blipped-CAIPI” EPI sequence that provides a more efficient sampling of the SMS 3D k-space. In this paper a blipped-spiral SMS sequence for ultra-fast fMRI is presented. The blipped-spiral sequence combines the sampling efficiency of spiral trajectories with the SMS encoding concept used in blipped-CAIPI EPI. We show that blipped spiral acquisition can achieve almost whole brain coverage at 3 mm isotropic resolution in 168 ms. It is also demonstrated that the high temporal resolution allows for dynamic BOLD lag time measurement using visual/motor and retinotopic mapping paradigms. The local BOLD lag time within the visual cortex following the retinotopic mapping stimulation of expanding flickering rings is directly measured and easily translated into an eccentricity map of the cortex. PMID:24518259
NASA Astrophysics Data System (ADS)
Cossette, Élise; Schneider, David; Audet, Pascal; Grasemann, Bernhard; Habler, Gerlinde
2015-05-01
The crystallographic preferred orientations (CPOs) were measured on a suite of samples representative of different structural depths along the West Cycladic Detachment System, Greece. Electron backscatter diffraction (EBSD) analyses were conducted on calcitic and mica schists, impure quartzites, and a blueschist, and the average seismic properties of the rocks were calculated with the Voigt-Reuss-Hill average of the single minerals' elastic stiffness tensor. The calcitic and quartzitic rocks have P- and S-wave velocity anisotropies (AVp, AVs) averaging 8.1% and 7.1%, respectively. The anisotropy increases with depth represented by the blueschist, with AVp averaging 20.3% and AVs averaging 14.5%, due to the content of aligned glaucophane and mica, which strongly control the seismic properties of the rocks. Localised anisotropies of very high magnitudes are caused by the presence of mica schists as they possess the strongest anisotropies, with values of ~ 25% for AVp and AVs. The direction of the fast and slow P-wave velocities occurs parallel and perpendicular to the foliation, respectively, for most samples. The fast propagation has the same NE-SW orientation as the lithospheric stretching direction experienced in the Cyclades since the Late Oligocene. The maximum shear wave anisotropy is subhorizontal, similarly concordant with mineral alignment that developed during extension in the Aegean. Radial anisotropy in the Aegean mid-crust is strongly favoured to azimuthal anisotropy by our results.
Very fast motion planning for highly dexterous-articulated robots
NASA Technical Reports Server (NTRS)
Challou, Daniel J.; Gini, Maria; Kumar, Vipin
1994-01-01
Due to the inherent danger of space exploration, the need for greater use of teleoperated and autonomous robotic systems in space-based applications has long been apparent. Autonomous and semi-autonomous robotic devices have been proposed for carrying out routine functions associated with scientific experiments aboard the shuttle and space station. Finally, research into the use of such devices for planetary exploration continues. To accomplish their assigned tasks, all such autonomous and semi-autonomous devices will require the ability to move themselves through space without hitting themselves or the objects which surround them. In space it is important to execute the necessary motions correctly when they are first attempted because repositioning is expensive in terms of both time and resources (e.g., fuel). Finally, such devices will have to function in a variety of different environments. Given these constraints, a means for fast motion planning to insure the correct movement of robotic devices would be ideal. Unfortunately, motion planning algorithms are rarely used in practice because of their computational complexity. Fast methods have been developed for detecting imminent collisions, but the more general problem of motion planning remains computationally intractable. However, in this paper we show how the use of multicomputers and appropriate parallel algorithms can substantially reduce the time required to synthesize paths for dexterous articulated robots with a large number of joints. We have developed a parallel formulation of the Randomized Path Planner proposed by Barraquand and Latombe. We have shown that our parallel formulation is capable of formulating plans in a few seconds or less on various parallel architectures including: the nCUBE2 multicomputer with up to 1024 processors (nCUBE2 is a registered trademark of the nCUBE corporation), and a network of workstations.
Fast Computation and Assessment Methods in Power System Analysis
NASA Astrophysics Data System (ADS)
Nagata, Masaki
Power system analysis is essential for efficient and reliable power system operation and control. Recently, online security assessment system has become of importance, as more efficient use of power networks is eagerly required. In this article, fast power system analysis techniques such as contingency screening, parallel processing and intelligent systems application are briefly surveyed from the view point of their application to online dynamic security assessment.
Parallel Plate System for Collecting Data Used to Determine Viscosity
NASA Technical Reports Server (NTRS)
Ethridge, Edwin C. (Inventor); Kaukler, William (Inventor)
2013-01-01
A parallel-plate system collects data used to determine viscosity. A first plate is coupled to a translator so that the first plate can be moved along a first direction. A second plate has a pendulum device coupled thereto such that the second plate is suspended above and parallel to the first plate. The pendulum device constrains movement of the second plate to a second direction that is aligned with the first direction and is substantially parallel thereto. A force measuring device is coupled to the second plate for measuring force along the second direction caused by movement of the second plate.
JSD: Parallel Job Accounting on the IBM SP2
NASA Technical Reports Server (NTRS)
Saphir, William; Jones, James Patton; Walter, Howard (Technical Monitor)
1995-01-01
The IBM SP2 is one of the most promising parallel computers for scientific supercomputing - it is fast and usually reliable. One of its biggest problems is a lack of robust and comprehensive system software. Among other things, this software allows a collection of Unix processes to be treated as a single parallel application. It does not, however, provide accounting for parallel jobs other than what is provided by AIX for the individual process components. Without parallel job accounting, it is not possible to monitor system use, measure the effectiveness of system administration strategies, or identify system bottlenecks. To address this problem, we have written jsd, a daemon that collects accounting data for parallel jobs. jsd records information in a format that is easily machine- and human-readable, allowing us to extract the most important accounting information with very little effort. jsd also notifies system administrators in certain cases of system failure.
The cost of parallel consolidation into visual working memory.
Rideaux, Reuben; Edwards, Mark
2016-01-01
A growing body of evidence indicates that information can be consolidated into visual working memory in parallel. Initially, it was suggested that color information could be consolidated in parallel while orientation was strictly limited to serial consolidation (Liu & Becker, 2013). However, we recently found evidence suggesting that both orientation and motion direction items can be consolidated in parallel, with different levels of accuracy (Rideaux, Apthorp, & Edwards, 2015). Here we examine whether there is a cost associated with parallel consolidation of orientation and direction information by comparing performance, in terms of precision and guess rate, on a target recall task where items are presented either sequentially or simultaneously. The results compellingly indicate that motion direction can be consolidated in parallel, but the evidence for orientation is less conclusive. Further, we find that there is a twofold cost associated with parallel consolidation of direction: Both the probability of failing to consolidate one (or both) item/s increases and the precision at which representations are encoded is reduced. Additionally, we find evidence indicating that the increased consolidation failure may be due to interference between items presented simultaneously, and is moderated by item similarity. These findings suggest that a biased competition model may explain differences in parallel consolidation between features.
When fellow customers behave badly: Witness reactions to employee mistreatment by customers.
Hershcovis, M Sandy; Bhatnagar, Namita
2017-11-01
In 3 experiments, we examined how customers react after witnessing a fellow customer mistreat an employee. Drawing on the deontic model of justice, we argue that customer mistreatment of employees leads witnesses (i.e., other customers) to leave larger tips, engage in supportive employee-directed behaviors, and evaluate employees more positively (Studies 1 and 2). We also theorize that witnesses develop less positive treatment intentions and more negative retaliatory intentions toward perpetrators, with anger and empathy acting as parallel mediators of our perpetrator- and target-directed outcomes, respectively. In Study 1, we conducted a field experiment that examined real customers' target-directed reactions to witnessed mistreatment in the context of a fast-food restaurant. In Study 2, we replicated Study 1 findings in an online vignette experiment, and extended it by examining more severe mistreatment and perpetrator-directed responses. In Study 3, we demonstrated that employees who respond to mistreatment uncivilly are significantly less likely to receive the positive outcomes found in Studies 1 and 2 than those who respond neutrally. We discuss the implications of our findings for theory and practice. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Fast Face-Recognition Optical Parallel Correlator Using High Accuracy Correlation Filter
NASA Astrophysics Data System (ADS)
Watanabe, Eriko; Kodate, Kashiko
2005-11-01
We designed and fabricated a fully automatic fast face recognition optical parallel correlator [E. Watanabe and K. Kodate: Appl. Opt. 44 (2005) 5666] based on the VanderLugt principle. The implementation of an as-yet unattained ultra high-speed system was aided by reconfiguring the system to make it suitable for easier parallel processing, as well as by composing a higher accuracy correlation filter and high-speed ferroelectric liquid crystal-spatial light modulator (FLC-SLM). In running trial experiments using this system (dubbed FARCO), we succeeded in acquiring remarkably low error rates of 1.3% for false match rate (FMR) and 2.6% for false non-match rate (FNMR). Given the results of our experiments, the aim of this paper is to examine methods of designing correlation filters and arranging database image arrays for even faster parallel correlation, underlining the issues of calculation technique, quantization bit rate, pixel size and shift from optical axis. The correlation filter has proved its excellent performance and higher precision than classical correlation and joint transform correlator (JTC). Moreover, arrangement of multi-object reference images leads to 10-channel correlation signals, as sharply marked as those of a single channel. This experiment result demonstrates great potential for achieving the process speed of 10000 face/s.
Bit error rate tester using fast parallel generation of linear recurring sequences
Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.
2003-05-06
A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.
Magnetic fabric and Petrofabric of Amphibolites in Eastern Himalaya Syntaxis
NASA Astrophysics Data System (ADS)
Li, Wenjing; Zhang, Junfeng; Xu, Haijun
2017-04-01
The Himalaya orogenic belt was formed by the collision of the Eurasian plate and the Indian plate. There are two syntaxies along the orogenic belt, where the lower crust are extruded because of the strong stress and deep melting. Our samples are from the eastern Himalaya syntaxis, which is near the Namchabarwa Mount. The sample TO-38 is composed of hornblende, garnet, plagioclase, quartz, ilmenite, magnetite and rutile. The hornblendes are strongly deformed and have clear lineation. While the garnets are relative strong and undeformed, they have a white rim of retrograded minerals with S-C fabric. The ilmenites are distributed extensively and are also deformed, with a slight SPO parallel to lineation. The magnetite are almost cubic with no SPO. We obtained the magnetic fabric of sample TO-38 from anisotropy of magnetic susceptibility (AMS) measurements, and crystallographic fabrics from EBSD analysis. The hornblende shows that [001] forms a well defined point maximum parallel to lineation; poles to {110}{010} plot as a girdle normal to the foliation. The ilmenite fabric shows less pronounced distribution of [0001] axis normal to foliation and weak subparallel distribution of [11-20] axis to lineation. The magnetite is very little, and shows no LPO. The AMS measurement shows that the maximum susceptibility direction correspond to the lineation, also parallel to the [11-20] axis of ilmenite and [001] axis of hornblende. The minimum susceptibility direction is parallel to the [0001] axis of ilmenite. The thermomagnetic curves and values of bulk susceptibility reveal a magnetic mineralogy dominated by a mixed contribution of paramagnetic minerals and magnetite. The mean susceptibility are from 7.06×10-3SI to 33.1×10-3SI. We also calculated the seismic anisotropy of amphibolites, and it shows the fast P wave propagate in lineation direction and has a 11.5% anisotropy. Meanwhile, the shear wave splitting polarization is also along the lineation, and has a 6% anisotropy. According to recent geophysical observations, the Tibet mid-lower crust have strong anisotropy, which favors an amphibolite facies mid-lower crust beneath the Tibet. Therefore, the correlation between petrofabric and magnetic fabric of amphibolites can be applied to interpret the deformation and evolution history of Tibetan Plateau.
Efficient implementation of parallel three-dimensional FFT on clusters of PCs
NASA Astrophysics Data System (ADS)
Takahashi, Daisuke
2003-05-01
In this paper, we propose a high-performance parallel three-dimensional fast Fourier transform (FFT) algorithm on clusters of PCs. The three-dimensional FFT algorithm can be altered into a block three-dimensional FFT algorithm to reduce the number of cache misses. We show that the block three-dimensional FFT algorithm improves performance by utilizing the cache memory effectively. We use the block three-dimensional FFT algorithm to implement the parallel three-dimensional FFT algorithm. We succeeded in obtaining performance of over 1.3 GFLOPS on an 8-node dual Pentium III 1 GHz PC SMP cluster.
NASA Astrophysics Data System (ADS)
Shi, Wei; Hu, Xiaosong; Jin, Chao; Jiang, Jiuchun; Zhang, Yanru; Yip, Tony
2016-05-01
With the development and popularization of electric vehicles, it is urgent and necessary to develop effective management and diagnosis technology for battery systems. In this work, we design a parallel battery model, according to equivalent circuits of parallel voltage and branch current, to study effects of imbalanced currents on parallel large-format LiFePO4/graphite battery systems. Taking a 60 Ah LiFePO4/graphite battery system manufactured by ATL (Amperex Technology Limited, China) as an example, causes of imbalanced currents in the parallel connection are analyzed using our model, and the associated effect mechanisms on long-term stability of each single battery are examined. Theoretical and experimental results show that continuously increasing imbalanced currents during cycling are mainly responsible for the capacity fade of LiFePO4/graphite parallel batteries. It is thus a good way to avoid fast performance fade of parallel battery systems by suppressing variations of branch currents.
NASA Astrophysics Data System (ADS)
Schultz, A.
2010-12-01
3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We describe our ongoing efforts to achieve massive parallelization on a novel hybrid GPU testbed machine currently configured with 12 Intel Westmere Xeon CPU cores (or 24 parallel computational threads) with 96 GB DDR3 system memory, 4 GPU subsystems which in aggregate contain 960 NVidia Tesla GPU cores with 16 GB dedicated DDR3 GPU memory, and a second interleved bank of 4 GPU subsystems containing in aggregate 1792 NVidia Fermi GPU cores with 12 GB dedicated DDR5 GPU memory. We are applying domain decomposition methods to a modified version of Weiss' (2001) 3D frequency domain full physics EM finite difference code, an open source GPL licensed f90 code available for download from www.OpenEM.org. This will be the core of a new hybrid 3D inversion that parallelizes frequencies across CPUs and individual forward solutions across GPUs. We describe progress made in modifying the code to use direct solvers in GPU cores dedicated to each small subdomain, iteratively improving the solution by matching adjacent subdomain boundary solutions, rather than iterative Krylov space sparse solvers as currently applied to the whole domain.
Procacci, Piero
2016-06-27
We present a new release (6.0β) of the ORAC program [Marsili et al. J. Comput. Chem. 2010, 31, 1106-1116] with a hybrid OpenMP/MPI (open multiprocessing message passing interface) multilevel parallelism tailored for generalized ensemble (GE) and fast switching double annihilation (FS-DAM) nonequilibrium technology aimed at evaluating the binding free energy in drug-receptor system on high performance computing platforms. The production of the GE or FS-DAM trajectories is handled using a weak scaling parallel approach on the MPI level only, while a strong scaling force decomposition scheme is implemented for intranode computations with shared memory access at the OpenMP level. The efficiency, simplicity, and inherent parallel nature of the ORAC implementation of the FS-DAM algorithm, project the code as a possible effective tool for a second generation high throughput virtual screening in drug discovery and design. The code, along with documentation, testing, and ancillary tools, is distributed under the provisions of the General Public License and can be freely downloaded at www.chim.unifi.it/orac .
The OpenMP Implementation of NAS Parallel Benchmarks and its Performance
NASA Technical Reports Server (NTRS)
Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry
1999-01-01
As the new ccNUMA architecture became popular in recent years, parallel programming with compiler directives on these machines has evolved to accommodate new needs. In this study, we examine the effectiveness of OpenMP directives for parallelizing the NAS Parallel Benchmarks. Implementation details will be discussed and performance will be compared with the MPI implementation. We have demonstrated that OpenMP can achieve very good results for parallelization on a shared memory system, but effective use of memory and cache is very important.
Hardware design and implementation of fast DOA estimation method based on multicore DSP
NASA Astrophysics Data System (ADS)
Guo, Rui; Zhao, Yingxiao; Zhang, Yue; Lin, Qianqiang; Chen, Zengping
2016-10-01
In this paper, we present a high-speed real-time signal processing hardware platform based on multicore digital signal processor (DSP). The real-time signal processing platform shows several excellent characteristics including high performance computing, low power consumption, large-capacity data storage and high speed data transmission, which make it able to meet the constraint of real-time direction of arrival (DOA) estimation. To reduce the high computational complexity of DOA estimation algorithm, a novel real-valued MUSIC estimator is used. The algorithm is decomposed into several independent steps and the time consumption of each step is counted. Based on the statistics of the time consumption, we present a new parallel processing strategy to distribute the task of DOA estimation to different cores of the real-time signal processing hardware platform. Experimental results demonstrate that the high processing capability of the signal processing platform meets the constraint of real-time direction of arrival (DOA) estimation.
Shallow Mantle Anisotropy Beneath the Juan de Fuca Plate
NASA Astrophysics Data System (ADS)
VanderBeek, Brandon P.; Toomey, Douglas R.
2017-11-01
The anisotropic fabric of the oceanic mantle lithosphere is often assumed to parallel paleo-relative plate motion (RPM). However, we find evidence that this assumption is invalid beneath the Juan de Fuca (JdF) plate. Using travel times of seismic energy propagating through the topmost mantle, we find that the fast direction of P wave propagation is rotated 18° ± 3° counterclockwise to the paleo-spreading direction and strikes between Pacific-JdF relative and JdF absolute plate motion (APM). The mean mantle velocity is 7.85 ± 0.02 km/s with 4.6% ± 0.4% anisotropy. Synthesis of the plate-averaged Pn anisotropy signal with measurements of Pn anisotropy beneath the JdF Ridge and SKS splits across the JdF plate suggests that the anisotropic structure of the topmost mantle continues to evolve away from the spreading center to more closely align with APM. We infer that the oceanic mantle lithosphere may record the influence of both paleo-RPM and paleo-APM.
A fast new algorithm for a robot neurocontroller using inverse QR decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morris, A.S.; Khemaissia, S.
2000-01-01
A new adaptive neural network controller for robots is presented. The controller is based on direct adaptive techniques. Unlike many neural network controllers in the literature, inverse dynamical model evaluation is not required. A numerically robust, computationally efficient processing scheme for neutral network weight estimation is described, namely, the inverse QR decomposition (INVQR). The inverse QR decomposition and a weighted recursive least-squares (WRLS) method for neural network weight estimation is derived using Cholesky factorization of the data matrix. The algorithm that performs the efficient INVQR of the underlying space-time data matrix may be implemented in parallel on a triangular array.more » Furthermore, its systolic architecture is well suited for VLSI implementation. Another important benefit is well suited for VLSI implementation. Another important benefit of the INVQR decomposition is that it solves directly for the time-recursive least-squares filter vector, while avoiding the sequential back-substitution step required by the QR decomposition approaches.« less
An efficient parallel algorithm for matrix-vector multiplication
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hendrickson, B.; Leland, R.; Plimpton, S.
The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in themore » well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.« less
NASA Astrophysics Data System (ADS)
Frickenhaus, Stephan; Hiller, Wolfgang; Best, Meike
The portable software FoSSI is introduced that—in combination with additional free solver software packages—allows for an efficient and scalable parallel solution of large sparse linear equations systems arising in finite element model codes. FoSSI is intended to support rapid model code development, completely hiding the complexity of the underlying solver packages. In particular, the model developer need not be an expert in parallelization and is yet free to switch between different solver packages by simple modifications of the interface call. FoSSI offers an efficient and easy, yet flexible interface to several parallel solvers, most of them available on the web, such as PETSC, AZTEC, MUMPS, PILUT and HYPRE. FoSSI makes use of the concept of handles for vectors, matrices, preconditioners and solvers, that is frequently used in solver libraries. Hence, FoSSI allows for a flexible treatment of several linear equations systems and associated preconditioners at the same time, even in parallel on separate MPI-communicators. The second special feature in FoSSI is the task specifier, being a combination of keywords, each configuring a certain phase in the solver setup. This enables the user to control a solver over one unique subroutine. Furthermore, FoSSI has rather similar features for all solvers, making a fast solver intercomparison or exchange an easy task. FoSSI is a community software, proven in an adaptive 2D-atmosphere model and a 3D-primitive equation ocean model, both formulated in finite elements. The present paper discusses perspectives of an OpenMP-implementation of parallel iterative solvers based on domain decomposition methods. This approach to OpenMP solvers is rather attractive, as the code for domain-local operations of factorization, preconditioning and matrix-vector product can be readily taken from a sequential implementation that is also suitable to be used in an MPI-variant. Code development in this direction is in an advanced state under the name ScOPES: the Scalable Open Parallel sparse linear Equations Solver.
Walter, Alexander M; Pinheiro, Paulo S; Verhage, Matthijs; Sørensen, Jakob B
2013-01-01
Neurotransmitter release depends on the fusion of secretory vesicles with the plasma membrane and the release of their contents. The final fusion step displays higher-order Ca(2+) dependence, but also upstream steps depend on Ca(2+). After deletion of the Ca(2+) sensor for fast release - synaptotagmin-1 - slower Ca(2+)-dependent release components persist. These findings have provoked working models involving parallel releasable vesicle pools (Parallel Pool Models, PPM) driven by alternative Ca(2+) sensors for release, but no slow release sensor acting on a parallel vesicle pool has been identified. We here propose a Sequential Pool Model (SPM), assuming a novel Ca(2+)-dependent action: a Ca(2+)-dependent catalyst that accelerates both forward and reverse priming reactions. While both models account for fast fusion from the Readily-Releasable Pool (RRP) under control of synaptotagmin-1, the origins of slow release differ. In the SPM the slow release component is attributed to the Ca(2+)-dependent refilling of the RRP from a Non-Releasable upstream Pool (NRP), whereas the PPM attributes slow release to a separate slowly-releasable vesicle pool. Using numerical integration we compared model predictions to data from mouse chromaffin cells. Like the PPM, the SPM explains biphasic release, Ca(2+)-dependence and pool sizes in mouse chromaffin cells. In addition, the SPM accounts for the rapid recovery of the fast component after strong stimulation, where the PPM fails. The SPM also predicts the simultaneous changes in release rate and amplitude seen when mutating the SNARE-complex. Finally, it can account for the loss of fast- and the persistence of slow release in the synaptotagmin-1 knockout by assuming that the RRP is depleted, leading to slow and Ca(2+)-dependent fusion from the NRP. We conclude that the elusive 'alternative Ca(2+) sensor' for slow release might be the upstream priming catalyst, and that a sequential model effectively explains Ca(2+)-dependent properties of secretion without assuming parallel pools or sensors.
Walter, Alexander M.; Pinheiro, Paulo S.; Verhage, Matthijs; Sørensen, Jakob B.
2013-01-01
Neurotransmitter release depends on the fusion of secretory vesicles with the plasma membrane and the release of their contents. The final fusion step displays higher-order Ca2+ dependence, but also upstream steps depend on Ca2+. After deletion of the Ca2+ sensor for fast release – synaptotagmin-1 – slower Ca2+-dependent release components persist. These findings have provoked working models involving parallel releasable vesicle pools (Parallel Pool Models, PPM) driven by alternative Ca2+ sensors for release, but no slow release sensor acting on a parallel vesicle pool has been identified. We here propose a Sequential Pool Model (SPM), assuming a novel Ca2+-dependent action: a Ca2+-dependent catalyst that accelerates both forward and reverse priming reactions. While both models account for fast fusion from the Readily-Releasable Pool (RRP) under control of synaptotagmin-1, the origins of slow release differ. In the SPM the slow release component is attributed to the Ca2+-dependent refilling of the RRP from a Non-Releasable upstream Pool (NRP), whereas the PPM attributes slow release to a separate slowly-releasable vesicle pool. Using numerical integration we compared model predictions to data from mouse chromaffin cells. Like the PPM, the SPM explains biphasic release, Ca2+-dependence and pool sizes in mouse chromaffin cells. In addition, the SPM accounts for the rapid recovery of the fast component after strong stimulation, where the PPM fails. The SPM also predicts the simultaneous changes in release rate and amplitude seen when mutating the SNARE-complex. Finally, it can account for the loss of fast- and the persistence of slow release in the synaptotagmin-1 knockout by assuming that the RRP is depleted, leading to slow and Ca2+-dependent fusion from the NRP. We conclude that the elusive ‘alternative Ca2+ sensor’ for slow release might be the upstream priming catalyst, and that a sequential model effectively explains Ca2+-dependent properties of secretion without assuming parallel pools or sensors. PMID:24339761
Development of fast parallel multi-technique scanning X-ray imaging at Synchrotron Soleil
NASA Astrophysics Data System (ADS)
Medjoubi, K.; Leclercq, N.; Langlois, F.; Buteau, A.; Lé, S.; Poirier, S.; Mercère, P.; Kewish, C. M.; Somogyi, A.
2013-10-01
A fast multimodal scanning X-ray imaging scheme is prototyped at Soleil Synchrotron. It permits the simultaneous acquisition of complementary information on the sample structure, composition and chemistry by measuring transmission, differential phase contrast, small-angle scattering, and X-ray fluorescence by dedicated detectors with ms dwell time per pixel. The results of the proof of principle experiments are presented in this paper.
Fast MPEG-CDVS Encoder With GPU-CPU Hybrid Computing.
Duan, Ling-Yu; Sun, Wei; Zhang, Xinfeng; Wang, Shiqi; Chen, Jie; Yin, Jianxiong; See, Simon; Huang, Tiejun; Kot, Alex C; Gao, Wen
2018-05-01
The compact descriptors for visual search (CDVS) standard from ISO/IEC moving pictures experts group has succeeded in enabling the interoperability for efficient and effective image retrieval by standardizing the bitstream syntax of compact feature descriptors. However, the intensive computation of a CDVS encoder unfortunately hinders its widely deployment in industry for large-scale visual search. In this paper, we revisit the merits of low complexity design of CDVS core techniques and present a very fast CDVS encoder by leveraging the massive parallel execution resources of graphics processing unit (GPU). We elegantly shift the computation-intensive and parallel-friendly modules to the state-of-the-arts GPU platforms, in which the thread block allocation as well as the memory access mechanism are jointly optimized to eliminate performance loss. In addition, those operations with heavy data dependence are allocated to CPU for resolving the extra but non-necessary computation burden for GPU. Furthermore, we have demonstrated the proposed fast CDVS encoder can work well with those convolution neural network approaches which enables to leverage the advantages of GPU platforms harmoniously, and yield significant performance improvements. Comprehensive experimental results over benchmarks are evaluated, which has shown that the fast CDVS encoder using GPU-CPU hybrid computing is promising for scalable visual search.
Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter
2015-01-20
While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.
PCTDSE: A parallel Cartesian-grid-based TDSE solver for modeling laser-atom interactions
NASA Astrophysics Data System (ADS)
Fu, Yongsheng; Zeng, Jiaolong; Yuan, Jianmin
2017-01-01
We present a parallel Cartesian-grid-based time-dependent Schrödinger equation (TDSE) solver for modeling laser-atom interactions. It can simulate the single-electron dynamics of atoms in arbitrary time-dependent vector potentials. We use a split-operator method combined with fast Fourier transforms (FFT), on a three-dimensional (3D) Cartesian grid. Parallelization is realized using a 2D decomposition strategy based on the Message Passing Interface (MPI) library, which results in a good parallel scaling on modern supercomputers. We give simple applications for the hydrogen atom using the benchmark problems coming from the references and obtain repeatable results. The extensions to other laser-atom systems are straightforward with minimal modifications of the source code.
An 81.6 μW FastICA processor for epileptic seizure detection.
Yang, Chia-Hsiang; Shih, Yi-Hsin; Chiueh, Herming
2015-02-01
To improve the performance of epileptic seizure detection, independent component analysis (ICA) is applied to multi-channel signals to separate artifacts and signals of interest. FastICA is an efficient algorithm to compute ICA. To reduce the energy dissipation, eigenvalue decomposition (EVD) is utilized in the preprocessing stage to reduce the convergence time of iterative calculation of ICA components. EVD is computed efficiently through an array structure of processing elements running in parallel. Area-efficient EVD architecture is realized by leveraging the approximate Jacobi algorithm, leading to a 77.2% area reduction. By choosing proper memory element and reduced wordlength, the power and area of storage memory are reduced by 95.6% and 51.7%, respectively. The chip area is minimized through fixed-point implementation and architectural transformations. Given a latency constraint of 0.1 s, an 86.5% area reduction is achieved compared to the direct-mapped architecture. Fabricated in 90 nm CMOS, the core area of the chip is 0.40 mm(2). The FastICA processor, part of an integrated epileptic control SoC, dissipates 81.6 μW at 0.32 V. The computation delay of a frame of 256 samples for 8 channels is 84.2 ms. Compared to prior work, 0.5% power dissipation, 26.7% silicon area, and 3.4 × computation speedup are achieved. The performance of the chip was verified by human dataset.
Parallel processing in a host plus multiple array processor system for radar
NASA Technical Reports Server (NTRS)
Barkan, B. Z.
1983-01-01
Host plus multiple array processor architecture is demonstrated to yield a modular, fast, and cost-effective system for radar processing. Software methodology for programming such a system is developed. Parallel processing with pipelined data flow among the host, array processors, and discs is implemented. Theoretical analysis of performance is made and experimentally verified. The broad class of problems to which the architecture and methodology can be applied is indicated.
Fast Whole-Engine Stirling Analysis
NASA Technical Reports Server (NTRS)
Dyson, Rodger W.; Wilson, Scott D.; Tew, Roy C.; Demko, Rikako
2006-01-01
This presentation discusses the simulation approach to whole-engine for physical consistency, REV regenerator modeling, grid layering for smoothness, and quality, conjugate heat transfer method adjustment, high-speed low cost parallel cluster, and debugging.
NASA Astrophysics Data System (ADS)
Evangelidis, Christos
2017-04-01
The upper mantle anisotropy pattern in the entire area of the Hellenic subduction zone have been analyzed for fast polarization directions and delay times to investigate the complex 3D pattern of mantle flow around the subducting slab. All previous studies do incorporate a significant number of measurements in the backarc area of the Aegean and in two cross-sections along the Hellenic subduction system. However, the transitional area from oceanic to continental subduction in the Western Hellenic trench has not been adequately sampled so far. Moreover, the eastern termination of the Hellenic subduction and the possible origin of a trench parallel anisotropy remains unclear. Here, I focus on the two possible ends of the high curvature Hellenic arc. I have now measured SKS splitting parameters from all broadband stations of the Hellenic Unified Seismic Network (HUSN), that they have not been measured before, specially concentrated in the transitional area from oceanic to continental subduction system. Complementary, using the Source-Side splitting technique to teleseismic S-wave records from intermediate depth earthquake in the Hellenic trench, the anisotropy measurements are increased in regions where no stations are installed. In western Greece, the Hellenic subduction system is separated by the Cephalonia Transform Fault (CTF), a dextral offset of 100 km, into the northern and southern segments, which are characterized by different convergence rates and slab composition. Recent seismic data show that north of CTF there is a subducted continental lithosphere in contrast to the region south of CTF where the on-going subduction is oceanic. The new measurements, combined with previously published observations, provide the most complete up-to-date spatial coverage for the area. Generally, the pronounced zonation of seismic anisotropy across the subduction zone, as inferred from other studies, is also observed here. Fast SKS splitting directions are trench-normal in the region nearest to the trench. The fast splitting directions change abruptly to trench-parallel above the corner of the mantle wedge and rotate back to trench-normal over the back-arc. Additionally, beneath western Greece, between the western Gulf of Corinth in the south and the Epirus-Thessaly area in the north, a transitional anisotropy pattern emerges that possibly depicts the passage from the continental to the oceanic subducted slabs and the subslab mantle flow due to the trench retreat. At the eastern side of the Hellenic arc, from eastern Crete to the Dodecanese Islands, the inferred subslab measurements of anisotropy show a general trench perpendicular pattern. This area is characterized as a STEP fault region with multiple trench normal strike slip faults. The difference between the fast roll-back in the Aegean and the slow lithospheric processes in the western Anatolia is accommodated by a broad shear zone of lithospheric deformation and a possible slab tear inferred from seismic tomography and geophysical studies but with a relative unknown geometry. Thus, the observed anisotropy pattern possibly resembles the 3D return flow around the slab edge that is caused by the inferred slab break.
NASA Astrophysics Data System (ADS)
John, B. E.; Cheadle, M. J.; Gee, J. S.; Coogan, L. A.; Gillis, K. M.
2017-12-01
During January and February 2017, the 42-day RV Atlantis PMaG cruise mapped and sampled in-situ fast spread lower crust for 35 km along a flow line at Pito Deep Rift (northeastern Easter microplate). There, ridge-perpendicular escarpments bound Pito Deep and expose up to 3 km sections of crust parallel to the paleo-spreading direction, providing a unique opportunity to test models for the architecture of fast spread lower ocean crust (the plutonic section). Shipboard operations included a >57,000 km2 multi-beam survey; ten Sentry dives over 70 km2 (nominal m-scale resolution) to facilitate acquisition of detailed magnetic and bathymetric data, and optimize Jason II dive siting for rock sampling and geologic mapping; nine Jason II dives in 4 areas, recovering >400 samples of gabbroic lower crust, of which 80% are approximately oriented. Combined Sentry mapping and Jason II sampling and imaging of one area, provides the most detailed documentation of in situ gabbroic crust (>3 km2 of seafloor, over 1000+m vertical section) ever completed. Significantly, the area exposes distinct lateral variation in rock type: in the west 100m of Fe-Ti oxide rich gabbroic rocks overly gabbro and olivine gabbro; however, to the east, exposures of primitive, layered troctolitic rocks extend to within 100m below the dike-gabbro transition. Equivalent troctolitic rocks are found 13 km to the southeast parallel to a flow line, implying shallow primitive rocks are a characteristic feature of EPR lower crust at this location. The high-level position of troctolitic rocks is best explained by construction in a shallow, near steady-state melt lens at a ridge segment center, with some form of gabbro glacier flow active during formation of at least the uppermost lower ocean crust (Perk et al., 2007). Lateral variation in rock type (adjacent oxide gabbro, gabbro, olivine-rich gabbro and troctolite) over short distances taken with complexity in magmatic fabric orientation (mineral and grain size layering, mineral shape preferred orientation/foliation) on the 100m-scale is perhaps surprising. Post cruise evaluation is required to determine how much of this variability is due to rift related tectonics and/or slumping. If such deformation is limited, our mapping will require adding complexity to current models of fast spread lower ocean crust.
Fast Numerical Solution of the Plasma Response Matrix for Real-time Ideal MHD Control
DOE Office of Scientific and Technical Information (OSTI.GOV)
Glasser, Alexander; Kolemen, Egemen; Glasser, Alan H.
To help effectuate near real-time feedback control of ideal MHD instabilities in tokamak geometries, a parallelized version of A.H. Glasser’s DCON (Direct Criterion of Newcomb) code is developed. To motivate the numerical implementation, we first solve DCON’s δW formulation with a Hamilton-Jacobi theory, elucidating analytical and numerical features of the ideal MHD stability problem. The plasma response matrix is demonstrated to be the solution of an ideal MHD Riccati equation. We then describe our adaptation of DCON with numerical methods natural to solutions of the Riccati equation, parallelizing it to enable its operation in near real-time. We replace DCON’s serial integration of perturbed modes—which satisfy a singular Euler- Lagrange equation—with a domain-decomposed integration of state transition matrices. Output is shown to match results from DCON with high accuracy, and with computation time < 1s. Such computational speed may enable active feedback ideal MHD stability control, especially in plasmas whose ideal MHD equilibria evolve with inductive timescalemore » $$\\tau$$ ≳ 1s—as in ITER. Further potential applications of this theory are discussed.« less
On some Aitken-like acceleration of the Schwarz method
NASA Astrophysics Data System (ADS)
Garbey, M.; Tromeur-Dervout, D.
2002-12-01
In this paper we present a family of domain decomposition based on Aitken-like acceleration of the Schwarz method seen as an iterative procedure with a linear rate of convergence. We first present the so-called Aitken-Schwarz procedure for linear differential operators. The solver can be a direct solver when applied to the Helmholtz problem with five-point finite difference scheme on regular grids. We then introduce the Steffensen-Schwarz variant which is an iterative domain decomposition solver that can be applied to linear and nonlinear problems. We show that these solvers have reasonable numerical efficiency compared to classical fast solvers for the Poisson problem or multigrids for more general linear and nonlinear elliptic problems. However, the salient feature of our method is that our algorithm has high tolerance to slow network in the context of distributed parallel computing and is attractive, generally speaking, to use with computer architecture for which performance is limited by the memory bandwidth rather than the flop performance of the CPU. This is nowadays the case for most parallel. computer using the RISC processor architecture. We will illustrate this highly desirable property of our algorithm with large-scale computing experiments.
Fast Numerical Solution of the Plasma Response Matrix for Real-time Ideal MHD Control
Glasser, Alexander; Kolemen, Egemen; Glasser, Alan H.
2018-03-26
To help effectuate near real-time feedback control of ideal MHD instabilities in tokamak geometries, a parallelized version of A.H. Glasser’s DCON (Direct Criterion of Newcomb) code is developed. To motivate the numerical implementation, we first solve DCON’s δW formulation with a Hamilton-Jacobi theory, elucidating analytical and numerical features of the ideal MHD stability problem. The plasma response matrix is demonstrated to be the solution of an ideal MHD Riccati equation. We then describe our adaptation of DCON with numerical methods natural to solutions of the Riccati equation, parallelizing it to enable its operation in near real-time. We replace DCON’s serial integration of perturbed modes—which satisfy a singular Euler- Lagrange equation—with a domain-decomposed integration of state transition matrices. Output is shown to match results from DCON with high accuracy, and with computation time < 1s. Such computational speed may enable active feedback ideal MHD stability control, especially in plasmas whose ideal MHD equilibria evolve with inductive timescalemore » $$\\tau$$ ≳ 1s—as in ITER. Further potential applications of this theory are discussed.« less
Novel molecular targets for kRAS downregulation: promoter G-quadruplexes
2016-11-01
conditions, and described the structure as having mixed parallel/anti-parallel loops of lengths 2:8:10 in the 5’-3’ direction. Using selective small...and anti-parallel loop directionality of lengths 4:10:8 in the 5’–3’ direction, three tetrads stacked, and involving guanines in runs B, C, E, and F...a tri-stacked structure incorporating runs B, C, E and F with intervening loops of 2, 10, and 8 bases in the 5’–3’ direction. G = black circles, C
NASA Technical Reports Server (NTRS)
Le, G.; Lu, G.; Strangeway, R. J.; Pfaff, R. F., Jr.; Vondrak, Richard R. (Technical Monitor)
2001-01-01
We present in this paper an investigation of IMF-By related plasma convection and cusp field-aligned currents using FAST data and AMIE model during a prolonged interval with large positive IMF By and northward Bz conditions (By/Bz much greater than 1). Using the FAST single trajectory observations to validate the global convection patterns at key times and key locations, we have demonstrated that the AMIE procedure provides a reasonably good description of plasma circulations in the ionosphere during this interval. Our results show that the plasma convection in the ionosphere is consistent with the anti-parallel merging model. When the IMF has a strongly positive By component under northward conditions, we find that the global plasma convection forms two cells oriented nearly along the Sun-earth line in the ionosphere. In the northern hemisphere, the dayside cell has clockwise convection mainly circulating within the polar cap on open field lines. A second cell with counterclockwise convection is located in the nightside circulating across the polar cap boundary, The observed two-cell convection pattern appears to be driven by the reconnection along the anti-parallel merging lines poleward of the cusp extending toward the dusk side when IMF By/Bz much greater than 1. The magnetic tension force on the newly reconnected field lines drives the plasma to move from dusk to dawn in the polar cusp region near the polar cap boundary. The field-aligned currents in the cusp region flow downward into the ionosphere. The return field-aligned currents extend into the polar cap in the center of the dayside convection cell. The field-aligned currents are closed through the Peterson currents in the ionosphere, which flow poleward from the polar cap boundary along the electric field direction.
NASA Astrophysics Data System (ADS)
Shi, Sheng-bing; Chen, Zhen-xing; Qin, Shao-gang; Song, Chun-yan; Jiang, Yun-hong
2014-09-01
With the development of science and technology, photoelectric equipment comprises visible system, infrared system, laser system and so on, integration, information and complication are higher than past. Parallelism and jumpiness of optical axis are important performance of photoelectric equipment,directly affect aim, ranging, orientation and so on. Jumpiness of optical axis directly affect hit precision of accurate point damage weapon, but we lack the facility which is used for testing this performance. In this paper, test system which is used fo testing parallelism and jumpiness of optical axis is devised, accurate aim isn't necessary and data processing are digital in the course of testing parallelism, it can finish directly testing parallelism of multi-axes, aim axis and laser emission axis, parallelism of laser emission axis and laser receiving axis and first acuualizes jumpiness of optical axis of optical sighting device, it's a universal test system.
Biomechanical Comparison of Parallel and Crossed Suture Repair for Longitudinal Meniscus Tears.
Milchteim, Charles; Branch, Eric A; Maughon, Ty; Hughey, Jay; Anz, Adam W
2016-04-01
Longitudinal meniscus tears are commonly encountered in clinical practice. Meniscus repair devices have been previously tested and presented; however, prior studies have not evaluated repair construct designs head to head. This study compared a new-generation meniscus repair device, SpeedCinch, with a similar established device, Fast-Fix 360, and a parallel repair construct to a crossed construct. Both devices utilize self-adjusting No. 2-0 ultra-high molecular weight polyethylene (UHMWPE) and 2 polyether ether ketone (PEEK) anchors. Crossed suture repair constructs have higher failure loads and stiffness compared with simple parallel constructs. The newer repair device would exhibit similar performance to an established device. Controlled laboratory study. Sutures were placed in an open fashion into the body and posterior horn regions of the medial and lateral menisci in 16 cadaveric knees. Evaluation of 2 repair devices and 2 repair constructs created 4 groups: 2 parallel vertical sutures created with the Fast-Fix 360 (2PFF), 2 crossed vertical sutures created with the Fast-Fix 360 (2XFF), 2 parallel vertical sutures created with the SpeedCinch (2PSC), and 2 crossed vertical sutures created with the SpeedCinch (2XSC). After open placement of the repair construct, each meniscus was explanted and tested to failure on a uniaxial material testing machine. All data were checked for normality of distribution, and 1-way analysis of variance by ranks was chosen to evaluate for statistical significance of maximum failure load and stiffness between groups. Statistical significance was defined as P < .05. The mean maximum failure loads ± 95% CI (range) were 89.6 ± 16.3 N (125.7-47.8 N) (2PFF), 72.1 ± 11.7 N (103.4-47.6 N) (2XFF), 71.9 ± 15.5 N (109.4-41.3 N) (2PSC), and 79.5 ± 25.4 N (119.1-30.9 N) (2XSC). Interconstruct comparison revealed no statistical difference between all 4 constructs regarding maximum failure loads (P = .49). Stiffness values were also similar, with no statistical difference on comparison (P = .28). Both devices in the current study had similar failure load and stiffness when 2 vertical or 2 crossed sutures were tested in cadaveric human menisci. Simple parallel vertical sutures perform similarly to crossed suture patterns at the time of implantation.
NASA Astrophysics Data System (ADS)
Pastori, M.; Piccinini, D.; Margheriti, L.; Improta, L.; Valoroso, L.; Chiaraluce, L.; Chiarabba, C.
2009-10-01
Shear wave splitting is measured at 19 seismic stations of a temporary network deployed in the Val d'Agri area to record low-magnitude seismic activity. The splitting results suggest the presence of an anisotropic layer between the surface and 15 km depth (i.e. above the hypocentres). The dominant fast polarization direction strikes NW-SE parallel to the Apennines orogen and is approximately parallel to the maximum horizontal stress in the region, as well as to major normal faults bordering the Val d'Agri basin. The size of the normalized delay times in the study region is about 0.01 s km-1, suggesting 4.5 percent shear wave velocity anisotropy (SWVA). On the south-western flank of the basin, where most of the seismicity occurs, we found larger values of normalized delay times, between 0.017 and 0.02 s km-1. These high values suggest a 10 percent of SWVA. These parameters agree with an interpretation of seismic anisotropy in terms of the Extensive-Dilatancy Anisotropy (EDA) model that considers the rock volume pervaded by fluid-saturated microcracks aligned by the active stress field. Anisotropic parameters are consistent with borehole image logs from deep exploration wells in the Val d'Agri oil field that detect pervasive fluid saturated microcracks striking NW-SE parallel to the maximum horizontal stress in the carbonatic reservoir. However, we cannot rule out the contribution of aligned macroscopic fractures because the main Quaternary normal faults are parallel to the maximum horizontal stress. The strong anisotropy and the seismicity concentration testify for active deformation along the SW flank of the basin.
An embedded multi-core parallel model for real-time stereo imaging
NASA Astrophysics Data System (ADS)
He, Wenjing; Hu, Jian; Niu, Jingyu; Li, Chuanrong; Liu, Guangyu
2018-04-01
The real-time processing based on embedded system will enhance the application capability of stereo imaging for LiDAR and hyperspectral sensor. The task partitioning and scheduling strategies for embedded multiprocessor system starts relatively late, compared with that for PC computer. In this paper, aimed at embedded multi-core processing platform, a parallel model for stereo imaging is studied and verified. After analyzing the computing amount, throughout capacity and buffering requirements, a two-stage pipeline parallel model based on message transmission is established. This model can be applied to fast stereo imaging for airborne sensors with various characteristics. To demonstrate the feasibility and effectiveness of the parallel model, a parallel software was designed using test flight data, based on the 8-core DSP processor TMS320C6678. The results indicate that the design performed well in workload distribution and had a speed-up ratio up to 6.4.
NASA Technical Reports Server (NTRS)
Zhang, Meng; Maxworthy, Tony
1999-01-01
It has long been recognized that flow in the melt can have a profound influence on the dynamics of a solidifying interface and hence the quality of the solid material. In particular, flow affects the heat and mass transfer, and causes spatial and temporal variations in the flow and melt composition. This results in a crystal with nonuniform physical properties. Flow can be generated by buoyancy, expansion or contraction upon phase change, and thermo-soluto capillary effects. In general, these flows can not be avoided and can have an adverse effect on the stability of the crystal structures. This motivates crystal growth experiments in a microgravity environment, where buoyancy-driven convection is significantly suppressed. However, transient accelerations (g-jitter) caused by the acceleration of the spacecraft can affect the melt, while convection generated from the effects other than buoyancy remain important. Rather than bemoan the presence of convection as a source of interfacial instability, Hurle in the 1960s suggested that flow in the melt, either forced or natural convection, might be used to stabilize the interface. Delves considered the imposition of both a parabolic velocity profile and a Blasius boundary layer flow over the interface. He concluded that fast stirring could stabilize the interface to perturbations whose wave vector is in the direction of the fluid velocity. Forth and Wheeler considered the effect of the asymptotic suction boundary layer profile. They showed that the effect of the shear flow was to generate travelling waves parallel to the flow with a speed proportional to the Reynolds number. There have been few quantitative, experimental works reporting on the coupling effect of fluid flow and morphological instabilities. Huang studied plane Couette flow over cells and dendrites. It was found that this flow could greatly enhance the planar stability and even induce the cell-planar transition. A rotating impeller was buried inside the sample cell, driven by an outside rotating magnet, in order to generate the flow. However, it appears that this was not a well-controlled flow and may also have been unsteady. In the present experimental study, we want to study how a forced parallel shear flow in a Hele-Shaw cell interacts with the directionally solidifying crystal interface. The comparison of experimental data show that the parallel shear flow in a Hele-Shaw cell has a strong stabilizing effect on the planar interface by damping the existing initial perturbations. The flow also shows a stabilizing effect on the cellular interface by slightly reducing the exponential growth rate of cells. The left-right symmetry of cells is broken by the flow with cells tilting toward the incoming flow direction. The tilting angle increases with the velocity ratio. The experimental results are explained through the parallel flow effect on lateral solute transport. The phenomenon of cells tilting against the flow is consistent with the numerical result of Dantzig and Chao.
Massively parallel implementation of 3D-RISM calculation with volumetric 3D-FFT.
Maruyama, Yutaka; Yoshida, Norio; Tadano, Hiroto; Takahashi, Daisuke; Sato, Mitsuhisa; Hirata, Fumio
2014-07-05
A new three-dimensional reference interaction site model (3D-RISM) program for massively parallel machines combined with the volumetric 3D fast Fourier transform (3D-FFT) was developed, and tested on the RIKEN K supercomputer. The ordinary parallel 3D-RISM program has a limitation on the number of parallelizations because of the limitations of the slab-type 3D-FFT. The volumetric 3D-FFT relieves this limitation drastically. We tested the 3D-RISM calculation on the large and fine calculation cell (2048(3) grid points) on 16,384 nodes, each having eight CPU cores. The new 3D-RISM program achieved excellent scalability to the parallelization, running on the RIKEN K supercomputer. As a benchmark application, we employed the program, combined with molecular dynamics simulation, to analyze the oligomerization process of chymotrypsin Inhibitor 2 mutant. The results demonstrate that the massive parallel 3D-RISM program is effective to analyze the hydration properties of the large biomolecular systems. Copyright © 2014 Wiley Periodicals, Inc.
Fast experiments for structure elucidation of small molecules: Hadamard NMR with multiple receivers.
Gierth, Peter; Codina, Anna; Schumann, Frank; Kovacs, Helena; Kupče, Ēriks
2015-11-01
We propose several significant improvements to the PANSY (Parallel NMR SpectroscopY) experiments-PANSY COSY and PANSY-TOCSY. The improved versions of these experiments provide sufficient spectral information for structure elucidation of small organic molecules from just two 2D experiments. The PANSY-TOCSY-Q experiment has been modified to allow for simultaneous acquisition of three different types of NMR spectra-1D C-13 of non-protonated carbon sites, 2D TOCSY and multiplicity edited 2D HETCOR. In addition the J-filtered 2D PANSY-gCOSY experiment records a 2D HH gCOSY spectrum in parallel with a (1) J-filtered HC long-range HETCOR spectrum as well as offers a simplified data processing. In addition to parallel acquisition, further time savings are feasible because of significantly smaller F1 spectral windows as compared to the indirect detection experiments. Use of cryoprobes and multiple receivers can significantly alleviate the sensitivity issues that are usually associated with the so called direct detection experiments. In cases where experiments are sampling limited rather than sensitivity limited further reduction of experiment time is achieved by using Hadamard encoding. In favorable cases the total recording time for the two PANSY experiments can be reduced to just 40 s. The proposed PANSY experiments provide sufficient information to allow the CMCse software package (Bruker) to solve structures of small organic molecules. Copyright © 2015 John Wiley & Sons, Ltd.
Plasma Physics Calculations on a Parallel Macintosh Cluster
NASA Astrophysics Data System (ADS)
Decyk, Viktor; Dauger, Dean; Kokelaar, Pieter
2000-03-01
We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 MFlops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
Plasma Physics Calculations on a Parallel Macintosh Cluster
NASA Astrophysics Data System (ADS)
Decyk, Viktor K.; Dauger, Dean E.; Kokelaar, Pieter R.
We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 Mflops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
NASA Astrophysics Data System (ADS)
Qin, Cheng-Zhi; Zhan, Lijun
2012-06-01
As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.
Fast and confident: postdicting eyewitness identification accuracy in a field study.
Sauerland, Melanie; Sporer, Siegfried L
2009-03-01
The combined postdictive value of postdecision confidence, decision time, and Remember-Know-Familiar (RKF) judgments as markers of identification accuracy was evaluated with 10 targets and 720 participants. In a pedestrian area, passers-by were asked for directions. Identifications were made from target-absent or target-present lineups. Fast (optimum time boundary at 6 seconds) and confident (optimum confidence boundary at 90%) witnesses were highly accurate, slow and nonconfident witnesses highly inaccurate. Although this combination of postdictors was clearly superior to using either postdictor by itself these combinations refer only to a subsample of choosers. Know answers were associated with higher identification performance than Familiar answers, with no difference between Remember and Know answers. The results of participants' post hoc decision time estimates paralleled those with measured decision times. To explore decision strategies of nonchoosers, three subgroups were formed according to their reasons given for rejecting the lineup. Nonchoosers indicating that the target had simply been absent made faster and more confident decisions than nonchoosers stating lack of confidence or lack of memory. There were no significant differences with regard to identification performance across nonchooser groups. (PsycINFO Database Record (c) 2009 APA, all rights reserved).
Exploration of high harmonic fast wave heating on the National Spherical Torus Experiment
NASA Astrophysics Data System (ADS)
Wilson, J. R.; Bell, R. E.; Bernabei, S.; Bitter, M.; Bonoli, P.; Gates, D.; Hosea, J.; LeBlanc, B.; Mau, T. K.; Medley, S.; Menard, J.; Mueller, D.; Ono, M.; Phillips, C. K.; Pinsker, R. I.; Raman, R.; Rosenberg, A.; Ryan, P.; Sabbagh, S.; Stutman, D.; Swain, D.; Takase, Y.; Wilgen, J.
2003-05-01
High harmonic fast wave (HHFW) heating has been proposed as a particularly attractive means for plasma heating and current drive in the high beta plasmas that are achievable in spherical torus (ST) devices. The National Spherical Torus Experiment (NSTX) [M. Ono, S. M. Kaye, S. Neumeyer et al., in Proceedings of the 18th IEEE/NPSS Symposium on Fusion Engineering, Albuquerque, 1999 (IEEE, Piscataway, NJ, 1999), p. 53] is such a device. An rf heating system has been installed on the NSTX to explore the physics of HHFW heating, current drive via rf waves and for use as a tool to demonstrate the attractiveness of the ST concept as a fusion device. To date, experiments have demonstrated many of the theoretical predictions for HHFW. In particular, strong wave absorption on electrons over a wide range of plasma parameters and wave parallel phase velocities, wave acceleration of energetic ions, and indications of current drive for directed wave spectra have been observed. In addition HHFW heating has been used to explore the energy transport properties of NSTX plasmas, to create H-mode discharges with a large fraction of bootstrap current and to control the plasma current profile during the early stages of the discharge.
Fast autonomous holographic adaptive optics
NASA Astrophysics Data System (ADS)
Andersen, G.
2010-07-01
We have created a new adaptive optics system using a holographic modal wavefront sensing method capable of autonomous (computer-free) closed-loop control of a MEMS deformable mirror. A multiplexed hologram is recorded using the maximum and minimum actuator positions on the deformable mirror as the "modes". On reconstruction, an input beam will be diffracted into pairs of focal spots - the ratio of particular pairs determines the absolute wavefront phase at a particular actuator location. The wavefront measurement is made using a fast, sensitive photo-detector array such as a multi-pixel photon counters. This information is then used to directly control each actuator in the MEMS DM without the need for any computer in the loop. We present initial results of a 32-actuator prototype device. We further demonstrate that being an all-optical, parallel processing scheme, the speed is independent of the number of actuators. In fact, the limitations on speed are ultimately determined by the maximum driving speed of the DM actuators themselves. Finally, being modal in nature, the system is largely insensitive to both obscuration and scintillation. This should make it ideal for laser beam transmission or imaging under highly turbulent conditions.
REPEATING FAST RADIO BURSTS FROM HIGHLY MAGNETIZED PULSARS TRAVELING THROUGH ASTEROID BELTS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dai, Z. G.; Wang, J. S.; Huang, Y. F.
Very recently, Spitler et al. and Scholz et al. reported their detections of 16 additional bright bursts in the direction of the fast radio burst (FRB) 121102. This repeating FRB is inconsistent with all of the catastrophic event models put forward previously for hypothetically non-repeating FRBs. Here, we propose a different model, in which highly magnetized pulsars travel through the asteroid belts of other stars. We show that a repeating FRB could originate from such a pulsar encountering a large number of asteroids in the belt. During each pulsar-asteroid impact, an electric field induced outside of the asteroid has suchmore » a large component parallel to the stellar magnetic field that electrons are torn off the asteroidal surface and accelerated to ultra-relativistic energies instantaneously. The subsequent movement of these electrons along magnetic field lines will cause coherent curvature radiation, which can account for all of the properties of an FRB. In addition, this model can self-consistently explain the typical duration, luminosity, and repetitive rate of the 17 bursts of FRB 121102. The predicted occurrence rate of repeating FRB sources may imply that our model would be testable in the next few years.« less
Highly Parallel Alternating Directions Algorithm for Time Dependent Problems
NASA Astrophysics Data System (ADS)
Ganzha, M.; Georgiev, K.; Lirkov, I.; Margenov, S.; Paprzycki, M.
2011-11-01
In our work, we consider the time dependent Stokes equation on a finite time interval and on a uniform rectangular mesh, written in terms of velocity and pressure. For this problem, a parallel algorithm based on a novel direction splitting approach is developed. Here, the pressure equation is derived from a perturbed form of the continuity equation, in which the incompressibility constraint is penalized in a negative norm induced by the direction splitting. The scheme used in the algorithm is composed of two parts: (i) velocity prediction, and (ii) pressure correction. This is a Crank-Nicolson-type two-stage time integration scheme for two and three dimensional parabolic problems in which the second-order derivative, with respect to each space variable, is treated implicitly while the other variable is made explicit at each time sub-step. In order to achieve a good parallel performance the solution of the Poison problem for the pressure correction is replaced by solving a sequence of one-dimensional second order elliptic boundary value problems in each spatial direction. The parallel code is implemented using the standard MPI functions and tested on two modern parallel computer systems. The performed numerical tests demonstrate good level of parallel efficiency and scalability of the studied direction-splitting-based algorithm.
Portion sizes and obesity: responses of fast-food companies.
Young, Lisa R; Nestle, Marion
2007-07-01
Because the sizes of food portions, especially of fast food, have increased in parallel with rising rates of overweight, health authorities have called on fast-food chains to decrease the sizes of menu items. From 2002 to 2006, we examined responses of fast-food chains to such calls by determining the current sizes of sodas, French fries, and hamburgers at three leading chains and comparing them to sizes observed in 1998 and 2002. Although McDonald's recently phased out its largest offerings, current items are similar to 1998 sizes and greatly exceed those offered when the company opened in 1955. Burger King and Wendy's have increased portion sizes, even while health authorities are calling for portion size reductions. Fast-food portions in the United States are larger than in Europe. These observations suggest that voluntary efforts by fast-food companies to reduce portion sizes are unlikely to be effective, and that policy approaches are needed to reduce energy intake from fast food.
Fast word reading in pure alexia: "fast, yet serial".
Bormann, Tobias; Wolfer, Sascha; Hachmann, Wibke; Neubauer, Claudia; Konieczny, Lars
2015-01-01
Pure alexia is a severe impairment of word reading in which individuals process letters serially with a pronounced length effect. Yet, there is considerable variation in the performance of alexic readers with generally very slow, but also occasionally fast responses, an observation addressed rarely in previous reports. It has been suggested that "fast" responses in pure alexia reflect residual parallel letter processing or that they may even be subserved by an independent reading system. Four experiments assessed fast and slow reading in a participant (DN) with pure alexia. Two behavioral experiments investigated frequency, neighborhood, and length effects in forced fast reading. Two further experiments measured eye movements when DN was forced to read quickly, or could respond faster because words were easier to process. Taken together, there was little support for the proposal that "qualitatively different" mechanisms or reading strategies underlie both types of responses in DN. Instead, fast responses are argued to be generated by the same serial-reading strategy.
SKS Splitting and the Scale of Vertical Coherence of the Taiwan Mountain Belt
NASA Astrophysics Data System (ADS)
Kuo, Ban-Yuan; Lin, Shu-Chuan; Lin, Yi-Wei
2018-02-01
Many continental orogens feature a pattern of SKS shear wave splitting with fast polarization directions parallel to the mountain fabrics and delay times of 1-2 s, implying that the crust and lithosphere deform consistently. In the Taiwan arc-continent collision zone, similar pattern of SKS splitting exists, and thereby lithospheric scale deformation due to collision has been assumed. However, recent dynamic modeling demonstrated that the SKS splitting in Taiwan can be generated by the toroidal flow in the asthenosphere induced by the subduction of the Philippine Sea plate and the Eurasian plate. To further evaluate this hypothesis, we analyzed a new data set using a quantitative approach. The results show that models with slab geometries constrained by seismicity explain the observed fast splitting direction to within 25°, whereas the misfit grows to 50-60° if the toroidal flow is disrupted by the presence of a sizable aseismic slab beneath central Taiwan as often suggested by tomographic imaging. However, small sized aseismic slab or detached slab fragment can potentially reconcile the splitting observations. We estimated the scale of vertical coherence to be 10-40 km in the lithosphere and 100-150 km in the asthenosphere, making the former unfavorable for accumulating large delay times. The low coherence is caused by the subduction of the Eurasian plate that creates complex deformation different from what characterizes the compressional tectonics above the plate. This suggests that the mountain building in Taiwan is a shallow process, rather than lithospheric in scale.
Lateral Variations in SKS Splitting Across the MAGIC Array, Central Appalachians
NASA Astrophysics Data System (ADS)
Aragon, John C.; Long, Maureen D.; Benoit, Margaret H.
2017-11-01
The eastern margin of North America has been shaped by several cycles of supercontinent assembly. These past episodes of orogenesis and continental rifting have likely deformed the lithosphere, but the extent, style, and geometry of this deformation remain poorly known. Measurements of seismic anisotropy in the upper mantle can shed light on past lithospheric deformation, but may also reveal contributions from present-day mantle flow in the asthenosphere. Here we examine SKS waveforms and measure splitting of SKS phases recorded by the MAGIC experiment, a dense transect of seismic stations across the central Appalachians. Our measurements constrain small-scale lateral variations in azimuthal anisotropy and reveal distinct regions of upper mantle anisotropy. Stations within the present-day Appalachian Mountains exhibit fast splitting directions roughly parallel to the strike of the mountains and delay times of about 1.0 s. To the west, transverse component waveforms for individual events reveal lateral variability in anisotropic structure. Stations immediately to the east of the mountains exhibit complicated splitting patterns, more null SKS arrivals, and a distinct clockwise rotation of fast directions. The observed variability in splitting behavior argues for contributions from both the lithosphere and the asthenospheric mantle. We infer that the sharp lateral transition in splitting behavior at the eastern edge of the Appalachians is controlled by a change in anisotropy in the lithospheric mantle. We hypothesize that beneath the Appalachians, SKS splitting reflects lithospheric deformation associated with Appalachian orogenesis, while just to the east this anisotropic signature was modified by Mesozoic rifting.
Parallel/distributed direct method for solving linear systems
NASA Technical Reports Server (NTRS)
Lin, Avi
1990-01-01
A new family of parallel schemes for directly solving linear systems is presented and analyzed. It is shown that these schemes exhibit a near optimal performance and enjoy several important features: (1) For large enough linear systems, the design of the appropriate paralleled algorithm is insensitive to the number of processors as its performance grows monotonically with them; (2) It is especially good for large matrices, with dimensions large relative to the number of processors in the system; (3) It can be used in both distributed parallel computing environments and tightly coupled parallel computing systems; and (4) This set of algorithms can be mapped onto any parallel architecture without any major programming difficulties or algorithmical changes.
2015-03-01
HEAVY OXIDE INORGANIC SCINTILLATOR CRYSTALS FOR DIRECT DETECTION OF FAST NEUTRONS BASED ON INELASTIC SCATTERING by Philip R. Rusiecki...HEAVY OXIDE INORGANIC SCINTILLATOR CRYSTALS FOR DIRECT DETECTION OF FAST NEUTRONS BASED ON INELASTIC SCATTERING 6. AUTHOR(S) Philip R. Rusiecki 7...ABSTRACT (maximum 200 words) Heavy oxide inorganic scintillators may prove viable in the detection of fast neutrons based on the mechanism of
NASA Astrophysics Data System (ADS)
Gao, Y.; Wang, Q.; SHI, Y.
2017-12-01
There are orogenic belts and strong deformation in northeastern zone of Tibetan Plateau. The media in crust and in the upper mantle are seismic anisotropic there. This study uses seismic records by permanent seismic stations and portable seismic arrays, and adopts analysis techniques on body waves to obtain spatial anisotropic distribution in northeastern front zone of Tibetan Plateau. With seismic records of small local earthquakes, we study shear-wave splitting in the upper crust. The polarization of fast shear wave (PFS) can be obtained, and PFS is considered parallel to the strike of the cracks, as well as the direction of maximum horizontal compressive stress. However, the result shows the strong influence from tectonics, such as faults. It suggests multiple-influence including stress and fault. Spatial distribution of seismic anisotropy in study zone presents the effect in short range. PFS at the station on the strike-slip fault is quite different to PFS at station just hundreds of meters away from the fault. With seismic records of teleseismic waveforms, we obtained seismic anisotropy in the whole crust by receiver functions. The PFS directions from Pms receiver functions show consistency, generally in WNW. The time-delay of slow S phases is significant. With seismic records of SKS, PKS and SKKS phases, we can detect seismic anisotropy in the upper mantle by splitting analysis. The fast directions of these phases also show consistency, generally in WNW, similar to those of receiver functions, but larger time-delays. It suggests significant seismic anisotropy in the crust and crustal deformation is coherent to that in the upper mantle.Seismic anisotropy in the upper crust, in the whole crust and in the upper mantle are discussed both in difference and tectonic implications [Grateful to the support by NSFC Project 41474032].
The development of GPU-based parallel PRNG for Monte Carlo applications in CUDA Fortran
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kargaran, Hamed, E-mail: h-kargaran@sbu.ac.ir; Minuchehr, Abdolhamid; Zolfaghari, Ahmad
The implementation of Monte Carlo simulation on the CUDA Fortran requires a fast random number generation with good statistical properties on GPU. In this study, a GPU-based parallel pseudo random number generator (GPPRNG) have been proposed to use in high performance computing systems. According to the type of GPU memory usage, GPU scheme is divided into two work modes including GLOBAL-MODE and SHARED-MODE. To generate parallel random numbers based on the independent sequence method, the combination of middle-square method and chaotic map along with the Xorshift PRNG have been employed. Implementation of our developed PPRNG on a single GPU showedmore » a speedup of 150x and 470x (with respect to the speed of PRNG on a single CPU core) for GLOBAL-MODE and SHARED-MODE, respectively. To evaluate the accuracy of our developed GPPRNG, its performance was compared to that of some other commercially available PPRNGs such as MATLAB, FORTRAN and Miller-Park algorithm through employing the specific standard tests. The results of this comparison showed that the developed GPPRNG in this study can be used as a fast and accurate tool for computational science applications.« less
Algorithm for fast event parameters estimation on GEM acquired data
NASA Astrophysics Data System (ADS)
Linczuk, Paweł; Krawczyk, Rafał D.; Poźniak, Krzysztof T.; Kasprowicz, Grzegorz; Wojeński, Andrzej; Chernyshova, Maryna; Czarski, Tomasz
2016-09-01
We present study of a software-hardware environment for developing fast computation with high throughput and low latency methods, which can be used as back-end in High Energy Physics (HEP) and other High Performance Computing (HPC) systems, based on high amount of input from electronic sensor based front-end. There is a parallelization possibilities discussion and testing on Intel HPC solutions with consideration of applications with Gas Electron Multiplier (GEM) measurement systems presented in this paper.
Hardware-efficient implementation of digital FIR filter using fast first-order moment algorithm
NASA Astrophysics Data System (ADS)
Cao, Li; Liu, Jianguo; Xiong, Jun; Zhang, Jing
2018-03-01
As the digital finite impulse response (FIR) filter can be transformed into the shift-add form of multiple small-sized firstorder moments, based on the existing fast first-order moment algorithm, this paper presents a novel multiplier-less structure to calculate any number of sequential filtering results in parallel. The theoretical analysis on its hardware and time-complexities reveals that by appropriately setting the degree of parallelism and the decomposition factor of a fixed word width, the proposed structure may achieve better area-time efficiency than the existing two-dimensional (2-D) memoryless-based filter. To evaluate the performance concretely, the proposed designs for different taps along with the existing 2-D memoryless-based filters, are synthesized by Synopsys Design Compiler with 0.18-μm SMIC library. The comparisons show that the proposed design has less area-time complexity and power consumption when the number of filter taps is larger than 48.
A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids
NASA Technical Reports Server (NTRS)
Prieto, Manuel; Montero, Ruben S.; Llorente, Ignacio M.; Bushnell, Dennis M. (Technical Monitor)
2001-01-01
This paper presents an efficient parallel multigrid solver for speeding up the computation of a 3-D model that treats the flow of a viscous fluid over a flat plate. The main interest of this simulation lies in exhibiting some basic difficulties that prevent optimal multigrid efficiencies from being achieved. As the computing platform, we have used Coral, a Beowulf-class system based on Intel Pentium processors and equipped with GigaNet cLAN and switched Fast Ethernet networks. Our study not only examines the scalability of the solver but also includes a performance evaluation of Coral where the investigated solver has been used to compare several of its design choices, namely, the interconnection network (GigaNet versus switched Fast-Ethernet) and the node configuration (dual nodes versus single nodes). As a reference, the performance results have been compared with those obtained with the NAS-MG benchmark.
NASA Technical Reports Server (NTRS)
Mccormick, S.; Quinlan, D.
1989-01-01
The fast adaptive composite grid method (FAC) is an algorithm that uses various levels of uniform grids (global and local) to provide adaptive resolution and fast solution of PDEs. Like all such methods, it offers parallelism by using possibly many disconnected patches per level, but is hindered by the need to handle these levels sequentially. The finest levels must therefore wait for processing to be essentially completed on all the coarser ones. A recently developed asynchronous version of FAC, called AFAC, completely eliminates this bottleneck to parallelism. This paper describes timing results for AFAC, coupled with a simple load balancing scheme, applied to the solution of elliptic PDEs on an Intel iPSC hypercube. These tests include performance of certain processes necessary in adaptive methods, including moving grids and changing refinement. A companion paper reports on numerical and analytical results for estimating convergence factors of AFAC applied to very large scale examples.
Parallel algorithms for placement and routing in VLSI design. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Brouwer, Randall Jay
1991-01-01
The computational requirements for high quality synthesis, analysis, and verification of very large scale integration (VLSI) designs have rapidly increased with the fast growing complexity of these designs. Research in the past has focused on the development of heuristic algorithms, special purpose hardware accelerators, or parallel algorithms for the numerous design tasks to decrease the time required for solution. Two new parallel algorithms are proposed for two VLSI synthesis tasks, standard cell placement and global routing. The first algorithm, a parallel algorithm for global routing, uses hierarchical techniques to decompose the routing problem into independent routing subproblems that are solved in parallel. Results are then presented which compare the routing quality to the results of other published global routers and which evaluate the speedups attained. The second algorithm, a parallel algorithm for cell placement and global routing, hierarchically integrates a quadrisection placement algorithm, a bisection placement algorithm, and the previous global routing algorithm. Unique partitioning techniques are used to decompose the various stages of the algorithm into independent tasks which can be evaluated in parallel. Finally, results are presented which evaluate the various algorithm alternatives and compare the algorithm performance to other placement programs. Measurements are presented on the parallel speedups available.
NASA Technical Reports Server (NTRS)
Wigton, Larry
1996-01-01
Improving the numerical linear algebra routines for use in new Navier-Stokes codes, specifically Tim Barth's unstructured grid code, with spin-offs to TRANAIR is reported. A fast distance calculation routine for Navier-Stokes codes using the new one-equation turbulence models is written. The primary focus of this work was devoted to improving matrix-iterative methods. New algorithms have been developed which activate the full potential of classical Cray-class computers as well as distributed-memory parallel computers.
NASA Astrophysics Data System (ADS)
Laitinen, Timo; Effenberger, Frederic; Kopp, Andreas; Dalla, Silvia
2018-02-01
Insights into the processes of Solar Energetic Particle (SEP) propagation are essential for understanding how solar eruptions affect the radiation environment of near-Earth space. SEP propagation is influenced by turbulent magnetic fields in the solar wind, resulting in stochastic transport of the particles from their acceleration site to Earth. While the conventional approach for SEP modelling focuses mainly on the transport of particles along the mean Parker spiral magnetic field, multi-spacecraft observations suggest that the cross-field propagation shapes the SEP fluxes at Earth strongly. However, adding cross-field transport of SEPs as spatial diffusion has been shown to be insufficient in modelling the SEP events without use of unrealistically large cross-field diffusion coefficients. Recently, Laitinen et al. [ApJL 773 (2013b); A&A 591 (2016)] demonstrated that the early-time propagation of energetic particles across the mean field direction in turbulent fields is not diffusive, with the particles propagating along meandering field lines. This early-time transport mode results in fast access of the particles across the mean field direction, in agreement with the SEP observations. In this work, we study the propagation of SEPs within the new transport paradigm, and demonstrate the significance of turbulence strength on the evolution of the SEP radiation environment near Earth. We calculate the transport parameters consistently using a turbulence transport model, parametrised by the SEP parallel scattering mean free path at 1 AU, λ∥*, and show that the parallel and cross-field transport are connected, with conditions resulting in slow parallel transport corresponding to wider events. We find a scaling σφ,max∝(1/λ∥*)1/4 for the Gaussian fitting of the longitudinal distribution of maximum intensities. The longitudes with highest intensities are shifted towards the west for strong scattering conditions. Our results emphasise the importance of understanding both the SEP transport and the interplanetary turbulence conditions for modelling and predicting the SEP radiation environment at Earth.
PoPLAR: Portal for Petascale Lifescience Applications and Research
2013-01-01
Background We are focusing specifically on fast data analysis and retrieval in bioinformatics that will have a direct impact on the quality of human health and the environment. The exponential growth of data generated in biology research, from small atoms to big ecosystems, necessitates an increasingly large computational component to perform analyses. Novel DNA sequencing technologies and complementary high-throughput approaches--such as proteomics, genomics, metabolomics, and meta-genomics--drive data-intensive bioinformatics. While individual research centers or universities could once provide for these applications, this is no longer the case. Today, only specialized national centers can deliver the level of computing resources required to meet the challenges posed by rapid data growth and the resulting computational demand. Consequently, we are developing massively parallel applications to analyze the growing flood of biological data and contribute to the rapid discovery of novel knowledge. Methods The efforts of previous National Science Foundation (NSF) projects provided for the generation of parallel modules for widely used bioinformatics applications on the Kraken supercomputer. We have profiled and optimized the code of some of the scientific community's most widely used desktop and small-cluster-based applications, including BLAST from the National Center for Biotechnology Information (NCBI), HMMER, and MUSCLE; scaled them to tens of thousands of cores on high-performance computing (HPC) architectures; made them robust and portable to next-generation architectures; and incorporated these parallel applications in science gateways with a web-based portal. Results This paper will discuss the various developmental stages, challenges, and solutions involved in taking bioinformatics applications from the desktop to petascale with a front-end portal for very-large-scale data analysis in the life sciences. Conclusions This research will help to bridge the gap between the rate of data generation and the speed at which scientists can study this data. The ability to rapidly analyze data at such a large scale is having a significant, direct impact on science achieved by collaborators who are currently using these tools on supercomputers. PMID:23902523
Fast realization of nonrecursive digital filters with limits on signal delay
NASA Astrophysics Data System (ADS)
Titov, M. A.; Bondarenko, N. N.
1983-07-01
Attention is given to the problem of achieving a fast realization of nonrecursive digital filters with the aim of reducing signal delay. It is shown that a realization wherein the impulse characteristic of the filter is divided into blocks satisfies the delay requirements and is almost as economical in terms of the number of multiplications as conventional fast convolution. In addition, the block method leads to a reduction in the needed size of the memory and in the number of additions; the short-convolution procedure is substantially simplified. Finally, the block method facilitates the paralleling of computations owing to the simple transfers between subfilters.
Seismic anisotropy across the east African plateau from shear wave splitting analysis
NASA Astrophysics Data System (ADS)
Bagley, B. C.; Nyblade, A.; Mulibo, G.; Tugume, F.
2011-12-01
Previous studies of the east African plateau reveal complicated patterns of seismic anisotropy that are not easily explained by a single mechanism. The pattern is defined by rift-parallel fast directions for stations within or near Cenozoic rift valleys, and near-null results in Precambrian terrains away from the rift. Data from 65 temporary Africa Array stations deployed between 2007 and 2011 are being used to make new shear wave splitting measurements. The stations span the east African plateau and cover both the eastern and western branches of the east African rift system, as well as unrifted Proterozoic and Archean terrains in Uganda, Kenya, Tanzania, and Zambia. Through analysis of shear wave splitting we will better constrain the distribution of seismic anisotropy, and and from it gain new insight into the tectonic evolution of east Africa.
Lee, Hangyeore; Mun, Dong-Gi; Bae, Jingi; Kim, Hokeun; Oh, Se Yeon; Park, Young Soo; Lee, Jae-Hyuk; Lee, Sang-Won
2015-08-21
We report a new and simple design of a fully automated dual-online ultra-high pressure liquid chromatography system. The system employs only two nano-volume switching valves (a two-position four port valve and a two-position ten port valve) that direct solvent flows from two binary nano-pumps for parallel operation of two analytical columns and two solid phase extraction (SPE) columns. Despite the simple design, the sDO-UHPLC offers many advantageous features that include high duty cycle, back flushing sample injection for fast and narrow zone sample injection, online desalting, high separation resolution and high intra/inter-column reproducibility. This system was applied to analyze proteome samples not only in high throughput deep proteome profiling experiments but also in high throughput MRM experiments.
High-contrast imaging in the cloud with klipReduce and Findr
NASA Astrophysics Data System (ADS)
Haug-Baltzell, Asher; Males, Jared R.; Morzinski, Katie M.; Wu, Ya-Lin; Merchant, Nirav; Lyons, Eric; Close, Laird M.
2016-08-01
Astronomical data sets are growing ever larger, and the area of high contrast imaging of exoplanets is no exception. With the advent of fast, low-noise detectors operating at 10 to 1000 Hz, huge numbers of images can be taken during a single hours-long observation. High frame rates offer several advantages, such as improved registration, frame selection, and improved speckle calibration. However, advanced image processing algorithms are computationally challenging to apply. Here we describe a parallelized, cloud-based data reduction system developed for the Magellan Adaptive Optics VisAO camera, which is capable of rapidly exploring tens of thousands of parameter sets affecting the Karhunen-Loève image processing (KLIP) algorithm to produce high-quality direct images of exoplanets. We demonstrate these capabilities with a visible wavelength high contrast data set of a hydrogen-accreting brown dwarf companion.
Mechanisms for Rapid Adaptive Control of Motion Processing in Macaque Visual Cortex.
McLelland, Douglas; Baker, Pamela M; Ahmed, Bashir; Kohn, Adam; Bair, Wyeth
2015-07-15
A key feature of neural networks is their ability to rapidly adjust their function, including signal gain and temporal dynamics, in response to changes in sensory inputs. These adjustments are thought to be important for optimizing the sensitivity of the system, yet their mechanisms remain poorly understood. We studied adaptive changes in temporal integration in direction-selective cells in macaque primary visual cortex, where specific hypotheses have been proposed to account for rapid adaptation. By independently stimulating direction-specific channels, we found that the control of temporal integration of motion at one direction was independent of motion signals driven at the orthogonal direction. We also found that individual neurons can simultaneously support two different profiles of temporal integration for motion in orthogonal directions. These findings rule out a broad range of adaptive mechanisms as being key to the control of temporal integration, including untuned normalization and nonlinearities of spike generation and somatic adaptation in the recorded direction-selective cells. Such mechanisms are too broadly tuned, or occur too far downstream, to explain the channel-specific and multiplexed temporal integration that we observe in single neurons. Instead, we are compelled to conclude that parallel processing pathways are involved, and we demonstrate one such circuit using a computer model. This solution allows processing in different direction/orientation channels to be separately optimized and is sensible given that, under typical motion conditions (e.g., translation or looming), speed on the retina is a function of the orientation of image components. Many neurons in visual cortex are understood in terms of their spatial and temporal receptive fields. It is now known that the spatiotemporal integration underlying visual responses is not fixed but depends on the visual input. For example, neurons that respond selectively to motion direction integrate signals over a shorter time window when visual motion is fast and a longer window when motion is slow. We investigated the mechanisms underlying this useful adaptation by recording from neurons as they responded to stimuli moving in two different directions at different speeds. Computer simulations of our results enabled us to rule out several candidate theories in favor of a model that integrates across multiple parallel channels that operate at different time scales. Copyright © 2015 the authors 0270-6474/15/3510268-13$15.00/0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Jiajia; Wang, Yuming; McIntosh, Scott W.
We combine observations of the Coronal Multi-channel Polarimeter and the Atmospheric Imaging Assembly on board the Solar Dynamics Observatory to study the characteristic properties of (propagating) Alfvénic motions and quasi-periodic intensity disturbances in polar plumes. This unique combination of instruments highlights the physical richness of the processes taking place at the base of the (fast) solar wind. The (parallel) intensity perturbations with intensity enhancements around 1% have an apparent speed of 120 km s{sup −1} (in both the 171 and 193 Å passbands) and a periodicity of 15 minutes, while the (perpendicular) Alfvénic wave motions have a velocity amplitude ofmore » 0.5 km s{sup −1}, a phase speed of 830 km s{sup −1}, and a shorter period of 5 minutes on the same structures. These observations illustrate a scenario where the excited Alfvénic motions are propagating along an inhomogeneously loaded magnetic field structure such that the combination could be a potential progenitor of the magnetohydrodynamic turbulence required to accelerate the fast solar wind.« less
NASA Astrophysics Data System (ADS)
Homuth, B.; Löbl, U.; Batte, A. G.; Link, K.; Kasereka, C. M.; Rümpker, G.
2016-09-01
Shear-wave splitting measurements from local and teleseismic earthquakes are used to investigate the seismic anisotropy in the upper mantle beneath the Rwenzori region of the East African Rift system. At most stations, shear-wave splitting parameters obtained from individual earthquakes exhibit only minor variations with backazimuth. We therefore employ a joint inversion of SKS waveforms to derive hypothetical one-layer parameters. The corresponding fast polarizations are generally rift parallel and the average delay time is about 1 s. Shear phases from local events within the crust are characterized by an average delay time of 0.04 s. Delay times from local mantle earthquakes are in the range of 0.2 s. This observation suggests that the dominant source region for seismic anisotropy beneath the rift is located within the mantle. We use finite-frequency waveform modeling to test different models of anisotropy within the lithosphere/asthenosphere system of the rift. The results show that the rift-parallel fast polarizations are consistent with horizontal transverse isotropy (HTI anisotropy) caused by rift-parallel magmatic intrusions or lenses located within the lithospheric mantle—as it would be expected during the early stages of continental rifting. Furthermore, the short-scale spatial variations in the fast polarizations observed in the southern part of the study area can be explained by effects due to sedimentary basins of low isotropic velocity in combination with a shift in the orientation of anisotropic fabrics in the upper mantle. A uniform anisotropic layer in relation to large-scale asthenospheric mantle flow is less consistent with the observed splitting parameters.
An accurate, fast, and scalable solver for high-frequency wave propagation
NASA Astrophysics Data System (ADS)
Zepeda-Núñez, L.; Taus, M.; Hewett, R.; Demanet, L.
2017-12-01
In many science and engineering applications, solving time-harmonic high-frequency wave propagation problems quickly and accurately is of paramount importance. For example, in geophysics, particularly in oil exploration, such problems can be the forward problem in an iterative process for solving the inverse problem of subsurface inversion. It is important to solve these wave propagation problems accurately in order to efficiently obtain meaningful solutions of the inverse problems: low order forward modeling can hinder convergence. Additionally, due to the volume of data and the iterative nature of most optimization algorithms, the forward problem must be solved many times. Therefore, a fast solver is necessary to make solving the inverse problem feasible. For time-harmonic high-frequency wave propagation, obtaining both speed and accuracy is historically challenging. Recently, there have been many advances in the development of fast solvers for such problems, including methods which have linear complexity with respect to the number of degrees of freedom. While most methods scale optimally only in the context of low-order discretizations and smooth wave speed distributions, the method of polarized traces has been shown to retain optimal scaling for high-order discretizations, such as hybridizable discontinuous Galerkin methods and for highly heterogeneous (and even discontinuous) wave speeds. The resulting fast and accurate solver is consequently highly attractive for geophysical applications. To date, this method relies on a layered domain decomposition together with a preconditioner applied in a sweeping fashion, which has limited straight-forward parallelization. In this work, we introduce a new version of the method of polarized traces which reveals more parallel structure than previous versions while preserving all of its other advantages. We achieve this by further decomposing each layer and applying the preconditioner to these new components separately and in parallel. We demonstrate that this produces an even more effective and parallelizable preconditioner for a single right-hand side. As before, additional speed can be gained by pipelining several right-hand-sides.
A fast ultrasonic simulation tool based on massively parallel implementations
NASA Astrophysics Data System (ADS)
Lambert, Jason; Rougeron, Gilles; Lacassagne, Lionel; Chatillon, Sylvain
2014-02-01
This paper presents a CIVA optimized ultrasonic inspection simulation tool, which takes benefit of the power of massively parallel architectures: graphical processing units (GPU) and multi-core general purpose processors (GPP). This tool is based on the classical approach used in CIVA: the interaction model is based on Kirchoff, and the ultrasonic field around the defect is computed by the pencil method. The model has been adapted and parallelized for both architectures. At this stage, the configurations addressed by the tool are : multi and mono-element probes, planar specimens made of simple isotropic materials, planar rectangular defects or side drilled holes of small diameter. Validations on the model accuracy and performances measurements are presented.
Ordered fast Fourier transforms on a massively parallel hypercube multiprocessor
NASA Technical Reports Server (NTRS)
Tong, Charles; Swarztrauber, Paul N.
1991-01-01
The present evaluation of alternative, massively parallel hypercube processor-applicable designs for ordered radix-2 decimation-in-frequency FFT algorithms gives attention to the reduction of computation time-dominating communication. A combination of the order and computational phases of the FFT is accordingly employed, in conjunction with sequence-to-processor maps which reduce communication. Two orderings, 'standard' and 'cyclic', in which the order of the transform is the same as that of the input sequence, can be implemented with ease on the Connection Machine (where orderings are determined by geometries and priorities. A parallel method for trigonometric coefficient computation is presented which does not employ trigonometric functions or interprocessor communication.
Fast parallel molecular algorithms for DNA-based computation: factoring integers.
Chang, Weng-Long; Guo, Minyi; Ho, Michael Shan-Hui
2005-06-01
The RSA public-key cryptosystem is an algorithm that converts input data to an unrecognizable encryption and converts the unrecognizable data back into its original decryption form. The security of the RSA public-key cryptosystem is based on the difficulty of factoring the product of two large prime numbers. This paper demonstrates to factor the product of two large prime numbers, and is a breakthrough in basic biological operations using a molecular computer. In order to achieve this, we propose three DNA-based algorithms for parallel subtractor, parallel comparator, and parallel modular arithmetic that formally verify our designed molecular solutions for factoring the product of two large prime numbers. Furthermore, this work indicates that the cryptosystems using public-key are perhaps insecure and also presents clear evidence of the ability of molecular computing to perform complicated mathematical operations.
NASA Astrophysics Data System (ADS)
Wang, Yue; Yu, Jingjun; Pei, Xu
2018-06-01
A new forward kinematics algorithm for the mechanism of 3-RPS (R: Revolute; P: Prismatic; S: Spherical) parallel manipulators is proposed in this study. This algorithm is primarily based on the special geometric conditions of the 3-RPS parallel mechanism, and it eliminates the errors produced by parasitic motions to improve and ensure accuracy. Specifically, the errors can be less than 10-6. In this method, only the group of solutions that is consistent with the actual situation of the platform is obtained rapidly. This algorithm substantially improves calculation efficiency because the selected initial values are reasonable, and all the formulas in the calculation are analytical. This novel forward kinematics algorithm is well suited for real-time and high-precision control of the 3-RPS parallel mechanism.
Computational electromagnetics: the physics of smooth versus oscillatory fields.
Chew, W C
2004-03-15
This paper starts by discussing the difference in the physics between solutions to Laplace's equation (static) and Maxwell's equations for dynamic problems (Helmholtz equation). Their differing physical characters are illustrated by how the two fields convey information away from their source point. The paper elucidates the fact that their differing physical characters affect the use of Laplacian field and Helmholtz field in imaging. They also affect the design of fast computational algorithms for electromagnetic scattering problems. Specifically, a comparison is made between fast algorithms developed using wavelets, the simple fast multipole method, and the multi-level fast multipole algorithm for electrodynamics. The impact of the physical characters of the dynamic field on the parallelization of the multi-level fast multipole algorithm is also discussed. The relationship of diagonalization of translators to group theory is presented. Finally, future areas of research for computational electromagnetics are described.
Fast Mapping Across Time: Memory Processes Support Children's Retention of Learned Words.
Vlach, Haley A; Sandhofer, Catherine M
2012-01-01
Children's remarkable ability to map linguistic labels to referents in the world is commonly called fast mapping. The current study examined children's (N = 216) and adults' (N = 54) retention of fast-mapped words over time (immediately, after a 1-week delay, and after a 1-month delay). The fast mapping literature often characterizes children's retention of words as consistently high across timescales. However, the current study demonstrates that learners forget word mappings at a rapid rate. Moreover, these patterns of forgetting parallel forgetting functions of domain-general memory processes. Memory processes are critical to children's word learning and the role of one such process, forgetting, is discussed in detail - forgetting supports extended mapping by promoting the memory and generalization of words and categories.
Sabouni, Abas; Pouliot, Philippe; Shmuel, Amir; Lesage, Frederic
2014-01-01
This paper introduce a fast and efficient solver for simulating the induced (eddy) current distribution in the brain during transcranial magnetic stimulation procedure. This solver has been integrated with MRI and neuronavigation software to accurately model the electromagnetic field and show eddy current in the head almost in real-time. To examine the performance of the proposed technique, we used a 3D anatomically accurate MRI model of the 25 year old female subject.
Crust-mantle Coupling Seismogenic Mechanism in Sichuan-Yunnan Region
NASA Astrophysics Data System (ADS)
Qiang, H.; Pei, L. S.; Yuan, Z. W.; Dong, L. S.
2016-12-01
The intracrustal weak zone controls strength of interaction between crust and mantle, restricts coupling relationship between lithospheric layers, and also affects mode of interaction between blocks. This effect can be analyzed in terms of comparing deformation and stress in different depth. The paper is based on GPS time series data that provided by 81 base stations from 1999 to 2015 to compute velocity field. Combining previous SKS shear wave splitting data, we analyze deformation characteristics of horizontal direction. The lithospheric bottom mantle convection stress field of the Sichuan-Yunnan region is calculated using 11 36 spherical harmonic coefficients of gravity model EGM2008. Meanwhile the focal mechanism of 1131 earthquakes that occurred from 2000 to now in Sichuan-Yunnan region is collected and organized. Through the above systematic research, this article argues that uneven development of the stress is the key of strain energy accumulation. And vertical coupling relationship of different layers greatly influences interaction of blocks. There is stress delamination in blocks which exist the intracrustal weak zone, stress of edge area changes significantly in horizontal and vertical directions, and seismic risk of crust above the weak layer is higher. We choose 81 stations from research area ,download the coordinate time series and use the monadic linear regression analysis to obtain the stations' average speed as shown in figure 1(a).the continuous variation of the velocity vector diagram.When in the process of communication, SKS wave divided into polarization direction and anisotropy of the parallel to the axis of symmetry fast slow wave and vertical wave through anisotropic medium. Fast wave polarization direction is considered to be the mantle peridotite in the crystal lattice advantage under the local stress direction, reflect the deformation of the upper mantle; Time delay of torsion wave reflect the characterization of anisotropic layer thickness and strength. This paper collected Wang Chunyong etc. [1], Chang Lijun provided in [2], such as literature research of 130 stations in the area of SKS shear wave splitting parameters (as shown in figure 1 (b)). From picture 1(c), Northwest Yunnan block and Lhasa block GPS crustal deformation direction are consistent.
NASA Astrophysics Data System (ADS)
Tonegawa, Takashi; Fukao, Yoshio; Fujie, Gou; Takemura, Shunsuke; Takahashi, Tsutomu; Kodaira, Shuichi
2015-12-01
In the northwestern Pacific, the elastic properties of marine sediments, including P-wave velocities ( Vp) and S wave velocities ( Vs), have recently been constrained by active seismic surveys. However, information on S anisotropy associated with the alignments of fractures and fabric remains elusive. To obtain such information, we used ambient noise records observed by ocean-bottom seismometers at 254 sites in the northwestern Pacific to calculate the auto-correlation functions for the S reflection retrieval from the top of the basement. For these S reflections, we measured differential travel times and polarized directions to reveal the potential geographical systematic distribution of S anisotropy. As a result, the observed differential times between fast and slow axes were at most 0.05 s. The fast polarization axes tend to align in the trench-parallel direction in the outer rise region. In particular, their directions changed systematically in accordance with the direction of the trench axis, which changes sharply across the junction of the Kuril and Japan trenches. We consider that a contributing factor for the obtained S anisotropy within marine sediments in the outer rise region is primarily aligned fractures due to the tensional stresses associated with the bending of the Pacific Plate. Moreover, numerical simulations conducted by using the three-dimensional (3D) finite difference method for isotropic and anisotropic media indicates that the successful extraction of S anisotropic information from the S reflection observed in this study is obtained from near-vertically propagating S waves due to extremely low Vs within marine sediments. In addition, we conducted an additional numerical simulation with a realistic velocity model to confirm whether S reflections below the basement can be extracted or not. The resultant auto-correlation function shows only S reflections from the top of the basement. It appears that such near-vertically propagating S waves obscure S reflections from interfaces below the basement.
Smart Optical Material Characterization System and Method
NASA Technical Reports Server (NTRS)
Choi, Sang Hyouk (Inventor); Park, Yeonjoon (Inventor)
2015-01-01
Disclosed is a system and method for characterizing optical materials, using steps and equipment for generating a coherent laser light, filtering the light to remove high order spatial components, collecting the filtered light and forming a parallel light beam, splitting the parallel beam into a first direction and a second direction wherein the parallel beam travelling in the second direction travels toward the material sample so that the parallel beam passes through the sample, applying various physical quantities to the sample, reflecting the beam travelling in the first direction to produce a first reflected beam, reflecting the beam that passes through the sample to produce a second reflected beam that travels back through the sample, combining the second reflected beam after it travels back though the sample with the first reflected beam, sensing the light beam produced by combining the first and second reflected beams, and processing the sensed beam to determine sample characteristics and properties.
A Fast Algorithm for Massively Parallel, Long-Term, Simulation of Complex Molecular Dynamics Systems
NASA Technical Reports Server (NTRS)
Jaramillo-Botero, Andres; Goddard, William A, III; Fijany, Amir
1997-01-01
The advances in theory and computing technology over the last decade have led to enormous progress in applying atomistic molecular dynamics (MD) methods to the characterization, prediction, and design of chemical, biological, and material systems,.
Parallel VLSI architecture emulation and the organization of APSA/MPP
NASA Technical Reports Server (NTRS)
Odonnell, John T.
1987-01-01
The Applicative Programming System Architecture (APSA) combines an applicative language interpreter with a novel parallel computer architecture that is well suited for Very Large Scale Integration (VLSI) implementation. The Massively Parallel Processor (MPP) can simulate VLSI circuits by allocating one processing element in its square array to an area on a square VLSI chip. As long as there are not too many long data paths, the MPP can simulate a VLSI clock cycle very rapidly. The APSA circuit contains a binary tree with a few long paths and many short ones. A skewed H-tree layout allows every processing element to simulate a leaf cell and up to four tree nodes, with no loss in parallelism. Emulation of a key APSA algorithm on the MPP resulted in performance 16,000 times faster than a Vax. This speed will make it possible for the APSA language interpreter to run fast enough to support research in parallel list processing algorithms.
Automatic Management of Parallel and Distributed System Resources
NASA Technical Reports Server (NTRS)
Yan, Jerry; Ngai, Tin Fook; Lundstrom, Stephen F.
1990-01-01
Viewgraphs on automatic management of parallel and distributed system resources are presented. Topics covered include: parallel applications; intelligent management of multiprocessing systems; performance evaluation of parallel architecture; dynamic concurrent programs; compiler-directed system approach; lattice gaseous cellular automata; and sparse matrix Cholesky factorization.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seal, Sudip K; Perumalla, Kalyan S; Hirshman, Steven Paul
2013-01-01
Simulations that require solutions of block tridiagonal systems of equations rely on fast parallel solvers for runtime efficiency. Leading parallel solvers that are highly effective for general systems of equations, dense or sparse, are limited in scalability when applied to block tridiagonal systems. This paper presents scalability results as well as detailed analyses of two parallel solvers that exploit the special structure of block tridiagonal matrices to deliver superior performance, often by orders of magnitude. A rigorous analysis of their relative parallel runtimes is shown to reveal the existence of a critical block size that separates the parameter space spannedmore » by the number of block rows, the block size and the processor count, into distinct regions that favor one or the other of the two solvers. Dependence of this critical block size on the above parameters as well as on machine-specific constants is established. These formal insights are supported by empirical results on up to 2,048 cores of a Cray XT4 system. To the best of our knowledge, this is the highest reported scalability for parallel block tridiagonal solvers to date.« less
Jackin, Boaz Jessie; Watanabe, Shinpei; Ootsu, Kanemitsu; Ohkawa, Takeshi; Yokota, Takashi; Hayasaki, Yoshio; Yatagai, Toyohiko; Baba, Takanobu
2018-04-20
A parallel computation method for large-size Fresnel computer-generated hologram (CGH) is reported. The method was introduced by us in an earlier report as a technique for calculating Fourier CGH from 2D object data. In this paper we extend the method to compute Fresnel CGH from 3D object data. The scale of the computation problem is also expanded to 2 gigapixels, making it closer to real application requirements. The significant feature of the reported method is its ability to avoid communication overhead and thereby fully utilize the computing power of parallel devices. The method exhibits three layers of parallelism that favor small to large scale parallel computing machines. Simulation and optical experiments were conducted to demonstrate the workability and to evaluate the efficiency of the proposed technique. A two-times improvement in computation speed has been achieved compared to the conventional method, on a 16-node cluster (one GPU per node) utilizing only one layer of parallelism. A 20-times improvement in computation speed has been estimated utilizing two layers of parallelism on a very large-scale parallel machine with 16 nodes, where each node has 16 GPUs.
Orthorectification by Using Gpgpu Method
NASA Astrophysics Data System (ADS)
Sahin, H.; Kulur, S.
2012-07-01
Thanks to the nature of the graphics processing, the newly released products offer highly parallel processing units with high-memory bandwidth and computational power of more than teraflops per second. The modern GPUs are not only powerful graphic engines but also they are high level parallel programmable processors with very fast computing capabilities and high-memory bandwidth speed compared to central processing units (CPU). Data-parallel computations can be shortly described as mapping data elements to parallel processing threads. The rapid development of GPUs programmability and capabilities attracted the attentions of researchers dealing with complex problems which need high level calculations. This interest has revealed the concepts of "General Purpose Computation on Graphics Processing Units (GPGPU)" and "stream processing". The graphic processors are powerful hardware which is really cheap and affordable. So the graphic processors became an alternative to computer processors. The graphic chips which were standard application hardware have been transformed into modern, powerful and programmable processors to meet the overall needs. Especially in recent years, the phenomenon of the usage of graphics processing units in general purpose computation has led the researchers and developers to this point. The biggest problem is that the graphics processing units use different programming models unlike current programming methods. Therefore, an efficient GPU programming requires re-coding of the current program algorithm by considering the limitations and the structure of the graphics hardware. Currently, multi-core processors can not be programmed by using traditional programming methods. Event procedure programming method can not be used for programming the multi-core processors. GPUs are especially effective in finding solution for repetition of the computing steps for many data elements when high accuracy is needed. Thus, it provides the computing process more quickly and accurately. Compared to the GPUs, CPUs which perform just one computing in a time according to the flow control are slower in performance. This structure can be evaluated for various applications of computer technology. In this study covers how general purpose parallel programming and computational power of the GPUs can be used in photogrammetric applications especially direct georeferencing. The direct georeferencing algorithm is coded by using GPGPU method and CUDA (Compute Unified Device Architecture) programming language. Results provided by this method were compared with the traditional CPU programming. In the other application the projective rectification is coded by using GPGPU method and CUDA programming language. Sample images of various sizes, as compared to the results of the program were evaluated. GPGPU method can be used especially in repetition of same computations on highly dense data, thus finding the solution quickly.
NASA Astrophysics Data System (ADS)
Tschiersch, R.; Nemschokmichal, S.; Bogaczyk, M.; Meichsner, J.
2017-10-01
Single self-stabilized discharge filaments were investigated in the plane-parallel electrode configuration. The barrier discharge was operated inside a gap of 3 mm shielded by glass plates to both electrodes, using helium-nitrogen mixtures and a square-wave feeding voltage at a frequency of 2 kHz. The combined application of electrical measurements, ICCD camera imaging, optical emission spectroscopy and surface charge diagnostics via the electro-optic Pockels effect allowed the correlation of the discharge development in the volume and on the dielectric surfaces. The formation criteria and existence regimes were found by systematic variation of the nitrogen admixture to helium, the total pressure and the feeding voltage amplitude. Single self-stabilized discharge filaments can be operated over a wide parameter range, foremost, by significant reduction of the voltage amplitude after the operation in the microdischarge regime. Here, the outstanding importance of the surface charge memory effect on the long-term stability was pointed out by the recalculated spatio-temporally resolved gap voltage. The optical emission revealed discharge characteristics that are partially reminiscent of both the glow-like barrier discharge and the microdischarge regime, such as a Townsend pre-phase, a fast cathode-directed ionization front during the breakdown and radially propagating surface discharges during the afterglow.
Reverse engineering and analysis of large genome-scale gene networks
Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas
2013-01-01
Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249
SDS: A Framework for Scientific Data Services
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dong, Bin; Byna, Surendra; Wu, Kesheng
2013-10-31
Large-scale scientific applications typically write their data to parallel file systems with organizations designed to achieve fast write speeds. Analysis tasks frequently read the data in a pattern that is different from the write pattern, and therefore experience poor I/O performance. In this paper, we introduce a prototype framework for bridging the performance gap between write and read stages of data access from parallel file systems. We call this framework Scientific Data Services, or SDS for short. This initial implementation of SDS focuses on reorganizing previously written files into data layouts that benefit read patterns, and transparently directs read callsmore » to the reorganized data. SDS follows a client-server architecture. The SDS Server manages partial or full replicas of reorganized datasets and serves SDS Clients' requests for data. The current version of the SDS client library supports HDF5 programming interface for reading data. The client library intercepts HDF5 calls and transparently redirects them to the reorganized data. The SDS client library also provides a querying interface for reading part of the data based on user-specified selective criteria. We describe the design and implementation of the SDS client-server architecture, and evaluate the response time of the SDS Server and the performance benefits of SDS.« less
Mathematical and Numerical Aspects of the Adaptive Fast Multipole Poisson-Boltzmann Solver
Zhang, Bo; Lu, Benzhuo; Cheng, Xiaolin; ...
2013-01-01
This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to large-scale long-time molecular dynamics simulations. Lastly, the potential of the solver is demonstrated with preliminary numericalmore » results.« less
Measuring signal-to-noise ratio in partially parallel imaging MRI
Goerner, Frank L.; Clarke, Geoffrey D.
2011-01-01
Purpose: To assess five different methods of signal-to-noise ratio (SNR) measurement for partially parallel imaging (PPI) acquisitions. Methods: Measurements were performed on a spherical phantom and three volunteers using a multichannel head coil a clinical 3T MRI system to produce echo planar, fast spin echo, gradient echo, and balanced steady state free precession image acquisitions. Two different PPI acquisitions, generalized autocalibrating partially parallel acquisition algorithm and modified sensitivity encoding with acceleration factors (R) of 2–4, were evaluated and compared to nonaccelerated acquisitions. Five standard SNR measurement techniques were investigated and Bland–Altman analysis was used to determine agreement between the various SNR methods. The estimated g-factor values, associated with each method of SNR calculation and PPI reconstruction method, were also subjected to assessments that considered the effects on SNR due to reconstruction method, phase encoding direction, and R-value. Results: Only two SNR measurement methods produced g-factors in agreement with theoretical expectations (g ≥ 1). Bland–Altman tests demonstrated that these two methods also gave the most similar results relative to the other three measurements. R-value was the only factor of the three we considered that showed significant influence on SNR changes. Conclusions: Non-signal methods used in SNR evaluation do not produce results consistent with expectations in the investigated PPI protocols. Two of the methods studied provided the most accurate and useful results. Of these two methods, it is recommended, when evaluating PPI protocols, the image subtraction method be used for SNR calculations due to its relative accuracy and ease of implementation. PMID:21978049
Slattery, Stuart R.
2015-12-02
In this study we analyze and extend mesh-free algorithms for three-dimensional data transfer problems in partitioned multiphysics simulations. We first provide a direct comparison between a mesh-based weighted residual method using the common-refinement scheme and two mesh-free algorithms leveraging compactly supported radial basis functions: one using a spline interpolation and one using a moving least square reconstruction. Through the comparison we assess both the conservation and accuracy of the data transfer obtained from each of the methods. We do so for a varying set of geometries with and without curvature and sharp features and for functions with and without smoothnessmore » and with varying gradients. Our results show that the mesh-based and mesh-free algorithms are complementary with cases where each was demonstrated to perform better than the other. We then focus on the mesh-free methods by developing a set of algorithms to parallelize them based on sparse linear algebra techniques. This includes a discussion of fast parallel radius searching in point clouds and restructuring the interpolation algorithms to leverage data structures and linear algebra services designed for large distributed computing environments. The scalability of our new algorithms is demonstrated on a leadership class computing facility using a set of basic scaling studies. Finally, these scaling studies show that for problems with reasonable load balance, our new algorithms for both spline interpolation and moving least square reconstruction demonstrate both strong and weak scalability using more than 100,000 MPI processes with billions of degrees of freedom in the data transfer operation.« less
The transcriptomics of ecological convergence between 2 limnetic coregonine fishes (Salmonidae).
Derome, N; Bernatchez, L
2006-12-01
Species living in comparable habitats often display strikingly similar patterns of specialization, suggesting that natural selection can lead to predictable evolutionary changes. Elucidating the genomic basis underlying such adaptive phenotypic changes is a major goal in evolutionary biology. Increasing evidence indicates that natural selection would first modulate gene regulation during the process of population divergence. Previously, we showed that parallel phenotypic adaptations of the dwarf whitefish (Coregonus clupeaformis) ecotype to the limnetic trophic niche involved parallel transcriptional changes at the same genes involved in muscle contraction and energetic metabolism relative to the sympatric normal ecotype. Here, we tested whether the same genes are also implicated in a limnetic specialist species, the cisco (Coregonus artedi), which is the most likely competitor of dwarf whitefish. Significant upregulation was detected in cisco at the same 6 candidate genes functionally involved in modulating swimming activity, namely 5 variants of a major protein of fast muscle and 1 putative catalytic crystallin enzyme. Moreover, 3 of 5 variants and the same putative catalytic crystallin enzyme were upregulated in cisco relative to the dwarf ecotype, indicating a greater physiological potential of the former for exploiting the limnetic trophic niche. This study provides the first empirical evidence that recent, parallel phenotypic evolution toward the use of the same ecological niche occupied by a specialist competitor involved similar adaptive changes in expression at the same genes. As such, this study provides strong support to the general hypothesis that directional selection acting on gene regulation may promote rapid phenotypic divergence and ultimately speciation.
Parallel equilibrium current effect on existence of reversed shear Alfvén eigenmodes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Hua-sheng, E-mail: huashengxie@gmail.com; Xiao, Yong, E-mail: yxiao@zju.edu.cn
2015-02-15
A new fast global eigenvalue code, where the terms are segregated according to their physics contents, is developed to study Alfvén modes in tokamak plasmas, particularly, the reversed shear Alfvén eigenmode (RSAE). Numerical calculations show that the parallel equilibrium current corresponding to the kink term is strongly unfavorable for the existence of the RSAE. An improved criterion for the RSAE existence is given for with and without the parallel equilibrium current. In the limits of ideal magnetohydrodynamics (MHD) and zero-pressure, the toroidicity effect is the main possible favorable factor for the existence of the RSAE, which is however usually small.more » This suggests that it is necessary to include additional physics such as kinetic term in the MHD model to overcome the strong unfavorable effect of the parallel current in order to enable the existence of RSAE.« less
Symplectic molecular dynamics simulations on specially designed parallel computers.
Borstnik, Urban; Janezic, Dusanka
2005-01-01
We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables the use of longer simulation time steps. The low-frequency motion is treated numerically on specially designed parallel computers, which decreases the computational time of each simulation time step. The combination of these approaches means that less time is required and fewer steps are needed and so enables fast MD simulations. We study the computational performance of MD simulation of molecular systems on specialized computers and provide a comparison to standard personal computers. The combination of the SISM with two specialized parallel computers is an effective way to increase the speed of MD simulations up to 16-fold over a single PC processor.
Fast parallel 3D profilometer with DMD technology
NASA Astrophysics Data System (ADS)
Hou, Wenmei; Zhang, Yunbo
2011-12-01
Confocal microscope has been a powerful tool for three-dimensional profile analysis. Single mode confocal microscope is limited by scanning speed. This paper presents a 3D profilometer prototype of parallel confocal microscope based on DMD (Digital Micromirror Device). In this system the DMD takes the place of Nipkow Disk which is a classical parallel scanning scheme to realize parallel lateral scanning technique. Operated with certain pattern, the DMD generates a virtual pinholes array which separates the light into multi-beams. The key parameters that affect the measurement (pinhole size and the lateral scanning distance) can be configured conveniently by different patterns sent to DMD chip. To avoid disturbance between two virtual pinholes working at the same time, a scanning strategy is adopted. Depth response curve both axial and abaxial were extract. Measurement experiments have been carried out on silicon structured sample, and axial resolution of 55nm is achieved.
Parallel Continuous Flow: A Parallel Suffix Tree Construction Tool for Whole Genomes
Farreras, Montse
2014-01-01
Abstract The construction of suffix trees for very long sequences is essential for many applications, and it plays a central role in the bioinformatic domain. With the advent of modern sequencing technologies, biological sequence databases have grown dramatically. Also the methodologies required to analyze these data have become more complex everyday, requiring fast queries to multiple genomes. In this article, we present parallel continuous flow (PCF), a parallel suffix tree construction method that is suitable for very long genomes. We tested our method for the suffix tree construction of the entire human genome, about 3GB. We showed that PCF can scale gracefully as the size of the input genome grows. Our method can work with an efficiency of 90% with 36 processors and 55% with 172 processors. We can index the human genome in 7 minutes using 172 processes. PMID:24597675
Parallelization and automatic data distribution for nuclear reactor simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liebrock, L.M.
1997-07-01
Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directlymore » affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.« less
A Massively Parallel Code for Polarization Calculations
NASA Astrophysics Data System (ADS)
Akiyama, Shizuka; Höflich, Peter
2001-03-01
We present an implementation of our Monte-Carlo radiation transport method for rapidly expanding, NLTE atmospheres for massively parallel computers which utilizes both the distributed and shared memory models. This allows us to take full advantage of the fast communication and low latency inherent to nodes with multiple CPUs, and to stretch the limits of scalability with the number of nodes compared to a version which is based on the shared memory model. Test calculations on a local 20-node Beowulf cluster with dual CPUs showed an improved scalability by about 40%.
Determination of backbone chain direction of PDA using FFM
NASA Astrophysics Data System (ADS)
Jo, Sadaharu; Okamoto, Kentaro; Takenaga, Mitsuru
2010-01-01
The effect of backbone chains on friction force was investigated on both Langmuir-Blodgett (LB) films of 10,12-heptacosadiynoic acid and the (0 1 0) surfaces of single crystals of 2,4-hexadiene-1,6-diol using friction force microscopy (FFM). It was observed that friction force decreased when the scanning direction was parallel to the [0 0 1] direction in both samples. Moreover, friction force decreased when the scanning direction was parallel to the crystallographic [1 0 2], [1 0 1], [1 0 0] and [1 0 1¯] directions in only the single crystals. For the LB films, the [0 0 1] direction corresponds to the backbone chain direction of 10,12-heptacosadiynoic acid. For the single crystals, both the [0 0 1] and [1 0 1] directions correspond to the backbone chain direction, and the [1 0 2], [1 0 0] and [1 0 1¯] directions correspond to the low-index crystallographic direction. In both the LB films and single crystals, the friction force was minimized when the directions of scanning and the backbone chain were parallel.
A parallel approach of COFFEE objective function to multiple sequence alignment
NASA Astrophysics Data System (ADS)
Zafalon, G. F. D.; Visotaky, J. M. V.; Amorim, A. R.; Valêncio, C. R.; Neves, L. A.; de Souza, R. C. G.; Machado, J. M.
2015-09-01
The computational tools to assist genomic analyzes show even more necessary due to fast increasing of data amount available. With high computational costs of deterministic algorithms for sequence alignments, many works concentrate their efforts in the development of heuristic approaches to multiple sequence alignments. However, the selection of an approach, which offers solutions with good biological significance and feasible execution time, is a great challenge. Thus, this work aims to show the parallelization of the processing steps of MSA-GA tool using multithread paradigm in the execution of COFFEE objective function. The standard objective function implemented in the tool is the Weighted Sum of Pairs (WSP), which produces some distortions in the final alignments when sequences sets with low similarity are aligned. Then, in studies previously performed we implemented the COFFEE objective function in the tool to smooth these distortions. Although the nature of COFFEE objective function implies in the increasing of execution time, this approach presents points, which can be executed in parallel. With the improvements implemented in this work, we can verify the execution time of new approach is 24% faster than the sequential approach with COFFEE. Moreover, the COFFEE multithreaded approach is more efficient than WSP, because besides it is slightly fast, its biological results are better.
Hesford, Andrew J; Tillett, Jason C; Astheimer, Jeffrey P; Waag, Robert C
2014-08-01
Accurate and efficient modeling of ultrasound propagation through realistic tissue models is important to many aspects of clinical ultrasound imaging. Simplified problems with known solutions are often used to study and validate numerical methods. Greater confidence in a time-domain k-space method and a frequency-domain fast multipole method is established in this paper by analyzing results for realistic models of the human breast. Models of breast tissue were produced by segmenting magnetic resonance images of ex vivo specimens into seven distinct tissue types. After confirming with histologic analysis by pathologists that the model structures mimicked in vivo breast, the tissue types were mapped to variations in sound speed and acoustic absorption. Calculations of acoustic scattering by the resulting model were performed on massively parallel supercomputer clusters using parallel implementations of the k-space method and the fast multipole method. The efficient use of these resources was confirmed by parallel efficiency and scalability studies using large-scale, realistic tissue models. Comparisons between the temporal and spectral results were performed in representative planes by Fourier transforming the temporal results. An RMS field error less than 3% throughout the model volume confirms the accuracy of the methods for modeling ultrasound propagation through human breast.
NASA Astrophysics Data System (ADS)
Wang, Tai-Han; Huang, Da-Nian; Ma, Guo-Qing; Meng, Zhao-Hai; Li, Ye
2017-06-01
With the continuous development of full tensor gradiometer (FTG) measurement techniques, three-dimensional (3D) inversion of FTG data is becoming increasingly used in oil and gas exploration. In the fast processing and interpretation of large-scale high-precision data, the use of the graphics processing unit process unit (GPU) and preconditioning methods are very important in the data inversion. In this paper, an improved preconditioned conjugate gradient algorithm is proposed by combining the symmetric successive over-relaxation (SSOR) technique and the incomplete Choleksy decomposition conjugate gradient algorithm (ICCG). Since preparing the preconditioner requires extra time, a parallel implement based on GPU is proposed. The improved method is then applied in the inversion of noisecontaminated synthetic data to prove its adaptability in the inversion of 3D FTG data. Results show that the parallel SSOR-ICCG algorithm based on NVIDIA Tesla C2050 GPU achieves a speedup of approximately 25 times that of a serial program using a 2.0 GHz Central Processing Unit (CPU). Real airborne gravity-gradiometry data from Vinton salt dome (southwest Louisiana, USA) are also considered. Good results are obtained, which verifies the efficiency and feasibility of the proposed parallel method in fast inversion of 3D FTG data.
Elastic Properties of 3D-Printed Rock Models: Dry and Saturated Cracks
NASA Astrophysics Data System (ADS)
Huang, L.; Stewart, R.; Dyaur, N.
2014-12-01
Many regions of subsurface interest are, or will be, fractured. In addition, these zones many be subject to varying saturations and stresses. New 3D printing techniques using different materials and structures, provide opportunities to understand porous or fractured materials and fluid effects on their elastic properties. We use a 3D printer (Stratasys Dimension SST 768) to print two rock models: a solid octahedral prism and a porous cube with thousands of penny-shaped cracks. The printing material is ABS thermal plastic with a density of 1.04 g/cm3. After printing, we measure the elastic properties of the models, both dry and 100% saturated with water. Both models exhibit VTI (Vertical Transverse Isotropic) symmetry due to laying (about 0.25 mm thick) of the printing process. The prism has a density of 0.96 g/cm3 before saturation and 1.00 g/cm3 after saturation. Its effective porosity is calculated to be 4 %. We use ultrasonic transducers (500 kHz) to measure both P- and shear-wave velocities, and the raw material has a P-wave velocity of 1.89 km/s and a shear-wave velocity of 0.91 km/s. P-wave velocity in the un-saturated prism increases from 1.81 km/s to 1.84 km/s after saturation in the direction parallel to layering and from 1.73 km/s to 1.81 km/s in the direction perpendicular to layering. The fast shear-wave velocity decreases from 0.88 km/s to 0.87 km/s and the slow shear-wave velocity decreases from 0.82 km/s to 0.81 km/s. The cube, printed with penny-shaped cracks, gives a density of 0.79 g/cm3 and a porosity of 24 %. We measure its P-wave velocity as 1.78 km/s and 1.68 km/s in the direction parallel and perpendicular to the layering, respectively. Its fast shear-wave velocity is 0.88 km/s and slow shear-wave velocity is 0.70 km/s. The penny-shaped cracks have significant influence on the elastic properties of the 3D-printed rock models. To better understand and explain the fluid effects on the elastic properties of the models, we apply the extended anisotropic Gassmann's equations to predict the effects of saturation changes. We find that the predictions match observations from the experimental data within 1 % difference.
NASA Astrophysics Data System (ADS)
Ravenna, Matteo; Lebedev, Sergei
2018-04-01
Seismic anisotropy provides important information on the deformation history of the Earth's interior. Rayleigh and Love surface-waves are sensitive to and can be used to determine both radial and azimuthal shear-wave anisotropies at depth, but parameter trade-offs give rise to substantial model non-uniqueness. Here, we explore the trade-offs between isotropic and anisotropic structure parameters and present a suite of methods for the inversion of surface-wave, phase-velocity curves for radial and azimuthal anisotropies. One Markov chain Monte Carlo (McMC) implementation inverts Rayleigh and Love dispersion curves for a radially anisotropic shear velocity profile of the crust and upper mantle. Another McMC implementation inverts Rayleigh phase velocities and their azimuthal anisotropy for profiles of vertically polarized shear velocity and its depth-dependent azimuthal anisotropy. The azimuthal anisotropy inversion is fully non-linear, with the forward problem solved numerically at different azimuths for every model realization, which ensures that any linearization biases are avoided. The computations are performed in parallel, in order to reduce the computing time. The often challenging issue of data noise estimation is addressed by means of a Hierarchical Bayesian approach, with the variance of the noise treated as an unknown during the radial anisotropy inversion. In addition to the McMC inversions, we also present faster, non-linear gradient-search inversions for the same anisotropic structure. The results of the two approaches are mutually consistent; the advantage of the McMC inversions is that they provide a measure of uncertainty of the models. Applying the method to broad-band data from the Baikal-central Mongolia region, we determine radial anisotropy from the crust down to the transition-zone depths. Robust negative anisotropy (Vsh < Vsv) in the asthenosphere, at 100-300 km depths, presents strong new evidence for a vertical component of asthenospheric flow. This is consistent with an upward flow from below the thick lithosphere of the Siberian Craton to below the thinner lithosphere of central Mongolia, likely to give rise to decompression melting and the scattered, sporadic volcanism observed in the Baikal Rift area, as proposed previously. Inversion of phase-velocity data from west-central Italy for azimuthal anisotropy reveals a clear change in the shear-wave fast-propagation direction at 70-100 km depths, near the lithosphere-asthenosphere boundary. The orientation of the fabric in the lithosphere is roughly E-W, parallel to the direction of stretching over the last 10 m.y. The orientation of the fabric in the asthenosphere is NW-SE, matching the fast directions inferred from shear-wave splitting and probably indicating the direction of the asthenospheric flow.
Catastrophic onset of fast magnetic reconnection with a guide field
NASA Astrophysics Data System (ADS)
Cassak, P. A.; Drake, J. F.; Shay, M. A.
2007-05-01
It was recently shown that the slow (collisional) Sweet-Parker and the fast (collisionless) Hall magnetic reconnection solutions simultaneously exist for a wide range of resistivities; reconnection is bistable [Cassak, Shay, and Drake, Phys. Rev. Lett., 95, 235002 (2005)]. When the thickness of the dissipation region becomes smaller than a critical value, the Sweet-Parker solution disappears and fast reconnection ensues, potentially explaining how large amounts of magnetic free energy can accrue without significant release before the onset of fast reconnection. Two-fluid numerical simulations extending the previous results for anti-parallel reconnection (where the critical thickness is the ion skin depth) to component reconnection with a large guide field (where the critical thickness is the thermal ion Larmor radius) are presented. Applications to laboratory experiments of magnetic reconnection and the sawtooth crash are discussed.
Capacitively coupled pickup in MCP-based photodetectors using a conductive metallic anode
NASA Astrophysics Data System (ADS)
Angelico, E.; Seiss, T.; Adams, B.; Elagin, A.; Frisch, H.; Spieglan, E.
2017-02-01
We have designed and tested a robust 20×20 cm2 thin metal film internal anode capacitively coupled to an external array of signal pads or micro-strips for use in fast microchannel plate photodetectors. The internal anode, in this case a 10 nm-thick NiCr film deposited on a 96% pure Al2O3 3 mm-thick ceramic plate and connected to HV ground, provides the return path for the electron cascade charge. The multi-channel pickup array consists of a printed-circuit card or glass plate with metal signal pickups on one side and the signal ground plane on the other. The pickup can be put in close proximity to the bottom outer surface of the sealed photodetector, with no electrical connections through the photodetector hermetic vacuum package other than a single ground connection to the internal anode. Two pickup patterns were tested using a small commercial MCP-PMT as the signal source: 1) parallel 50 Ω 25-cm-long micro-strips with an analog bandwidth of 1.5 GHz, and 2) a 20×20 cm2 array of 2-dimensional square 'pads' with sides of 1.27 cm or 2.54 cm. The rise-time of the fast input pulse is maintained for both pickup patterns. For the pad pattern, we observe 80% of the directly coupled amplitude. For the strip pattern we measure 34% of the directly coupled amplitude on the central strip of a broadened signal. The physical decoupling of the photodetector from the pickup pattern allows easy customization for different applications while maintaining high analog bandwidth.
Excitation of the hydrogen atom by fast-electron impact in the presence of a laser field
NASA Astrophysics Data System (ADS)
Bhattacharya, Manabesh; Sinha, C.; Sil, N. C.
1991-08-01
An approach has been developed to study the excitation of a ground-state H atom to the n=2 level under the simultaneous action of fast-electron impact and a monochromatic, linearly polarized, homogeneous laser beam. The laser frequency is assumed to be low (soft-photon limit) so that a stationary-state perturbation theory can be applied as is done in the adiabatic theory. An elegant method has been developed in the present work to construct the dressed excited-state wave functions of the H atom using first-order perturbation theory in the parabolic coordinate representation. By virtue of this method, the problem arising due to the degeneracy of the excited states of the H atom has been successfully overcome. The main advantage of the present approach is that the dressed wave function has been obtained in terms of a finite number of Laguerre polynomials instead of an infinite summation occurring in the usual perturbative treatment. The amplitude for direct excitation (without exchange) has been obtained in closed form. Numerical results for differential cross sections are presented for individual excitations to different Stark manifolds as well as for excitations to the n=2 level at high energies (100 and 200 eV) and for field directions both parallel and perpendicular to the incident electron momentum. Extension to a higher order of perturbation is also possible in the present approach for the construction of the dressed states, and the electron-exchange effect can also be taken into account without any further approximation.
Microchannel gel electrophoretic separation systems and methods for preparing and using
Herr, Amy E; Singh, Anup K; Throckmorton, Daniel J
2015-02-24
A micro-analytical platform for performing electrophoresis-based immunoassays was developed by integrating photopolymerized cross-linked polyacrylamide gels within a microfluidic device. The microfluidic immunoassays are performed by gel electrophoretic separation and quantifying analyte concentration based upon conventional polyacrylamide gel electrophoresis (PAGE). To retain biological activity of proteins and maintain intact immune complexes, native PAGE conditions were employed. Both direct (non-competitive) and competitive immunoassay formats are demonstrated in microchips for detecting toxins and biomarkers (cytokines, c-reactive protein) in bodily fluids (serum, saliva, oral fluids). Further, a description of gradient gels fabrication is included, in an effort to describe methods we have developed for further optimization of on-chip PAGE immunoassays. The described chip-based PAGE immunoassay method enables immunoassays that are fast (minutes) and require very small amounts of sample (less than a few microliters). Use of microfabricated chips as a platform enables integration, parallel assays, automation and development of portable devices.
Harper, J C; Aittomäki, K; Borry, P; Cornel, M C; de Wert, G; Dondorp, W; Geraedts, J; Gianaroli, L; Ketterson, K; Liebaers, I; Lundin, K; Mertes, H; Morris, M; Pennings, G; Sermon, K; Spits, C; Soini, S; van Montfoort, A P A; Veiga, A; Vermeesch, J R; Viville, S; Macek, M
2018-01-01
Two leading European professional societies, the European Society of Human Genetics and the European Society for Human Reproduction and Embryology, have worked together since 2004 to evaluate the impact of fast research advances at the interface of assisted reproduction and genetics, including their application into clinical practice. In September 2016, the expert panel met for the third time. The topics discussed highlighted important issues covering the impacts of expanded carrier screening, direct-to-consumer genetic testing, voiding of the presumed anonymity of gamete donors by advanced genetic testing, advances in the research of genetic causes underlying male and female infertility, utilisation of massively parallel sequencing in preimplantation genetic testing and non-invasive prenatal screening, mitochondrial replacement in human oocytes, and additionally, issues related to cross-generational epigenetic inheritance following IVF and germline genome editing. The resulting paper represents a consensus of both professional societies involved.
The Level 0 Pixel Trigger system for the ALICE experiment
NASA Astrophysics Data System (ADS)
Aglieri Rinella, G.; Kluge, A.; Krivda, M.; ALICE Silicon Pixel Detector project
2007-01-01
The ALICE Silicon Pixel Detector contains 1200 readout chips. Fast-OR signals indicate the presence of at least one hit in the 8192 pixel matrix of each chip. The 1200 bits are transmitted every 100 ns on 120 data readout optical links using the G-Link protocol. The Pixel Trigger System extracts and processes them to deliver an input signal to the Level 0 trigger processor targeting a latency of 800 ns. The system is compact, modular and based on FPGA devices. The architecture allows the user to define and implement various trigger algorithms. The system uses advanced 12-channel parallel optical fiber modules operating at 1310 nm as optical receivers and 12 deserializer chips closely packed in small area receiver boards. Alternative solutions with multi-channel G-Link deserializers implemented directly in programmable hardware devices were investigated. The design of the system and the progress of the ALICE Pixel Trigger project are described in this paper.
Microchannel gel electrophoretic separation systems and methods for preparing and using
Herr, Amy; Singh, Anup K; Throckmorton, Daniel J
2013-09-03
A micro-analytical platform for performing electrophoresis-based immunoassays was developed by integrating photopolymerized cross-linked polyacrylamide gels within a microfluidic device. The microfluidic immunoassays are performed by gel electrophoretic separation and quantifying analyte concentration based upon conventional polyacrylamide gel electrophoresis (PAGE). To retain biological activity of proteins and maintain intact immune complexes, native PAGE conditions were employed. Both direct (non-competitive) and competitive immunoassay formats are demonstrated in microchips for detecting toxins and biomarkers (cytokines, c-reactive protein) in bodily fluids (serum, saliva, oral fluids). Further, a description of gradient gels fabrication is included, in an effort to describe methods we have developed for further optimization of on-chip PAGE immunoassays. The described chip-based PAGE immunoassay method enables immunoassays that are fast (minutes) and require very small amounts of sample (less than a few microliters). Use of microfabricated chips as a platform enables integration, parallel assays, automation and development of portable devices.
Caloric restriction improves health and survival of rhesus monkeys.
Mattison, Julie A; Colman, Ricki J; Beasley, T Mark; Allison, David B; Kemnitz, Joseph W; Roth, George S; Ingram, Donald K; Weindruch, Richard; de Cabo, Rafael; Anderson, Rozalyn M
2017-01-17
Caloric restriction (CR) without malnutrition extends lifespan and delays the onset of age-related disorders in most species but its impact in nonhuman primates has been controversial. In the late 1980s two parallel studies were initiated to determine the effect of CR in rhesus monkeys. The University of Wisconsin study reported a significant positive impact of CR on survival, but the National Institute on Aging study detected no significant survival effect. Here we present a direct comparison of longitudinal data from both studies including survival, bodyweight, food intake, fasting glucose levels and age-related morbidity. We describe differences in study design that could contribute to differences in outcomes, and we report species specificity in the impact of CR in terms of optimal onset and diet. Taken together these data confirm that health benefits of CR are conserved in monkeys and suggest that CR mechanisms are likely translatable to human health.
Flight- and ground-test correlation study of BMDO SDS materials: Phase 1 report
NASA Technical Reports Server (NTRS)
Chung, Shirley Y.; Brinza, David E.; Minton, Timothy K.; Stiegman, Albert E.; Kenny, James T.; Liang, Ranty H.
1993-01-01
The NASA Evaluation of Oxygen Interactions with Materials-3 (EOIM-3) experiment served as a test bed for a variety of materials that are candidates for Ballistic Missile Defense Organization (BMDO) space assets. The materials evaluated on this flight experiment were provided by BMDO contractors and technology laboratories. A parallel ground exposure evaluation was conducted using the FAST atomic-oxygen simulation facility at Physical Sciences, Inc. The EOIM-3 materials were exposed to an atomic oxygen fluence of approximately 2.3 x 10(exp 2) atoms/sq. cm. The ground-exposed materials' fluence of 2.0 - 2.5 x 10(exp 2) atoms/sq. cm permits direct comparison of ground-exposed materials' performance with that of the flight-exposed specimens. The results from the flight test conducted aboard STS-46 and the correlative ground exposure are presented in this publication.
Caloric restriction improves health and survival of rhesus monkeys
Mattison, Julie A.; Colman, Ricki J.; Beasley, T. Mark; Allison, David B.; Kemnitz, Joseph W.; Roth, George S.; Ingram, Donald K.; Weindruch, Richard; de Cabo, Rafael; Anderson, Rozalyn M.
2017-01-01
Caloric restriction (CR) without malnutrition extends lifespan and delays the onset of age-related disorders in most species but its impact in nonhuman primates has been controversial. In the late 1980s two parallel studies were initiated to determine the effect of CR in rhesus monkeys. The University of Wisconsin study reported a significant positive impact of CR on survival, but the National Institute on Aging study detected no significant survival effect. Here we present a direct comparison of longitudinal data from both studies including survival, bodyweight, food intake, fasting glucose levels and age-related morbidity. We describe differences in study design that could contribute to differences in outcomes, and we report species specificity in the impact of CR in terms of optimal onset and diet. Taken together these data confirm that health benefits of CR are conserved in monkeys and suggest that CR mechanisms are likely translatable to human health. PMID:28094793
Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units
USDA-ARS?s Scientific Manuscript database
This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...
Teleseismic SKS splitting beneath East Antarctica using broad-band stations around Soya Coast
NASA Astrophysics Data System (ADS)
Usui, Y.; Kanao, M.
2006-12-01
We observed shear wave splitting of SKS waves from digital seismographs that are recorded at 5 stations around Soya Coast in the Lutzow-Holm Bay, East Antarctica. Their recording systems are composed of a three-component broadband seismometer (CMG-40T), a digital recording unit and a solar power battery supply. The events used were selected from 1999 to 2004 and phase arrival times were calculated using the IASPEI91 earth model (Kennet, 1995). In general, we chose the data from earthquakes with m>6.0 and a distance range 85° < Δ < 130° for the most prominent SKS waves We used the methods of Silver and Chan (1991) for the inversion of anisotropy parameters and estimated the splitting parameters φ (fast polarization direction) and δt (delay time between split waves) assuming a single layer of hexagonal symmetry with a horizontal symmetry axis. The weighted averages of all splitting parameters (φ, δt) for each station are AKR (30±4, 1.30±0.2), LNG (58±6, 1.27±0.2), SKL (67±10, 0.94±0.2), SKV (40±6, 1.28±0.3) and TOT (52±8, 1.26±0.3), where the weights are inversely proportional to the standard deviations for each solution. As compared to typical delay times of SKS waves which show 1.2s (Silver and Chan 1991; Vinnik et al., 1992), the result shows generally the same value. In previous study, Kubo and Hiramatsu (1998) estimate the splitting parameter for Syowa station (SYO), where is located near our using stations in East Antarctica, and the results are (49±3, 0.70±0.1). Although it is consistent with our results for fast polarization direction, δt for our results are large relatively to those of SYO. The difference may be due to either different incident angle or more complex anisotropic structure. We found that fast polarization direction is systematically parallel to coast line in the Lutzow-Holm Bay, East Antarctica, which is consistent with NE-SW paleo compressional stress. The absolute plate motion based on the HS2-NUVEL1 (Gripp and Gordon, 1990), that may reflect the present horizontal mantle flow, shows the direction of N120°E and velocity of 1cm/yr in this study region. Since it doesn't coincide with fast polarization direction (the difference is about 50°~90°), we conclude that the mechanism of observed anisotropy is lattice preferred orientation of olivine along the mantle flow which caused NE-SW paleo compressional stress. In future works, we will accomplish the analysis assumed more complex anisotropy systems, such as a two layer model of azimuthal anisotropy, because we could find there is the possibility of azimuthal variations of the splitting parameters in a few station.
GPU-based ultra-fast dose calculation using a finite size pencil beam model.
Gu, Xuejun; Choi, Dongju; Men, Chunhua; Pan, Hubert; Majumdar, Amitava; Jiang, Steve B
2009-10-21
Online adaptive radiation therapy (ART) is an attractive concept that promises the ability to deliver an optimal treatment in response to the inter-fraction variability in patient anatomy. However, it has yet to be realized due to technical limitations. Fast dose deposit coefficient calculation is a critical component of the online planning process that is required for plan optimization of intensity-modulated radiation therapy (IMRT). Computer graphics processing units (GPUs) are well suited to provide the requisite fast performance for the data-parallel nature of dose calculation. In this work, we develop a dose calculation engine based on a finite-size pencil beam (FSPB) algorithm and a GPU parallel computing framework. The developed framework can accommodate any FSPB model. We test our implementation in the case of a water phantom and the case of a prostate cancer patient with varying beamlet and voxel sizes. All testing scenarios achieved speedup ranging from 200 to 400 times when using a NVIDIA Tesla C1060 card in comparison with a 2.27 GHz Intel Xeon CPU. The computational time for calculating dose deposition coefficients for a nine-field prostate IMRT plan with this new framework is less than 1 s. This indicates that the GPU-based FSPB algorithm is well suited for online re-planning for adaptive radiotherapy.
A practical implementation of multi-frequency widefield frequency-domain FLIM
Chen, Hongtao
2013-01-01
Widefield frequency-domain fluorescence lifetime imaging microscopy (FD-FLIM) is a fast and accurate method to measure the fluorescence lifetime, especially in kinetic studies in biomedical researches. However, the small range of modulation frequencies available in commercial instruments makes this technique limited in its applications. Here we describe a practical implementation of multi-frequency widefield FD-FLIM using a pulsed supercontinuum laser and a direct digital synthesizer. In this instrument we use a pulse to modulate the image intensifier rather than the more conventional sine wave modulation. This allows parallel multi-frequency FLIM measurement using the Fast Fourier Transform and the cross-correlation technique, which permits precise and simultaneous isolation of individual frequencies. In addition, the pulse modulation at the cathode of image intensifier restored the loss of optical resolution caused by the defocusing effect when the voltage at the cathode is sinusoidally modulated. Furthermore, in our implementation of this technique, data can be graphically analyzed by the phasor method while data are acquired, which allows easy fit-free lifetime analysis of FLIM images. Here our measurements of standard fluorescent samples and a Föster resonance energy transfer pair demonstrate that the widefield multi-frequency FLIM system is a valuable and simple tool in fluorescence imaging studies. PMID:23296945
Exploration of High Harmonic Fast Wave Heating on the National Spherical Torus Experiment
DOE Office of Scientific and Technical Information (OSTI.GOV)
J.R. Wilson; R.E. Bell; S. Bernabei
2003-02-11
High Harmonic Fast Wave (HHFW) heating has been proposed as a particularly attractive means for plasma heating and current drive in the high-beta plasmas that are achievable in spherical torus (ST) devices. The National Spherical Torus Experiment (NSTX) [Ono, M., Kaye, S.M., Neumeyer, S., et al., Proceedings, 18th IEEE/NPSS Symposium on Fusion Engineering, Albuquerque, 1999, (IEEE, Piscataway, NJ (1999), p. 53.)] is such a device. An radio-frequency (rf) heating system has been installed on NSTX to explore the physics of HHFW heating, current drive via rf waves and for use as a tool to demonstrate the attractiveness of the STmore » concept as a fusion device. To date, experiments have demonstrated many of the theoretical predictions for HHFW. In particular, strong wave absorption on electrons over a wide range of plasma parameters and wave parallel phase velocities, wave acceleration of energetic ions, and indications of current drive for directed wave spectra have been observed. In addition HHFW heating has been used to explore the energy transport properties of NSTX plasmas, to create H-mode (high-confinement mode) discharges with a large fraction of bootstrap current and to control the plasma current profile during the early stages of the discharge.« less
Sub-second pencil beam dose calculation on GPU for adaptive proton therapy.
da Silva, Joakim; Ansorge, Richard; Jena, Rajesh
2015-06-21
Although proton therapy delivered using scanned pencil beams has the potential to produce better dose conformity than conventional radiotherapy, the created dose distributions are more sensitive to anatomical changes and patient motion. Therefore, the introduction of adaptive treatment techniques where the dose can be monitored as it is being delivered is highly desirable. We present a GPU-based dose calculation engine relying on the widely used pencil beam algorithm, developed for on-line dose calculation. The calculation engine was implemented from scratch, with each step of the algorithm parallelized and adapted to run efficiently on the GPU architecture. To ensure fast calculation, it employs several application-specific modifications and simplifications, and a fast scatter-based implementation of the computationally expensive kernel superposition step. The calculation time for a skull base treatment plan using two beam directions was 0.22 s on an Nvidia Tesla K40 GPU, whereas a test case of a cubic target in water from the literature took 0.14 s to calculate. The accuracy of the patient dose distributions was assessed by calculating the γ-index with respect to a gold standard Monte Carlo simulation. The passing rates were 99.2% and 96.7%, respectively, for the 3%/3 mm and 2%/2 mm criteria, matching those produced by a clinical treatment planning system.
Ramdane, Said; Daoudi-Gueddah, Doria
2011-08-01
We examined retrospectively the concurrent relationships between fasting plasma total cholesterol, triglycerides, and glucose levels, and Alzheimer's disease (AD), in a clinical setting-based study. Total cholesterol level was higher in patients with AD compared to elderly controls; triglycerides or glucose levels did not significantly differ between the 2 groups. Respective plotted trajectories of change in cholesterol level across age were fairly parallel. No significant difference in total cholesterol levels was recorded between patients with AD classified by the Clinical Dementia Rating (CDR) score subgroups. These results suggest that patients with AD have relative mild total hypercholesterolemia, normal triglyceridemia, and normal fasting plasma glucose level. Mild total hypercholesterolemia seems to be permanent across age, and across dementia severity staging, and fairly parallels the trajectory of age-related change in total cholesterolemia of healthy controls. We speculate that these biochemical parameters pattern may be present long before-a decade at least-the symptomatic onset of the disease.
Spectral Anisotropy of Magnetic Field Fluctuations around Ion Scales in the Fast Solar Wind
NASA Astrophysics Data System (ADS)
Wang, X.; Tu, C.; He, J.; Marsch, E.; Wang, L.
2016-12-01
The power spectra of magnetic field at ion scales are significantly influenced by waves and structures. In this work, we study the ΘRB angle dependence of the contribution of waves on the spectral index of the magnetic field. Wavelet technique is applied to the high time-resolution magnetic field data from WIND spacecraft measurements in the fast solar wind. It is found that around ion scales, the parallel spectrum has a slope of -4.6±0.1 originally. When we remove the waves, which correspond to the data points with relatively larger value of magnetic helicity, the parallel spectrum gets shallower gradually to -3.2±0.2. However, the perpendicular spectrum does not change significantly during the wave-removal process, and its slope remains -3.1±0.1. It means that when the waves are removed from the original data, the spectral anisotropy gets weaker. This result may help us understand the physical nature of the spectral anisotropy around the ion scales.
Seismic anisotropy and large-scale deformation of the Eastern Alps
NASA Astrophysics Data System (ADS)
Bokelmann, Götz; Qorbani, Ehsan; Bianchi, Irene
2013-12-01
Mountain chains at the Earth's surface result from deformation processes within the Earth. Such deformation processes can be observed by seismic anisotropy, via the preferred alignment of elastically anisotropic minerals. The Alps show complex deformation at the Earth's surface. In contrast, we show here that observations of seismic anisotropy suggest a relatively simple pattern of internal deformation. Together with earlier observations from the Western Alps, the SKS shear-wave splitting observations presented here show one of the clearest examples yet of mountain chain-parallel fast orientations worldwide, with a simple pattern nearly parallel to the trend of the mountain chain. In the Eastern Alps, the fast orientations do not connect with neighboring mountain chains, neither the present-day Carpathians, nor the present-day Dinarides. In that region, the lithosphere is thin and the observed anisotropy thus resides within the asthenosphere. The deformation is consistent with the eastward extrusion toward the Pannonian basin that was previously suggested based on seismicity and surface geology.
NASA Technical Reports Server (NTRS)
Miller, R. H.; Gombosi, T. I.; Gary, S. P.; Winske, D.
1991-01-01
The direction of propagation of low frequency magnetic fluctuations generated by cometary ion pick-up is examined by means of 1D electromagnetic hybrid simulations. The newborn ions are injected at a constant rate, and the helicity and direction of propagation of magnetic fluctuations are explored for cometary ion injection angles of 0 and 90 deg relative to the solar wind magnetic field. The parameter eta represents the relative contribution of wave energy propagating in the direction away from the comet, parallel to the beam. For small (quasi-parallel) injection angles eta was found to be of order unity, while for larger (quasi-perpendicular) angles eta was found to be of order 0.5.
A fast algorithm for computer aided collimation gamma camera (CACAO)
NASA Astrophysics Data System (ADS)
Jeanguillaume, C.; Begot, S.; Quartuccio, M.; Douiri, A.; Franck, D.; Pihet, P.; Ballongue, P.
2000-08-01
The computer aided collimation gamma camera is aimed at breaking down the resolution sensitivity trade-off of the conventional parallel hole collimator. It uses larger and longer holes, having an added linear movement at the acquisition sequence. A dedicated algorithm including shift and sum, deconvolution, parabolic filtering and rotation is described. Examples of reconstruction are given. This work shows that a simple and fast algorithm, based on a diagonal dominant approximation of the problem can be derived. Its gives a practical solution to the CACAO reconstruction problem.
Large-Constraint-Length, Fast Viterbi Decoder
NASA Technical Reports Server (NTRS)
Collins, O.; Dolinar, S.; Hsu, In-Shek; Pollara, F.; Olson, E.; Statman, J.; Zimmerman, G.
1990-01-01
Scheme for efficient interconnection makes VLSI design feasible. Concept for fast Viterbi decoder provides for processing of convolutional codes of constraint length K up to 15 and rates of 1/2 to 1/6. Fully parallel (but bit-serial) architecture developed for decoder of K = 7 implemented in single dedicated VLSI circuit chip. Contains six major functional blocks. VLSI circuits perform branch metric computations, add-compare-select operations, and then store decisions in traceback memory. Traceback processor reads appropriate memory locations and puts out decoded bits. Used as building block for decoders of larger K.
NASA Astrophysics Data System (ADS)
Lynner, Colton; Long, Maureen D.
2015-06-01
Measurements of seismic anisotropy are commonly used to constrain deformation in the upper mantle. Observations of anisotropy at mid-mantle depths are, however, relatively sparse. In this study we probe the anisotropic structure of the mid-mantle (transition zone and uppermost lower mantle) beneath the Japan, Izu-Bonin, and South America subduction systems. We present source-side shear wave splitting measurements for direct teleseismic S phases from earthquakes deeper than 300 km that have been corrected for the effects of upper mantle anisotropy beneath the receiver. In each region, we observe consistent splitting with delay times as large as 1 s, indicating the presence of anisotropy at mid-mantle depths. Clear splitting of phases originating from depths as great as ˜600 km argues for a contribution from anisotropy in the uppermost lower mantle as well as the transition zone. Beneath Japan, fast splitting directions are perpendicular or oblique to the slab strike and do not appear to depend on the propagation direction of the waves. Beneath South America and Izu-Bonin, splitting directions vary from trench-parallel to trench-perpendicular and have an azimuthal dependence, indicating lateral heterogeneity. Our results provide evidence for the presence of laterally variable anisotropy and are indicative of variable deformation and dynamics at mid-mantle depths in the vicinity of subducting slabs.
Ultra-fast laser microprocessing of medical polymers for cell engineering applications.
Ortiz, R; Moreno-Flores, S; Quintana, I; Vivanco, MdM; Sarasua, J R; Toca-Herrera, J L
2014-04-01
Picosecond laser micromachining technology (PLM) has been employed as a tool for the fabrication of 3D structured substrates. These substrates have been used as supports in the in vitro study of the effect of substrate topography on cell behavior. Different micropatterns were PLM-generated on polystyrene (PS) and poly-L-lactide (PLLA) and employed to study cellular proliferation and morphology of breast cancer cells. The laser-induced microstructures included parallel lines of comparable width to that of a single cell (which in this case is roughly 20μm), and the fabrication of square-like compartments of a much larger area than a single cell (250,000μm(2)). The results obtained from this in vitro study showed that though the laser treatment altered substrate roughness, it did not noticeably affect the adhesion and proliferation of the breast cancer cells. However, pattern direction directly affected cell proliferation, leading to a guided growth of cell clusters along the pattern direction. When cultured in square-like compartments, cells remained confined inside these for eleven incubation days. According to these results, laser micromachining with ultra-short laser pulses is a suitable method to directly modify the cell microenvironment in order to induce a predefined cellular behavior and to study the effect of the physical microenvironment on cell proliferation. Copyright © 2013 Elsevier B.V. All rights reserved.
Wang, Zihao; Chen, Yu; Zhang, Jingrong; Li, Lun; Wan, Xiaohua; Liu, Zhiyong; Sun, Fei; Zhang, Fa
2018-03-01
Electron tomography (ET) is an important technique for studying the three-dimensional structures of the biological ultrastructure. Recently, ET has reached sub-nanometer resolution for investigating the native and conformational dynamics of macromolecular complexes by combining with the sub-tomogram averaging approach. Due to the limited sampling angles, ET reconstruction typically suffers from the "missing wedge" problem. Using a validation procedure, iterative compressed-sensing optimized nonuniform fast Fourier transform (NUFFT) reconstruction (ICON) demonstrates its power in restoring validated missing information for a low-signal-to-noise ratio biological ET dataset. However, the huge computational demand has become a bottleneck for the application of ICON. In this work, we implemented a parallel acceleration technology ICON-many integrated core (MIC) on Xeon Phi cards to address the huge computational demand of ICON. During this step, we parallelize the element-wise matrix operations and use the efficient summation of a matrix to reduce the cost of matrix computation. We also developed parallel versions of NUFFT on MIC to achieve a high acceleration of ICON by using more efficient fast Fourier transform (FFT) calculation. We then proposed a hybrid task allocation strategy (two-level load balancing) to improve the overall performance of ICON-MIC by making full use of the idle resources on Tianhe-2 supercomputer. Experimental results using two different datasets show that ICON-MIC has high accuracy in biological specimens under different noise levels and a significant acceleration, up to 13.3 × , compared with the CPU version. Further, ICON-MIC has good scalability efficiency and overall performance on Tianhe-2 supercomputer.
Kepper, Nick; Ettig, Ramona; Dickmann, Frank; Stehr, Rene; Grosveld, Frank G; Wedemann, Gero; Knoch, Tobias A
2010-01-01
Especially in the life-science and the health-care sectors the huge IT requirements are imminent due to the large and complex systems to be analysed and simulated. Grid infrastructures play here a rapidly increasing role for research, diagnostics, and treatment, since they provide the necessary large-scale resources efficiently. Whereas grids were first used for huge number crunching of trivially parallelizable problems, increasingly parallel high-performance computing is required. Here, we show for the prime example of molecular dynamic simulations how the presence of large grid clusters including very fast network interconnects within grid infrastructures allows now parallel high-performance grid computing efficiently and thus combines the benefits of dedicated super-computing centres and grid infrastructures. The demands for this service class are the highest since the user group has very heterogeneous requirements: i) two to many thousands of CPUs, ii) different memory architectures, iii) huge storage capabilities, and iv) fast communication via network interconnects, are all needed in different combinations and must be considered in a highly dedicated manner to reach highest performance efficiency. Beyond, advanced and dedicated i) interaction with users, ii) the management of jobs, iii) accounting, and iv) billing, not only combines classic with parallel high-performance grid usage, but more importantly is also able to increase the efficiency of IT resource providers. Consequently, the mere "yes-we-can" becomes a huge opportunity like e.g. the life-science and health-care sectors as well as grid infrastructures by reaching higher level of resource efficiency.
An implementation of a tree code on a SIMD, parallel computer
NASA Technical Reports Server (NTRS)
Olson, Kevin M.; Dorband, John E.
1994-01-01
We describe a fast tree algorithm for gravitational N-body simulation on SIMD parallel computers. The tree construction uses fast, parallel sorts. The sorted lists are recursively divided along their x, y and z coordinates. This data structure is a completely balanced tree (i.e., each particle is paired with exactly one other particle) and maintains good spatial locality. An implementation of this tree-building algorithm on a 16k processor Maspar MP-1 performs well and constitutes only a small fraction (approximately 15%) of the entire cycle of finding the accelerations. Each node in the tree is treated as a monopole. The tree search and the summation of accelerations also perform well. During the tree search, node data that is needed from another processor is simply fetched. Roughly 55% of the tree search time is spent in communications between processors. We apply the code to two problems of astrophysical interest. The first is a simulation of the close passage of two gravitationally, interacting, disk galaxies using 65,636 particles. We also simulate the formation of structure in an expanding, model universe using 1,048,576 particles. Our code attains speeds comparable to one head of a Cray Y-MP, so single instruction, multiple data (SIMD) type computers can be used for these simulations. The cost/performance ratio for SIMD machines like the Maspar MP-1 make them an extremely attractive alternative to either vector processors or large multiple instruction, multiple data (MIMD) type parallel computers. With further optimizations (e.g., more careful load balancing), speeds in excess of today's vector processing computers should be possible.
Evaluation and application of a fast module in a PLC based interlock and control system
NASA Astrophysics Data System (ADS)
Zaera-Sanz, M.
2009-08-01
The LHC Beam Interlock system requires a controller performing a simple matrix function to collect the different beam dump requests. To satisfy the expected safety level of the Interlock, the system should be robust and reliable. The PLC is a promising candidate to fulfil both aspects but too slow to meet the expected response time which is of the order of μseconds. Siemens has introduced a ``so called'' fast module (FM352-5 Boolean Processor). It provides independent and extremely fast control of a process within a larger control system using an onboard processor, a Field Programmable Gate Array (FPGA), to execute code in parallel which results in extremely fast scan times. It is interesting to investigate its features and to evaluate it as a possible candidate for the beam interlock system. This paper publishes the results of this study. As well, this paper could be useful for other applications requiring fast processing using a PLC.
Current drive with combined electron cyclotron wave and high harmonic fast wave in tokamak plasmas
NASA Astrophysics Data System (ADS)
Li, J. C.; Gong, X. Y.; Dong, J. Q.; Wang, J.; Zhang, N.; Zheng, P. W.; Yin, C. Y.
2016-12-01
The current driven by combined electron cyclotron wave (ECW) and high harmonic fast wave is investigated using the GENRAY/CQL3D package. It is shown that no significant synergetic current is found in a range of cases with a combined ECW and fast wave (FW). This result is consistent with a previous study [Harvey et al., in Proceedings of IAEA TCM on Fast Wave Current Drive in Reactor Scale Tokamaks (Synergy and Complimentarily with LHCD and ECRH), Arles, France, IAEA, Vienna, 1991]. However, a positive synergy effect does appear with the FW in the lower hybrid range of frequencies. This positive synergy effect can be explained using a picture of the electron distribution function induced by the ECW and a very high harmonic fast wave (helicon). The dependence of the synergy effect on the radial position of the power deposition, the wave power, the wave frequency, and the parallel refractive index is also analyzed, both numerically and physically.
Liu, Tiemin; Kong, Dong; Shah, Bhavik P.; Ye, Chianping; Koda, Shuichi; Saunders, Arpiar; Ding, Jun B.; Yang, Zongfang; Sabatini, Bernardo L.; Lowell, Bradford B.
2012-01-01
SUMMARY AgRP neuron activity drives feeding and weight gain while that of nearby POMC neurons does the opposite. However, the role of excitatory glutamatergic input in controlling these neurons is unknown. To address this question, we generated mice lacking NMDA receptors (NMDARs) on either AgRP or POMC neurons. Deletion of NMDARs from AgRP neurons markedly reduced weight, body fat and food intake whereas deletion from POMC neurons had no effect. Activation of AgRP neurons by fasting, as assessed by c-Fos, Agrp and Npy mRNA expression, AMPA receptor-mediated EPSCs, depolarization and firing rates, required NMDARs. Furthermore, AgRP but not POMC neurons have dendritic spines and increased glutamatergic input onto AgRP neurons caused by fasting was paralleled by an increase in spines, suggesting fasting induced synaptogenesis and spinogenesis. Thus glutamatergic synaptic transmission and its modulation by NMDARs play key roles in controlling AgRP neurons and determining the cellular and behavioral response to fasting. PMID:22325203
Direct Observation of Parallel Folding Pathways Revealed Using a Symmetric Repeat Protein System
Aksel, Tural; Barrick, Doug
2014-01-01
Although progress has been made to determine the native fold of a polypeptide from its primary structure, the diversity of pathways that connect the unfolded and folded states has not been adequately explored. Theoretical and computational studies predict that proteins fold through parallel pathways on funneled energy landscapes, although experimental detection of pathway diversity has been challenging. Here, we exploit the high translational symmetry and the direct length variation afforded by linear repeat proteins to directly detect folding through parallel pathways. By comparing folding rates of consensus ankyrin repeat proteins (CARPs), we find a clear increase in folding rates with increasing size and repeat number, although the size of the transition states (estimated from denaturant sensitivity) remains unchanged. The increase in folding rate with chain length, as opposed to a decrease expected from typical models for globular proteins, is a clear demonstration of parallel pathways. This conclusion is not dependent on extensive curve-fitting or structural perturbation of protein structure. By globally fitting a simple parallel-Ising pathway model, we have directly measured nucleation and propagation rates in protein folding, and have quantified the fluxes along each path, providing a detailed energy landscape for folding. This finding of parallel pathways differs from results from kinetic studies of repeat-proteins composed of sequence-variable repeats, where modest repeat-to-repeat energy variation coalesces folding into a single, dominant channel. Thus, for globular proteins, which have much higher variation in local structure and topology, parallel pathways are expected to be the exception rather than the rule. PMID:24988356
Scan Directed Load Balancing for Highly-Parallel Mesh-Connected Computers
1991-07-01
DTIC ~ ELECTE OCT 2 41991 AD-A242 045 Scan Directed Load Balancing for Highly-Parallel Mesh-Connected Computers’ Edoardo S. Biagioni Jan F. Prins...Department of Computer Science University of North Carolina Chapel Hill, N.C. 27599-3175 USA biagioni @cs.unc.edu prinsOcs.unc.edu Abstract Scan Directed...MasPar Computer Corpora- tion. Bibliography [1] Edoardo S. Biagioni . Scan Directed Load Balancing. PhD thesis., University of North Carolina, Chapel Hill
Nerve injury affects the capillary supply in rat slow and fast muscles differently.
Cebasek, Vita; Radochová, Barbora; Ribaric, Samo; Kubínová, Lucie; Erzen, Ida
2006-02-01
The goal of this study was to determine the acute effects of permanent denervation on the length density of the capillary network in rat slow soleus (SOL) and fast extensor digitorum longus (EDL) muscles and the effect of short-lasting reinnervation in slow muscle only. Denervation was performed by cutting the sciatic nerve. Both muscles were excised 2 weeks later. Reinnervation was studied 4 weeks after nerve crush in SOL muscle only. Capillaries and muscle fibres were visualised by triple immunofluorescent staining with antibodies against CD31 and laminin and with fluorescein-labelled Griffonia (Bandeira) simplicifolia lectin. A recently developed stereological approach allowing the estimation of the length of capillaries adjacent to each individual fibre (Lcap/Lfib) was employed. Three-dimensional virtual test grids were applied to stacks of optical images captured with a confocal microscope and their intersections with capillaries and muscle fibres were counted. Interrelationships among capillaries and muscle fibres were demonstrated with maximum intensity projection of the acquired stacks of optical images. The course of capillaries in EDL seemed to be parallel to the fibre axes, whereas in SOL, their preferential direction deviated from the fibre axes and formed more cross-connections among neighbouring capillaries. Lcap/Lfib was clearly reduced in denervated SOL but remained unchanged in EDL, although the muscle fibres significantly atrophied in both muscle types. When soleus muscle was reinnervated, capillary length per unit fibre length was completely restored. The physiological background for the different responses of the capillary network in slow and fast muscle is discussed.
NASA Astrophysics Data System (ADS)
Zhao, G.; Liu, J.; Chen, B.; Guo, R.; Chen, L.
2017-12-01
Forward modeling of gravitational fields at large-scale requires to consider the curvature of the Earth and to evaluate the Newton's volume integral in spherical coordinates. To acquire fast and accurate gravitational effects for subsurface structures, subsurface mass distribution is usually discretized into small spherical prisms (called tesseroids). The gravity fields of tesseroids are generally calculated numerically. One of the commonly used numerical methods is the 3D Gauss-Legendre quadrature (GLQ). However, the traditional GLQ integration suffers from low computational efficiency and relatively poor accuracy when the observation surface is close to the source region. We developed a fast and high accuracy 3D GLQ integration based on the equivalence of kernel matrix, adaptive discretization and parallelization using OpenMP. The equivalence of kernel matrix strategy increases efficiency and reduces memory consumption by calculating and storing the same matrix elements in each kernel matrix just one time. In this method, the adaptive discretization strategy is used to improve the accuracy. The numerical investigations show that the executing time of the proposed method is reduced by two orders of magnitude compared with the traditional method that without these optimized strategies. High accuracy results can also be guaranteed no matter how close the computation points to the source region. In addition, the algorithm dramatically reduces the memory requirement by N times compared with the traditional method, where N is the number of discretization of the source region in the longitudinal direction. It makes the large-scale gravity forward modeling and inversion with a fine discretization possible.
NASA Astrophysics Data System (ADS)
Zhu, Dan; Shang, Jing; Ye, Xiaodong; Shen, Jian
2016-12-01
The understanding of macromolecular structures and interactions is important but difficult, due to the facts that a macromolecules are of versatile conformations and aggregate states, which vary with environmental conditions and histories. In this work two polyamides with parallel or anti-parallel dipoles along the linear backbone, named as ABAB (parallel) and AABB (anti-parallel) have been studied. By using a combination of methods, the phase behaviors of the polymers during the aggregate and gelation, i.e., the forming or dissociation processes of nuclei and fibril, cluster of fibrils, and cluster-cluster aggregation have been revealed. Such abundant phase behaviors are dominated by the inter-chain interactions, including dispersion, polarity and hydrogen bonding, and correlatd with the solubility parameters of solvents, the temperature, and the polymer concentration. The results of X-ray diffraction and fast-mode dielectric relaxation indicate that AABB possesses more rigid conformation than ABAB, and because of that AABB aggregates are of long fibers while ABAB is of hairy fibril clusters, the gelation concentration in toluene is 1 w/v% for AABB, lower than the 3 w/v% for ABAB.
Method and apparatus for offloading compute resources to a flash co-processing appliance
Tzelnic, Percy; Faibish, Sorin; Gupta, Uday K.; Bent, John; Grider, Gary Alan; Chen, Hsing -bung
2015-10-13
Solid-State Drive (SSD) burst buffer nodes are interposed into a parallel supercomputing cluster to enable fast burst checkpoint of cluster memory to or from nearby interconnected solid-state storage with asynchronous migration between the burst buffer nodes and slower more distant disk storage. The SSD nodes also perform tasks offloaded from the compute nodes or associated with the checkpoint data. For example, the data for the next job is preloaded in the SSD node and very fast uploaded to the respective compute node just before the next job starts. During a job, the SSD nodes perform fast visualization and statistical analysis upon the checkpoint data. The SSD nodes can also perform data reduction and encryption of the checkpoint data.
Fast iterative censoring CFAR algorithm for ship detection from SAR images
NASA Astrophysics Data System (ADS)
Gu, Dandan; Yue, Hui; Zhang, Yuan; Gao, Pengcheng
2017-11-01
Ship detection is one of the essential techniques for ship recognition from synthetic aperture radar (SAR) images. This paper presents a fast iterative detection procedure to eliminate the influence of target returns on the estimation of local sea clutter distributions for constant false alarm rate (CFAR) detectors. A fast block detector is first employed to extract potential target sub-images; and then, an iterative censoring CFAR algorithm is used to detect ship candidates from each target blocks adaptively and efficiently, where parallel detection is available, and statistical parameters of G0 distribution fitting local sea clutter well can be quickly estimated based on an integral image operator. Experimental results of TerraSAR-X images demonstrate the effectiveness of the proposed technique.
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics
NASA Technical Reports Server (NTRS)
Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier
1992-01-01
Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Radiography and partial tomography of wood with thermal neutrons
NASA Astrophysics Data System (ADS)
Osterloh, K.; Fratzscher, D.; Schwabe, A.; Schillinger, B.; Zscherpel, U.; Ewert, U.
2011-09-01
The effective high neutron scattering absorption coefficient of hydrogen (48.5 cm 2/g) due to the scattering allows neutrons to reveal hydrocarbon structures with more contrast than X-rays, but at the same time limits the sample size and thickness that can be investigated. Many planar shaped objects, particularly wood samples, are sufficiently thin to allow thermal neutrons to transmit through the sample in a direction perpendicular to the planar face but not in a parallel direction, due to increased thickness. Often, this is an obstacle that prevents some tomographic reconstruction algorithms from obtaining desired results because of inadequate information or presence of distracting artifacts due to missing projections. This can be true for samples such as the distribution of glue in glulam (boards of wooden layers glued together), or the course of partially visible annual rings in trees where the features of interest are parallel to the planar surface of the sample. However, it should be possible to study these features by rotating the specimen within a limited angular range. In principle, this approach has been shown previously in a study with fast neutrons [2]. A study of this kind was performed at the Antares facility of FRM II in Garching with a 2.6×10 7/cm 2 s thermal neutron beam. The limit of penetration was determined for a wooden step wedge carved from a 2 cm×4 cm block of wood in comparison to other materials such as heavy metals and Lucite as specimens rich in hydrogen. The depth of the steps was 1 cm, the height 0.5 cm. The annual ring structures were clearly detectable up to 2 cm thickness. Wooden specimens, i.e. shivers, from a sunken old ship have been subjected to tomography. Not visible from the outside, clear radial structures have been found that are typical for certain kinds of wood. This insight was impaired in a case where the specimen had been soaked with ethylene glycol. In another large sample study, a planar board made of glulam has been studied to show the glued layers. This study shows not only the limits of penetration in wood but also demonstrates access to structures perpendicular to the surface in larger planar objects by tomography with fast neutrons, even with incomplete sets of projection data that covers an angular range of only 90° or even 60°.
NASA Astrophysics Data System (ADS)
Shahzad, M.; Rizvi, H.; Panwar, A.; Ryu, C. M.
2017-06-01
We have re-visited the existence criterion of the reverse shear Alfven eigenmodes (RSAEs) in the presence of the parallel equilibrium current by numerically solving the eigenvalue equation using a fast eigenvalue solver code KAES. The parallel equilibrium current can bring in the kink effect and is known to be strongly unfavorable for the RSAE. We have numerically estimated the critical value of the toroidicity factor Qtor in a circular tokamak plasma, above which RSAEs can exist, and compared it to the analytical one. The difference between the numerical and analytical critical values is small for low frequency RSAEs, but it increases as the frequency of the mode increases, becoming greater for higher poloidal harmonic modes.
Parallel processing approach to transform-based image coding
NASA Astrophysics Data System (ADS)
Normile, James O.; Wright, Dan; Chu, Ken; Yeh, Chia L.
1991-06-01
This paper describes a flexible parallel processing architecture designed for use in real time video processing. The system consists of floating point DSP processors connected to each other via fast serial links, each processor has access to a globally shared memory. A multiple bus architecture in combination with a dual ported memory allows communication with a host control processor. The system has been applied to prototyping of video compression and decompression algorithms. The decomposition of transform based algorithms for decompression into a form suitable for parallel processing is described. A technique for automatic load balancing among the processors is developed and discussed, results ar presented with image statistics and data rates. Finally techniques for accelerating the system throughput are analyzed and results from the application of one such modification described.
A convenient and accurate parallel Input/Output USB device for E-Prime.
Canto, Rosario; Bufalari, Ilaria; D'Ausilio, Alessandro
2011-03-01
Psychological and neurophysiological experiments require the accurate control of timing and synchrony for Input/Output signals. For instance, a typical Event-Related Potential (ERP) study requires an extremely accurate synchronization of stimulus delivery with recordings. This is typically done via computer software such as E-Prime, and fast communications are typically assured by the Parallel Port (PP). However, the PP is an old and disappearing technology that, for example, is no longer available on portable computers. Here we propose a convenient USB device enabling parallel I/O capabilities. We tested this device against the PP on both a desktop and a laptop machine in different stress tests. Our data demonstrate the accuracy of our system, which suggests that it may be a good substitute for the PP with E-Prime.
Merlin - Massively parallel heterogeneous computing
NASA Technical Reports Server (NTRS)
Wittie, Larry; Maples, Creve
1989-01-01
Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.
A transient-enhanced NMOS low dropout voltage regulator with parallel feedback compensation
NASA Astrophysics Data System (ADS)
Han, Wang; Lin, Tan
2016-02-01
This paper presents a transient-enhanced NMOS low-dropout regulator (LDO) for portable applications with parallel feedback compensation. The parallel feedback structure adds a dynamic zero to get an adequate phase margin with a load current variation from 0 to 1 A. A class-AB error amplifier and a fast charging/discharging unit are adopted to enhance the transient performance. The proposed LDO has been implemented in a 0.35 μm BCD process. From experimental results, the regulator can operate with a minimum dropout voltage of 150 mV at a maximum 1 A load and IQ of 165 μA. Under the full range load current step, the voltage undershoot and overshoot of the proposed LDO are reduced to 38 mV and 27 mV respectively.
2D-RBUC for efficient parallel compression of residuals
NASA Astrophysics Data System (ADS)
Đurđević, Đorđe M.; Tartalja, Igor I.
2018-02-01
In this paper, we present a method for lossless compression of residuals with an efficient SIMD parallel decompression. The residuals originate from lossy or near lossless compression of height fields, which are commonly used to represent models of terrains. The algorithm is founded on the existing RBUC method for compression of non-uniform data sources. We have adapted the method to capture 2D spatial locality of height fields, and developed the data decompression algorithm for modern GPU architectures already present even in home computers. In combination with the point-level SIMD-parallel lossless/lossy high field compression method HFPaC, characterized by fast progressive decompression and seamlessly reconstructed surface, the newly proposed method trades off small efficiency degradation for a non negligible compression ratio (measured up to 91%) benefit.
Depth variations of P-wave azimuthal anisotropy beneath East Asia
NASA Astrophysics Data System (ADS)
Wei, W.; Zhao, D.; Xu, J.
2017-12-01
We present a new P-wave anisotropic tomographic model beneath East Asia by inverting a total of 1,488,531 P wave arrival-time data recorded by the regional seismic networks in East Asia and temporary seismic arrays deployed on the Tibetan Plateau. Our results provide important new insights into the subducting Indian, Pacific and Philippine Sea plates and mantle dynamics in East Asia. Our tomographic images show that the northern limit of the subducting Indian plate has reached the Jinsha River suture in eastern Tibet. A striking variation of P-wave azimuthal anisotropy is revealed in the Indian lithosphere: the fast velocity direction (FVD) is NE-SW beneath the Indian continent, whereas the FVD is arc parallel beneath the Himalaya and Tibetan Plateau, which may reflect re-orientation of minerals due to lithospheric extension, in response to the India-Eurasia collision. The FVD in the subducting Philippine Sea plate beneath the Ryukyu arc is NE-SW(trench parallel), which is consistent with the spreading direction of the West Philippine Basin during its initial opening stage, suggesting that it may reflect the fossil anisotropy. A circular pattern of FVDs is revealed around the Philippine Sea slab beneath SE China. We suggest that it reflects asthenospheric strain caused by toroidal mantle flow around the edge of the subducting slab. We find a striking variation of the FVD with depth in the subducting Pacific slab beneath the Northeast Japan arc. It may be caused by slab dehydration that changed elastic properties of the slab with depth. The FVD in the mantle wedge beneath the Northeast Japan and Ryukyu arcs is trench normal, which reflects subduction-induced convection. Beneath the Kuril and Izu-Bonin arcs where oblique subduction occurs, the FVD in the mantle wedge is nearly normal to the moving direction of the downgoing Pacific plate, suggesting that the oblique subduction together with the complex slab morphology have disturbed the mantle flow.
MMS Observations of Parallel Electric Fields During a Quasi-Perpendicular Bow Shock Crossing
NASA Astrophysics Data System (ADS)
Goodrich, K.; Schwartz, S. J.; Ergun, R.; Wilder, F. D.; Holmes, J.; Burch, J. L.; Gershman, D. J.; Giles, B. L.; Khotyaintsev, Y. V.; Le Contel, O.; Lindqvist, P. A.; Strangeway, R. J.; Russell, C.; Torbert, R. B.
2016-12-01
Previous observations of the terrestrial bow shock have frequently shown large-amplitude fluctuations in the parallel electric field. These parallel electric fields are seen as both nonlinear solitary structures, such as double layers and electron phase-space holes, and short-wavelength waves, which can reach amplitudes greater than 100 mV/m. The Magnetospheric Multi-Scale (MMS) Mission has crossed the Earth's bow shock more than 200 times. The parallel electric field signatures observed in these crossings are seen in very discrete packets and evolve over time scales of less than a second, indicating the presence of a wealth of kinetic-scale activity. The high time resolution of the Fast Particle Instrument (FPI) available on MMS offers greater detail of the kinetic-scale physics that occur at bow shocks than ever before, allowing greater insight into the overall effect of these observed electric fields. We present a characterization of these parallel electric fields found in a single bow shock event and how it reflects the kinetic-scale activity that can occur at the terrestrial bow shock.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nishioka, K.; Nakamura, Y.; Nishimura, S.
A moment approach to calculate neoclassical transport in non-axisymmetric torus plasmas composed of multiple ion species is extended to include the external parallel momentum sources due to unbalanced tangential neutral beam injections (NBIs). The momentum sources that are included in the parallel momentum balance are calculated from the collision operators of background particles with fast ions. This method is applied for the clarification of the physical mechanism of the neoclassical parallel ion flows and the multi-ion species effect on them in Heliotron J NBI plasmas. It is found that parallel ion flow can be determined by the balance between themore » parallel viscosity and the external momentum source in the region where the external source is much larger than the thermodynamic force driven source in the collisional plasmas. This is because the friction between C{sup 6+} and D{sup +} prevents a large difference between C{sup 6+} and D{sup +} flow velocities in such plasmas. The C{sup 6+} flow velocities, which are measured by the charge exchange recombination spectroscopy system, are numerically evaluated with this method. It is shown that the experimentally measured C{sup 6+} impurity flow velocities do not contradict clearly with the neoclassical estimations, and the dependence of parallel flow velocities on the magnetic field ripples is consistent in both results.« less
A new fast direct solver for the boundary element method
NASA Astrophysics Data System (ADS)
Huang, S.; Liu, Y. J.
2017-09-01
A new fast direct linear equation solver for the boundary element method (BEM) is presented in this paper. The idea of the new fast direct solver stems from the concept of the hierarchical off-diagonal low-rank matrix. The hierarchical off-diagonal low-rank matrix can be decomposed into the multiplication of several diagonal block matrices. The inverse of the hierarchical off-diagonal low-rank matrix can be calculated efficiently with the Sherman-Morrison-Woodbury formula. In this paper, a more general and efficient approach to approximate the coefficient matrix of the BEM with the hierarchical off-diagonal low-rank matrix is proposed. Compared to the current fast direct solver based on the hierarchical off-diagonal low-rank matrix, the proposed method is suitable for solving general 3-D boundary element models. Several numerical examples of 3-D potential problems with the total number of unknowns up to above 200,000 are presented. The results show that the new fast direct solver can be applied to solve large 3-D BEM models accurately and with better efficiency compared with the conventional BEM.
Control of parallel manipulators using force feedback
NASA Technical Reports Server (NTRS)
Nanua, Prabjot
1994-01-01
Two control schemes are compared for parallel robotic mechanisms actuated by hydraulic cylinders. One scheme, the 'rate based scheme', uses the position and rate information only for feedback. The second scheme, the 'force based scheme' feeds back the force information also. The force control scheme is shown to improve the response over the rate control one. It is a simple constant gain control scheme better suited to parallel mechanisms. The force control scheme can be easily modified for the dynamic forces on the end effector. This paper presents the results of a computer simulation of both the rate and force control schemes. The gains in the force based scheme can be individually adjusted in all three directions, whereas the adjustment in just one direction of the rate based scheme directly affects the other two directions.
Suzuki, Miwa; Lee, Andrew Y; Vázquez-Medina, José Pablo; Viscarra, Jose A; Crocker, Daniel E; Ortiz, Rudy M
2015-05-15
Fibroblast growth factor (FGF)-21 is secreted from the liver, pancreas, and adipose in response to prolonged fasting/starvation to facilitate lipid and glucose metabolism. Northern elephant seals naturally fast for several months, maintaining a relatively elevated metabolic rate to satisfy their energetic requirements. Thus, to better understand the impact of prolonged food deprivation on FGF21-associated changes, we analyzed the expression of FGF21, FGF receptor-1 (FGFR1), β-klotho (KLB; a co-activator of FGFR) in adipose, and plasma FGF21, glucose and 3-hydroxybutyrate in fasted elephant seal pups. Expression of FGFR1 and KLB mRNA decreased 98% and 43%, respectively, with fasting duration. While the 80% decrease in mean adipose FGF21 mRNA expression with fasting did not reach statistical significance, it paralleled the 39% decrease in plasma FGF21 concentrations suggesting that FGF21 is suppressed with fasting in elephant seals. Data demonstrate an atypical response of FGF21 to prolonged fasting in a mammal suggesting that FGF21-mediated mechanisms have evolved differentially in elephant seals. Furthermore, the typical fasting-induced, FGF21-mediated actions such as the inhibition of lipolysis in adipose may not be required in elephant seals as part of a naturally adapted mechanism to support their unique metabolic demands during prolonged fasting. Copyright © 2015 Elsevier Inc. All rights reserved.
Suzuki, Miwa; Lee, Andrew; Vázquez-Medina, Jose Pablo; Viscarra, Jose A.; Crocker, Daniel E.; Ortiz, Rudy M.
2015-01-01
Fibroblast growth factor (FGF)-21 is secreted from the liver, pancreas, and adipose in response to prolonged fasting/starvation to facilitate lipid and glucose metabolism. Northern elephant seals naturally fast for several months, maintaining a relatively elevated metabolic rate to satisfy their energetic requirements. Thus, to better understand the impact of prolonged food deprivation on FGF21-associated changes, we analyzed the expression of FGF21, FGF receptor-1 (FGFR1), β-klotho (KLB; a co-activator of FGFR) in adipose, and plasma FGF21, glucose and 3-hydroxybutyrate in fasted elephant seal pups. Expression of FGFR1 and KLB mRNA decreased 98% and 43%, respectively, with fasting duration. While the 80% decrease in mean adipose FGF21 mRNA expression with fasting did not reach statistical significance, it paralleled the 39% decrease in plasma FGF21 concentrations suggesting that FGF21 is suppressed with fasting in elephant seals. Data demonstrate an atypical response of FGF21 to prolonged fasting in a mammal suggesting that FGF21-mediated mechanisms have evolved differentially in elephant seals. Furthermore, the typical fasting-induced, FGF21-mediated actions such as the inhibition of lipolysis in adipose may not be required in elephant seals as part of a naturally adapted mechanism to support their unique metabolic demands during prolonged fasting. PMID:25857751
Usefulness of non-fasting lipid parameters in children.
Kubo, Toshihide; Takahashi, Kyohei; Furujo, Mahoko; Hyodo, Yuki; Tsuchiya, Hiroki; Hattori, Mariko; Fujinaga, Shoko; Urayama, Kenji
2017-01-01
This study assessed whether non-fasting lipid markers could be substituted for fasting markers in screening for dyslipidemia, whether direct measurement of non-fasting low-density lipoprotein cholesterol [LDL-C (D)] could be substituted for the calculation of fasting LDL-C [LDL-C (F)], and the utility of measuring non-high-density lipoprotein cholesterol (non-HDL-C). In 33 children, the lipid profile was measured in the non-fasting and fasting states within 24 h. Correlations were examined between non-fasting LDL-C (D) or non-HDL-C levels and fasting LDL-C (F) levels. Non-fasting triglyceride (TG), total cholesterol (TC), HDL-C, LDL-C (D), and non-HDL-C levels were all significantly higher than the fasting levels, but the mean difference was within 10% (except for TG). Non-fasting LDL-C (D) and non-HDL-C levels were strongly correlated with the fasting LDL-C (F) levels. In conclusion, except for TG, non-fasting lipid parameters are useful when screening children for dyslipidemia. Direct measurement of non-fasting LDL-C and calculation of non-fasting non-HDL-C could replace the calculation of fasting LDL-C because of convenience.
A fast pulse design for parallel excitation with gridding conjugate gradient.
Feng, Shuo; Ji, Jim
2013-01-01
Parallel excitation (pTx) is recognized as a crucial technique in high field MRI to address the transmit field inhomogeneity problem. However, it can be time consuming to design pTx pulses which is not desirable. In this work, we propose a pulse design with gridding conjugate gradient (CG) based on the small-tip-angle approximation. The two major time consuming matrix-vector multiplications are substituted by two operators which involves with FFT and gridding only. Simulation results have shown that the proposed method is 3 times faster than conventional method and the memory cost is reduced by 1000 times.
NASA Astrophysics Data System (ADS)
Alves Júnior, A. A.; Sokoloff, M. D.
2017-10-01
MCBooster is a header-only, C++11-compliant library that provides routines to generate and perform calculations on large samples of phase space Monte Carlo events. To achieve superior performance, MCBooster is capable to perform most of its calculations in parallel using CUDA- and OpenMP-enabled devices. MCBooster is built on top of the Thrust library and runs on Linux systems. This contribution summarizes the main features of MCBooster. A basic description of the user interface and some examples of applications are provided, along with measurements of performance in a variety of environments
Jacobsohn, D.H.; Merrill, L.C.
1959-01-20
An improved parallel addition unit is described which is especially adapted for use in electronic digital computers and characterized by propagation of the carry signal through each of a plurality of denominationally ordered stages within a minimum time interval. In its broadest aspects, the invention incorporates a fast multistage parallel digital adder including a plurality of adder circuits, carry-propagation circuit means in all but the most significant digit stage, means for conditioning each carry-propagation circuit during the time period in which information is placed into the adder circuits, and means coupling carry-generation portions of thc adder circuit to the carry propagating means.
Perceptual learning in visual search: fast, enduring, but non-specific.
Sireteanu, R; Rettenbach, R
1995-07-01
Visual search has been suggested as a tool for isolating visual primitives. Elementary "features" were proposed to involve parallel search, while serial search is necessary for items without a "feature" status, or, in some cases, for conjunctions of "features". In this study, we investigated the role of practice in visual search tasks. We found that, under some circumstances, initially serial tasks can become parallel after a few hundred trials. Learning in visual search is far less specific than learning of visual discriminations and hyperacuity, suggesting that it takes place at another level in the central visual pathway, involving different neural circuits.
A Fast parallel tridiagonal algorithm for a class of CFD applications
NASA Technical Reports Server (NTRS)
Moitra, Stuti; Sun, Xian-He
1996-01-01
The parallel diagonal dominant (PDD) algorithm is an efficient tridiagonal solver. This paper presents for study a variation of the PDD algorithm, the reduced PDD algorithm. The new algorithm maintains the minimum communication provided by the PDD algorithm, but has a reduced operation count. The PDD algorithm also has a smaller operation count than the conventional sequential algorithm for many applications. Accuracy analysis is provided for the reduced PDD algorithm for symmetric Toeplitz tridiagonal (STT) systems. Implementation results on Langley's Intel Paragon and IBM SP2 show that both the PDD and reduced PDD algorithms are efficient and scalable.
NASA Technical Reports Server (NTRS)
Juang, Hann-Ming Henry; Tao, Wei-Kuo; Zeng, Xi-Ping; Shie, Chung-Lin; Simpson, Joanne; Lang, Steve
2004-01-01
The capability for massively parallel programming (MPP) using a message passing interface (MPI) has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model. The design for the MPP with MPI uses the concept of maintaining similar code structure between the whole domain as well as the portions after decomposition. Hence the model follows the same integration for single and multiple tasks (CPUs). Also, it provides for minimal changes to the original code, so it is easily modified and/or managed by the model developers and users who have little knowledge of MPP. The entire model domain could be sliced into one- or two-dimensional decomposition with a halo regime, which is overlaid on partial domains. The halo regime requires that no data be fetched across tasks during the computational stage, but it must be updated before the next computational stage through data exchange via MPI. For reproducible purposes, transposing data among tasks is required for spectral transform (Fast Fourier Transform, FFT), which is used in the anelastic version of the model for solving the pressure equation. The performance of the MPI-implemented codes (i.e., the compressible and anelastic versions) was tested on three different computing platforms. The major results are: 1) both versions have speedups of about 99% up to 256 tasks but not for 512 tasks; 2) the anelastic version has better speedup and efficiency because it requires more computations than that of the compressible version; 3) equal or approximately-equal numbers of slices between the x- and y- directions provide the fastest integration due to fewer data exchanges; and 4) one-dimensional slices in the x-direction result in the slowest integration due to the need for more memory relocation for computation.
NASA Technical Reports Server (NTRS)
Voellmer, George
1992-01-01
Compliant element for robot wrist accepts small displacements in one direction only (to first approximation). Three such elements combined to obtain translational compliance along three orthogonal directions, without rotational compliance along any of them. Element is double-blade flexure joint in which two sheets of spring steel attached between opposing blocks, forming rectangle. Blocks moved parallel to each other in one direction only. Sheets act as double cantilever beams deforming in S-shape, keeping blocks parallel.
rfpipe: Radio interferometric transient search pipeline
NASA Astrophysics Data System (ADS)
Law, Casey J.
2017-10-01
rfpipe supports Python-based analysis of radio interferometric data (especially from the Very Large Array) and searches for fast radio transients. This extends on the rtpipe library (ascl:1706.002) with new approaches to parallelization, acceleration, and more portable data products. rfpipe can run in standalone mode or be in a cluster environment.
USDA-ARS?s Scientific Manuscript database
New, faster methods have been developed for analysis of vitamin D and triacylglycerols that eliminate hours of wet chemistry and preparative chromatography, while providing more information than classical methods for analysis. Unprecedented detail is provided by combining liquid chromatography with ...
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm
NASA Technical Reports Server (NTRS)
Povitsky, A.
1998-01-01
In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
Risk of misclassification with a non-fasting lipid profile in secondary cardiovascular prevention.
Klop, Boudewijn; Hartong, Simone C C; Vermeer, Henricus J; Schoofs, Mariette W C J; Kofflard, Marcel J M
2017-09-01
Routinely fasting is not necessary for measuring the lipid profile according to the latest European consensus. However, LDL-C tends to be lower in the non-fasting state with risk of misclassification. The extent of misclassification in secondary cardiovascular prevention with a non-fasting lipid profile was investigated. 329 patients on lipid lowering therapy for secondary cardiovascular prevention measured a fasting and non-fasting lipid profile. Cut-off values for LDL-C, non-HDL-C and apolipoprotein B were set at <1.8mmol/l, <2.6mmol/l and <0.8g/l, respectively. Study outcomes were net misclassification with non-fasting LDL-C (calculated using the Friedewald formula), direct LDL-C, non-HDL-C and apolipoprotein B. Net misclassification <10% was considered clinically irrelevant. Mean age was 68.3±8.5years and the majority were men (79%). Non-fasting measurements resulted in lower LDL-C (-0.2±0.4mmol/l, P<0.001), direct LDL-C (-0.1±0.2mmol/l, P=0.001), non-HDL-C (-0.1±0.4mmol/l, P=0.004) and apolipoprotein B (-0.02±0.10g/l, P=0.004). 36.0% of the patients reached a fasting LDL-C target of <1.8mmol/l with a significant net misclassification of 10.7% (95% CI 6.4-15.0%) in the non-fasting state. In the non-fasting state net misclassification with direct LDL-C was 5.7% (95% CI 2.1-9.2%), 4.0% (95% CI 1.0-7.4%) with non-HDL-C and 4.1% (95% CI 1.1-9.1%) with apolipoprotein B. Use of non-fasting LDL-C as treatment target in secondary cardiovascular prevention resulted in significant misclassification with subsequent risk of undertreatment, whereas non-fasting direct LDL-C, non-HDL-C and apolipoprotein B are reliable parameters. Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yee, Seonghwan, E-mail: Seonghwan.Yee@Beaumont.edu; Gao, Jia-Hong
Purpose: To investigate whether the direction of spin-lock field, either parallel or antiparallel to the rotating magnetization, has any effect on the spin-lock MRI signal and further on the quantitative measurement of T1ρ, in a clinical 3 T MRI system. Methods: The effects of inverted spin-lock field direction were investigated by acquiring a series of spin-lock MRI signals for an American College of Radiology MRI phantom, while the spin-lock field direction was switched between the parallel and antiparallel directions. The acquisition was performed for different spin-locking methods (i.e., for the single- and dual-field spin-locking methods) and for different levels ofmore » clinically feasible spin-lock field strength, ranging from 100 to 500 Hz, while the spin-lock duration was varied in the range from 0 to 100 ms. Results: When the spin-lock field was inverted into the antiparallel direction, the rate of MRI signal decay was altered and the T1ρ value, when compared to the value for the parallel field, was clearly different. Different degrees of such direction-dependency were observed for different spin-lock field strengths. In addition, the dependency was much smaller when the parallel and the antiparallel fields are mixed together in the dual-field method. Conclusions: The spin-lock field direction could impact the MRI signal and further the T1ρ measurement in a clinical MRI system.« less
NASA Astrophysics Data System (ADS)
Evangelidis, C. P.
2017-12-01
The segmentation and differentiation of subducting slabs have considerable effects on mantle convection and tectonics. The Hellenic subduction zone is a complex convergent margin with strong curvature and fast slab rollback. The upper mantle seismic anisotropy in the region is studied focusing at its western and eastern edges in order to explore the effects of possible slab segmentation on mantle flow and fabrics. Complementary to new SKS shear-wave splitting measurements in regions not adequately sampled so far, the source-side splitting technique is applied to constrain the depth of anisotropy and to densify measurements. In the western Hellenic arc, a trench-normal subslab anisotropy is observed near the trench. In the forearc domain, source-side and SKS measurements reveal a trench-parallel pattern. This indicates subslab trench-parallel mantle flow, associated with return flow due to the fast slab rollback. The passage from continental to oceanic subduction in the western Hellenic zone is illustrated by a forearc transitional anisotropy pattern. This indicates subslab mantle flow parallel to a NE-SW smooth ramp that possibly connects the two subducted slabs. A young tear fault initiated at the Kefalonia Transform Fault is likely not entirely developed, as this trench-parallel anisotropy pattern is observed along the entire western Hellenic subduction system, even following this horizontal offset between the two slabs. At the eastern side of the Hellenic subduction zone, subslab source-side anisotropy measurements show a general trench-normal pattern. These are associated with mantle flow through a possible ongoing tearing of the oceanic lithosphere in the area. Although the exact geometry of this slab tear is relatively unknown, SKS trench-parallel measurements imply that the tear has not reached the surface yet. Further exploration of the Hellenic subduction system is necessary; denser seismic networks should be deployed at both its edges in order to achieve a more definite image of the structure and geodynamics of this area.
Tetreault, J.; Jones, C.H.; Erslev, E.; Larson, S.; Hudson, M.; Holdaway, S.
2008-01-01
Significant fold-axis-parallel slip is accommodated in the folded strata of the Grayback monocline, northeastern Front Range, Colorado, without visible large strike-slip displacement on the fold surface. In many cases, oblique-slip deformation is partitioned; fold-axis-normal slip is accommodated within folds, and fold-axis-parallel slip is resolved onto adjacent strike-slip faults. Unlike partitioning strike-parallel slip onto adjacent strike-slip faults, fold-axis-parallel slip has deformed the forelimb of the Grayback monocline. Mean compressive paleostress orientations in the forelimb are deflected 15??-37?? clockwise from the regional paleostress orientation of the northeastern Front Range. Paleomagnetic directions from the Permian Ingleside Formation in the forelimb are rotated 16??-42?? clockwise about a bedding-normal axis relative to the North American Permian reference direction. The paleostress and paleomagnetic rotations increase with the bedding dip angle and decrease along strike toward the fold tip. These measurements allow for 50-120 m of fold-axis-parallel slip within the forelimb, depending on the kinematics of strike-slip shear. This resolved horizontal slip is nearly equal in magnitude to the ???180 m vertical throw across the fold. For 200 m of oblique-slip displacement (120 m of strike slip and 180 m of reverse slip), the true shortening direction across the fold is N90??E, indistinguishable from the regionally inferred direction of N90??E and quite different from the S53??E fold-normal direction. Recognition of this deformational style means that significant amounts of strike slip can be accommodated within folds without axis-parallel surficial faulting. ?? 2008 Geological Society of America.
HeinzelCluster: accelerated reconstruction for FORE and OSEM3D.
Vollmar, S; Michel, C; Treffert, J T; Newport, D F; Casey, M; Knöss, C; Wienhard, K; Liu, X; Defrise, M; Heiss, W D
2002-08-07
Using iterative three-dimensional (3D) reconstruction techniques for reconstruction of positron emission tomography (PET) is not feasible on most single-processor machines due to the excessive computing time needed, especially so for the large sinogram sizes of our high-resolution research tomograph (HRRT). In our first approach to speed up reconstruction time we transform the 3D scan into the format of a two-dimensional (2D) scan with sinograms that can be reconstructed independently using Fourier rebinning (FORE) and a fast 2D reconstruction method. On our dedicated reconstruction cluster (seven four-processor systems, Intel PIII@700 MHz, switched fast ethernet and Myrinet, Windows NT Server), we process these 2D sinograms in parallel. We have achieved a speedup > 23 using 26 processors and also compared results for different communication methods (RPC, Syngo, Myrinet GM). The other approach is to parallelize OSEM3D (implementation of C Michel), which has produced the best results for HRRT data so far and is more suitable for an adequate treatment of the sinogram gaps that result from the detector geometry of the HRRT. We have implemented two levels of parallelization for four dedicated cluster (a shared memory fine-grain level on each node utilizing all four processors and a coarse-grain level allowing for 15 nodes) reducing the time for one core iteration from over 7 h to about 35 min.
NASA Astrophysics Data System (ADS)
Labombard, Brian
2013-10-01
A ``Mirror Langmuir Probe'' (MLP) diagnostic has been used to interrogate edge plasma profiles and turbulence in Alcator C-Mod with unprecedented detail, yielding fundamental insights on the Quasi-Coherent Mode (QCM) - a mode that regulates plasma density and impurities in EDA H-modes without ELMs. The MLP employs a fast-switching, self-adapting bias scheme, recording density, electron temperature and plasma potential simultaneously at high bandwidth (~1 MHz) on each of four separate electrodes on a scanning probe. Temporal dynamics are followed in detail; wavenumber-frequency spectra and phase relationships are readily deduced. Poloidal field fluctuations are recorded separately with a two-coil, scanning probe. Results from ohmic L-mode and H-mode plasmas are reported, including key observations of the QCM: The QCM lives in a region of positive radial electric field, with a mode width (~3 mm) that spans open and closed field line regions. Remarkably large amplitude (~30%), sinusoidal bursts in density, electron temperature and plasma potential fluctuations are observed that are in phase; potential lags density by at most 10 degrees. Propagation velocity of the mode corresponds to the sum of local E × B and electron diamagnetic drift velocities - quantities that are deduced directly from time-averaged profiles. Poloidal magnetic field fluctuations project to parallel current densities of ~5 amps/cm2 in the mode layer, with significant parallel electromagnetic induction. Electron force balance is examined, unambiguously identifying the mode type. It is found that fluctuations in parallel electron pressure gradient are roughly balanced by the sum of electrostatic and electromotive forces. Thus the primary mode structure of the QCM is that of a drift-Alfven wave. Work supported by US DoE award DE-FC02-99ER54512.
Limits to high-speed simulations of spiking neural networks using general-purpose computers.
Zenke, Friedemann; Gerstner, Wulfram
2014-01-01
To understand how the central nervous system performs computations using recurrent neuronal circuitry, simulations have become an indispensable tool for theoretical neuroscience. To study neuronal circuits and their ability to self-organize, increasing attention has been directed toward synaptic plasticity. In particular spike-timing-dependent plasticity (STDP) creates specific demands for simulations of spiking neural networks. On the one hand a high temporal resolution is required to capture the millisecond timescale of typical STDP windows. On the other hand network simulations have to evolve over hours up to days, to capture the timescale of long-term plasticity. To do this efficiently, fast simulation speed is the crucial ingredient rather than large neuron numbers. Using different medium-sized network models consisting of several thousands of neurons and off-the-shelf hardware, we compare the simulation speed of the simulators: Brian, NEST and Neuron as well as our own simulator Auryn. Our results show that real-time simulations of different plastic network models are possible in parallel simulations in which numerical precision is not a primary concern. Even so, the speed-up margin of parallelism is limited and boosting simulation speeds beyond one tenth of real-time is difficult. By profiling simulation code we show that the run times of typical plastic network simulations encounter a hard boundary. This limit is partly due to latencies in the inter-process communications and thus cannot be overcome by increased parallelism. Overall, these results show that to study plasticity in medium-sized spiking neural networks, adequate simulation tools are readily available which run efficiently on small clusters. However, to run simulations substantially faster than real-time, special hardware is a prerequisite.
Block-Parallel Data Analysis with DIY2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morozov, Dmitriy; Peterka, Tom
DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial,more » parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.« less
Ambient Noise Tomography of the Northwestern U.S. and the Adjacent Juan de Fuca and Gorda Plates
NASA Astrophysics Data System (ADS)
Wang, H.; Feng, L.; Tian, Y.; Ritzwoller, M. H.
2017-12-01
The NSF Cascadia Initiative (CI) experiment includes 4-year deployments of ocean bottom seismometers (OBSs) on the Juan de Fuca and Gorda Plates. The CI experiment provides the unprecedented opportunity to investigate the crustal and upper mantle structure of this region. The 259 OBSs switched between Cascadia North in Years 1 and 3 and Cascadia South in Years 2 and 4 at around 160 different sites. Using the OBSs together with 89 stations near the Pacific coast, we estimate empirical Green's function (EGF) between station pairs by cross-correlating ambient noise recorded on their vertical components. Unlike continental stations, the OBSs are contaminated mainly by tilt and compliance noise at low frequencies (<0.1 Hz), which obscures the coherent ambient noise and makes it more difficult to retrieve reliable EGFs. Compliance noise comes from the seafloor deformation under gravity waves and its strength depends mostly on the pressure signal, thus compliance noise can be reduced significantly using the pressure record. Tilt noise is induced by currents near the seafloor, and the horizontal records are dominated by tilt noise at frequencies below 0.1 Hz. Tilt noise on the vertical components can be reduced using horizontal components. The "denoised" cross-correlations provide more reliable and higher signal to noise ratio (SNR) EGFs. Based on the estimated EGFs from the "denoised" vertical records, we use frequency-time analysis (FTAN) to retrieve the dispersion curve of Rayleigh waves between station pairs. Using the Rayleigh wave dispersion curves, we perform seismic tomography to construct isotropic and azimuthally anisotropic phase velocity maps at periods from about 8 to 30s across the Juan de Fuca and Gorda plates, extending up onto the continent. Previous studies have shown that the fast axis directions of 2ψ azimuthal anisotropy align parallel to present-day plate motion directions at longer periods and parallel to paleospreading directions at shorter periods. We also investigate the relationship between azimuthal anisotropy and plate motion direction across the Juan de Fuca and Gorda plates.
Use Computer-Aided Tools to Parallelize Large CFD Applications
NASA Technical Reports Server (NTRS)
Jin, H.; Frumkin, M.; Yan, J.
2000-01-01
Porting applications to high performance parallel computers is always a challenging task. It is time consuming and costly. With rapid progressing in hardware architectures and increasing complexity of real applications in recent years, the problem becomes even more sever. Today, scalability and high performance are mostly involving handwritten parallel programs using message-passing libraries (e.g. MPI). However, this process is very difficult and often error-prone. The recent reemergence of shared memory parallel (SMP) architectures, such as the cache coherent Non-Uniform Memory Access (ccNUMA) architecture used in the SGI Origin 2000, show good prospects for scaling beyond hundreds of processors. Programming on an SMP is simplified by working in a globally accessible address space. The user can supply compiler directives, such as OpenMP, to parallelize the code. As an industry standard for portable implementation of parallel programs for SMPs, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran, C and C++ to express shared memory parallelism. It promises an incremental path for parallel conversion of existing software, as well as scalability and performance for a complete rewrite or an entirely new development. Perhaps the main disadvantage of programming with directives is that inserted directives may not necessarily enhance performance. In the worst cases, it can create erroneous results. While vendors have provided tools to perform error-checking and profiling, automation in directive insertion is very limited and often failed on large programs, primarily due to the lack of a thorough enough data dependence analysis. To overcome the deficiency, we have developed a toolkit, CAPO, to automatically insert OpenMP directives in Fortran programs and apply certain degrees of optimization. CAPO is aimed at taking advantage of detailed inter-procedural dependence analysis provided by CAPTools, developed by the University of Greenwich, to reduce potential errors made by users. Earlier tests on NAS Benchmarks and ARC3D have demonstrated good success of this tool. In this study, we have applied CAPO to parallelize three large applications in the area of computational fluid dynamics (CFD): OVERFLOW, TLNS3D and INS3D. These codes are widely used for solving Navier-Stokes equations with complicated boundary conditions and turbulence model in multiple zones. Each one comprises of from 50K to 1,00k lines of FORTRAN77. As an example, CAPO took 77 hours to complete the data dependence analysis of OVERFLOW on a workstation (SGI, 175MHz, R10K processor). A fair amount of effort was spent on correcting false dependencies due to lack of necessary knowledge during the analysis. Even so, CAPO provides an easy way for user to interact with the parallelization process. The OpenMP version was generated within a day after the analysis was completed. Due to sequential algorithms involved, code sections in TLNS3D and INS3D need to be restructured by hand to produce more efficient parallel codes. An included figure shows preliminary test results of the generated OVERFLOW with several test cases in single zone. The MPI data points for the small test case were taken from a handcoded MPI version. As we can see, CAPO's version has achieved 18 fold speed up on 32 nodes of the SGI O2K. For the small test case, it outperformed the MPI version. These results are very encouraging, but further work is needed. For example, although CAPO attempts to place directives on the outer- most parallel loops in an interprocedural framework, it does not insert directives based on the best manual strategy. In particular, it lacks the support of parallelization at the multi-zone level. Future work will emphasize on the development of methodology to work in a multi-zone level and with a hybrid approach. Development of tools to perform more complicated code transformation is also needed.
ERIC Educational Resources Information Center
Gil, Arturo; Peidró, Adrián; Reinoso, Óscar; Marín, José María
2017-01-01
This paper presents a tool, LABEL, oriented to the teaching of parallel robotics. The application, organized as a set of tools developed using Easy Java Simulations, enables the study of the kinematics of parallel robotics. A set of classical parallel structures was implemented such that LABEL can solve the inverse and direct kinematic problem of…
NASA Astrophysics Data System (ADS)
Al-Omari, S.
2006-12-01
The photophysical properties of the hexapyropheophorbide- a (P6) compound were studied using both steady-state and time-resolved spectroscopy. It was found that neighboring pyropheophorbide- a (pyroPheo) molecules covalently linked to each other through carbon chains, which could stack. This structural property is the reason for the possibility of formation of two different types of energy traps, which could be resolved experimentally. One of them is formed via face-to-face stacking of two pyroPheo molecules with a direction of the transition dipole moments parallel to each other. The second type of energy trap gives the dominant contribution to the fluorescence signal at a registration wavelength having the oblique geometry or orthogonal direction of the transition dipole moments of the interacting pyroPheo molecules. In any case, the dipole-dipole Förster energy transfer between pyroPheo molecules caused a very fast and efficient delivery of the excitation to a trap. As a result, the fluorescence as well as the singlet oxygen quantum yields of P6 were reduced by four and three times, respectively, compared to those values of the reference bispyrophephorbide- a (P2) compound.
NASA Astrophysics Data System (ADS)
Hartmann, Jana; Steib, Frederik; Zhou, Hao; Ledig, Johannes; Nicolai, Lars; Fündling, Sönke; Schimpke, Tilman; Avramescu, Adrian; Varghese, Tansen; Trampert, Achim; Straßburg, Martin; Lugauer, Hans-Jürgen; Wehmann, Hergo-Heinrich; Waag, Andreas
2017-10-01
GaN fins are 3D architectures elongated in one direction parallel to the substrate surface. They have the geometry of walls with a large height to width ratio as well as small footprints. When appropriate symmetry directions of the GaN buffer are used, the sidewalls are formed by non-polar {1 1 -2 0} planes, making the fins particularly suitable for many device applications like LEDs, FETs, lasers, sensors or waveguides. The influence of growth parameters like temperature, pressure, V/III ratio and total precursor flow on the fin structures is analyzed. Based on these results, a 2-temperature-step-growth was developed, leading to fins with smooth side and top facets, fast vertical growth rates and good homogeneity along their length as well as over different mask patterns. For the core-shell growth of fin LED heterostructures, the 2-temperature-step-growth shows much smoother sidewalls and less crystal defects in the InGaN QW and p-GaN shell compared to structures with cores grown in just one step. Electroluminescence spectra of the 2-temperature-step-grown fin LED are demonstrated.
Photodissociation dynamics of H2O at 111.5 nm by a vacuum ultraviolet free electron laser
NASA Astrophysics Data System (ADS)
Wang, Heilong; Yu, Yong; Chang, Yao; Su, Shu; Yu, Shengrui; Li, Qinming; Tao, Kai; Ding, Hongli; Yang, Jaiyue; Wang, Guanglei; Che, Li; He, Zhigang; Chen, Zhichao; Wang, Xingan; Zhang, Weiqing; Dai, Dongxu; Wu, Guorong; Yuan, Kaijun; Yang, Xueming
2018-03-01
Photodissociation dynamics of H2O via the F ˜ state at 111.5 nm were investigated using the high resolution H-atom Rydberg "tagging" time-of-flight (TOF) technique, in combination with the tunable vacuum ultraviolet free electron laser at the Dalian Coherent Light Source. The product translational energy distributions and angular distributions in both parallel and perpendicular directions were derived from the recorded TOF spectra. Based on these distributions, the quantum state distributions and angular anisotropy parameters of OH (X) and OH (A) products have been determined. For the OH (A) + H channel, highly rotationally excited OH (A) products have been observed. These products are ascribed to a fast direct dissociation on the B ˜ 1A1 state surface after multi-step internal conversions from the initial excited F ˜ state to the B ˜ state. While for the OH (X) + H channel, very highly rotationally excited OH (X) products with moderate vibrational excitation are revealed and attributed to the dissociation via a nonadiabatic pathway through the well-known two conical intersections between the B ˜ -state and the X ˜ -state surfaces.
Study of Historical 4B/X17 Mega Flare on 28 October 2003 (P58)
NASA Astrophysics Data System (ADS)
Uddin, W.; Chandra, R.; Ali, S. S.
2006-11-01
wuddin_99@yahoo.com We analysed multi-wavelength data of 28 October 2003 4B/X17.2 class extremely energetic parallel ribbon solar flare, which occurred in NOAA 10486. The flare was well observed in H-alpha at ARIES, Nainital and various space (SOHO, TRACE, RHESSI, WIND etc.) and ground based Observatories. The H-alpha observations show the stretching/detwisting and eruption of helically twisted S shaped (sigmoid) filament in the South-West direction of the active region with bright shock front followed by rapid increase in intensity and area of the gigantic flare. The flare is associated with a bright/fast full halo earth directed CME, strong type II, III and IV radio bursts, an intense proton event and GLE. It seems that the filament eruption triggered the halo CME because the helical structure is clearly visible in the SOHO/LASCO C2, C3 images. This indicates helicity transfer from chromosphere to corona and interplanetary medium. The magnetic field of the flaring region was most complex with high magnetic shear. From the above analysis we feel that the energy buildup/release process of this unique flare support helically twisted magnetic flux rope model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wiyono, Samsul H., E-mail: samsul.wiyono@bmkg.go.id; Indonesia’s Agency for Meteorology Climatology and Geophysics, Jakarta 10610; Nugraha, Andri Dian, E-mail: nugraha@gf.itb.ac.id
2015-04-24
Determining of seismic anisotropy allowed us for understanding the deformation processes that occured in the past and present. In this study, we performed shear wave splitting to characterize seismic anisotropy beneath Sunda-Banda subduction-collision zone. For about 1,610 XKS waveforms from INATEWS-BMKG networks have been analyzed. From its measurements showed that fast polarization direction is consistent with trench-perpendicular orientation but several stations presented different orientation. We also compared between fast polarization direction with absolute plate motion in the no net rotation and hotspot frame. Its result showed that both absolute plate motion frame had strong correlation with fast polarization direction. Strongmore » correlation between the fast polarization direction and the absolute plate motion can be interpreted as the possibility of dominant anisotropy is in the asthenosphere.« less
Fast cat-eye effect target recognition based on saliency extraction
NASA Astrophysics Data System (ADS)
Li, Li; Ren, Jianlin; Wang, Xingbin
2015-09-01
Background complexity is a main reason that results in false detection in cat-eye target recognition. Human vision has selective attention property which can help search the salient target from complex unknown scenes quickly and precisely. In the paper, we propose a novel cat-eye effect target recognition method named Multi-channel Saliency Processing before Fusion (MSPF). This method combines traditional cat-eye target recognition with the selective characters of visual attention. Furthermore, parallel processing enables it to achieve fast recognition. Experimental results show that the proposed method performs better in accuracy, robustness and speed compared to other methods.
NASA Technical Reports Server (NTRS)
Sanyal, Soumya; Jain, Amit; Das, Sajal K.; Biswas, Rupak
2003-01-01
In this paper, we propose a distributed approach for mapping a single large application to a heterogeneous grid environment. To minimize the execution time of the parallel application, we distribute the mapping overhead to the available nodes of the grid. This approach not only provides a fast mapping of tasks to resources but is also scalable. We adopt a hierarchical grid model and accomplish the job of mapping tasks to this topology using a scheduler tree. Results show that our three-phase algorithm provides high quality mappings, and is fast and scalable.
Parallel aeroelastic computations for wing and wing-body configurations
NASA Technical Reports Server (NTRS)
Byun, Chansup
1994-01-01
The objective of this research is to develop computationally efficient methods for solving fluid-structural interaction problems by directly coupling finite difference Euler/Navier-Stokes equations for fluids and finite element dynamics equations for structures on parallel computers. This capability will significantly impact many aerospace projects of national importance such as Advanced Subsonic Civil Transport (ASCT), where the structural stability margin becomes very critical at the transonic region. This research effort will have direct impact on the High Performance Computing and Communication (HPCC) Program of NASA in the area of parallel computing.
Seismic Imaging of the crust and upper mantle beneath Afar, Ethiopia
NASA Astrophysics Data System (ADS)
Hammond, J. O.; Kendall, J. M.; Stuart, G. W.; Ebinger, C. J.
2009-12-01
In March 2007 41 seismic stations were deployed in north east Ethiopia. These stations recorded until October 2009, whereupon the array was condensed to 13 stations. Here we show estimates of crustal structure derived from receiver functions and upper mantle velocity structure, derived from tomography and shear-wave splitting using the first 2.5 years of data. Bulk crustal structure has been determined by H-k stacking receiver functions. Crustal Thickness varies from ~45km on the rift margins to ~16km beneath the northeastern Afar stations. Estimates of Vp/Vs show normal continental crust values (1.7-1.8) on the rift margins, and very high values (2.0-2.2) in Afar, similar to results for the Main Ethiopian Rift (MER). This supports ideas of high levels of melt in the crust beneath the Ethiopian Rift. Additionally, we use a common conversion point migration technique to obtain high resolution images of crustal structure beneath the region. Both techniques show a linear region of thin crust (~16km) trending north-south, the same trend as the Red Sea rift. SKS-wave splitting results show a general north east-south west fast direction in the MER, systematically rotating to a more north-south fast direction towards the Red Sea. Additionally, stations close to the recent Dabbahu diking episode show sharp lateral changes over small lateral distances (40° over <30km), with fast directions overlying the Dabbahu segment aligning parallel with the recent diking. This supports ideas of melt dominated anisotropy beneath the Ethiopian rift. The magnitude of splitting in this region is smaller than that seen at the MER, suggesting a thinner region of melt, or less focused melt is causing the anisotropy. Seismic tomography inversions show that in the top 150km low velocities highlight plate boundaries. The low velocity anomalies extend from the main Ethiopian rift NE, towards Djibouti, and from Djibouti NW towards the Dabbahu segment The lowest velocities exist on the rift margins, supporting ideas of preferential melt generation at these regions of high strain. This includes a region of low velocity close to the edge of the proposed location of the Danakil microplate. Outside of these focused regions the velocities are relatively fast. Below ~250km the anomaly broadens to cover most of the Afar region with only the rift margins remaining fast. At transition zone depths little anomaly is seen beneath Afar, but some low velocities remain present beneath the MER. These studies suggest that in northern Ethiopia the Red Sea rift is dominant. The presence of thin crust beneath northern Afar suggests that the Red Sea rift is creating oceanic like crust in this region. The lack of deep mantle low velocity anomalies beneath Afar suggest that a typical narrow conduit plume does not exist in this region, rather the velocity models seem more similar to passive upwelling of material beneath Afar.
Ion acceleration and heating by kinetic Alfvén waves associated with magnetic reconnection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liang, Ji; Lin, Yu; Johnson, Jay R.
In a previous study on the generation and signatures of kinetic Alfv en waves (KAWs) associated with magnetic reconnection in a current sheet revealed that KAWs are a common feature during reconnection [Liang et al. J. Geophys. Res.: Space Phys. 121, 6526 (2016)]. In this paper, ion acceleration and heating by the KAWs generated during magnetic reconnection are investigated with a three-dimensional (3-D) hybrid model. It is found that in the outflow region, a fraction of inflow ions are accelerated by the KAWs generated in the leading bulge region of reconnection, and their parallel velocities gradually increase up to slightly super-Alfv enic. As a result of waveparticle interactions, an accelerated ion beam forms in the direction of the anti-parallel magnetic field, in addition to the core ion population, leading to the development of non-Maxwellian velocity distributions, which include a trapped population with parallel velocities consistent with the wave speed. We then heat ions in both parallel and perpendicular directions. In the parallel direction, the heating results from nonlinear Landau resonance of trapped ions. In the perpendicular direction, however, evidence of stochastic heating by the KAWs is found during the acceleration stage, with an increase of magnetic moment μ. The coherence in the T more » $$\\perp$$ ion temperature and the perpendicular electric and magnetic fields of KAWs also provides evidence for perpendicular heating by KAWs. The parallel and perpendicular heating of the accelerated beam occur simultaneously, leading to the development of temperature anisotropy with the perpendicular temperature T $$\\perp$$>T $$\\parallel$$ temperature. The heating rate agrees with the damping rate of the KAWs, and the heating is dominated by the accelerated ion beam. In the later stage, with the increase of the fraction of the accelerated ions, interaction between the accelerated beam and the core population also contributes to the ion heating, ultimately leading to overlap of the beams and an overall anisotropy with T $$\\perp$$>T $$\\parallel$$.« less
Ion acceleration and heating by kinetic Alfvén waves associated with magnetic reconnection
Liang, Ji; Lin, Yu; Johnson, Jay R.; ...
2017-09-19
In a previous study on the generation and signatures of kinetic Alfv en waves (KAWs) associated with magnetic reconnection in a current sheet revealed that KAWs are a common feature during reconnection [Liang et al. J. Geophys. Res.: Space Phys. 121, 6526 (2016)]. In this paper, ion acceleration and heating by the KAWs generated during magnetic reconnection are investigated with a three-dimensional (3-D) hybrid model. It is found that in the outflow region, a fraction of inflow ions are accelerated by the KAWs generated in the leading bulge region of reconnection, and their parallel velocities gradually increase up to slightly super-Alfv enic. As a result of waveparticle interactions, an accelerated ion beam forms in the direction of the anti-parallel magnetic field, in addition to the core ion population, leading to the development of non-Maxwellian velocity distributions, which include a trapped population with parallel velocities consistent with the wave speed. We then heat ions in both parallel and perpendicular directions. In the parallel direction, the heating results from nonlinear Landau resonance of trapped ions. In the perpendicular direction, however, evidence of stochastic heating by the KAWs is found during the acceleration stage, with an increase of magnetic moment μ. The coherence in the T more » $$\\perp$$ ion temperature and the perpendicular electric and magnetic fields of KAWs also provides evidence for perpendicular heating by KAWs. The parallel and perpendicular heating of the accelerated beam occur simultaneously, leading to the development of temperature anisotropy with the perpendicular temperature T $$\\perp$$>T $$\\parallel$$ temperature. The heating rate agrees with the damping rate of the KAWs, and the heating is dominated by the accelerated ion beam. In the later stage, with the increase of the fraction of the accelerated ions, interaction between the accelerated beam and the core population also contributes to the ion heating, ultimately leading to overlap of the beams and an overall anisotropy with T $$\\perp$$>T $$\\parallel$$.« less
NASA Astrophysics Data System (ADS)
Nouri-Borujerdi, Ali; Moazezi, Arash
2018-01-01
The current study investigates the conjugate heat transfer characteristics for laminar flow in backward facing step channel. All of the channel walls are insulated except the lower thick wall under a constant temperature. The upper wall includes a insulated obstacle perpendicular to flow direction. The effect of obstacle height and location on the fluid flow and heat transfer are numerically explored for the Reynolds number in the range of 10 ≤ Re ≤ 300. Incompressible Navier-Stokes and thermal energy equations are solved simultaneously in fluid region by the upwind compact finite difference scheme based on flux-difference splitting in conjunction with artificial compressibility method. In the thick wall, the energy equation is obtained by Laplace equation. A multi-block approach is used to perform parallel computing to reduce the CPU time. Each block is modeled separately by sharing boundary conditions with neighbors. The developed program for modeling was written in FORTRAN language with OpenMP API. The obtained results showed that using of the multi-block parallel computing method is a simple robust scheme with high performance and high-order accurate. Moreover, the obtained results demonstrated that the increment of Reynolds number and obstacle height as well as decrement of horizontal distance between the obstacle and the step improve the heat transfer.
Modularized Parallel Neutron Instrument Simulation on the TeraGrid
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Meili; Cobb, John W; Hagen, Mark E
2007-01-01
In order to build a bridge between the TeraGrid (TG), a national scale cyberinfrastructure resource, and neutron science, the Neutron Science TeraGrid Gateway (NSTG) is focused on introducing productive HPC usage to the neutron science community, primarily the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL). Monte Carlo simulations are used as a powerful tool for instrument design and optimization at SNS. One of the successful efforts of a collaboration team composed of NSTG HPC experts and SNS instrument scientists is the development of a software facility named PSoNI, Parallelizing Simulations of Neutron Instruments. Parallelizing the traditional serialmore » instrument simulation on TeraGrid resources, PSoNI quickly computes full instrument simulation at sufficient statistical levels in instrument de-sign. Upon SNS successful commissioning, to the end of 2007, three out of five commissioned instruments in SNS target station will be available for initial users. Advanced instrument study, proposal feasibility evalua-tion, and experiment planning are on the immediate schedule of SNS, which pose further requirements such as flexibility and high runtime efficiency on fast instrument simulation. PSoNI has been redesigned to meet the new challenges and a preliminary version is developed on TeraGrid. This paper explores the motivation and goals of the new design, and the improved software structure. Further, it describes the realized new fea-tures seen from MPI parallelized McStas running high resolution design simulations of the SEQUOIA and BSS instruments at SNS. A discussion regarding future work, which is targeted to do fast simulation for automated experiment adjustment and comparing models to data in analysis, is also presented.« less
When fast logic meets slow belief: Evidence for a parallel-processing model of belief bias.
Trippas, Dries; Thompson, Valerie A; Handley, Simon J
2017-05-01
Two experiments pitted the default-interventionist account of belief bias against a parallel-processing model. According to the former, belief bias occurs because a fast, belief-based evaluation of the conclusion pre-empts a working-memory demanding logical analysis. In contrast, according to the latter both belief-based and logic-based responding occur in parallel. Participants were given deductive reasoning problems of variable complexity and instructed to decide whether the conclusion was valid on half the trials or to decide whether the conclusion was believable on the other half. When belief and logic conflict, the default-interventionist view predicts that it should take less time to respond on the basis of belief than logic, and that the believability of a conclusion should interfere with judgments of validity, but not the reverse. The parallel-processing view predicts that beliefs should interfere with logic judgments only if the processing required to evaluate the logical structure exceeds that required to evaluate the knowledge necessary to make a belief-based judgment, and vice versa otherwise. Consistent with this latter view, for the simplest reasoning problems (modus ponens), judgments of belief resulted in lower accuracy than judgments of validity, and believability interfered more with judgments of validity than the converse. For problems of moderate complexity (modus tollens and single-model syllogisms), the interference was symmetrical, in that validity interfered with belief judgments to the same degree that believability interfered with validity judgments. For the most complex (three-term multiple-model syllogisms), conclusion believability interfered more with judgments of validity than vice versa, in spite of the significant interference from conclusion validity on judgments of belief.
On a model of three-dimensional bursting and its parallel implementation
NASA Astrophysics Data System (ADS)
Tabik, S.; Romero, L. F.; Garzón, E. M.; Ramos, J. I.
2008-04-01
A mathematical model for the simulation of three-dimensional bursting phenomena and its parallel implementation are presented. The model consists of four nonlinearly coupled partial differential equations that include fast and slow variables, and exhibits bursting in the absence of diffusion. The differential equations have been discretized by means of a second-order accurate in both space and time, linearly-implicit finite difference method in equally-spaced grids. The resulting system of linear algebraic equations at each time level has been solved by means of the Preconditioned Conjugate Gradient (PCG) method. Three different parallel implementations of the proposed mathematical model have been developed; two of these implementations, i.e., the MPI and the PETSc codes, are based on a message passing paradigm, while the third one, i.e., the OpenMP code, is based on a shared space address paradigm. These three implementations are evaluated on two current high performance parallel architectures, i.e., a dual-processor cluster and a Shared Distributed Memory (SDM) system. A novel representation of the results that emphasizes the most relevant factors that affect the performance of the paralled implementations, is proposed. The comparative analysis of the computational results shows that the MPI and the OpenMP implementations are about twice more efficient than the PETSc code on the SDM system. It is also shown that, for the conditions reported here, the nonlinear dynamics of the three-dimensional bursting phenomena exhibits three stages characterized by asynchronous, synchronous and then asynchronous oscillations, before a quiescent state is reached. It is also shown that the fast system reaches steady state in much less time than the slow variables.
The Dorsal Visual System Predicts Future and Remembers Past Eye Position
Morris, Adam P.; Bremmer, Frank; Krekelberg, Bart
2016-01-01
Eye movements are essential to primate vision but introduce potentially disruptive displacements of the retinal image. To maintain stable vision, the brain is thought to rely on neurons that carry both visual signals and information about the current direction of gaze in their firing rates. We have shown previously that these neurons provide an accurate representation of eye position during fixation, but whether they are updated fast enough during saccadic eye movements to support real-time vision remains controversial. Here we show that not only do these neurons carry a fast and accurate eye-position signal, but also that they support in parallel a range of time-lagged variants, including predictive and post dictive signals. We recorded extracellular activity in four areas of the macaque dorsal visual cortex during a saccade task, including the lateral and ventral intraparietal areas (LIP, VIP), and the middle temporal (MT) and medial superior temporal (MST) areas. As reported previously, neurons showed tonic eye-position-related activity during fixation. In addition, they showed a variety of transient changes in activity around the time of saccades, including relative suppression, enhancement, and pre-saccadic bursts for one saccade direction over another. We show that a hypothetical neuron that pools this rich population activity through a weighted sum can produce an output that mimics the true spatiotemporal dynamics of the eye. Further, with different pooling weights, this downstream eye position signal (EPS) could be updated long before (<100 ms) or after (<200 ms) an eye movement. The results suggest a flexible coding scheme in which downstream computations have access to past, current, and future eye positions simultaneously, providing a basis for visual stability and delay-free visually-guided behavior. PMID:26941617
Wen, Qiuting; Kodiweera, Chandana; Dale, Brian M; Shivraman, Giri; Wu, Yu-Chien
2018-01-01
To accelerate high-resolution diffusion imaging, rotating single-shot acquisition (RoSA) with composite reconstruction is proposed. Acceleration was achieved by acquiring only one rotating single-shot blade per diffusion direction, and high-resolution diffusion-weighted (DW) images were reconstructed by using similarities of neighboring DW images. A parallel imaging technique was implemented in RoSA to further improve the image quality and acquisition speed. RoSA performance was evaluated by simulation and human experiments. A brain tensor phantom was developed to determine an optimal blade size and rotation angle by considering similarity in DW images, off-resonance effects, and k-space coverage. With the optimal parameters, RoSA MR pulse sequence and reconstruction algorithm were developed to acquire human brain data. For comparison, multishot echo planar imaging (EPI) and conventional single-shot EPI sequences were performed with matched scan time, resolution, field of view, and diffusion directions. The simulation indicated an optimal blade size of 48 × 256 and a 30 ° rotation angle. For 1 × 1 mm 2 in-plane resolution, RoSA was 12 times faster than the multishot acquisition with comparable image quality. With the same acquisition time as SS-EPI, RoSA provided superior image quality and minimum geometric distortion. RoSA offers fast, high-quality, high-resolution diffusion images. The composite image reconstruction is model-free and compatible with various diffusion computation approaches including parametric and nonparametric analyses. Magn Reson Med 79:264-275, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
High-speed spectral domain optical coherence tomography using non-uniform fast Fourier transform
Chan, Kenny K. H.; Tang, Shuo
2010-01-01
The useful imaging range in spectral domain optical coherence tomography (SD-OCT) is often limited by the depth dependent sensitivity fall-off. Processing SD-OCT data with the non-uniform fast Fourier transform (NFFT) can improve the sensitivity fall-off at maximum depth by greater than 5dB concurrently with a 30 fold decrease in processing time compared to the fast Fourier transform with cubic spline interpolation method. NFFT can also improve local signal to noise ratio (SNR) and reduce image artifacts introduced in post-processing. Combined with parallel processing, NFFT is shown to have the ability to process up to 90k A-lines per second. High-speed SD-OCT imaging is demonstrated at camera-limited 100 frames per second on an ex-vivo squid eye. PMID:21258551
Parallel, stochastic measurement of molecular surface area.
Juba, Derek; Varshney, Amitabh
2008-08-01
Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface area computation that maps well to the emerging multi-core architectures. Our algorithm is also progressive, providing a rough estimate of surface area immediately and refining this estimate as time goes on. Furthermore, the algorithm generates points on the molecular surface which can be used for point-based rendering. We demonstrate a GPU implementation of our algorithm and show that it compares favorably with several existing molecular surface computation programs, giving fast estimates of the molecular surface area with good accuracy.
Parallel momentum input by tangential neutral beam injections in stellarator and heliotron plasmas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nishimura, S., E-mail: nishimura.shin@lhd.nifs.ac.jp; Nakamura, Y.; Nishioka, K.
The configuration dependence of parallel momentum inputs to target plasma particle species by tangentially injected neutral beams is investigated in non-axisymmetric stellarator/heliotron model magnetic fields by assuming the existence of magnetic flux-surfaces. In parallel friction integrals of the full Rosenbluth-MacDonald-Judd collision operator in thermal particles' kinetic equations, numerically obtained eigenfunctions are used for excluding trapped fast ions that cannot contribute to the friction integrals. It is found that the momentum inputs to thermal ions strongly depend on magnetic field strength modulations on the flux-surfaces, while the input to electrons is insensitive to the modulation. In future plasma flow studies requiringmore » flow calculations of all particle species in more general non-symmetric toroidal configurations, the eigenfunction method investigated here will be useful.« less
The fast and the slow of skilled bimanual rhythm production: parallel versus integrated timing.
Krampe, R T; Kliegl, R; Mayr, U; Engbert, R; Vorberg, D
2000-02-01
Professional pianists performed 2 bimanual rhythms at a wide range of different tempos. The polyrhythmic task required the combination of 2 isochronous sequences (3 against 4) between the hands; in the syncopated rhythm task successive keystrokes formed intervals of identical (isochronous) durations. At slower tempos, pianists relied on integrated timing control merging successive intervals between the hands into a common reference frame. A timer-motor model is proposed based on the concepts of rate fluctuation and the distinction between target specification and timekeeper execution processes as a quantitative account of performance at slow tempos. At rapid rates expert pianists used hand-independent, parallel timing control. In alternative to a model based on a single central clock, findings support a model of flexible control structures with multiple timekeepers that can work in parallel to accommodate specific task constraints.
NASA Astrophysics Data System (ADS)
Xie, Lizhe; Hu, Yining; Chen, Yang; Shi, Luyao
2015-03-01
Projection and back-projection are the most computational consuming parts in Computed Tomography (CT) reconstruction. Parallelization strategies using GPU computing techniques have been introduced. We in this paper present a new parallelization scheme for both projection and back-projection. The proposed method is based on CUDA technology carried out by NVIDIA Corporation. Instead of build complex model, we aimed on optimizing the existing algorithm and make it suitable for CUDA implementation so as to gain fast computation speed. Besides making use of texture fetching operation which helps gain faster interpolation speed, we fixed sampling numbers in the computation of projection, to ensure the synchronization of blocks and threads, thus prevents the latency caused by inconsistent computation complexity. Experiment results have proven the computational efficiency and imaging quality of the proposed method.
NASA Technical Reports Server (NTRS)
Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Jost, Gabriele
2004-01-01
In this paper we describe the parallelization of the multi-zone code versions of the NAS Parallel Benchmarks employing multi-level OpenMP parallelism. For our study we use the NanosCompiler, which supports nesting of OpenMP directives and provides clauses to control the grouping of threads, load balancing, and synchronization. We report the benchmark results, compare the timings with those of different hybrid parallelization paradigms and discuss OpenMP implementation issues which effect the performance of multi-level parallel applications.
[The parallelisms in of sound signal of domestic sheep and Northern fur seals].
Nikol'skiĭ, A A; Lisitsina, T Iu
2011-01-01
The parallelisms in communicative behavior of domestic sheep and Northern fur seals within a herd are accompanied by parallelisms in parameters of sound signal, the calling scream. This signal ensures ties between babies and their mothers at a long distance. The basis of parallelisms is formed by amplitude modulation at two levels: the one being a direct amplitude modulation of the carrier frequency and the other--modulation of the carrier frequency oscillation. Parallelisms in the signal oscillatory process result in corresponding parallelisms in the structure of its frequency spectrum.
Parallel imaging of knee cartilage at 3 Tesla.
Zuo, Jin; Li, Xiaojuan; Banerjee, Suchandrima; Han, Eric; Majumdar, Sharmila
2007-10-01
To evaluate the feasibility and reproducibility of quantitative cartilage imaging with parallel imaging at 3T and to determine the impact of the acceleration factor (AF) on morphological and relaxation measurements. An eight-channel phased-array knee coil was employed for conventional and parallel imaging on a 3T scanner. The imaging protocol consisted of a T2-weighted fast spin echo (FSE), a 3D-spoiled gradient echo (SPGR), a custom 3D-SPGR T1rho, and a 3D-SPGR T2 sequence. Parallel imaging was performed with an array spatial sensitivity technique (ASSET). The left knees of six healthy volunteers were scanned with both conventional and parallel imaging (AF = 2). Morphological parameters and relaxation maps from parallel imaging methods (AF = 2) showed comparable results with conventional method. The intraclass correlation coefficient (ICC) of the two methods for cartilage volume, mean cartilage thickness, T1rho, and T2 were 0.999, 0.977, 0.964, and 0.969, respectively, while demonstrating excellent reproducibility. No significant measurement differences were found when AF reached 3 despite the low signal-to-noise ratio (SNR). The study demonstrated that parallel imaging can be applied to current knee cartilage quantification at AF = 2 without degrading measurement accuracy with good reproducibility while effectively reducing scan time. Shorter imaging times can be achieved with higher AF at the cost of SNR. (c) 2007 Wiley-Liss, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran
We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
On the dimensionally correct kinetic theory of turbulence for parallel propagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gaelzer, R., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br; Ziebell, L. F., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br; Yoon, P. H., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br
2015-03-15
Yoon and Fang [Phys. Plasmas 15, 122312 (2008)] formulated a second-order nonlinear kinetic theory that describes the turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. Their theory also includes discrete-particle effects, or the effects due to spontaneously emitted thermal fluctuations. However, terms associated with the spontaneous fluctuations in particle and wave kinetic equations in their theory contain proper dimensionality only for an artificial one-dimensional situation. The present paper extends the analysis and re-derives the dimensionally correct kinetic equations for three-dimensional case. The new formalism properly describes the effects of spontaneous fluctuations emitted in three-dimensional space, while the collectivelymore » emitted turbulence propagates predominantly in directions parallel/anti-parallel to the ambient magnetic field. As a first step, the present investigation focuses on linear wave-particle interaction terms only. A subsequent paper will include the dimensionally correct nonlinear wave-particle interaction terms.« less
Parallel image reconstruction for 3D positron emission tomography from incomplete 2D projection data
NASA Astrophysics Data System (ADS)
Guerrero, Thomas M.; Ricci, Anthony R.; Dahlbom, Magnus; Cherry, Simon R.; Hoffman, Edward T.
1993-07-01
The problem of excessive computational time in 3D Positron Emission Tomography (3D PET) reconstruction is defined, and we present an approach for solving this problem through the construction of an inexpensive parallel processing system and the adoption of the FAVOR algorithm. Currently, the 3D reconstruction of the 610 images of a total body procedure would require 80 hours and the 3D reconstruction of the 620 images of a dynamic study would require 110 hours. An inexpensive parallel processing system for 3D PET reconstruction is constructed from the integration of board level products from multiple vendors. The system achieves its computational performance through the use of 6U VME four i860 processor boards, the processor boards from five manufacturers are discussed from our perspective. The new 3D PET reconstruction algorithm FAVOR, FAst VOlume Reconstructor, that promises a substantial speed improvement is adopted. Preliminary results from parallelizing FAVOR are utilized in formulating architectural improvements for this problem. In summary, we are addressing the problem of excessive computational time in 3D PET image reconstruction, through the construction of an inexpensive parallel processing system and the parallelization of a 3D reconstruction algorithm that uses the incomplete data set that is produced by current PET systems.
YAPPA: a Compiler-Based Parallelization Framework for Irregular Applications on MPSoCs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lovergine, Silvia; Tumeo, Antonino; Villa, Oreste
Modern embedded systems include hundreds of cores. Because of the difficulty in providing a fast, coherent memory architecture, these systems usually rely on non-coherent, non-uniform memory architectures with private memories for each core. However, programming these systems poses significant challenges. The developer must extract large amounts of parallelism, while orchestrating communication among cores to optimize application performance. These issues become even more significant with irregular applications, which present data sets difficult to partition, unpredictable memory accesses, unbalanced control flow and fine grained communication. Hand-optimizing every single aspect is hard and time-consuming, and it often does not lead to the expectedmore » performance. There is a growing gap between such complex and highly-parallel architectures and the high level languages used to describe the specification, which were designed for simpler systems and do not consider these new issues. In this paper we introduce YAPPA (Yet Another Parallel Programming Approach), a compilation framework for the automatic parallelization of irregular applications on modern MPSoCs based on LLVM. We start by considering an efficient parallel programming approach for irregular applications on distributed memory systems. We then propose a set of transformations that can reduce the development and optimization effort. The results of our initial prototype confirm the correctness of the proposed approach.« less
Parallel versus Sequential Processing in Print and Braille Reading
ERIC Educational Resources Information Center
Veispak, Anneli; Boets, Bart; Ghesquiere, Pol
2012-01-01
In the current study we investigated word, pseudoword and story reading in Dutch speaking braille and print readers. To examine developmental patterns, these reading skills were assessed in both children and adults. The results reveal that braille readers read less accurately and fast than print readers. While item length has no impact on word…
Avoiding Defect Nucleation during Equilibration in Molecular Dynamics Simulations with ReaxFF
2015-04-01
respectively. All simulations are performed using the LAMMPS computer code.12 2 Fig. 1 a) Initial and b) final configurations of the molecular centers...Plimpton S. Fast parallel algorithms for short-range molecular dynamics. Comput J Phys. 1995;117:1–19. (Software available at http:// lammps .sandia.gov
Cryogenic liquid-level detector
NASA Technical Reports Server (NTRS)
Hamlet, J.
1978-01-01
Detector is designed for quick assembly, fast response, and good performance under vibratory stress. Its basic parallel-plate open configuration can be adapted to any length and allows its calibration scale factor to be predicted accurately. When compared with discrete level sensors, continuous reading sensor was found to be superior if there is sloshing, boiling, or other disturbance.
Dijkstra, Hildebrand; Dorrius, Monique D; Wielema, Mirjam; Pijnappel, Ruud M; Oudkerk, Matthijs; Sijens, Paul E
2016-12-01
To assess if specificity can be increased when semiautomated breast lesion analysis of quantitative diffusion-weighted imaging (DWI) is implemented after dynamic contrast-enhanced (DCE-) magnetic resonance imaging (MRI) in the workup of BI-RADS 3 and 4 breast lesions larger than 1 cm. In all, 120 consecutive patients (mean-age, 48 years; age range, 23-75 years) with 139 breast lesions (≥1 cm) were examined (2010-2014) with 1.5T DCE-MRI and DWI (b = 0, 50, 200, 500, 800, 1000 s/mm 2 ) and the BI-RADS classification and histopathology were obtained. For each lesion malignancy was excluded using voxelwise semiautomated breast lesion analysis based on previously defined thresholds for the apparent diffusion coefficient (ADC) and the three intravoxel incoherent motion (IVIM) parameters: molecular diffusion (D slow ), microperfusion (D fast ), and the fraction of D fast (f fast ). The sensitivity (Se), specificity (Sp), and negative predictive value (NPV) based on only IVIM parameters combined in parallel (D slow , D fast , and f fast ), or the ADC or the BI-RADS classification by DCE-MRI were compared. Subsequently, the Se, Sp, and NPV of the combination of the BI-RADS classification by DCE-MRI followed by the IVIM parameters in parallel (or the ADC) were compared. In all, 23 of 139 breast lesions were benign. Se and Sp of DCE-MRI was 100% and 30.4% (NPV = 100%). Se and Sp of IVIM parameters in parallel were 92.2% and 52.2% (NPV = 57.1%) and for the ADC 95.7% and 17.4%, respectively (NPV = 44.4%). In all, 26 of 139 lesions were classified as BI-RADS 3 (n = 7) or BI-RADS 4 (n = 19). DCE-MRI combined with ADC (Se = 99.1%, Sp = 34.8%) or IVIM (Se = 99.1%, Sp = 56.5%) did significantly improve (P = 0.016) Sp of DCE-MRI alone for workup of BI-RADS 3 and 4 lesions (NPV = 92.9%). Quantitative DWI has a lower NPV compared to DCE-MRI for evaluation of breast lesions and may therefore not be able to replace DCE-MRI; when implemented after DCE-MRI as problem solver for BI-RADS 3 and 4 lesions, the combined specificity improves significantly. J. Magn. Reson. Imaging 2016;44:1642-1649. © 2016 International Society for Magnetic Resonance in Medicine.
NASA Astrophysics Data System (ADS)
Mukherjee, Anamitra; Patel, Niravkumar D.; Bishop, Chris; Dagotto, Elbio
2015-06-01
Lattice spin-fermion models are important to study correlated systems where quantum dynamics allows for a separation between slow and fast degrees of freedom. The fast degrees of freedom are treated quantum mechanically while the slow variables, generically referred to as the "spins," are treated classically. At present, exact diagonalization coupled with classical Monte Carlo (ED + MC) is extensively used to solve numerically a general class of lattice spin-fermion problems. In this common setup, the classical variables (spins) are treated via the standard MC method while the fermion problem is solved by exact diagonalization. The "traveling cluster approximation" (TCA) is a real space variant of the ED + MC method that allows to solve spin-fermion problems on lattice sizes with up to 103 sites. In this publication, we present a novel reorganization of the TCA algorithm in a manner that can be efficiently parallelized. This allows us to solve generic spin-fermion models easily on 104 lattice sites and with some effort on 105 lattice sites, representing the record lattice sizes studied for this family of models.
Fast precalculated triangular mesh algorithm for 3D binary computer-generated holograms.
Yang, Fan; Kaczorowski, Andrzej; Wilkinson, Tim D
2014-12-10
A new method for constructing computer-generated holograms using a precalculated triangular mesh is presented. The speed of calculation can be increased dramatically by exploiting both the precalculated base triangle and GPU parallel computing. Unlike algorithms using point-based sources, this method can reconstruct a more vivid 3D object instead of a "hollow image." In addition, there is no need to do a fast Fourier transform for each 3D element every time. A ferroelectric liquid crystal spatial light modulator is used to display the binary hologram within our experiment and the hologram of a base right triangle is produced by utilizing just a one-step Fourier transform in the 2D case, which can be expanded to the 3D case by multiplying by a suitable Fresnel phase plane. All 3D holograms generated in this paper are based on Fresnel propagation; thus, the Fresnel plane is treated as a vital element in producing the hologram. A GeForce GTX 770 graphics card with 2 GB memory is used to achieve parallel computing.
Subramanian, Sankaran; Koscielniak, Janusz W.; Devasahayam, Nallathamby; Pursley, Randall H.; Pohida, Thomas J.; Krishna, Murali C.
2007-01-01
Rapid field scan on the order of T/s using high frequency sinusoidal or triangular sweep fields superimposed on the main Zeeman field, was used for direct detection of signals without low-frequency field modulation. Simultaneous application of space-encoding rotating field gradients have been employed to perform fast CW EPR imaging using direct detection that could, in principle, approach the speed of pulsed FT EPR imaging. The method takes advantage of the well-known rapid-scan strategy in CW NMR and EPR that allows arbitrarily fast field sweep and the simultaneous application of spinning gradients that allows fast spatial encoding. This leads to fast functional EPR imaging and, depending on the spin concentration, spectrometer sensitivity and detection band width, can provide improved temporal resolution that is important to interrogate dynamics of spin perfusion, pharmacokinetics, spectral spatial imaging, dynamic oxymetry, etc. PMID:17350865
Seismic anisotropy of 70 Ma Pacific-plate upper mantle
NASA Astrophysics Data System (ADS)
Mark, H. F.; Lizarralde, D.; Collins, J. A.; Miller, N. C.; Hirth, G.; Gaherty, J. B.; Evans, R. L.
2017-12-01
We present a new measurement of seismic anisotropy and velocity gradients in the Pacific-plate upper mantle based on data from the NoMelt experiment. The seismic velocity structure of oceanic lithosphere reflects the processes involved in its formation at mid-ocean ridges and subsequent evolution off-axis. Increasing mantle depletion with depth due to melt extraction predicts negative velocity gradients, as does cooling with age. Alignment of olivine by corner flow predicts azimuthal anisotropy. Some models predict the strength of anisotropy should decrease with depth. Measurements of uppermost mantle velocities have not fully verified these predictions. Observations of direct Pn phases demonstrate that positive velocity gradients exist; and anisotropy measurements, while consistent with strain-induced olivine alignment, vary widely and generally suggest weaker fabric development than is observed in ophiolite samples. These discrepancies raise questions about the extent to which mantle structure evolves through time due to processes such as cracking and alteration, and hinder the use of seismic measurements to make more detailed inferences on aspects of lithospheric formation processes. We have measured anisotropy and vertical velocity gradients to 10 km below the Moho on 70 Ma lithosphere between the Clarion and Clipperton fracture zones. The lithosphere at the study site has not been obviously affected by tectonic or magmatic events since its formation. We find 6.2% anisotropy at the Moho with a mean velocity of 8.14 km/s and the fast direction parallel to paleospreading. Velocity gradients are estimated at 0.02 km/s/km in the fast direction and near 0 km/s/km in the slow direction. The gradient estimates can be explained by aligned microcracks oriented perpendicular to spreading that close with depth. Cracks are expected to close by 10 km below the Moho. At that depth the strength of anisotropy increases to 9%, close to the strength estimated from ophiolite fabrics. These results are consistent with observed olivine fabrics and the predicted effects of lithospheric formation processes, and suggest that lithospheric evolution is modest even at 70 Ma, involving microcracks oriented by a stress field consistent with thermal contraction.
NASA Astrophysics Data System (ADS)
Sun, Jicheng; Gao, Xinliang; Lu, Quanming; Chen, Lunjin; Liu, Xu; Wang, Xueyi; Tao, Xin; Wang, Shui
2017-05-01
In this paper, we perform a 1-D particle-in-cell (PIC) simulation model consisting of three species, cold electrons, cold ions, and energetic ion ring, to investigate spectral structures of magnetosonic waves excited by ring distribution protons in the Earth's magnetosphere, and dynamics of charged particles during the excitation of magnetosonic waves. As the wave normal angle decreases, the spectral range of excited magnetosonic waves becomes broader with upper frequency limit extending beyond the lower hybrid resonant frequency, and the discrete spectra tends to merge into a continuous one. This dependence on wave normal angle is consistent with the linear theory. The effects of magnetosonic waves on the background cold plasma populations also vary with wave normal angle. For exactly perpendicular magnetosonic waves (parallel wave number k|| = 0), there is no energization in the parallel direction for both background cold protons and electrons due to the negligible fluctuating electric field component in the parallel direction. In contrast, the perpendicular energization of background plasmas is rather significant, where cold protons follow unmagnetized motion while cold electrons follow drift motion due to wave electric fields. For magnetosonic waves with a finite k||, there exists a nonnegligible parallel fluctuating electric field, leading to a significant and rapid energization in the parallel direction for cold electrons. These cold electrons can also be efficiently energized in the perpendicular direction due to the interaction with the magnetosonic wave fields in the perpendicular direction. However, cold protons can be only heated in the perpendicular direction, which is likely caused by the higher-order resonances with magnetosonic waves. The potential impacts of magnetosonic waves on the energization of the background cold plasmas in the Earth's inner magnetosphere are also discussed in this paper.