Sample records for t-sph parallel code

  1. GRADSPMHD: A parallel MHD code based on the SPH formalism

    NASA Astrophysics Data System (ADS)

    Vanaverbeke, S.; Keppens, R.; Poedts, S.

    2014-03-01

    We present GRADSPMHD, a completely Lagrangian parallel magnetohydrodynamics code based on the SPH formalism. The implementation of the equations of SPMHD in the “GRAD-h” formalism assembles known results, including the derivation of the discretized MHD equations from a variational principle, the inclusion of time-dependent artificial viscosity, resistivity and conductivity terms, as well as the inclusion of a mixed hyperbolic/parabolic correction scheme for satisfying the ∇ṡB→ constraint on the magnetic field. The code uses a tree-based formalism for neighbor finding and can optionally use the tree code for computing the self-gravity of the plasma. The structure of the code closely follows the framework of our parallel GRADSPH FORTRAN 90 code which we added previously to the CPC program library. We demonstrate the capabilities of GRADSPMHD by running 1, 2, and 3 dimensional standard benchmark tests and we find good agreement with previous work done by other researchers. The code is also applied to the problem of simulating the magnetorotational instability in 2.5D shearing box tests as well as in global simulations of magnetized accretion disks. We find good agreement with available results on this subject in the literature. Finally, we discuss the performance of the code on a parallel supercomputer with distributed memory architecture. Catalogue identifier: AERP_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AERP_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 620503 No. of bytes in distributed program, including test data, etc.: 19837671 Distribution format: tar.gz Programming language: FORTRAN 90/MPI. Computer: HPC cluster. Operating system: Unix. Has the code been vectorized or parallelized?: Yes, parallelized using MPI. RAM: ˜30 MB for a

  2. ZENO: N-body and SPH Simulation Codes

    NASA Astrophysics Data System (ADS)

    Barnes, Joshua E.

    2011-02-01

    The ZENO software package integrates N-body and SPH simulation codes with a large array of programs to generate initial conditions and analyze numerical simulations. Written in C, the ZENO system is portable between Mac, Linux, and Unix platforms. It is in active use at the Institute for Astronomy (IfA), at NRAO, and possibly elsewhere. Zeno programs can perform a wide range of simulation and analysis tasks. While many of these programs were first created for specific projects, they embody algorithms of general applicability and embrace a modular design strategy, so existing code is easily applied to new tasks. Major elements of the system include: Structured data file utilities facilitate basic operations on binary data, including import/export of ZENO data to other systems.Snapshot generation routines create particle distributions with various properties. Systems with user-specified density profiles can be realized in collisionless or gaseous form; multiple spherical and disk components may be set up in mutual equilibrium.Snapshot manipulation routines permit the user to sift, sort, and combine particle arrays, translate and rotate particle configurations, and assign new values to data fields associated with each particle.Simulation codes include both pure N-body and combined N-body/SPH programs: Pure N-body codes are available in both uniprocessor and parallel versions.SPH codes offer a wide range of options for gas physics, including isothermal, adiabatic, and radiating models. Snapshot analysis programs calculate temporal averages, evaluate particle statistics, measure shapes and density profiles, compute kinematic properties, and identify and track objects in particle distributions.Visualization programs generate interactive displays and produce still images and videos of particle distributions; the user may specify arbitrary color schemes and viewing transformations.

  3. GASOLINE: Smoothed Particle Hydrodynamics (SPH) code

    NASA Astrophysics Data System (ADS)

    N-Body Shop

    2017-10-01

    Gasoline solves the equations of gravity and hydrodynamics in astrophysical problems, including simulations of planets, stars, and galaxies. It uses an SPH method that features correct mixing behavior in multiphase fluids and minimal artificial viscosity. This method is identical to the SPH method used in the ChaNGa code (ascl:1105.005), allowing users to extend results to problems requiring >100,000 cores. Gasoline uses a fast, memory-efficient O(N log N) KD-Tree to solve Poisson's Equation for gravity and avoids artificial viscosity in non-shocking compressive flows.

  4. FleCSPH - a parallel and distributed SPH implementation based on the FleCSI framework

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Junghans, Christoph; Loiseau, Julien

    2017-06-20

    FleCSPH is a multi-physics compact application that exercises FleCSI parallel data structures for tree-based particle methods. In particular, FleCSPH implements a smoothed-particle hydrodynamics (SPH) solver for the solution of Lagrangian problems in astrophysics and cosmology. FleCSPH includes support for gravitational forces using the fast multipole method (FMM).

  5. Neptune: An astrophysical smooth particle hydrodynamics code for massively parallel computer architectures

    NASA Astrophysics Data System (ADS)

    Sandalski, Stou

    Smooth particle hydrodynamics is an efficient method for modeling the dynamics of fluids. It is commonly used to simulate astrophysical processes such as binary mergers. We present a newly developed GPU accelerated smooth particle hydrodynamics code for astrophysical simulations. The code is named neptune after the Roman god of water. It is written in OpenMP parallelized C++ and OpenCL and includes octree based hydrodynamic and gravitational acceleration. The design relies on object-oriented methodologies in order to provide a flexible and modular framework that can be easily extended and modified by the user. Several pre-built scenarios for simulating collisions of polytropes and black-hole accretion are provided. The code is released under the MIT Open Source license and publicly available at http://code.google.com/p/neptune-sph/.

  6. Draft Genome Sequence of Methylovulum psychrotolerans Sph1T, an Obligate Methanotroph from Low-Temperature Environments.

    PubMed

    Oshkin, Igor Y; Miroshnikov, Kirill K; Belova, Svetlana E; Korzhenkov, Aleksei A; Toshchakov, Stepan V; Dedysh, Svetlana N

    2018-03-15

    Methylovulum psychrotolerans Sph1 T is an aerobic, obligate methanotroph, which was isolated from cold methane seeps in West Siberia. This bacterium possesses only a particulate methane monooxygenase and is widely distributed in low-temperature environments. Strain Sph1 T has the genomic potential for biosynthesis of hopanoids required for the maintenance of intracytoplasmic membranes. Copyright © 2018 Oshkin et al.

  7. Implicit SPH v. 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Kyungjoo; Parks, Michael L.; Perego, Mauro

    2016-11-09

    ISPH code is developed to solve multi-physics meso-scale flow problems using implicit SPH method. In particular, the code can provides solutions for incompressible, multi phase flow and electro-kinetic flows.

  8. Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

    NASA Astrophysics Data System (ADS)

    Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

    2015-09-01

    The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.

  9. The SPH consistency problem and some astrophysical applications

    NASA Astrophysics Data System (ADS)

    Klapp, Jaime; Sigalotti, Leonardo; Rendon, Otto; Gabbasov, Ruslan; Torres, Ayax

    2017-11-01

    We discuss the SPH kernel and particle consistency problem and demonstrate that SPH has a limiting second-order convergence rate. We also present a solution to the SPH consistency problem. We present examples of how SPH implementations that are not mathematically consistent may lead to erroneous results. The new formalism has been implemented into the Gadget 2 code, including an improved scheme for the artificial viscosity. We present results for the ``Standard Isothermal Test Case'' of gravitational collapse and fragmentation of protostellar molecular cores that produce a very different evolution than with the standard SPH theory. A further application of accretion onto a black hole is presented.

  10. Wakefield Computations for the CLIC PETS using the Parallel Finite Element Time-Domain Code T3P

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Candel, A; Kabel, A.; Lee, L.

    In recent years, SLAC's Advanced Computations Department (ACD) has developed the high-performance parallel 3D electromagnetic time-domain code, T3P, for simulations of wakefields and transients in complex accelerator structures. T3P is based on advanced higher-order Finite Element methods on unstructured grids with quadratic surface approximation. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with unprecedented accuracy, aiding the design of the next generation of accelerator facilities. Applications to the Compact Linear Collider (CLIC) Power Extraction and Transfer Structure (PETS) are presented.

  11. Enforcing dust mass conservation in 3D simulations of tightly coupled grains with the PHANTOM SPH code

    NASA Astrophysics Data System (ADS)

    Ballabio, G.; Dipierro, G.; Veronesi, B.; Lodato, G.; Hutchison, M.; Laibe, G.; Price, D. J.

    2018-06-01

    We describe a new implementation of the one-fluid method in the SPH code PHANTOM to simulate the dynamics of dust grains in gas protoplanetary discs. We revise and extend previously developed algorithms by computing the evolution of a new fluid quantity that produces a more accurate and numerically controlled evolution of the dust dynamics. Moreover, by limiting the stopping time of uncoupled grains that violate the assumptions of the terminal velocity approximation, we avoid fatal numerical errors in mass conservation. We test and validate our new algorithm by running 3D SPH simulations of a large range of disc models with tightly and marginally coupled grains.

  12. Synthesis of Aromatic Thiolate-Protected Gold Nanomolecules by Core Conversion: The Case of Au36(SPh-tBu)24.

    PubMed

    Theivendran, Shevanuja; Dass, Amala

    2017-08-01

    Ultrasmall nanomolecules (<2 nm) such as Au 25 (SCH 2 CH 2 Ph) 18 , Au 38 (SCH 2 CH 2 Ph) 24 , and Au 144 (SCH 2 CH 2 Ph) 60 are well studied and can be prepared using established synthetic procedures. No such synthetic protocols that result in high yield products from commercially available starting materials exist for Au 36 (SPh-X) 24 . Here, we report a synthetic procedure for the large-scale synthesis of highly stable Au 36 (SPh-X) 24 with a yield of ∼42%. Au 36 (SPh-X) 24 was conveniently synthesized by using tert-butylbenzenethiol (HSPh-tBu, TBBT) as the ligand, giving a more stable product with better shelf life and higher yield than previously reported for making Au 36 (SPh) 24 from thiophenol (PhSH). The choice of thiol, solvent, and reaction conditions were modified for the optimization of the synthetic procedure. The purposes of this work are to (1) optimize the existing procedure to obtain stable product with better yield, (2) develop a scalable synthetic procedure, (3) demonstrate the superior stability of Au 36 (SPh-tBu) 24 when compared to Au 36 (SPh) 24 , and (4) demonstrate the reproducibility and robustness of the optimized synthetic procedure.

  13. Murine recessive hereditary spherocytosis, sph/sph, is caused by a mutation in the erythroid alpha-spectrin gene.

    PubMed

    Wandersee, N J; Birkenmeier, C S; Gifford, E J; Mohandas, N; Barker, J E

    2000-01-01

    Spectrin, a heterodimer of alpha- and beta-subunits, is the major protein component of the red blood cell membrane skeleton. The mouse mutation, sph, causes an alpha-spectrin-deficient hereditary spherocytosis with the severe phenotype typical of recessive hereditary spherocytosis in humans. The sph mutation maps to the erythroid alpha-spectrin locus, Spna1, on Chromosome 1. Scanning electron microscopy, osmotic gradient ektacytometry, cDNA cloning, RT-PCR, nucleic acid sequencing, and Northern blot analyses were used to characterize the wild type and sph alleles of the Spna1 locus. Our results confirm the spherocytic nature of sph/sph red blood cells and document a mild spherocytic transition in the +/sph heterozygotes. Sequencing of the full length coding region of the Spna1 wild type allele from the C57BL/6J strain of mice reveals a 2414 residue deduced amino acid sequence that shows the typical 106-amino-acid repeat structure previously described for other members of the spectrin protein family. Sequence analysis of RT-PCR clones from sph/sph alpha-spectrin mRNA identified a single base deletion in repeat 5 that would cause a frame shift and premature termination of the protein. This deletion was confirmed in sph/sph genomic DNA. Northern blot analyses of the distribution of Spna1 mRNA in non-erythroid tissues detects the expression of 8, 2.5 and 2.0 kb transcripts in adult heart. These results predict the heart as an additional site where alpha-spectrin mutations may produce a phenotype and raise the possibility that a novel functional class of small alpha-spectrin isoforms may exist.

  14. SPHYNX: an accurate density-based SPH method for astrophysical applications

    NASA Astrophysics Data System (ADS)

    Cabezón, R. M.; García-Senz, D.; Figueira, J.

    2017-10-01

    Aims: Hydrodynamical instabilities and shocks are ubiquitous in astrophysical scenarios. Therefore, an accurate numerical simulation of these phenomena is mandatory to correctly model and understand many astrophysical events, such as supernovas, stellar collisions, or planetary formation. In this work, we attempt to address many of the problems that a commonly used technique, smoothed particle hydrodynamics (SPH), has when dealing with subsonic hydrodynamical instabilities or shocks. To that aim we built a new SPH code named SPHYNX, that includes many of the recent advances in the SPH technique and some other new ones, which we present here. Methods: SPHYNX is of Newtonian type and grounded in the Euler-Lagrange formulation of the smoothed-particle hydrodynamics technique. Its distinctive features are: the use of an integral approach to estimating the gradients; the use of a flexible family of interpolators called sinc kernels, which suppress pairing instability; and the incorporation of a new type of volume element which provides a better partition of the unity. Unlike other modern formulations, which consider volume elements linked to pressure, our volume element choice relies on density. SPHYNX is, therefore, a density-based SPH code. Results: A novel computational hydrodynamic code oriented to Astrophysical applications is described, discussed, and validated in the following pages. The ensuing code conserves mass, linear and angular momentum, energy, entropy, and preserves kernel normalization even in strong shocks. In our proposal, the estimation of gradients is enhanced using an integral approach. Additionally, we introduce a new family of volume elements which reduce the so-called tensile instability. Both features help to suppress the damp which often prevents the growth of hydrodynamic instabilities in regular SPH codes. Conclusions: On the whole, SPHYNX has passed the verification tests described below. For identical particle setting and initial

  15. rpSPH: a novel smoothed particle hydrodynamics algorithm

    NASA Astrophysics Data System (ADS)

    Abel, Tom

    2011-05-01

    We suggest a novel discretization of the momentum equation for smoothed particle hydrodynamics (SPH) and show that it significantly improves the accuracy of the obtained solutions. Our new formulation which we refer to as relative pressure SPH, rpSPH, evaluates the pressure force with respect to the local pressure. It respects Newton's first law of motion and applies forces to particles only when there is a net force acting upon them. This is in contrast to standard SPH which explicitly uses Newton's third law of motion continuously applying equal but opposite forces between particles. rpSPH does not show the unphysical particle noise, the clumping or banding instability, unphysical surface tension and unphysical scattering of different mass particles found for standard SPH. At the same time, it uses fewer computational operations and only changes a single line in existing SPH codes. We demonstrate its performance on isobaric uniform density distributions, uniform density shearing flows, the Kelvin-Helmholtz and Rayleigh-Taylor instabilities, the Sod shock tube, the Sedov-Taylor blast wave and a cosmological integration of the Santa Barbara galaxy cluster formation test. rpSPH is an improvement in these cases. The improvements come at the cost of giving up exact momentum conservation of the scheme. Consequently, one can also obtain unphysical solutions particularly at low resolutions.

  16. PARAVT: Parallel Voronoi tessellation code

    NASA Astrophysics Data System (ADS)

    González, R. E.

    2016-10-01

    In this study, we present a new open source code for massive parallel computation of Voronoi tessellations (VT hereafter) in large data sets. The code is focused for astrophysical purposes where VT densities and neighbors are widely used. There are several serial Voronoi tessellation codes, however no open source and parallel implementations are available to handle the large number of particles/galaxies in current N-body simulations and sky surveys. Parallelization is implemented under MPI and VT using Qhull library. Domain decomposition takes into account consistent boundary computation between tasks, and includes periodic conditions. In addition, the code computes neighbors list, Voronoi density, Voronoi cell volume, density gradient for each particle, and densities on a regular grid. Code implementation and user guide are publicly available at https://github.com/regonzar/paravt.

  17. National Combustion Code Parallel Performance Enhancements

    NASA Technical Reports Server (NTRS)

    Quealy, Angela; Benyo, Theresa (Technical Monitor)

    2002-01-01

    The National Combustion Code (NCC) is being developed by an industry-government team for the design and analysis of combustion systems. The unstructured grid, reacting flow code uses a distributed memory, message passing model for its parallel implementation. The focus of the present effort has been to improve the performance of the NCC code to meet combustor designer requirements for model accuracy and analysis turnaround time. Improving the performance of this code contributes significantly to the overall reduction in time and cost of the combustor design cycle. This report describes recent parallel processing modifications to NCC that have improved the parallel scalability of the code, enabling a two hour turnaround for a 1.3 million element fully reacting combustion simulation on an SGI Origin 2000.

  18. Code Parallelization with CAPO: A User Manual

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry; Biegel, Bryan (Technical Monitor)

    2001-01-01

    A software tool has been developed to assist the parallelization of scientific codes. This tool, CAPO, extends an existing parallelization toolkit, CAPTools developed at the University of Greenwich, to generate OpenMP parallel codes for shared memory architectures. This is an interactive toolkit to transform a serial Fortran application code to an equivalent parallel version of the software - in a small fraction of the time normally required for a manual parallelization. We first discuss the way in which loop types are categorized and how efficient OpenMP directives can be defined and inserted into the existing code using the in-depth interprocedural analysis. The use of the toolkit on a number of application codes ranging from benchmark to real-world application codes is presented. This will demonstrate the great potential of using the toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of processors. The second part of the document gives references to the parameters and the graphic user interface implemented in the toolkit. Finally a set of tutorials is included for hands-on experiences with this toolkit.

  19. Implicit Incompressible SPH.

    PubMed

    Ihmsen, Markus; Cornelis, Jens; Solenthaler, Barbara; Horvath, Christopher; Teschner, Matthias

    2013-07-25

    We propose a novel formulation of the projection method for Smoothed Particle Hydrodynamics (SPH). We combine a symmetric SPH pressure force and an SPH discretization of the continuity equation to obtain a discretized form of the pressure Poisson equation (PPE). In contrast to previous projection schemes, our system does consider the actual computation of the pressure force. This incorporation improves the convergence rate of the solver. Furthermore, we propose to compute the density deviation based on velocities instead of positions as this formulation improves the robustness of the time-integration scheme. We show that our novel formulation outperforms previous projection schemes and state-of-the-art SPH methods. Large time steps and small density deviations of down to 0.01% can be handled in typical scenarios. The practical relevance of the approach is illustrated by scenarios with up to 40 million SPH particles.

  20. Transformation of Au144(SCH2CH2Ph)60 to Au133(SPh-tBu)52 Nanomolecules: Theoretical and Experimental Study.

    PubMed

    Nimmala, Praneeth Reddy; Theivendran, Shevanuja; Barcaro, Giovanni; Sementa, Luca; Kumara, Chanaka; Jupally, Vijay Reddy; Apra, Edoardo; Stener, Mauro; Fortunelli, Alessandro; Dass, Amala

    2015-06-04

    Ultrastable gold nanomolecule Au144(SCH2CH2Ph)60 upon etching with excess tert-butylbenzenethiol undergoes a core-size conversion and compositional change to form an entirely new core of Au133(SPh-tBu)52. This conversion was studied using high-resolution electrospray mass spectrometry which shows that the core size conversion is initiated after 22 ligand exchanges, suggesting a relatively high stability of the Au144(SCH2CH2Ph)38(SPh-tBu)22 intermediate. The Au144 → Au133 core size conversion is surprisingly different from the Au144 → Au99 core conversion reported in the case of thiophenol, -SPh. Theoretical analysis and ab initio molecular dynamics simulations show that rigid p-tBu groups play a crucial role by reducing the cluster structural freedom, and protecting the cluster from adsorption of exogenous and reactive species, thus rationalizing the kinetic factors that stabilize the Au133 core size. This 144-atom to 133-atom nanomolecule's compositional change is reflected in optical spectroscopy and electrochemistry.

  1. Implicit incompressible SPH.

    PubMed

    Ihmsen, Markus; Cornelis, Jens; Solenthaler, Barbara; Horvath, Christopher; Teschner, Matthias

    2014-03-01

    We propose a novel formulation of the projection method for Smoothed Particle Hydrodynamics (SPH). We combine a symmetric SPH pressure force and an SPH discretization of the continuity equation to obtain a discretized form of the pressure Poisson equation (PPE). In contrast to previous projection schemes, our system does consider the actual computation of the pressure force. This incorporation improves the convergence rate of the solver. Furthermore, we propose to compute the density deviation based on velocities instead of positions as this formulation improves the robustness of the time-integration scheme. We show that our novel formulation outperforms previous projection schemes and state-of-the-art SPH methods. Large time steps and small density deviations of down to 0.01 percent can be handled in typical scenarios. The practical relevance of the approach is illustrated by scenarios with up to 40 million SPH particles.

  2. A challenge to dSph formation models: are the most isolated Local Group dSph galaxies truly old?

    NASA Astrophysics Data System (ADS)

    Monelli, Matteo

    2017-08-01

    What is the origin of the different dwarf galaxy types? The classification into dwarf irregular (dIrr), spheroidal (dSph), and transition (dT) types is based on their present-day properties. However, star formation histories (SFHs) reconstructed from deep color-magnitude diagrams (CMDs) provide details on the early evolution of galaxies of all these types, and indicate only two basic evolutionary paths. One is characterized by a vigorous but brief initial star-forming event, and little or no star formation thereafter (fast evolution), and the other one by roughly continuous star formation until (nearly) the present time (slow evolution). These two paths do not map directly onto the dIrr, dT and dSph types. Thus, the present galaxy properties do not reflect their lifetime evolution. Since there are some indications that slow dwarfs were assembled in lower-density environments than fast dwarfs, Gallart et al (2015) proposed that the distinction between fast and slow dwarfs reflects the characteristic density of the environment where they formed. This scenario, and more generally scenarios where dSph galaxies formed through the interaction with a massive galaxy, are challenged by a small sample of extremely isolated dSph/dT in the outer fringes of the Local Group. This proposal targets two of these objects (VV124, KKR25) for which we will infer their SFH - through a novel technique that combines the information from their RR Lyrae stars and deep CMDs sampling the intermediate-age population - in order to test these scenarios. This is much less demanding on observing time than classical SFH derivation using full depth CMDs.

  3. Direct collapse to supermassive black hole seeds: comparing the AMR and SPH approaches.

    PubMed

    Luo, Yang; Nagamine, Kentaro; Shlosman, Isaac

    2016-07-01

    We provide detailed comparison between the adaptive mesh refinement (AMR) code enzo-2.4 and the smoothed particle hydrodynamics (SPH)/ N -body code gadget-3 in the context of isolated or cosmological direct baryonic collapse within dark matter (DM) haloes to form supermassive black holes. Gas flow is examined by following evolution of basic parameters of accretion flows. Both codes show an overall agreement in the general features of the collapse; however, many subtle differences exist. For isolated models, the codes increase their spatial and mass resolutions at different pace, which leads to substantially earlier collapse in SPH than in AMR cases due to higher gravitational resolution in gadget-3. In cosmological runs, the AMR develops a slightly higher baryonic resolution than SPH during halo growth via cold accretion permeated by mergers. Still, both codes agree in the build-up of DM and baryonic structures. However, with the onset of collapse, this difference in mass and spatial resolution is amplified, so evolution of SPH models begins to lag behind. Such a delay can have effect on formation/destruction rate of H 2 due to UV background, and on basic properties of host haloes. Finally, isolated non-cosmological models in spinning haloes, with spin parameter λ ∼ 0.01-0.07, show delayed collapse for greater λ, but pace of this increase is faster for AMR. Within our simulation set-up, gadget-3 requires significantly larger computational resources than enzo-2.4 during collapse, and needs similar resources, during the pre-collapse, cosmological structure formation phase. Yet it benefits from substantially higher gravitational force and hydrodynamic resolutions, except at the end of collapse.

  4. Direct collapse to supermassive black hole seeds: comparing the AMR and SPH approaches

    NASA Astrophysics Data System (ADS)

    Luo, Yang; Nagamine, Kentaro; Shlosman, Isaac

    2016-07-01

    We provide detailed comparison between the adaptive mesh refinement (AMR) code ENZO-2.4 and the smoothed particle hydrodynamics (SPH)/N-body code GADGET-3 in the context of isolated or cosmological direct baryonic collapse within dark matter (DM) haloes to form supermassive black holes. Gas flow is examined by following evolution of basic parameters of accretion flows. Both codes show an overall agreement in the general features of the collapse; however, many subtle differences exist. For isolated models, the codes increase their spatial and mass resolutions at different pace, which leads to substantially earlier collapse in SPH than in AMR cases due to higher gravitational resolution in GADGET-3. In cosmological runs, the AMR develops a slightly higher baryonic resolution than SPH during halo growth via cold accretion permeated by mergers. Still, both codes agree in the build-up of DM and baryonic structures. However, with the onset of collapse, this difference in mass and spatial resolution is amplified, so evolution of SPH models begins to lag behind. Such a delay can have effect on formation/destruction rate of H2 due to UV background, and on basic properties of host haloes. Finally, isolated non-cosmological models in spinning haloes, with spin parameter λ ˜ 0.01-0.07, show delayed collapse for greater λ, but pace of this increase is faster for AMR. Within our simulation set-up, GADGET-3 requires significantly larger computational resources than ENZO-2.4 during collapse, and needs similar resources, during the pre-collapse, cosmological structure formation phase. Yet it benefits from substantially higher gravitational force and hydrodynamic resolutions, except at the end of collapse.

  5. SPH non-Newtonian Model for Ice Sheet and Ice Shelf Dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tartakovsky, Alexandre M.; Pan, Wenxiao; Monaghan, Joseph J.

    2012-07-07

    We propose a new three-dimensional smoothed particle hydrodynamics (SPH) non-Newtonian model to study coupled ice sheet and ice shelf dynamics. Most existing ice sheet numerical models use a grid-based Eulerian approach, and are usually restricted to shallow ice sheet and ice shelf approximations of the momentum conservation equation. SPH, a fully Lagrangian particle method, solves the full momentum conservation equation. SPH method also allows modeling of free-surface flows, large material deformation, and material fragmentation without employing complex front-tracking schemes, and does not require re-meshing. As a result, SPH codes are highly scalable. Numerical accuracy of the proposed SPH model ismore » first verified by simulating a plane shear flow with a free surface and the propagation of a blob of ice along a horizontal surface. Next, the SPH model is used to investigate the grounding line dynamics of ice sheet/shelf. The steady position of the grounding line, obtained from our SPH simulations, is in good agreement with laboratory observations for a wide range of bedrock slopes, ice-to-fluid density ratios, and flux. We examine the effect of non-Newtonian behavior of ice on the grounding line dynamics. The non-Newtonian constitutive model is based on Glen's law for a creeping flow of a polycrystalline ice. Finally, we investigate the effect of a bedrock geometry on a steady-state position of the grounding line.« less

  6. National Combustion Code: Parallel Implementation and Performance

    NASA Technical Reports Server (NTRS)

    Quealy, A.; Ryder, R.; Norris, A.; Liu, N.-S.

    2000-01-01

    The National Combustion Code (NCC) is being developed by an industry-government team for the design and analysis of combustion systems. CORSAIR-CCD is the current baseline reacting flow solver for NCC. This is a parallel, unstructured grid code which uses a distributed memory, message passing model for its parallel implementation. The focus of the present effort has been to improve the performance of the NCC flow solver to meet combustor designer requirements for model accuracy and analysis turnaround time. Improving the performance of this code contributes significantly to the overall reduction in time and cost of the combustor design cycle. This paper describes the parallel implementation of the NCC flow solver and summarizes its current parallel performance on an SGI Origin 2000. Earlier parallel performance results on an IBM SP-2 are also included. The performance improvements which have enabled a turnaround of less than 15 hours for a 1.3 million element fully reacting combustion simulation are described.

  7. IMPROVED PERFORMANCES IN SUBSONIC FLOWS OF AN SPH SCHEME WITH GRADIENTS ESTIMATED USING AN INTEGRAL APPROACH

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Valdarnini, R., E-mail: valda@sissa.it

    In this paper, we present results from a series of hydrodynamical tests aimed at validating the performance of a smoothed particle hydrodynamics (SPH) formulation in which gradients are derived from an integral approach. We specifically investigate the code behavior with subsonic flows, where it is well known that zeroth-order inconsistencies present in standard SPH make it particularly problematic to correctly model the fluid dynamics. In particular, we consider the Gresho–Chan vortex problem, the growth of Kelvin–Helmholtz instabilities, the statistics of driven subsonic turbulence and the cold Keplerian disk problem. We compare simulation results for the different tests with those obtained,more » for the same initial conditions, using standard SPH. We also compare the results with the corresponding ones obtained previously with other numerical methods, such as codes based on a moving-mesh scheme or Godunov-type Lagrangian meshless methods. We quantify code performances by introducing error norms and spectral properties of the particle distribution, in a way similar to what was done in other works. We find that the new SPH formulation exhibits strongly reduced gradient errors and outperforms standard SPH in all of the tests considered. In fact, in terms of accuracy, we find good agreement between the simulation results of the new scheme and those produced using other recently proposed numerical schemes. These findings suggest that the proposed method can be successfully applied for many astrophysical problems in which the presence of subsonic flows previously limited the use of SPH, with the new scheme now being competitive in these regimes with other numerical methods.« less

  8. Parallelization of Finite Element Analysis Codes Using Heterogeneous Distributed Computing

    NASA Technical Reports Server (NTRS)

    Ozguner, Fusun

    1996-01-01

    Performance gains in computer design are quickly consumed as users seek to analyze larger problems to a higher degree of accuracy. Innovative computational methods, such as parallel and distributed computing, seek to multiply the power of existing hardware technology to satisfy the computational demands of large applications. In the early stages of this project, experiments were performed using two large, coarse-grained applications, CSTEM and METCAN. These applications were parallelized on an Intel iPSC/860 hypercube. It was found that the overall speedup was very low, due to large, inherently sequential code segments present in the applications. The overall execution time T(sub par), of the application is dependent on these sequential segments. If these segments make up a significant fraction of the overall code, the application will have a poor speedup measure.

  9. User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Earth Sciences Division; Zhang, Keni; Zhang, Keni

    TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical

  10. SEURAT: SPH scheme extended with ultraviolet line radiative transfer

    NASA Astrophysics Data System (ADS)

    Abe, Makito; Suzuki, Hiroyuki; Hasegawa, Kenji; Semelin, Benoit; Yajima, Hidenobu; Umemura, Masayuki

    2018-05-01

    We present a novel Lyman alpha (Ly α) radiative transfer code, SEURAT (SPH scheme Extended with Ultraviolet line RAdiative Transfer), where line scatterings are solved adaptively with the resolution of the smoothed particle hydrodynamics (SPH). The radiative transfer method implemented in SEURAT is based on a Monte Carlo algorithm in which the scattering and absorption by dust are also incorporated. We perform standard test calculations to verify the validity of the code; (i) emergent spectra from a static uniform sphere, (ii) emergent spectra from an expanding uniform sphere, and (iii) escape fraction from a dusty slab. Thereby, we demonstrate that our code solves the {Ly} α radiative transfer with sufficient accuracy. We emphasize that SEURAT can treat the transfer of {Ly} α photons even in highly complex systems that have significantly inhomogeneous density fields. The high adaptivity of SEURAT is desirable to solve the propagation of {Ly} α photons in the interstellar medium of young star-forming galaxies like {Ly} α emitters (LAEs). Thus, SEURAT provides a powerful tool to model the emergent spectra of {Ly} α emission, which can be compared to the observations of LAEs.

  11. A Data Parallel Multizone Navier-Stokes Code

    NASA Technical Reports Server (NTRS)

    Jespersen, Dennis C.; Levit, Creon; Kwak, Dochan (Technical Monitor)

    1995-01-01

    We have developed a data parallel multizone compressible Navier-Stokes code on the Connection Machine CM-5. The code is set up for implicit time-stepping on single or multiple structured grids. For multiple grids and geometrically complex problems, we follow the "chimera" approach, where flow data on one zone is interpolated onto another in the region of overlap. We will describe our design philosophy and give some timing results for the current code. The design choices can be summarized as: 1. finite differences on structured grids; 2. implicit time-stepping with either distributed solves or data motion and local solves; 3. sequential stepping through multiple zones with interzone data transfer via a distributed data structure. We have implemented these ideas on the CM-5 using CMF (Connection Machine Fortran), a data parallel language which combines elements of Fortran 90 and certain extensions, and which bears a strong similarity to High Performance Fortran (HPF). One interesting feature is the issue of turbulence modeling, where the architecture of a parallel machine makes the use of an algebraic turbulence model awkward, whereas models based on transport equations are more natural. We will present some performance figures for the code on the CM-5, and consider the issues involved in transitioning the code to HPF for portability to other parallel platforms.

  12. A shock-capturing SPH scheme based on adaptive kernel estimation

    NASA Astrophysics Data System (ADS)

    Sigalotti, Leonardo Di G.; López, Hender; Donoso, Arnaldo; Sira, Eloy; Klapp, Jaime

    2006-02-01

    Here we report a method that converts standard smoothed particle hydrodynamics (SPH) into a working shock-capturing scheme without relying on solutions to the Riemann problem. Unlike existing adaptive SPH simulations, the present scheme is based on an adaptive kernel estimation of the density, which combines intrinsic features of both the kernel and nearest neighbor approaches in a way that the amount of smoothing required in low-density regions is effectively controlled. Symmetrized SPH representations of the gas dynamic equations along with the usual kernel summation for the density are used to guarantee variational consistency. Implementation of the adaptive kernel estimation involves a very simple procedure and allows for a unique scheme that handles strong shocks and rarefactions the same way. Since it represents a general improvement of the integral interpolation on scattered data, it is also applicable to other fluid-dynamic models. When the method is applied to supersonic compressible flows with sharp discontinuities, as in the classical one-dimensional shock-tube problem and its variants, the accuracy of the results is comparable, and in most cases superior, to that obtained from high quality Godunov-type methods and SPH formulations based on Riemann solutions. The extension of the method to two- and three-space dimensions is straightforward. In particular, for the two-dimensional cylindrical Noh's shock implosion and Sedov point explosion problems the present scheme produces much better results than those obtained with conventional SPH codes.

  13. National Combustion Code: Parallel Performance

    NASA Technical Reports Server (NTRS)

    Babrauckas, Theresa

    2001-01-01

    This report discusses the National Combustion Code (NCC). The NCC is an integrated system of codes for the design and analysis of combustion systems. The advanced features of the NCC meet designers' requirements for model accuracy and turn-around time. The fundamental features at the inception of the NCC were parallel processing and unstructured mesh. The design and performance of the NCC are discussed.

  14. A comparison of cosmological hydrodynamic codes

    NASA Technical Reports Server (NTRS)

    Kang, Hyesung; Ostriker, Jeremiah P.; Cen, Renyue; Ryu, Dongsu; Hernquist, Lars; Evrard, August E.; Bryan, Greg L.; Norman, Michael L.

    1994-01-01

    We present a detailed comparison of the simulation results of various hydrodynamic codes. Starting with identical initial conditions based on the cold dark matter scenario for the growth of structure, with parameters h = 0.5 Omega = Omega(sub b) = 1, and sigma(sub 8) = 1, we integrate from redshift z = 20 to z = O to determine the physical state within a representative volume of size L(exp 3) where L = 64 h(exp -1) Mpc. Five indenpendent codes are compared: three of them Eulerian mesh-based and two variants of the smooth particle hydrodynamics 'SPH' Lagrangian approach. The Eulerian codes were run at N(exp 3) = (32(exp 3), 64(exp 3), 128(exp 3), and 256(exp 3)) cells, the SPH codes at N(exp 3) = 32(exp 3) and 64(exp 3) particles. Results were then rebinned to a 16(exp 3) grid with the exception that the rebinned data should converge, by all techniques, to a common and correct result as N approaches infinity. We find that global averages of various physical quantities do, as expected, tend to converge in the rebinned model, but that uncertainites in even primitive quantities such as (T), (rho(exp 2))(exp 1/2) persists at the 3%-17% level achieve comparable and satisfactory accuracy for comparable computer time in their treatment of the high-density, high-temeprature regions as measured in the rebinned data; the variance among the five codes (at highest resolution) for the mean temperature (as weighted by rho(exp 2) is only 4.5%. Examined at high resolution we suspect that the density resolution is better in the SPH codes and the thermal accuracy in low-density regions better in the Eulerian codes. In the low-density, low-temperature regions the SPH codes have poor accuracy due to statiscal effects, and the Jameson code gives the temperatures which are too high, due to overuse of artificial viscosity in these high Mach number regions. Overall the comparison allows us to better estimate errors; it points to ways of improving this current generation ofhydrodynamic

  15. Comparing AMR and SPH Cosmological Simulations. I. Dark Matter and Adiabatic Simulations

    NASA Astrophysics Data System (ADS)

    O'Shea, Brian W.; Nagamine, Kentaro; Springel, Volker; Hernquist, Lars; Norman, Michael L.

    2005-09-01

    We compare two cosmological hydrodynamic simulation codes in the context of hierarchical galaxy formation: the Lagrangian smoothed particle hydrodynamics (SPH) code GADGET, and the Eulerian adaptive mesh refinement (AMR) code Enzo. Both codes represent dark matter with the N-body method but use different gravity solvers and fundamentally different approaches for baryonic hydrodynamics. The SPH method in GADGET uses a recently developed ``entropy conserving'' formulation of SPH, while for the mesh-based Enzo two different formulations of Eulerian hydrodynamics are employed: the piecewise parabolic method (PPM) extended with a dual energy formulation for cosmology, and the artificial viscosity-based scheme used in the magnetohydrodynamics code ZEUS. In this paper we focus on a comparison of cosmological simulations that follow either only dark matter, or also a nonradiative (``adiabatic'') hydrodynamic gaseous component. We perform multiple simulations using both codes with varying spatial and mass resolution with identical initial conditions. The dark matter-only runs agree generally quite well provided Enzo is run with a comparatively fine root grid and a low overdensity threshold for mesh refinement, otherwise the abundance of low-mass halos is suppressed. This can be readily understood as a consequence of the hierarchical particle-mesh algorithm used by Enzo to compute gravitational forces, which tends to deliver lower force resolution than the tree-algorithm of GADGET at early times before any adaptive mesh refinement takes place. At comparable force resolution we find that the latter offers substantially better performance and lower memory consumption than the present gravity solver in Enzo. In simulations that include adiabatic gasdynamics we find general agreement in the distribution functions of temperature, entropy, and density for gas of moderate to high overdensity, as found inside dark matter halos. However, there are also some significant differences in

  16. SPH simulations of high-speed collisions

    NASA Astrophysics Data System (ADS)

    Rozehnal, Jakub; Broz, Miroslav

    2016-10-01

    Our work is devoted to a comparison of: i) asteroid-asteroid collisions occurring at lower velocities (about 5 km/s in the Main Belt), and ii) mutual collisions of asteroids and cometary nuclei usually occurring at significantly higher relative velocities (> 10 km/s).We focus on differences in the propagation of the shock wave, ejection of the fragments and possible differences in the resultingsize-frequency distributions of synthetic asteroid families. We also discuss scaling with respect to the "nominal" target diameter D = 100 km, projectile velocity 3-7 km/s, for which a number of simulations were done so far (Durda et al. 2007, Benavidez et al. 2012).In the latter case of asteroid-comet collisions, we simulate the impacts of brittle or pre-damaged impactors onto solid monolithic targets at high velocities, ranging from 10 to 15 km/s. The purpose of this numerical experiment is to better understand impact processes shaping the early Solar System, namely the primordial asteroid belt during during the (late) heavy bombardment (as a continuation of Broz et al. 2013).For all hydrodynamical simulations we use a smoothed-particle hydrodynamics method (SPH), namely the lagrangian SPH3D code (Benz & Asphaug 1994, 1995). The gravitational interactions between fragments (re-accumulation) is simulated with the Pkdgrav tree-code (Richardson et al. 2000).

  17. PENTACLE: Parallelized particle-particle particle-tree code for planet formation

    NASA Astrophysics Data System (ADS)

    Iwasawa, Masaki; Oshino, Shoichi; Fujii, Michiko S.; Hori, Yasunori

    2017-10-01

    We have newly developed a parallelized particle-particle particle-tree code for planet formation, PENTACLE, which is a parallelized hybrid N-body integrator executed on a CPU-based (super)computer. PENTACLE uses a fourth-order Hermite algorithm to calculate gravitational interactions between particles within a cut-off radius and a Barnes-Hut tree method for gravity from particles beyond. It also implements an open-source library designed for full automatic parallelization of particle simulations, FDPS (Framework for Developing Particle Simulator), to parallelize a Barnes-Hut tree algorithm for a memory-distributed supercomputer. These allow us to handle 1-10 million particles in a high-resolution N-body simulation on CPU clusters for collisional dynamics, including physical collisions in a planetesimal disc. In this paper, we show the performance and the accuracy of PENTACLE in terms of \\tilde{R}_cut and a time-step Δt. It turns out that the accuracy of a hybrid N-body simulation is controlled through Δ t / \\tilde{R}_cut and Δ t / \\tilde{R}_cut ˜ 0.1 is necessary to simulate accurately the accretion process of a planet for ≥106 yr. For all those interested in large-scale particle simulations, PENTACLE, customized for planet formation, will be freely available from https://github.com/PENTACLE-Team/PENTACLE under the MIT licence.

  18. A hybrid Lagrangian Voronoi-SPH scheme

    NASA Astrophysics Data System (ADS)

    Fernandez-Gutierrez, D.; Souto-Iglesias, A.; Zohdi, T. I.

    2018-07-01

    A hybrid Lagrangian Voronoi-SPH scheme, with an explicit weakly compressible formulation for both the Voronoi and SPH sub-domains, has been developed. The SPH discretization is substituted by Voronoi elements close to solid boundaries, where SPH consistency and boundary conditions implementation become problematic. A buffer zone to couple the dynamics of both sub-domains is used. This zone is formed by a set of particles where fields are interpolated taking into account SPH particles and Voronoi elements. A particle may move in or out of the buffer zone depending on its proximity to a solid boundary. The accuracy of the coupled scheme is discussed by means of a set of well-known verification benchmarks.

  19. A hybrid Lagrangian Voronoi-SPH scheme

    NASA Astrophysics Data System (ADS)

    Fernandez-Gutierrez, D.; Souto-Iglesias, A.; Zohdi, T. I.

    2017-11-01

    A hybrid Lagrangian Voronoi-SPH scheme, with an explicit weakly compressible formulation for both the Voronoi and SPH sub-domains, has been developed. The SPH discretization is substituted by Voronoi elements close to solid boundaries, where SPH consistency and boundary conditions implementation become problematic. A buffer zone to couple the dynamics of both sub-domains is used. This zone is formed by a set of particles where fields are interpolated taking into account SPH particles and Voronoi elements. A particle may move in or out of the buffer zone depending on its proximity to a solid boundary. The accuracy of the coupled scheme is discussed by means of a set of well-known verification benchmarks.

  20. Identification of a Serine Proteinase Homolog (Sp-SPH) Involved in Immune Defense in the Mud Crab Scylla paramamosain

    PubMed Central

    Zhang, Qiu-xia; Liu, Hai-peng; Chen, Rong-yuan; Shen, Kai-li; Wang, Ke-jian

    2013-01-01

    Clip domain serine proteinase homologs are involved in many biological processes including immune response. To identify the immune function of a serine proteinase homolog (Sp-SPH), originally isolated from hemocytes of the mud crab, Scylla paramamosain, the Sp-SPH was expressed recombinantly and purified for further studies. It was found that the Sp-SPH protein could bind to a number of bacteria (including Aeromonas hydrophila, Escherichia coli, Staphylococcus aureus, Vibrio fluvialis, Vibrio harveyi and Vibrio parahemolyticus), bacterial cell wall components such as lipopolysaccharide or peptidoglycan (PGN), and β-1, 3-glucan of fungus. But no direct antibacterial activity of Sp-SPH protein was shown by using minimum inhibitory concentration or minimum bactericidal concentration assays. Nevertheless, the Sp-SPH protein was found to significantly enhance the crab hemocyte adhesion activity (paired t-test, P<0.05), and increase phenoloxidase activity if triggered by PGN in vitro (paired t-test, P<0.05). Importantly, the Sp-SPH protein was demonstrated to promote the survival rate of the animals after challenge with A. hydrophila or V. parahemolyticus which were both recognized by Sp-SPH protein, if pre-incubated with Sp-SPH protein, respectively. Whereas, the crabs died much faster when challenged with Vibrio alginolyiicus, a pathogenic bacterium not recognized by Sp-SPH protein, compared to those of crabs challenged with A. hydrophila or V. parahemolyticus when pre-coated with Sp-SPH protein. Taken together, these data suggested that Sp-SPH molecule might play an important role in immune defense against bacterial infection in the mud crab S. paramamosain. PMID:23724001

  1. Identification of a serine proteinase homolog (Sp-SPH) involved in immune defense in the mud crab Scylla paramamosain.

    PubMed

    Zhang, Qiu-xia; Liu, Hai-peng; Chen, Rong-yuan; Shen, Kai-li; Wang, Ke-jian

    2013-01-01

    Clip domain serine proteinase homologs are involved in many biological processes including immune response. To identify the immune function of a serine proteinase homolog (Sp-SPH), originally isolated from hemocytes of the mud crab, Scylla paramamosain, the Sp-SPH was expressed recombinantly and purified for further studies. It was found that the Sp-SPH protein could bind to a number of bacteria (including Aeromonas hydrophila, Escherichia coli, Staphylococcus aureus, Vibrio fluvialis, Vibrio harveyi and Vibrio parahemolyticus), bacterial cell wall components such as lipopolysaccharide or peptidoglycan (PGN), and β-1, 3-glucan of fungus. But no direct antibacterial activity of Sp-SPH protein was shown by using minimum inhibitory concentration or minimum bactericidal concentration assays. Nevertheless, the Sp-SPH protein was found to significantly enhance the crab hemocyte adhesion activity (paired t-test, P<0.05), and increase phenoloxidase activity if triggered by PGN in vitro (paired t-test, P<0.05). Importantly, the Sp-SPH protein was demonstrated to promote the survival rate of the animals after challenge with A. hydrophila or V. parahemolyticus which were both recognized by Sp-SPH protein, if pre-incubated with Sp-SPH protein, respectively. Whereas, the crabs died much faster when challenged with Vibrio alginolyiicus, a pathogenic bacterium not recognized by Sp-SPH protein, compared to those of crabs challenged with A. hydrophila or V. parahemolyticus when pre-coated with Sp-SPH protein. Taken together, these data suggested that Sp-SPH molecule might play an important role in immune defense against bacterial infection in the mud crab S. paramamosain.

  2. SPH modeling of fluid-structure interaction

    NASA Astrophysics Data System (ADS)

    Han, Luhui; Hu, Xiangyu

    2018-02-01

    This work concerns numerical modeling of fluid-structure interaction (FSI) problems in a uniform smoothed particle hydrodynamics (SPH) framework. It combines a transport-velocity SPH scheme, advancing fluid motions, with a total Lagrangian SPH formulation dealing with the structure deformations. Since both fluid and solid governing equations are solved in SPH framework, while coupling becomes straightforward, the momentum conservation of the FSI system is satisfied strictly. A well-known FSI benchmark test case has been performed to validate the modeling and to demonstrate its potential.

  3. SPH modeling of the Stickney impact at Phobos

    NASA Astrophysics Data System (ADS)

    Bruck Syal, Megan; Rovny, Jared; Owen, J. Michael; Miller, Paul L.

    2016-10-01

    Stickney crater stretches across nearly half the diameter of ~22-km Phobos, the larger of the two martian moons. The Stickney-forming impact would have had global consequences for Phobos, causing extensive damage to the satellite's interior and initiating large-scale resurfacing through ejecta blanket emplacement. Further, much of the ejected material that initially escaped the moon's tiny gravity (escape velocity of ~11 m/s) would have likely reimpacted on subsequent orbits. Modeling of the impact event is necessary to understand the conditions that allowed this "megacrater" to form without disrupting the entire satellite. Impact simulation results also provide a means to test several different hypotheses for how the mysterious families of parallel grooves may have formed at Phobos.We report on adaptive SPH simulations that successfully generate Stickney while avoiding catastrophic fragmentation of Phobos. Inclusion of target porosity and using sufficient numerical resolution in fully 3-D simulations are key for avoiding over-estimation of target damage. Cratering efficiency follows gravity-dominated scaling laws over a wide range of velocities (6-20 km/s) for the appropriate material constants. While the adaptive SPH results are used to constrain crater volume and fracture patterns within the target, additional questions about the fate of ejecta and final crater morphology within an unusual gravity environment can be addressed with complementary numerical methods. Results from the end of the hydrodynamics-controlled phase (tens of seconds after impact) are linked to a Discrete Element Method code, which can explore these processes over longer time scales (see Schwartz et al., this meeting).This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-695442.

  4. A Comparison of Grid-based and SPH Binary Mass-transfer and Merger Simulations

    DOE PAGES

    Motl, Patrick M.; Frank, Juhan; Staff, Jan; ...

    2017-03-29

    There is currently a great amount of interest in the outcomes and astrophysical implications of mergers of double degenerate binaries. In a commonly adopted approximation, the components of such binaries are represented by polytropes with an index of n = 3/2. We present detailed comparisons of stellar mass-transfer and merger simulations of polytropic binaries that have been carried out using two very different numerical algorithms—a finite-volume "grid" code and a smoothed-particle hydrodynamics (SPH) code. We find that there is agreement in both the ultimate outcomes of the evolutions and the intermediate stages if the initial conditions for each code aremore » chosen to match as closely as possible. We find that even with closely matching initial setups, the time it takes to reach a concordant evolution differs between the two codes because the initial depth of contact cannot be matched exactly. There is a general tendency for SPH to yield higher mass transfer rates and faster evolution to the final outcome. Here, we also present comparisons of simulations calculated from two different energy equations: in one series, we assume a polytropic equation of state and in the other series an ideal gas equation of state. In the latter series of simulations, an atmosphere forms around the accretor, which can exchange angular momentum and cause a more rapid loss of orbital angular momentum. In the simulations presented here, the effect of the ideal equation of state is to de-stabilize the binary in both SPH and grid simulations, but the effect is more pronounced in the grid code.« less

  5. A Comparison of Grid-based and SPH Binary Mass-transfer and Merger Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Motl, Patrick M.; Frank, Juhan; Clayton, Geoffrey C.

    2017-04-01

    There is currently a great amount of interest in the outcomes and astrophysical implications of mergers of double degenerate binaries. In a commonly adopted approximation, the components of such binaries are represented by polytropes with an index of n  = 3/2. We present detailed comparisons of stellar mass-transfer and merger simulations of polytropic binaries that have been carried out using two very different numerical algorithms—a finite-volume “grid” code and a smoothed-particle hydrodynamics (SPH) code. We find that there is agreement in both the ultimate outcomes of the evolutions and the intermediate stages if the initial conditions for each code are chosen to matchmore » as closely as possible. We find that even with closely matching initial setups, the time it takes to reach a concordant evolution differs between the two codes because the initial depth of contact cannot be matched exactly. There is a general tendency for SPH to yield higher mass transfer rates and faster evolution to the final outcome. We also present comparisons of simulations calculated from two different energy equations: in one series, we assume a polytropic equation of state and in the other series an ideal gas equation of state. In the latter series of simulations, an atmosphere forms around the accretor, which can exchange angular momentum and cause a more rapid loss of orbital angular momentum. In the simulations presented here, the effect of the ideal equation of state is to de-stabilize the binary in both SPH and grid simulations, but the effect is more pronounced in the grid code.« less

  6. New Parallel computing framework for radiation transport codes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kostin, M.A.; /Michigan State U., NSCL; Mokhov, N.V.

    A new parallel computing framework has been developed to use with general-purpose radiation transport codes. The framework was implemented as a C++ module that uses MPI for message passing. The module is significantly independent of radiation transport codes it can be used with, and is connected to the codes by means of a number of interface functions. The framework was integrated with the MARS15 code, and an effort is under way to deploy it in PHITS. Besides the parallel computing functionality, the framework offers a checkpoint facility that allows restarting calculations with a saved checkpoint file. The checkpoint facility canmore » be used in single process calculations as well as in the parallel regime. Several checkpoint files can be merged into one thus combining results of several calculations. The framework also corrects some of the known problems with the scheduling and load balancing found in the original implementations of the parallel computing functionality in MARS15 and PHITS. The framework can be used efficiently on homogeneous systems and networks of workstations, where the interference from the other users is possible.« less

  7. Capabilities of Fully Parallelized MHD Stability Code MARS

    NASA Astrophysics Data System (ADS)

    Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang

    2016-10-01

    Results of full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. Parallel version of MARS, named PMARS, has been recently developed at FAR-TECH. Parallelized MARS is an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, implemented in MARS. Parallelization of the code included parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse vector iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the MARS algorithm using parallel libraries and procedures. Parallelized MARS is capable of calculating eigenmodes with significantly increased spatial resolution: up to 5,000 adapted radial grid points with up to 500 poloidal harmonics. Such resolution is sufficient for simulation of kink, tearing and peeling-ballooning instabilities with physically relevant parameters. Work is supported by the U.S. DOE SBIR program.

  8. Dimension reduction method for SPH equations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tartakovsky, Alexandre M.; Scheibe, Timothy D.

    2011-08-26

    Smoothed Particle Hydrodynamics model of a complex multiscale processe often results in a system of ODEs with an enormous number of unknowns. Furthermore, a time integration of the SPH equations usually requires time steps that are smaller than the observation time by many orders of magnitude. A direct solution of these ODEs can be extremely expensive. Here we propose a novel dimension reduction method that gives an approximate solution of the SPH ODEs and provides an accurate prediction of the average behavior of the modeled system. The method consists of two main elements. First, effective equationss for evolution of averagemore » variables (e.g. average velocity, concentration and mass of a mineral precipitate) are obtained by averaging the SPH ODEs over the entire computational domain. These effective ODEs contain non-local terms in the form of volume integrals of functions of the SPH variables. Second, a computational closure is used to close the system of the effective equations. The computational closure is achieved via short bursts of the SPH model. The dimension reduction model is used to simulate flow and transport with mixing controlled reactions and mineral precipitation. An SPH model is used model transport at the porescale. Good agreement between direct solutions of the SPH equations and solutions obtained with the dimension reduction method for different boundary conditions confirms the accuracy and computational efficiency of the dimension reduction model. The method significantly accelerates SPH simulations, while providing accurate approximation of the solution and accurate prediction of the average behavior of the system.« less

  9. Long non-protein coding RNA DANCR functions as a competing endogenous RNA to regulate osteoarthritis progression via miR-577/SphK2 axis.

    PubMed

    Fan, Xiaochen; Yuan, Jishan; Xie, Jun; Pan, Zhanpeng; Yao, Xiang; Sun, Xiangyi; Zhang, Pin; Zhang, Lei

    2018-06-07

    Long noncoding RNAs (lncRNAs) have been known to be involved in multiple diverse diseases, including osteoarthritis (OA). This study aimed to explore the role of differentiation antagonizing non-protein coding RNA (DANCR) in OA and identify the potential molecular mechanisms. The expression of DANCR in cartilage samples from patients with OA was detected using quantitative reverse transcription-polymerase chain reaction. The effects of DANCR on the viability of OA chondrocytes and apoptosis were explored using cell counting kit 8 assay and flow cytometry assay, respectively. Additionally, the interaction among DANCR, miR-577, and SphK2 was explored using dual-luciferase reporter and RIP assays. The present study found that DANCR was significantly upregulated in patients with OA. Functional assays demonstrated that DANCR inhibition suppressed the proliferation of OA chondrocytes and induced cell apoptosis. The study also showed that DANCR acted as a competitive endogenous RNA to sponge miR-577, which targeted the mRNA of SphK2 to regulate the survival of OA chondrocytes. In conclusion, the study revealed that lncRNA DANCR might promote the proliferation of OA chondrocytes and reduce apoptosis through the miR-577/SphK2 axis. Thus, lncRNA DANCR might be considered as a potential therapeutic target for OA treatment. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. Crystal Structure of Faradaurate-279: Au279(SPh-tBu)84 Plasmonic Nanocrystal Molecules.

    PubMed

    Sakthivel, Naga Arjun; Theivendran, Shevanuja; Ganeshraj, Vigneshraja; Oliver, Allen G; Dass, Amala

    2017-11-01

    We report the discovery of an unprecedentedly large, 2.2 nm diameter, thiolate protected gold nanocrystal characterized by single crystal X-ray crystallography (sc-XRD), Au 279 (SPh-tBu) 84 named Faradaurate-279 (F-279) in honor of Michael Faraday's (1857) pioneering work on nanoparticles. F-279 nanocrystal has a core-shell structure containing a truncated octahedral core with bulk face-centered cubic-like arrangement, yet a nanomolecule with a precise number of metal atoms and thiolate ligands. The Au 279 S 84 geometry was established from a low-temperature 120 K sc-XRD study at 0.90 Å resolution. The atom counts in core-shell structure of Au 279 follows the mathematical formula for magic number shells: Au@Au 12 @Au 42 @Au 92 @Au 54 , which is further protected by a final shell of Au 48 . Au 249 core is protected by three types of staple motifs, namely: 30 bridging, 18 monomeric, and 6 dimeric staple motifs. Despite the presence of such diverse staple motifs, Au 279 S 84 structure has a chiral pseudo-D 3 symmetry. The core-shell structure can be viewed as nested, concentric polyhedra, containing a total of five forms of Archimedean solids. A comparison between the Au 279 and Au 309 cuboctahedral superatom model in shell-wise growth is illustrated. F-279 can be synthesized and isolated in high purity in milligram quantities using size exclusion chromatography, as evidenced by mass spectrometry. Electrospray ionization-mass spectrometry independently verifies the X-ray diffraction study based heavy atoms formula, Au 279 S 84 , and establishes the molecular formula with the complete ligands, namely, Au 279 (SPh-tBu) 84 . It is also the smallest gold nanocrystal to exhibit metallic behavior, with a surface plasmon resonance band around 510 nm.

  11. Zebrafish U6 small nuclear RNA gene promoters contain a SPH element in an unusual location.

    PubMed

    Halbig, Kari M; Lekven, Arne C; Kunkel, Gary R

    2008-09-15

    Promoters for vertebrate small nuclear RNA (snRNA) genes contain a relatively simple array of transcriptional control elements, divided into proximal and distal regions. Most of these genes are transcribed by RNA polymerase II (e.g., U1, U2), whereas the U6 gene is transcribed by RNA polymerase III. Previously identified vertebrate U6 snRNA gene promoters consist of a proximal sequence element (PSE) and TATA element in the proximal region, plus a distal region with octamer (OCT) and SphI postoctamer homology (SPH) elements. We have found that zebrafish U6 snRNA promoters contain the SPH element in a novel proximal position immediately upstream of the TATA element. The zebrafish SPH element is recognized by SPH-binding factor/selenocysteine tRNA gene transcription activating factor/zinc finger protein 143 (SBF/Staf/ZNF143) in vitro. Furthermore, a zebrafish U6 promoter with a defective SPH element is inefficiently transcribed when injected into embryos.

  12. Incompressible SPH (ISPH) with fast Poisson solver on a GPU

    NASA Astrophysics Data System (ADS)

    Chow, Alex D.; Rogers, Benedict D.; Lind, Steven J.; Stansby, Peter K.

    2018-05-01

    This paper presents a fast incompressible SPH (ISPH) solver implemented to run entirely on a graphics processing unit (GPU) capable of simulating several millions of particles in three dimensions on a single GPU. The ISPH algorithm is implemented by converting the highly optimised open-source weakly-compressible SPH (WCSPH) code DualSPHysics to run ISPH on the GPU, combining it with the open-source linear algebra library ViennaCL for fast solutions of the pressure Poisson equation (PPE). Several challenges are addressed with this research: constructing a PPE matrix every timestep on the GPU for moving particles, optimising the limited GPU memory, and exploiting fast matrix solvers. The ISPH pressure projection algorithm is implemented as 4 separate stages, each with a particle sweep, including an algorithm for the population of the PPE matrix suitable for the GPU, and mixed precision storage methods. An accurate and robust ISPH boundary condition ideal for parallel processing is also established by adapting an existing WCSPH boundary condition for ISPH. A variety of validation cases are presented: an impulsively started plate, incompressible flow around a moving square in a box, and dambreaks (2-D and 3-D) which demonstrate the accuracy, flexibility, and speed of the methodology. Fragmentation of the free surface is shown to influence the performance of matrix preconditioners and therefore the PPE matrix solution time. The Jacobi preconditioner demonstrates robustness and reliability in the presence of fragmented flows. For a dambreak simulation, GPU speed ups demonstrate up to 10-18 times and 1.1-4.5 times compared to single-threaded and 16-threaded CPU run times respectively.

  13. Parallelization of a Monte Carlo particle transport simulation code

    NASA Astrophysics Data System (ADS)

    Hadjidoukas, P.; Bousis, C.; Emfietzoglou, D.

    2010-05-01

    We have developed a high performance version of the Monte Carlo particle transport simulation code MC4. The original application code, developed in Visual Basic for Applications (VBA) for Microsoft Excel, was first rewritten in the C programming language for improving code portability. Several pseudo-random number generators have been also integrated and studied. The new MC4 version was then parallelized for shared and distributed-memory multiprocessor systems using the Message Passing Interface. Two parallel pseudo-random number generator libraries (SPRNG and DCMT) have been seamlessly integrated. The performance speedup of parallel MC4 has been studied on a variety of parallel computing architectures including an Intel Xeon server with 4 dual-core processors, a Sun cluster consisting of 16 nodes of 2 dual-core AMD Opteron processors and a 200 dual-processor HP cluster. For large problem size, which is limited only by the physical memory of the multiprocessor server, the speedup results are almost linear on all systems. We have validated the parallel implementation against the serial VBA and C implementations using the same random number generator. Our experimental results on the transport and energy loss of electrons in a water medium show that the serial and parallel codes are equivalent in accuracy. The present improvements allow for studying of higher particle energies with the use of more accurate physical models, and improve statistics as more particles tracks can be simulated in low response time.

  14. Parallel Scaling Characteristics of Selected NERSC User ProjectCodes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Skinner, David; Verdier, Francesca; Anand, Harsh

    This report documents parallel scaling characteristics of NERSC user project codes between Fiscal Year 2003 and the first half of Fiscal Year 2004 (Oct 2002-March 2004). The codes analyzed cover 60% of all the CPU hours delivered during that time frame on seaborg, a 6080 CPU IBM SP and the largest parallel computer at NERSC. The scale in terms of concurrency and problem size of the workload is analyzed. Drawing on batch queue logs, performance data and feedback from researchers we detail the motivations, benefits, and challenges of implementing highly parallel scientific codes on current NERSC High Performance Computing systems.more » An evaluation and outlook of the NERSC workload for Allocation Year 2005 is presented.« less

  15. Parallel CARLOS-3D code development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Putnam, J.M.; Kotulski, J.D.

    1996-02-01

    CARLOS-3D is a three-dimensional scattering code which was developed under the sponsorship of the Electromagnetic Code Consortium, and is currently used by over 80 aerospace companies and government agencies. The code has been extensively validated and runs on both serial workstations and parallel super computers such as the Intel Paragon. CARLOS-3D is a three-dimensional surface integral equation scattering code based on a Galerkin method of moments formulation employing Rao- Wilton-Glisson roof-top basis for triangular faceted surfaces. Fully arbitrary 3D geometries composed of multiple conducting and homogeneous bulk dielectric materials can be modeled. This presentation describes some of the extensions tomore » the CARLOS-3D code, and how the operator structure of the code facilitated these improvements. Body of revolution (BOR) and two-dimensional geometries were incorporated by simply including new input routines, and the appropriate Galerkin matrix operator routines. Some additional modifications were required in the combined field integral equation matrix generation routine due to the symmetric nature of the BOR and 2D operators. Quadrilateral patched surfaces with linear roof-top basis functions were also implemented in the same manner. Quadrilateral facets and triangular facets can be used in combination to more efficiently model geometries with both large smooth surfaces and surfaces with fine detail such as gaps and cracks. Since the parallel implementation in CARLOS-3D is at high level, these changes were independent of the computer platform being used. This approach minimizes code maintenance, while providing capabilities with little additional effort. Results are presented showing the performance and accuracy of the code for some large scattering problems. Comparisons between triangular faceted and quadrilateral faceted geometry representations will be shown for some complex scatterers.« less

  16. Multi-phase SPH modelling of violent hydrodynamics on GPUs

    NASA Astrophysics Data System (ADS)

    Mokos, Athanasios; Rogers, Benedict D.; Stansby, Peter K.; Domínguez, José M.

    2015-11-01

    This paper presents the acceleration of multi-phase smoothed particle hydrodynamics (SPH) using a graphics processing unit (GPU) enabling large numbers of particles (10-20 million) to be simulated on just a single GPU card. With novel hardware architectures such as a GPU, the optimum approach to implement a multi-phase scheme presents some new challenges. Many more particles must be included in the calculation and there are very different speeds of sound in each phase with the largest speed of sound determining the time step. This requires efficient computation. To take full advantage of the hardware acceleration provided by a single GPU for a multi-phase simulation, four different algorithms are investigated: conditional statements, binary operators, separate particle lists and an intermediate global function. Runtime results show that the optimum approach needs to employ separate cell and neighbour lists for each phase. The profiler shows that this approach leads to a reduction in both memory transactions and arithmetic operations giving significant runtime gains. The four different algorithms are compared to the efficiency of the optimised single-phase GPU code, DualSPHysics, for 2-D and 3-D simulations which indicate that the multi-phase functionality has a significant computational overhead. A comparison with an optimised CPU code shows a speed up of an order of magnitude over an OpenMP simulation with 8 threads and two orders of magnitude over a single thread simulation. A demonstration of the multi-phase SPH GPU code is provided by a 3-D dam break case impacting an obstacle. This shows better agreement with experimental results than an equivalent single-phase code. The multi-phase GPU code enables a convergence study to be undertaken on a single GPU with a large number of particles that otherwise would have required large high performance computing resources.

  17. Parallelization Issues and Particle-In Codes.

    NASA Astrophysics Data System (ADS)

    Elster, Anne Cathrine

    1994-01-01

    "Everything should be made as simple as possible, but not simpler." Albert Einstein. The field of parallel scientific computing has concentrated on parallelization of individual modules such as matrix solvers and factorizers. However, many applications involve several interacting modules. Our analyses of a particle-in-cell code modeling charged particles in an electric field, show that these accompanying dependencies affect data partitioning and lead to new parallelization strategies concerning processor, memory and cache utilization. Our test-bed, a KSR1, is a distributed memory machine with a globally shared addressing space. However, most of the new methods presented hold generally for hierarchical and/or distributed memory systems. We introduce a novel approach that uses dual pointers on the local particle arrays to keep the particle locations automatically partially sorted. Complexity and performance analyses with accompanying KSR benchmarks, have been included for both this scheme and for the traditional replicated grids approach. The latter approach maintains load-balance with respect to particles. However, our results demonstrate it fails to scale properly for problems with large grids (say, greater than 128-by-128) running on as few as 15 KSR nodes, since the extra storage and computation time associated with adding the grid copies, becomes significant. Our grid partitioning scheme, although harder to implement, does not need to replicate the whole grid. Consequently, it scales well for large problems on highly parallel systems. It may, however, require load balancing schemes for non-uniform particle distributions. Our dual pointer approach may facilitate this through dynamically partitioned grids. We also introduce hierarchical data structures that store neighboring grid-points within the same cache -line by reordering the grid indexing. This alignment produces a 25% savings in cache-hits for a 4-by-4 cache. A consideration of the input data's effect on

  18. Coding of Class I and II aminoacyl-tRNA synthetases

    PubMed Central

    Carter, Charles W.

    2018-01-01

    SUMMARY The aminoacyl-tRNA synthetases and their cognate transfer RNAs translate the universal genetic code. The twenty canonical amino acids are sufficiently diverse to create a selective advantage for dividing amino acid activation between two distinct, apparently unrelated superfamilies of synthetases, Class I amino acids being generally larger and less polar, Class II amino acids smaller and more polar. Biochemical, bioinformatic, and protein engineering experiments support the hypothesis that the two Classes descended from opposite strands of the same ancestral gene. Parallel experimental deconstructions of Class I and II synthetases reveal parallel losses in catalytic proficiency at two novel modular levels—protozymes and Urzymes—associated with the evolution of catalytic activity. Bi-directional coding supports an important unification of the proteome; affords a genetic relatedness metric—middle base-pairing frequencies in sense/antisense alignments—that probes more deeply into the evolutionary history of translation than do single multiple sequence alignments; and has facilitated the analysis of hitherto unknown coding relationships in tRNA sequences. Reconstruction of native synthetases by modular thermodynamic cycles facilitated by domain engineering emphasizes the subtlety associated with achieving high specificity, shedding new light on allosteric relationships in contemporary synthetases. Synthetase Urzyme structural biology suggests that they are catalytically active molten globules, broadening the potential manifold of polypeptide catalysts accessible to primitive genetic coding and motivating revisions of the origins of catalysis. Finally, bi-directional genetic coding of some of the oldest genes in the proteome places major limitations on the likelihood that any RNA World preceded the origins of coded proteins. PMID:28828732

  19. Modelling multi-phase liquid-sediment scour and resuspension induced by rapid flows using Smoothed Particle Hydrodynamics (SPH) accelerated with a Graphics Processing Unit (GPU)

    NASA Astrophysics Data System (ADS)

    Fourtakas, G.; Rogers, B. D.

    2016-06-01

    A two-phase numerical model using Smoothed Particle Hydrodynamics (SPH) is applied to two-phase liquid-sediments flows. The absence of a mesh in SPH is ideal for interfacial and highly non-linear flows with changing fragmentation of the interface, mixing and resuspension. The rheology of sediment induced under rapid flows undergoes several states which are only partially described by previous research in SPH. This paper attempts to bridge the gap between the geotechnics, non-Newtonian and Newtonian flows by proposing a model that combines the yielding, shear and suspension layer which are needed to predict accurately the global erosion phenomena, from a hydrodynamics prospective. The numerical SPH scheme is based on the explicit treatment of both phases using Newtonian and the non-Newtonian Bingham-type Herschel-Bulkley-Papanastasiou constitutive model. This is supplemented by the Drucker-Prager yield criterion to predict the onset of yielding of the sediment surface and a concentration suspension model. The multi-phase model has been compared with experimental and 2-D reference numerical models for scour following a dry-bed dam break yielding satisfactory results and improvements over well-known SPH multi-phase models. With 3-D simulations requiring a large number of particles, the code is accelerated with a graphics processing unit (GPU) in the open-source DualSPHysics code. The implementation and optimisation of the code achieved a speed up of x58 over an optimised single thread serial code. A 3-D dam break over a non-cohesive erodible bed simulation with over 4 million particles yields close agreement with experimental scour and water surface profiles.

  20. Implementation of a 3D mixing layer code on parallel computers

    NASA Technical Reports Server (NTRS)

    Roe, K.; Thakur, R.; Dang, T.; Bogucz, E.

    1995-01-01

    This paper summarizes our progress and experience in the development of a Computational-Fluid-Dynamics code on parallel computers to simulate three-dimensional spatially-developing mixing layers. In this initial study, the three-dimensional time-dependent Euler equations are solved using a finite-volume explicit time-marching algorithm. The code was first programmed in Fortran 77 for sequential computers. The code was then converted for use on parallel computers using the conventional message-passing technique, while we have not been able to compile the code with the present version of HPF compilers.

  1. New Bandwidth Efficient Parallel Concatenated Coding Schemes

    NASA Technical Reports Server (NTRS)

    Denedetto, S.; Divsalar, D.; Montorsi, G.; Pollara, F.

    1996-01-01

    We propose a new solution to parallel concatenation of trellis codes with multilevel amplitude/phase modulations and a suitable iterative decoding structure. Examples are given for throughputs 2 bits/sec/Hz with 8PSK and 16QAM signal constellations.

  2. Smoothed Particle Hydrodynamic Simulator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2016-10-05

    This code is a highly modular framework for developing smoothed particle hydrodynamic (SPH) simulations running on parallel platforms. The compartmentalization of the code allows for rapid development of new SPH applications and modifications of existing algorithms. The compartmentalization also allows changes in one part of the code used by many applications to instantly be made available to all applications.

  3. Parallelization of KENO-Va Monte Carlo code

    NASA Astrophysics Data System (ADS)

    Ramón, Javier; Peña, Jorge

    1995-07-01

    KENO-Va is a code integrated within the SCALE system developed by Oak Ridge that solves the transport equation through the Monte Carlo Method. It is being used at the Consejo de Seguridad Nuclear (CSN) to perform criticality calculations for fuel storage pools and shipping casks. Two parallel versions of the code: one for shared memory machines and other for distributed memory systems using the message-passing interface PVM have been generated. In both versions the neutrons of each generation are tracked in parallel. In order to preserve the reproducibility of the results in both versions, advanced seeds for random numbers were used. The CONVEX C3440 with four processors and shared memory at CSN was used to implement the shared memory version. A FDDI network of 6 HP9000/735 was employed to implement the message-passing version using proprietary PVM. The speedup obtained was 3.6 in both cases.

  4. Gas stripping in galaxy clusters: a new SPH simulation approach

    NASA Astrophysics Data System (ADS)

    Jáchym, P.; Palouš, J.; Köppen, J.; Combes, F.

    2007-09-01

    Aims:The influence of a time-varying ram pressure on spiral galaxies in clusters is explored with a new simulation method based on the N-body SPH/tree code GADGET. Methods: We have adapted the code to describe the interaction of two different gas phases, the diffuse hot intracluster medium (ICM) and the denser and colder interstellar medium (ISM). Both the ICM and ISM components are introduced as SPH particles. As a galaxy arrives on a highly radial orbit from outskirts to cluster center, it crosses the ICM density peak and experiences a time-varying wind. Results: Depending on the duration and intensity of the ISM-ICM interaction, early and late type galaxies in galaxy clusters with either a large or small ICM distribution are found to show different stripping efficiencies, amounts of reaccretion of the extra-planar ISM, and final masses. We compare the numerical results with analytical approximations of different complexity and indicate the limits of the Gunn & Gott simple stripping formula. Conclusions: Our investigations emphasize the role of the galactic orbital history to the stripping amount. We discuss the contribution of ram pressure stripping to the origin of the ICM and its metallicity. We propose gas accumulations like tails, filaments, or ripples to be responsible for stripping in regions with low overall ICM occurrence. Appendix A is only available in electronic form at http://www.aanda.org

  5. RAM simulation model for SPH/RSV systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schryver, J.C.; Primm, A.H.; Nelson, S.C.

    1995-12-31

    The US Army`s Project Manager, Crusader is sponsoring the development of technologies that apply to the Self-Propelled Howitzer (SPH), formerly the Advanced Field Artillery System (AFAS), and Resupply Vehicle (RSV), formerly the Future Armored Resupply Vehicle (FARV), weapon system. Oak Ridge National Laboratory (ORNL) is currently performing developmental work in support of the SPH/PSV Crusader system. Supportive analyses of reliability, availability, and maintainability (RAM) aspects were also performed for the SPH/RSV effort. During FY 1994 and FY 1995 OPNL conducted a feasibility study to demonstrate the application of simulation modeling for RAM analysis of the Crusader system. Following completion ofmore » the feasibility study, a full-scale RAM simulation model of the Crusader system was developed for both the SPH and PSV. This report provides documentation for the simulation model as well as instructions in the proper execution and utilization of the model for the conduct of RAM analyses.« less

  6. Cloning and characterization of a shrimp clip domain serine protease homolog (c-SPH) as a cell adhesion molecule.

    PubMed

    Lin, Chun-Yu; Hu, Kuang-Yu; Ho, Shih-Hu; Song, Yen-Ling

    2006-01-01

    Clip domain serine protease homologs (c-SPHs) are involved in various innate immune functions in arthropods such as antimicrobial activity, cell adhesion, pattern recognition, opsonization, and regulation of the prophenoloxidase system. In the present study, we cloned a c-SPH cDNA from tiger shrimp (Penaeus monodon) hemocytes. It is 1337 bp in length with a coding region of 1068 bp consisting a protein of 355 amino acid residues. The deduced protein includes one clip domain and one catalytically inactive serine protease-like (SP-like) domain. Its molecular weight is estimated to be 38 kDa with an isoelectric point of 7.9. The predicted cutting site of the signal peptide is located between Gly(21) and Gln(22). We aligned 15 single clip domain SPH protein sequences from 12 arthropod species; the identity of these clip domains is low and that of SP-like domains is from 34% to 46%. The conserved regions are located near the amino acid residues which served as substrate interaction sites in catalytically active serine protease. Phylogenetically, the tiger shrimp c-SPH is most similar to a low molecular mass masquerade-like protein of crayfish, but less similar to c-SPHs in Chelicerata and Insecta. Nested reverse transcription polymerase chain reaction (RT-PCR) revealed that c-SPH mRNA is expressed most in tissues with the highest hemocyte abundance. Antimicrobial and opsonization activities of the molecule were not detected. The expression of c-SPH mRNA in hemocytes was up-regulated at the 12-day post beta-glucan immersion. Recombinant c-SPH could significantly enhance hemocyte adhesion. The result suggests that the shrimp c-SPH protein plays a role in innate immunity.

  7. Performance Analysis and Optimization on the UCLA Parallel Atmospheric General Circulation Model Code

    NASA Technical Reports Server (NTRS)

    Lou, John; Ferraro, Robert; Farrara, John; Mechoso, Carlos

    1996-01-01

    An analysis is presented of several factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on massively parallel computer systems. Several modificaitons to the original parallel AGCM code aimed at improving its numerical efficiency, interprocessor communication cost, load-balance and issues affecting single-node code performance are discussed.

  8. VINE-A NUMERICAL CODE FOR SIMULATING ASTROPHYSICAL SYSTEMS USING PARTICLES. I. DESCRIPTION OF THE PHYSICS AND THE NUMERICAL METHODS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wetzstein, M.; Nelson, Andrew F.; Naab, T.

    2009-10-01

    We present a numerical code for simulating the evolution of astrophysical systems using particles to represent the underlying fluid flow. The code is written in Fortran 95 and is designed to be versatile, flexible, and extensible, with modular options that can be selected either at the time the code is compiled or at run time through a text input file. We include a number of general purpose modules describing a variety of physical processes commonly required in the astrophysical community and we expect that the effort required to integrate additional or alternate modules into the code will be small. Inmore » its simplest form the code can evolve the dynamical trajectories of a set of particles in two or three dimensions using a module which implements either a Leapfrog or Runge-Kutta-Fehlberg integrator, selected by the user at compile time. The user may choose to allow the integrator to evolve the system using individual time steps for each particle or with a single, global time step for all. Particles may interact gravitationally as N-body particles, and all or any subset may also interact hydrodynamically, using the smoothed particle hydrodynamic (SPH) method by selecting the SPH module. A third particle species can be included with a module to model massive point particles which may accrete nearby SPH or N-body particles. Such particles may be used to model, e.g., stars in a molecular cloud. Free boundary conditions are implemented by default, and a module may be selected to include periodic boundary conditions. We use a binary 'Press' tree to organize particles for rapid access in gravity and SPH calculations. Modules implementing an interface with special purpose 'GRAPE' hardware may also be selected to accelerate the gravity calculations. If available, forces obtained from the GRAPE coprocessors may be transparently substituted for those obtained from the tree, or both tree and GRAPE may be used as a combination GRAPE/tree code. The code may be run without

  9. Vine—A Numerical Code for Simulating Astrophysical Systems Using Particles. I. Description of the Physics and the Numerical Methods

    NASA Astrophysics Data System (ADS)

    Wetzstein, M.; Nelson, Andrew F.; Naab, T.; Burkert, A.

    2009-10-01

    We present a numerical code for simulating the evolution of astrophysical systems using particles to represent the underlying fluid flow. The code is written in Fortran 95 and is designed to be versatile, flexible, and extensible, with modular options that can be selected either at the time the code is compiled or at run time through a text input file. We include a number of general purpose modules describing a variety of physical processes commonly required in the astrophysical community and we expect that the effort required to integrate additional or alternate modules into the code will be small. In its simplest form the code can evolve the dynamical trajectories of a set of particles in two or three dimensions using a module which implements either a Leapfrog or Runge-Kutta-Fehlberg integrator, selected by the user at compile time. The user may choose to allow the integrator to evolve the system using individual time steps for each particle or with a single, global time step for all. Particles may interact gravitationally as N-body particles, and all or any subset may also interact hydrodynamically, using the smoothed particle hydrodynamic (SPH) method by selecting the SPH module. A third particle species can be included with a module to model massive point particles which may accrete nearby SPH or N-body particles. Such particles may be used to model, e.g., stars in a molecular cloud. Free boundary conditions are implemented by default, and a module may be selected to include periodic boundary conditions. We use a binary "Press" tree to organize particles for rapid access in gravity and SPH calculations. Modules implementing an interface with special purpose "GRAPE" hardware may also be selected to accelerate the gravity calculations. If available, forces obtained from the GRAPE coprocessors may be transparently substituted for those obtained from the tree, or both tree and GRAPE may be used as a combination GRAPE/tree code. The code may be run without

  10. ANNarchy: a code generation approach to neural simulations on parallel hardware

    PubMed Central

    Vitay, Julien; Dinkelbach, Helge Ü.; Hamker, Fred H.

    2015-01-01

    Many modern neural simulators focus on the simulation of networks of spiking neurons on parallel hardware. Another important framework in computational neuroscience, rate-coded neural networks, is mostly difficult or impossible to implement using these simulators. We present here the ANNarchy (Artificial Neural Networks architect) neural simulator, which allows to easily define and simulate rate-coded and spiking networks, as well as combinations of both. The interface in Python has been designed to be close to the PyNN interface, while the definition of neuron and synapse models can be specified using an equation-oriented mathematical description similar to the Brian neural simulator. This information is used to generate C++ code that will efficiently perform the simulation on the chosen parallel hardware (multi-core system or graphical processing unit). Several numerical methods are available to transform ordinary differential equations into an efficient C++code. We compare the parallel performance of the simulator to existing solutions. PMID:26283957

  11. Performance of a parallel thermal-hydraulics code TEMPEST

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fann, G.I.; Trent, D.S.

    The authors describe the parallelization of the Tempest thermal-hydraulics code. The serial version of this code is used for production quality 3-D thermal-hydraulics simulations. Good speedup was obtained with a parallel diagonally preconditioned BiCGStab non-symmetric linear solver, using a spatial domain decomposition approach for the semi-iterative pressure-based and mass-conserved algorithm. The test case used here to illustrate the performance of the BiCGStab solver is a 3-D natural convection problem modeled using finite volume discretization in cylindrical coordinates. The BiCGStab solver replaced the LSOR-ADI method for solving the pressure equation in TEMPEST. BiCGStab also solves the coupled thermal energy equation. Scalingmore » performance of 3 problem sizes (221220 nodes, 358120 nodes, and 701220 nodes) are presented. These problems were run on 2 different parallel machines: IBM-SP and SGI PowerChallenge. The largest problem attains a speedup of 68 on an 128 processor IBM-SP. In real terms, this is over 34 times faster than the fastest serial production time using the LSOR-ADI solver.« less

  12. SPH simulation of free surface flow over a sharp-crested weir

    NASA Astrophysics Data System (ADS)

    Ferrari, Angela

    2010-03-01

    In this paper the numerical simulation of a free surface flow over a sharp-crested weir is presented. Since in this case the usual shallow water assumptions are not satisfied, we propose to solve the problem using the full weakly compressible Navier-Stokes equations with the Tait equation of state for water. The numerical method used consists of the new meshless Smooth Particle Hydrodynamics (SPH) formulation proposed by Ferrari et al. (2009) [8], that accurately tracks the free surface profile and provides monotone pressure fields. Thus, the unsteady evolution of the complex moving material interface (free surface) can been properly solved. The simulations involving about half a million of fluid particles have been run in parallel on two of the most powerful High Performance Computing (HPC) facilities in Europe. The validation of the results has been carried out analysing the pressure field and comparing the free surface profiles obtained with the SPH scheme with experimental measurements available in literature [18]. A very good quantitative agreement has been obtained.

  13. Au133(SPh-tBu)52 nanomolecules: X-ray crystallography, optical, electrochemical, and theoretical analysis.

    PubMed

    Dass, Amala; Theivendran, Shevanuja; Nimmala, Praneeth Reddy; Kumara, Chanaka; Jupally, Vijay Reddy; Fortunelli, Alessandro; Sementa, Luca; Barcaro, Giovanni; Zuo, Xiaobing; Noll, Bruce C

    2015-04-15

    Crystal structure determination has revolutionized modern science in biology, chemistry, and physics. However, the difficulty in obtaining periodic crystal lattices which are needed for X-ray crystal analysis has hindered the determination of atomic structure in nanomaterials, known as the "nanostructure problem". Here, by using rigid and bulky ligands, we have overcome this limitation and successfully solved the X-ray crystallographic structure of the largest reported thiolated gold nanomolecule, Au133S52. The total composition, Au133(SPh-tBu)52, was verified using high resolution electrospray ionization mass spectrometry (ESI-MS). The experimental and simulated optical spectra show an emergent surface plasmon resonance that is more pronounced than in the slightly larger Au144(SCH2CH2Ph)60. Theoretical analysis indicates that the presence of rigid and bulky ligands is the key to the successful crystal formation.

  14. Scalability study of parallel spatial direct numerical simulation code on IBM SP1 parallel supercomputer

    NASA Technical Reports Server (NTRS)

    Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad

    1994-01-01

    The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.

  15. Second International Workshop on Software Engineering and Code Design in Parallel Meteorological and Oceanographic Applications

    NASA Technical Reports Server (NTRS)

    OKeefe, Matthew (Editor); Kerr, Christopher L. (Editor)

    1998-01-01

    This report contains the abstracts and technical papers from the Second International Workshop on Software Engineering and Code Design in Parallel Meteorological and Oceanographic Applications, held June 15-18, 1998, in Scottsdale, Arizona. The purpose of the workshop is to bring together software developers in meteorology and oceanography to discuss software engineering and code design issues for parallel architectures, including Massively Parallel Processors (MPP's), Parallel Vector Processors (PVP's), Symmetric Multi-Processors (SMP's), Distributed Shared Memory (DSM) multi-processors, and clusters. Issues to be discussed include: (1) code architectures for current parallel models, including basic data structures, storage allocation, variable naming conventions, coding rules and styles, i/o and pre/post-processing of data; (2) designing modular code; (3) load balancing and domain decomposition; (4) techniques that exploit parallelism efficiently yet hide the machine-related details from the programmer; (5) tools for making the programmer more productive; and (6) the proliferation of programming models (F--, OpenMP, MPI, and HPF).

  16. Composing Data Parallel Code for a SPARQL Graph Engine

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste

    Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basicmore » graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.« less

  17. Water Flow Simulation using Smoothed Particle Hydrodynamics (SPH)

    NASA Technical Reports Server (NTRS)

    Vu, Bruce; Berg, Jared; Harris, Michael F.

    2014-01-01

    Simulation of water flow from the rainbird nozzles has been accomplished using the Smoothed Particle Hydrodynamics (SPH). The advantage of using SPH is that no meshing is required, thus the grid quality is no longer an issue and accuracy can be improved.

  18. VINE-A NUMERICAL CODE FOR SIMULATING ASTROPHYSICAL SYSTEMS USING PARTICLES. II. IMPLEMENTATION AND PERFORMANCE CHARACTERISTICS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nelson, Andrew F.; Wetzstein, M.; Naab, T.

    2009-10-01

    We continue our presentation of VINE. In this paper, we begin with a description of relevant architectural properties of the serial and shared memory parallel computers on which VINE is intended to run, and describe their influences on the design of the code itself. We continue with a detailed description of a number of optimizations made to the layout of the particle data in memory and to our implementation of a binary tree used to access that data for use in gravitational force calculations and searches for smoothed particle hydrodynamics (SPH) neighbor particles. We describe the modifications to the codemore » necessary to obtain forces efficiently from special purpose 'GRAPE' hardware, the interfaces required to allow transparent substitution of those forces in the code instead of those obtained from the tree, and the modifications necessary to use both tree and GRAPE together as a fused GRAPE/tree combination. We conclude with an extensive series of performance tests, which demonstrate that the code can be run efficiently and without modification in serial on small workstations or in parallel using the OpenMP compiler directives on large-scale, shared memory parallel machines. We analyze the effects of the code optimizations and estimate that they improve its overall performance by more than an order of magnitude over that obtained by many other tree codes. Scaled parallel performance of the gravity and SPH calculations, together the most costly components of most simulations, is nearly linear up to at least 120 processors on moderate sized test problems using the Origin 3000 architecture, and to the maximum machine sizes available to us on several other architectures. At similar accuracy, performance of VINE, used in GRAPE-tree mode, is approximately a factor 2 slower than that of VINE, used in host-only mode. Further optimizations of the GRAPE/host communications could improve the speed by as much as a factor of 3, but have not yet been implemented in VINE

  19. Parallel processing a three-dimensional free-lagrange code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mandell, D.A.; Trease, H.E.

    1989-01-01

    A three-dimensional, time-dependent free-Lagrange hydrodynamics code has been multitasked and autotasked on a CRAY X-MP/416. The multitasking was done by using the Los Alamos Multitasking Control Library, which is a superset of the CRAY multitasking library. Autotasking is done by using constructs which are only comment cards if the source code is not run through a preprocessor. The three-dimensional algorithm has presented a number of problems that simpler algorithms, such as those for one-dimensional hydrodynamics, did not exhibit. Problems in converting the serial code, originally written for a CRAY-1, to a multitasking code are discussed. Autotasking of a rewritten versionmore » of the code is discussed. Timing results for subroutines and hot spots in the serial code are presented and suggestions for additional tools and debugging aids are given. Theoretical speedup results obtained from Amdahl's law and actual speedup results obtained on a dedicated machine are presented. Suggestions for designing large parallel codes are given.« less

  20. Au133(SPh-tBu)52 Nanomolecules: X-ray Crystallography, Optical, Electrochemical, and Theoretical Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dass, Amala; Theivendran, Shevanuja; Nimmala, Praneeth Reddy

    2015-04-15

    Crystal structure determination has revolutionized modern science in biology, chemistry, and physics. However, the difficulty in obtaining periodic crystal lattices which are needed for X-ray crystal analysis has hindered the determination of atomic structure in nanomaterials, known as the “nanostructure problem”. Here, by using rigid and bulky ligands, we have overcome this limitation and successfully solved the X-ray crystallographic structure of the largest reported thiolated gold nanomolecule, Au133S52. The total composition, Au133(SPh-tBu)52, was verified using high resolution electrospray ionization mass spectrometry (ESI-MS). The experimental and simulated optical spectra show an emergent surface plasmon resonance that is more pronounced than inmore » the slightly larger Au144(SCH2CH2Ph)60. Theoretical analysis indicates that the presence of rigid and bulky ligands is the key to the successful crystal formation.« less

  1. Au 133 (SPh - t Bu) 52 Nanomolecules: X-ray Crystallography, Optical, Electrochemical, and Theoretical Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dass, Amala; Theivendran, Shevanuja; Nimmala, Praneeth Reddy

    2015-04-15

    Crystal structure determination has revolutionized modern science in biology, chemistry, and physics. However, the difficulty in obtaining periodic crystal lattices which are needed for X-ray crystal analysis has hindered the determination of atomic structure in nanomaterials, known as the "nanostructure problem". Here, by using rigid and bulky ligands, we have overcome this limitation and successfully solved the X-ray crystallographic structure of the largest reported thiolated gold nanomolecule, Au133S52. The total composition, Au-133(SPh-tBu)(52), was verified using high resolution electrospray ionization mass spectrometry (ESI-MS). The experimental and simulated optical spectra show an emergent surface plasmon resonance that is more pronounced than inmore » the slightly larger Au-144(SCH2CH2Ph)(60). Theoretical analysis indicates that the presence of rigid and bulky ligands is the key to the successful crystal formation.« less

  2. Kinetically controlled synthesis of Au102(SPh)44 nanoclusters and catalytic application

    NASA Astrophysics Data System (ADS)

    Chen, Yongdong; Wang, Jin; Liu, Chao; Li, Zhimin; Li, Gao

    2016-05-01

    We here explore a kinetically controlled synthetic protocol for preparing solvent-solvable Au102(SPh)44 nanoclusters which are isolated from polydispersed gold nanoclusters by solvent extraction and size exclusion chromatography (SEC). The as-obtained Au102(SPh)44 nanoclusters are determined by matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI) mass spectrometry, in conjunction with UV-vis spectroscopy and thermogravimetric analysis (TGA). However, Au99(SPh)42, instead of Au102(SPh)44, is yielded when the polydispersed gold nanoclusters are etched in the presence of excess thiophenol under thermal conditions (e.g., 80 °C). Interestingly, the Au102(SPh)44 nanoclusters also can convert to Au99(SPh)42 with equivalent thiophenol ligands, evidenced by the analyses of UV-vis and MALDI mass spectrometry. Finally, the TiO2-supported Au102(SPh)44 nanocluster catalyst is investigated in the selective oxidation of sulfides into sulfoxides by the PhIO oxidant and gives rise to high catalytic activity (e.g., 80-99% conversion of R-S-R' sulfides with 96-99% selectivity for R-S(&z.dbd;O)-R' sulfoxides). The Au102(SPh)44/TiO2 catalyst also shows excellent recyclability in the sulfoxidation process.We here explore a kinetically controlled synthetic protocol for preparing solvent-solvable Au102(SPh)44 nanoclusters which are isolated from polydispersed gold nanoclusters by solvent extraction and size exclusion chromatography (SEC). The as-obtained Au102(SPh)44 nanoclusters are determined by matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI) mass spectrometry, in conjunction with UV-vis spectroscopy and thermogravimetric analysis (TGA). However, Au99(SPh)42, instead of Au102(SPh)44, is yielded when the polydispersed gold nanoclusters are etched in the presence of excess thiophenol under thermal conditions (e.g., 80 °C). Interestingly, the Au102(SPh)44 nanoclusters also can convert to Au99(SPh)42 with equivalent

  3. Characterization of a serine proteinase homologous (SPH) in Chinese mitten crab Eriocheir sinensis.

    PubMed

    Qin, Chuanjie; Chen, Liqiao; Qin, Jian G; Zhao, Daxian; Zhang, Hao; Wu, Ping; Li, Erchao

    2010-01-01

    The serine protease homologous (SPH) is an important cofactor of prophenoloxidase-activating enzyme (PPAE). The gene of SPH of Chinese mitten crab Eriocheir sinensis (EsSPH) in hemocytes was cloned and characterized using reverse transcript polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends (RACE). The SPH cDNA consisted of 1386 bp with an open reading frame (ORF) encoded a protein of 378 amino acids, 154 bp 5'-untranslated region, and 95 bp 3'-untranslated region. Sequence comparisons against the GenBank database showed that EsSPH deduced amino acids had an overall identity to the gene of serine protease family from 41% to 70% of 15 invertebrate species. The protein had the structural characteristics of SPH, including the conserved six cysteine residues in the N-terminal clip domain and the functional activity (His157, Asp209, Gly311) in the C-terminal serine proteinase-like domain. To analyze the role of EsSPH in an acute infection, the temporal expression of the EsSPH gene after the Aeromonas hydrophila challenge was measured by real-time RT-PCR. The EsSPH transcripts in hemocytes significantly increased at 6 h, 12 h and 48 h over time after the A. hydrophila injection. This expression pattern shows that EsSPH has the potential to defend against invading microorganisms. The mRNA transcripts of EsSPH were detected in all tissues with the highest in the hepatopancreas. Interestingly, the mRNA transcripts of EsSPH and proPO were found in ova and expressed in oosperms, suggesting that the maternal transfer of EsSPH and proPO may exit in crab, but this warrants confirmation in further research.

  4. Parallel Grand Canonical Monte Carlo (ParaGrandMC) Simulation Code

    NASA Technical Reports Server (NTRS)

    Yamakov, Vesselin I.

    2016-01-01

    This report provides an overview of the Parallel Grand Canonical Monte Carlo (ParaGrandMC) simulation code. This is a highly scalable parallel FORTRAN code for simulating the thermodynamic evolution of metal alloy systems at the atomic level, and predicting the thermodynamic state, phase diagram, chemical composition and mechanical properties. The code is designed to simulate multi-component alloy systems, predict solid-state phase transformations such as austenite-martensite transformations, precipitate formation, recrystallization, capillary effects at interfaces, surface absorption, etc., which can aid the design of novel metallic alloys. While the software is mainly tailored for modeling metal alloys, it can also be used for other types of solid-state systems, and to some degree for liquid or gaseous systems, including multiphase systems forming solid-liquid-gas interfaces.

  5. A smooth particle hydrodynamics code to model collisions between solid, self-gravitating objects

    NASA Astrophysics Data System (ADS)

    Schäfer, C.; Riecker, S.; Maindl, T. I.; Speith, R.; Scherrer, S.; Kley, W.

    2016-05-01

    Context. Modern graphics processing units (GPUs) lead to a major increase in the performance of the computation of astrophysical simulations. Owing to the different nature of GPU architecture compared to traditional central processing units (CPUs) such as x86 architecture, existing numerical codes cannot be easily migrated to run on GPU. Here, we present a new implementation of the numerical method smooth particle hydrodynamics (SPH) using CUDA and the first astrophysical application of the new code: the collision between Ceres-sized objects. Aims: The new code allows for a tremendous increase in speed of astrophysical simulations with SPH and self-gravity at low costs for new hardware. Methods: We have implemented the SPH equations to model gas, liquids and elastic, and plastic solid bodies and added a fragmentation model for brittle materials. Self-gravity may be optionally included in the simulations and is treated by the use of a Barnes-Hut tree. Results: We find an impressive performance gain using NVIDIA consumer devices compared to our existing OpenMP code. The new code is freely available to the community upon request. If you are interested in our CUDA SPH code miluphCUDA, please write an email to Christoph Schäfer. miluphCUDA is the CUDA port of miluph. miluph is pronounced [maßl2v]. We do not support the use of the code for military purposes.

  6. Performance of a parallel code for the Euler equations on hypercube computers

    NASA Technical Reports Server (NTRS)

    Barszcz, Eric; Chan, Tony F.; Jesperson, Dennis C.; Tuminaro, Raymond S.

    1990-01-01

    The performance of hypercubes were evaluated on a computational fluid dynamics problem and the parallel environment issues were considered that must be addressed, such as algorithm changes, implementation choices, programming effort, and programming environment. The evaluation focuses on a widely used fluid dynamics code, FLO52, which solves the two dimensional steady Euler equations describing flow around the airfoil. The code development experience is described, including interacting with the operating system, utilizing the message-passing communication system, and code modifications necessary to increase parallel efficiency. Results from two hypercube parallel computers (a 16-node iPSC/2, and a 512-node NCUBE/ten) are discussed and compared. In addition, a mathematical model of the execution time was developed as a function of several machine and algorithm parameters. This model accurately predicts the actual run times obtained and is used to explore the performance of the code in interesting but yet physically realizable regions of the parameter space. Based on this model, predictions about future hypercubes are made.

  7. Self-perception of health (SPH) in the oldest-old subjects.

    PubMed

    Zikic, L; Jankelic, S; Milosevic, D P; Despotovic, N; Erceg, P; Davidovic, M

    2009-01-01

    SPH is a subjective and objective assessment of personal health. It is important in evaluation of health status in the elderly as it has capacity to predict mortality, functional declining, and health-care demands. A lot of research has been published about SPH in the elderly, but little is known about SPH in the very old, especially in comparison with the "younger-old" (YO) population. The study has aimed to investigate SPH in 240 elderly patients and compare the data between the "oldest-old" (OO) (aged >or= 90 years; n=52) and the YO (aged 60-74 years; n=188) subjects. Results have shown that the OO group of patients had better SPH than their YO counterparts. Our findings implicate that very old persons belong to a special sub-group of elderly, the "successfully aged", probably due to their genetic stability, distinctive lifestyle, or both.

  8. Implementation of the SPH Procedure Within the MOOSE Finite Element Framework

    NASA Astrophysics Data System (ADS)

    Laurier, Alexandre

    The goal of this thesis was to implement the SPH homogenization procedure within the MOOSE finite element framework at INL. Before this project, INL relied on DRAGON to do their SPH homogenization which was not flexible enough for their needs. As such, the SPH procedure was implemented for the neutron diffusion equation with the traditional, Selengut and true Selengut normalizations. Another aspect of this research was to derive the SPH corrected neutron transport equations and implement them in the same framework. Following in the footsteps of other articles, this feature was implemented and tested successfully with both the PN and S N transport calculation schemes. Although the results obtained for the power distribution in PWR assemblies show no advantages over the use of the SPH diffusion equation, we believe the inclusion of this transport correction will allow for better results in cases where either P N or SN are required. An additional aspect of this research was the implementation of a novel way of solving the non-linear SPH problem. Traditionally, this was done through a Picard, fixed-point iterative process whereas the new implementation relies on MOOSE's Preconditioned Jacobian-Free Newton Krylov (PJFNK) method to allow for a direct solution to the non-linear problem. This novel implementation showed a decrease in calculation time by a factor reaching 50 and generated SPH factors that correspond to those obtained through a fixed-point iterative process with a very tight convergence criteria: epsilon < 10-8. The use of the PJFNK SPH procedure also allows to reach convergence in problems containing important reflector regions and void boundary conditions, something that the traditional SPH method has never been able to achieve. At times when the PJFNK method cannot reach convergence to the SPH problem, a hybrid method is used where by the traditional SPH iteration forces the initial condition to be within the radius of convergence of the Newton method

  9. SPH for impact force and ricochet behavior of water-entry bodies

    NASA Astrophysics Data System (ADS)

    Omidvar, Pourya; Farghadani, Omid; Nikeghbali, Pooyan

    The numerical modeling of fluid interaction with a bouncing body has many applications in scientific and engineering application. In this paper, the problem of water impact of a body on free-surface is investigated, where the fixed ghost boundary condition is added to the open source code SPHysics2D1 to rectify the oscillations in pressure distributions with the repulsive boundary condition. First, after introducing the methodology of SPH and the option of boundary conditions, the still water problem is simulated using two types of boundary conditions. It is shown that the fixed ghost boundary condition gives a better result for a hydrostatics pressure. Then, the dam-break problem, which is a bench mark test case in SPH, is simulated and compared with available data. In order to show the behavior of the hydrostatics forces on bodies, a fix/floating cylinder is placed on free surface looking carefully at the force and heaving profile. Finally, the impact of a body on free-surface is successfully simulated for different impact angles and velocities.

  10. Meshless Lagrangian SPH method applied to isothermal lid-driven cavity flow at low-Re numbers

    NASA Astrophysics Data System (ADS)

    Fraga Filho, C. A. D.; Chacaltana, J. T. A.; Pinto, W. J. N.

    2018-01-01

    SPH is a recent particle method applied in the cavities study, without many results available in the literature. The lid-driven cavity flow is a classic problem of the fluid mechanics, extensively explored in the literature and presenting a considerable complexity. The aim of this paper is to present a solution from the Lagrangian viewpoint for this problem. The discretization of the continuum domain is performed using the Lagrangian particles. The physical laws of mass, momentum and energy conservation are presented by the Navier-Stokes equations. A serial numerical code, written in Fortran programming language, has been used to perform the numerical simulations. The application of the SPH and comparison with the literature (mesh methods and a meshless collocation method) have been done. The positions of the primary vortex centre and the non-dimensional velocity profiles passing through the geometric centre of the cavity have been analysed. The numerical Lagrangian results showed a good agreement when compared to the results found in the literature, specifically for { Re} < 100.00 . Suggestions for improvements in the SPH model presented are listed, in the search for better results for flows with higher Reynolds numbers.

  11. Au38(SPh)24: Au38 Protected with Aromatic Thiolate Ligands.

    PubMed

    Rambukwella, Milan; Burrage, Shayna; Neubrander, Marie; Baseggio, Oscar; Aprà, Edoardo; Stener, Mauro; Fortunelli, Alessandro; Dass, Amala

    2017-04-06

    Au 38 (SR) 24 is one of the most extensively investigated gold nanomolecules along with Au 25 (SR) 18 and Au 144 (SR) 60 . However, so far it has only been prepared using aliphatic-like ligands, where R = -SC 6 H 13 , -SC 12 H 25 and -SCH 2 CH 2 Ph. Au 38 (SCH 2 CH 2 Ph) 24 when reacted with HSPh undergoes core-size conversion to Au 36 (SPh) 24 , and existing literature suggests that Au 38 (SPh) 24 cannot be synthesized. Here, contrary to prevailing knowledge, we demonstrate that Au 38 (SPh) 24 can be prepared if the ligand exchanged conditions are optimized, under delicate conditions, without any formation of Au 36 (SPh) 24 . Conclusive evidence is presented in the form of matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS), electrospray ionization mass spectra (ESI-MS) characterization, and optical spectra of Au 38 (SPh) 24 in a solid glass form showing distinct differences from that of Au 38 (S-aliphatic) 24 . Theoretical analysis confirms experimental assignment of the optical spectrum and shows that the stability of Au 38 (SPh) 24 is not negligible with respect to that of its aliphatic analogous, and contains a significant component of ligand-ligand attractive interactions. Thus, while Au 38 (SPh) 24 is stable at RT, it converts to Au 36 (SPh) 24 either on prolonged etching (longer than 2 hours) at RT or when etched at 80 °C.

  12. Parallel processing a real code: A case history

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mandell, D.A.; Trease, H.E.

    1988-01-01

    A three-dimensional, time-dependent Free-Lagrange hydrodynamics code has been multitasked and autotasked on a Cray X-MP/416. The multitasking was done by using the Los Alamos Multitasking Control Library, which is a superset of the Cray multitasking library. Autotasking is done by using constructs which are only comment cards if the source code is not run through a preprocessor. The 3-D algorithm has presented a number of problems that simpler algorithms, such as 1-D hydrodynamics, did not exhibit. Problems in converting the serial code, originally written for a Cray 1, to a multitasking code are discussed, Autotasking of a rewritten version ofmore » the code is discussed. Timing results for subroutines and hot spots in the serial code are presented and suggestions for additional tools and debugging aids are given. Theoretical speedup results obtained from Amdahl's law and actual speedup results obtained on a dedicated machine are presented. Suggestions for designing large parallel codes are given. 8 refs., 13 figs.« less

  13. Kinetically controlled synthesis of Au102(SPh)44 nanoclusters and catalytic application.

    PubMed

    Chen, Yongdong; Wang, Jin; Liu, Chao; Li, Zhimin; Li, Gao

    2016-05-21

    We here explore a kinetically controlled synthetic protocol for preparing solvent-solvable Au102(SPh)44 nanoclusters which are isolated from polydispersed gold nanoclusters by solvent extraction and size exclusion chromatography (SEC). The as-obtained Au102(SPh)44 nanoclusters are determined by matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI) mass spectrometry, in conjunction with UV-vis spectroscopy and thermogravimetric analysis (TGA). However, Au99(SPh)42, instead of Au102(SPh)44, is yielded when the polydispersed gold nanoclusters are etched in the presence of excess thiophenol under thermal conditions (e.g., 80 °C). Interestingly, the Au102(SPh)44 nanoclusters also can convert to Au99(SPh)42 with equivalent thiophenol ligands, evidenced by the analyses of UV-vis and MALDI mass spectrometry. Finally, the TiO2-supported Au102(SPh)44 nanocluster catalyst is investigated in the selective oxidation of sulfides into sulfoxides by the PhIO oxidant and gives rise to high catalytic activity (e.g., 80-99% conversion of R-S-R' sulfides with 96-99% selectivity for R-S([double bond, length as m-dash]O)-R' sulfoxides). The Au102(SPh)44/TiO2 catalyst also shows excellent recyclability in the sulfoxidation process.

  14. The implementation of an aeronautical CFD flow code onto distributed memory parallel systems

    NASA Astrophysics Data System (ADS)

    Ierotheou, C. S.; Forsey, C. R.; Leatham, M.

    2000-04-01

    The parallelization of an industrially important in-house computational fluid dynamics (CFD) code for calculating the airflow over complex aircraft configurations using the Euler or Navier-Stokes equations is presented. The code discussed is the flow solver module of the SAUNA CFD suite. This suite uses a novel grid system that may include block-structured hexahedral or pyramidal grids, unstructured tetrahedral grids or a hybrid combination of both. To assist in the rapid convergence to a solution, a number of convergence acceleration techniques are employed including implicit residual smoothing and a multigrid full approximation storage scheme (FAS). Key features of the parallelization approach are the use of domain decomposition and encapsulated message passing to enable the execution in parallel using a single programme multiple data (SPMD) paradigm. In the case where a hybrid grid is used, a unified grid partitioning scheme is employed to define the decomposition of the mesh. The parallel code has been tested using both structured and hybrid grids on a number of different distributed memory parallel systems and is now routinely used to perform industrial scale aeronautical simulations. Copyright

  15. Parallelization of ARC3D with Computer-Aided Tools

    NASA Technical Reports Server (NTRS)

    Jin, Haoqiang; Hribar, Michelle; Yan, Jerry; Saini, Subhash (Technical Monitor)

    1998-01-01

    A series of efforts have been devoted to investigating methods of porting and parallelizing applications quickly and efficiently for new architectures, such as the SCSI Origin 2000 and Cray T3E. This report presents the parallelization of a CFD application, ARC3D, using the computer-aided tools, Cesspools. Steps of parallelizing this code and requirements of achieving better performance are discussed. The generated parallel version has achieved reasonably well performance, for example, having a speedup of 30 for 36 Cray T3E processors. However, this performance could not be obtained without modification of the original serial code. It is suggested that in many cases improving serial code and performing necessary code transformations are important parts for the automated parallelization process although user intervention in many of these parts are still necessary. Nevertheless, development and improvement of useful software tools, such as Cesspools, can help trim down many tedious parallelization details and improve the processing efficiency.

  16. Boltzmann Transport Code Update: Parallelization and Integrated Design Updates

    NASA Technical Reports Server (NTRS)

    Heinbockel, J. H.; Nealy, J. E.; DeAngelis, G.; Feldman, G. A.; Chokshi, S.

    2003-01-01

    The on going efforts at developing a web site for radiation analysis is expected to result in an increased usage of the High Charge and Energy Transport Code HZETRN. It would be nice to be able to do the requested calculations quickly and efficiently. Therefore the question arose, "Could the implementation of parallel processing speed up the calculations required?" To answer this question two modifications of the HZETRN computer code were created. The first modification selected the shield material of Al(2219) , then polyethylene and then Al(2219). The modified Fortran code was labeled 1SSTRN.F. The second modification considered the shield material of CO2 and Martian regolith. This modified Fortran code was labeled MARSTRN.F.

  17. Parallel implementation of the particle simulation method with dynamic load balancing: Toward realistic geodynamical simulation

    NASA Astrophysics Data System (ADS)

    Furuichi, M.; Nishiura, D.

    2015-12-01

    Fully Lagrangian methods such as Smoothed Particle Hydrodynamics (SPH) and Discrete Element Method (DEM) have been widely used to solve the continuum and particles motions in the computational geodynamics field. These mesh-free methods are suitable for the problems with the complex geometry and boundary. In addition, their Lagrangian nature allows non-diffusive advection useful for tracking history dependent properties (e.g. rheology) of the material. These potential advantages over the mesh-based methods offer effective numerical applications to the geophysical flow and tectonic processes, which are for example, tsunami with free surface and floating body, magma intrusion with fracture of rock, and shear zone pattern generation of granular deformation. In order to investigate such geodynamical problems with the particle based methods, over millions to billion particles are required for the realistic simulation. Parallel computing is therefore important for handling such huge computational cost. An efficient parallel implementation of SPH and DEM methods is however known to be difficult especially for the distributed-memory architecture. Lagrangian methods inherently show workload imbalance problem for parallelization with the fixed domain in space, because particles move around and workloads change during the simulation. Therefore dynamic load balance is key technique to perform the large scale SPH and DEM simulation. In this work, we present the parallel implementation technique of SPH and DEM method utilizing dynamic load balancing algorithms toward the high resolution simulation over large domain using the massively parallel super computer system. Our method utilizes the imbalances of the executed time of each MPI process as the nonlinear term of parallel domain decomposition and minimizes them with the Newton like iteration method. In order to perform flexible domain decomposition in space, the slice-grid algorithm is used. Numerical tests show that our

  18. Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Candel, A.; Kabel, A.; Lee, L.

    In recent years, SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic time-domain code T3P. Higher-order Finite Element methods on conformal unstructured meshes and massively parallel processing allow unprecedented simulation accuracy for wakefield computations and simulations of transient effects in realistic accelerator structures. Applications include simulation of wakefield damping in the Compact Linear Collider (CLIC) power extraction and transfer structure (PETS).

  19. Synthesis of Au38(SCH2CH2Ph)24, Au36(SPh-tBu)24, and Au30(S-tBu)18 Nanomolecules from a Common Precursor Mixture.

    PubMed

    Rambukwella, Milan; Dass, Amala

    2017-10-17

    Phenylethanethiol protected nanomolecules such as Au 25 , Au 38 , and Au 144 are widely studied by a broad range of scientists in the community, owing primarily to the availability of simple synthetic protocols. However, synthetic methods are not available for other ligands, such as aromatic thiol and bulky ligands, impeding progress. Here we report the facile synthesis of three distinct nanomolecules, Au 38 (SCH 2 CH 2 Ph) 24 , Au 36 (SPh-tBu) 24 , and Au 30 (S-tBu) 18 , exclusively, starting from a common Au n (glutathione) m (where n and m are number of gold atoms and glutathiolate ligands) starting material upon reaction with HSCH 2 CH 2 Ph, HSPh-tBu, and HStBu, respectively. The systematic synthetic approach involves two steps: (i) synthesis of kinetically controlled Au n (glutathione) m crude nanocluster mixture with 1:4 gold to thiol molar ratio and (ii) thermochemical treatment of the purified nanocluster mixture with excess thiols to obtain thermodynamically stable nanomolecules. Thermochemical reactions with physicochemically different ligands formed highly monodispersed, exclusively three different core-size nanomolecules, suggesting a ligand induced core-size conversion and structural transformation. The purpose of this work is to make available a facile and simple synthetic method for the preparation of Au 38 (SCH 2 CH 2 Ph) 24 , Au 36 (SPh-tBu) 24 , and Au 30 (S-tBu) 18 , to nonspecialists and the broader scientific community. The central idea of simple synthetic method was demonstrated with other ligand systems such as cyclopentanethiol (HSC 5 H 9 ), cyclohexanethiol(HSC 6 H 11 ), para-methylbenzenethiol(pMBT), 1-pentanethiol(HSC 5 H 11 ), 1-hexanethiol(HSC 6 H 13 ), where Au 36 (SC 5 H 9 ) 24 , Au 36 (SC 6 H 11 ) 24 , Au 36 (pMBT) 24 , Au 38 (SC 5 H 11 ) 24 , and Au 38 (SC 6 H 13 ) 24 were obtained, respectively.

  20. Data Parallel Line Relaxation (DPLR) Code User Manual: Acadia - Version 4.01.1

    NASA Technical Reports Server (NTRS)

    Wright, Michael J.; White, Todd; Mangini, Nancy

    2009-01-01

    Data-Parallel Line Relaxation (DPLR) code is a computational fluid dynamic (CFD) solver that was developed at NASA Ames Research Center to help mission support teams generate high-value predictive solutions for hypersonic flow field problems. The DPLR Code Package is an MPI-based, parallel, full three-dimensional Navier-Stokes CFD solver with generalized models for finite-rate reaction kinetics, thermal and chemical non-equilibrium, accurate high-temperature transport coefficients, and ionized flow physics incorporated into the code. DPLR also includes a large selection of generalized realistic surface boundary conditions and links to enable loose coupling with external thermal protection system (TPS) material response and shock layer radiation codes.

  1. Au36(SPh)24 nanomolecules: X-ray crystal structure, optical spectroscopy, electrochemistry, and theoretical analysis.

    PubMed

    Nimmala, Praneeth Reddy; Knoppe, Stefan; Jupally, Vijay Reddy; Delcamp, Jared H; Aikens, Christine M; Dass, Amala

    2014-12-11

    The physicochemical properties of gold:thiolate nanomolecules depend on their crystal structure and the capping ligands. The effects of protecting ligands on the crystal structure of the nanomolecules are of high interest in this area of research. Here we report the crystal structure of an all aromatic thiophenolate-capped Au36(SPh)24 nanomolecule, which has a face-centered cubic (fcc) core similar to other nanomolecules such as Au36(SPh-tBu)24 and Au36(SC5H9)24 with the same number of gold atoms and ligands. The results support the idea that a stable core remains intact even when the capping ligand is varied. We also correct our earlier assignment of "Au36(SPh)23" which was determined based on MALDI mass spectrometry which is more prone to fragmentation than ESI mass spectrometry. We show that ESI mass spectrometry gives the correct assignment of Au36(SPh)24, supporting the X-ray crystal structure. The electronic structure of the title compound was computed at different levels of theory (PBE, LDA, and LB94) using the coordinates extracted from the single crystal X-ray diffraction data. The optical and electrochemical properties were determined from experimental data using UV-vis spectroscopy, cyclic voltammetry, and differential pulse voltammetry. Au36(SPh)24 shows a broad electrochemical gap near 2 V, a desirable optical gap of ∼1.75 eV for dye-sensitized solar cell applications, as well as appropriately positioned electrochemical potentials for many electrocatalytic reactions.

  2. A Parallel Numerical Micromagnetic Code Using FEniCS

    NASA Astrophysics Data System (ADS)

    Nagy, L.; Williams, W.; Mitchell, L.

    2013-12-01

    Many problems in the geosciences depend on understanding the ability of magnetic minerals to provide stable paleomagnetic recordings. Numerical micromagnetic modelling allows us to calculate the domain structures found in naturally occurring magnetic materials. However the computational cost rises exceedingly quickly with respect to the size and complexity of the geometries that we wish to model. This problem is compounded by the fact that the modern processor design no longer focuses on the speed at which calculations are performed, but rather on the number of computational units amongst which we may distribute our calculations. Consequently to better exploit modern computational resources our micromagnetic simulations must "go parallel". We present a parallel and scalable micromagnetics code written using FEniCS. FEniCS is a multinational collaboration involving several institutions (University of Cambridge, University of Chicago, The Simula Research Laboratory, etc.) that aims to provide a set of tools for writing scientific software; in particular software that employs the finite element method. The advantages of this approach are the leveraging of pre-existing projects from the world of scientific computing (PETSc, Trilinos, Metis/Parmetis, etc.) and exposing these so that researchers may pose problems in a manner closer to the mathematical language of their domain. Our code provides a scriptable interface (in Python) that allows users to not only run micromagnetic models in parallel, but also to perform pre/post processing of data.

  3. Highly fault-tolerant parallel computation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Spielman, D.A.

    We re-introduce the coded model of fault-tolerant computation in which the input and output of a computational device are treated as words in an error-correcting code. A computational device correctly computes a function in the coded model if its input and output, once decoded, are a valid input and output of the function. In the coded model, it is reasonable to hope to simulate all computational devices by devices whose size is greater by a constant factor but which are exponentially reliable even if each of their components can fail with some constant probability. We consider fine-grained parallel computations inmore » which each processor has a constant probability of producing the wrong output at each time step. We show that any parallel computation that runs for time t on w processors can be performed reliably on a faulty machine in the coded model using w log{sup O(l)} w processors and time t log{sup O(l)} w. The failure probability of the computation will be at most t {center_dot} exp(-w{sup 1/4}). The codes used to communicate with our fault-tolerant machines are generalized Reed-Solomon codes and can thus be encoded and decoded in O(n log{sup O(1)} n) sequential time and are independent of the machine they are used to communicate with. We also show how coded computation can be used to self-correct many linear functions in parallel with arbitrarily small overhead.« less

  4. CUBE: Information-optimized parallel cosmological N-body simulation code

    NASA Astrophysics Data System (ADS)

    Yu, Hao-Ran; Pen, Ue-Li; Wang, Xin

    2018-05-01

    CUBE, written in Coarray Fortran, is a particle-mesh based parallel cosmological N-body simulation code. The memory usage of CUBE can approach as low as 6 bytes per particle. Particle pairwise (PP) force, cosmological neutrinos, spherical overdensity (SO) halofinder are included.

  5. Characterizing flow in oil reservoir rock using SPH: absolute permeability

    NASA Astrophysics Data System (ADS)

    Holmes, David W.; Williams, John R.; Tilke, Peter; Leonardi, Christopher R.

    2016-04-01

    In this paper, a three-dimensional smooth particle hydrodynamics (SPH) simulator for modeling grain scale fluid flow in porous rock is presented. The versatility of the SPH method has driven its use in increasingly complex areas of flow analysis, including flows related to permeable rock for both groundwater and petroleum reservoir research. While previous approaches to such problems using SPH have involved the use of idealized pore geometries (cylinder/sphere packs etc), in this paper we detail the characterization of flow in models with geometries taken from 3D X-ray microtomographic imaging of actual porous rock; specifically 25.12 % porosity dolomite. This particular rock type has been well characterized experimentally and described in the literature, thus providing a practical `real world' means of verification of SPH that will be key to its acceptance by industry as a viable alternative to traditional reservoir modeling tools. The true advantages of SPH are realized when adding the complexity of multiple fluid phases, however, the accuracy of SPH for single phase flow is, as yet, under developed in the literature and will be the primary focus of this paper. Flow in reservoir rock will typically occur in the range of low Reynolds numbers, making the enforcement of no-slip boundary conditions an important factor in simulation. To this end, we detail the development of a new, robust, and numerically efficient method for implementing no-slip boundary conditions in SPH that can handle the degree of complexity of boundary surfaces, characteristic of an actual permeable rock sample. A study of the effect of particle density is carried out and simulation results for absolute permeability are presented and compared to those from experimentation showing good agreement and validating the method for such applications.

  6. An Expert System for the Development of Efficient Parallel Code

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Chun, Robert; Jin, Hao-Qiang; Labarta, Jesus; Gimenez, Judit

    2004-01-01

    We have built the prototype of an expert system to assist the user in the development of efficient parallel code. The system was integrated into the parallel programming environment that is currently being developed at NASA Ames. The expert system interfaces to tools for automatic parallelization and performance analysis. It uses static program structure information and performance data in order to automatically determine causes of poor performance and to make suggestions for improvements. In this paper we give an overview of our programming environment, describe the prototype implementation of our expert system, and demonstrate its usefulness with several case studies.

  7. Fast l₁-SPIRiT compressed sensing parallel imaging MRI: scalable parallel implementation and clinically feasible runtime.

    PubMed

    Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael

    2012-06-01

    We present l₁-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative self-consistent parallel imaging (SPIRiT). Like many iterative magnetic resonance imaging reconstructions, l₁-SPIRiT's image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing l₁-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of l₁-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT spoiled gradient echo (SPGR) sequence with up to 8× acceleration via Poisson-disc undersampling in the two phase-encoded directions.

  8. Performance and Application of Parallel OVERFLOW Codes on Distributed and Shared Memory Platforms

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Rizk, Yehia M.

    1999-01-01

    The presentation discusses recent studies on the performance of the two parallel versions of the aerodynamics CFD code, OVERFLOW_MPI and _MLP. Developed at NASA Ames, the serial version, OVERFLOW, is a multidimensional Navier-Stokes flow solver based on overset (Chimera) grid technology. The code has recently been parallelized in two ways. One is based on the explicit message-passing interface (MPI) across processors and uses the _MPI communication package. This approach is primarily suited for distributed memory systems and workstation clusters. The second, termed the multi-level parallel (MLP) method, is simple and uses shared memory for all communications. The _MLP code is suitable on distributed-shared memory systems. For both methods, the message passing takes place across the processors or processes at the advancement of each time step. This procedure is, in effect, the Chimera boundary conditions update, which is done in an explicit "Jacobi" style. In contrast, the update in the serial code is done in more of the "Gauss-Sidel" fashion. The programming efforts for the _MPI code is more complicated than for the _MLP code; the former requires modification of the outer and some inner shells of the serial code, whereas the latter focuses only on the outer shell of the code. The _MPI version offers a great deal of flexibility in distributing grid zones across a specified number of processors in order to achieve load balancing. The approach is capable of partitioning zones across multiple processors or sending each zone and/or cluster of several zones into a single processor. The message passing across the processors consists of Chimera boundary and/or an overlap of "halo" boundary points for each partitioned zone. The MLP version is a new coarse-grain parallel concept at the zonal and intra-zonal levels. A grouping strategy is used to distribute zones into several groups forming sub-processes which will run in parallel. The total volume of grid points in each

  9. Development of Parallel Code for the Alaska Tsunami Forecast Model

    NASA Astrophysics Data System (ADS)

    Bahng, B.; Knight, W. R.; Whitmore, P.

    2014-12-01

    The Alaska Tsunami Forecast Model (ATFM) is a numerical model used to forecast propagation and inundation of tsunamis generated by earthquakes and other means in both the Pacific and Atlantic Oceans. At the U.S. National Tsunami Warning Center (NTWC), the model is mainly used in a pre-computed fashion. That is, results for hundreds of hypothetical events are computed before alerts, and are accessed and calibrated with observations during tsunamis to immediately produce forecasts. ATFM uses the non-linear, depth-averaged, shallow-water equations of motion with multiply nested grids in two-way communications between domains of each parent-child pair as waves get closer to coastal waters. Even with the pre-computation the task becomes non-trivial as sub-grid resolution gets finer. Currently, the finest resolution Digital Elevation Models (DEM) used by ATFM are 1/3 arc-seconds. With a serial code, large or multiple areas of very high resolution can produce run-times that are unrealistic even in a pre-computed approach. One way to increase the model performance is code parallelization used in conjunction with a multi-processor computing environment. NTWC developers have undertaken an ATFM code-parallelization effort to streamline the creation of the pre-computed database of results with the long term aim of tsunami forecasts from source to high resolution shoreline grids in real time. Parallelization will also permit timely regeneration of the forecast model database with new DEMs; and, will make possible future inclusion of new physics such as the non-hydrostatic treatment of tsunami propagation. The purpose of our presentation is to elaborate on the parallelization approach and to show the compute speed increase on various multi-processor systems.

  10. Parallelization of PANDA discrete ordinates code using spatial decomposition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Humbert, P.

    2006-07-01

    We present the parallel method, based on spatial domain decomposition, implemented in the 2D and 3D versions of the discrete Ordinates code PANDA. The spatial mesh is orthogonal and the spatial domain decomposition is Cartesian. For 3D problems a 3D Cartesian domain topology is created and the parallel method is based on a domain diagonal plane ordered sweep algorithm. The parallel efficiency of the method is improved by directions and octants pipelining. The implementation of the algorithm is straightforward using MPI blocking point to point communications. The efficiency of the method is illustrated by an application to the 3D-Ext C5G7more » benchmark of the OECD/NEA. (authors)« less

  11. RR Lyrae in the UMi dSph Galaxy

    NASA Astrophysics Data System (ADS)

    Kuehn, Charles; Kinemuchi, Karen; Jeffery, Elizabeth; Grabowski, Kathleen; Nemec, James; Herrera, Daniel

    2018-01-01

    Over the past two years we have obtained observations of the Ursa Minor dwarf spheroidal galaxy with the goal of completing an updated catalog of the variable stars in the dwarf galaxy. In addition to finding new variable stars, this updated catalog will allow us to look at period changes in the variables and to determine stellar characteristic for the RR Lyrae stars in the dSph. We will compare the RR Lyrae stellar characteristics to other RR Lyrae stars found in the Local Group dSph galaxies; these comparisons can give us insights to the near-field cosmology of the Local Group. In this poster we present our updated catalog of RR Lyrae stars in the UMi dSph; the updated catalog includes Fourier decomposition parameters, metallicities, and other physical properties for the RR Lyrae stars.

  12. A density-adaptive SPH method with kernel gradient correction for modeling explosive welding

    NASA Astrophysics Data System (ADS)

    Liu, M. B.; Zhang, Z. L.; Feng, D. L.

    2017-09-01

    Explosive welding involves processes like the detonation of explosive, impact of metal structures and strong fluid-structure interaction, while the whole process of explosive welding has not been well modeled before. In this paper, a novel smoothed particle hydrodynamics (SPH) model is developed to simulate explosive welding. In the SPH model, a kernel gradient correction algorithm is used to achieve better computational accuracy. A density adapting technique which can effectively treat large density ratio is also proposed. The developed SPH model is firstly validated by simulating a benchmark problem of one-dimensional TNT detonation and an impact welding problem. The SPH model is then successfully applied to simulate the whole process of explosive welding. It is demonstrated that the presented SPH method can capture typical physics in explosive welding including explosion wave, welding surface morphology, jet flow and acceleration of the flyer plate. The welding angle obtained from the SPH simulation agrees well with that from a kinematic analysis.

  13. Code Optimization and Parallelization on the Origins: Looking from Users' Perspective

    NASA Technical Reports Server (NTRS)

    Chang, Yan-Tyng Sherry; Thigpen, William W. (Technical Monitor)

    2002-01-01

    Parallel machines are becoming the main compute engines for high performance computing. Despite their increasing popularity, it is still a challenge for most users to learn the basic techniques to optimize/parallelize their codes on such platforms. In this paper, we present some experiences on learning these techniques for the Origin systems at the NASA Advanced Supercomputing Division. Emphasis of this paper will be on a few essential issues (with examples) that general users should master when they work with the Origins as well as other parallel systems.

  14. A visual parallel-BCI speller based on the time-frequency coding strategy.

    PubMed

    Xu, Minpeng; Chen, Long; Zhang, Lixin; Qi, Hongzhi; Ma, Lan; Tang, Jiabei; Wan, Baikun; Ming, Dong

    2014-04-01

    Spelling is one of the most important issues in brain-computer interface (BCI) research. This paper is to develop a visual parallel-BCI speller system based on the time-frequency coding strategy in which the sub-speller switching among four simultaneously presented sub-spellers and the character selection are identified in a parallel mode. The parallel-BCI speller was constituted by four independent P300+SSVEP-B (P300 plus SSVEP blocking) spellers with different flicker frequencies, thereby all characters had a specific time-frequency code. To verify its effectiveness, 11 subjects were involved in the offline and online spellings. A classification strategy was designed to recognize the target character through jointly using the canonical correlation analysis and stepwise linear discriminant analysis. Online spellings showed that the proposed parallel-BCI speller had a high performance, reaching the highest information transfer rate of 67.4 bit min(-1), with an average of 54.0 bit min(-1) and 43.0 bit min(-1) in the three rounds and five rounds, respectively. The results indicated that the proposed parallel-BCI could be effectively controlled by users with attention shifting fluently among the sub-spellers, and highly improved the BCI spelling performance.

  15. Structure–Activity Relationship Studies and in Vivo Activity of Guanidine-Based Sphingosine Kinase Inhibitors: Discovery of SphK1- and SphK2-Selective Inhibitors

    PubMed Central

    Kharel, Yugesh; Raje, Mithun R.; Gao, Ming; Tomsig, Jose L.; Lynch, Kevin R.; Santos, Webster L.

    2015-01-01

    Sphingosine 1-phosphate (S1P) is a pleiotropic signaling molecule that acts as a ligand for five G-protein coupled receptors (S1P1–5) whose downstream effects are implicated in a variety of important pathologies including sickle cell disease, cancer, inflammation, and fibrosis. The synthesis of S1P is catalyzed by sphingosine kinase (SphK) isoforms 1 and 2, and hence, inhibitors of this phosphorylation step are pivotal in understanding the physiological functions of SphKs. To date, SphK1 and 2 inhibitors with the potency, selectivity, and in vivo stability necessary to determine the potential of these kinases as therapeutic targets are lacking. Herein, we report the design, synthesis, and structure–activity relationship studies of guanidine-based SphK inhibitors bearing an oxadiazole ring in the scaffold. Our studies demonstrate that SLP120701, a SphK2-selective inhibitor (Ki = 1 μM), decreases S1P levels in histiocytic lymphoma (U937) cells. Surprisingly, homologation with a single methylene unit between the oxadiazole and heterocyclic ring afforded a SphK1-selective inhibitor in SLP7111228 (Ki = 48 nM), which also decreased S1P levels in cultured U937 cells. In vivo application of both compounds, however, resulted in contrasting effect in circulating levels of S1P. Administration of SLP7111228 depressed blood S1P levels while SLP120701 increased levels of S1P. Taken together, these compounds provide an in vivo chemical toolkit to interrogate the effect of increasing or decreasing S1P levels and whether such a maneuver can have implications in disease states. PMID:25643074

  16. Advanced Boundary Electrode Modeling for tES and Parallel tES/EEG.

    PubMed

    Pursiainen, Sampsa; Agsten, Britte; Wagner, Sven; Wolters, Carsten H

    2018-01-01

    This paper explores advanced electrode modeling in the context of separate and parallel transcranial electrical stimulation (tES) and electroencephalography (EEG) measurements. We focus on boundary condition-based approaches that do not necessitate adding auxiliary elements, e.g., sponges, to the computational domain. In particular, we investigate the complete electrode model (CEM) which incorporates a detailed description of the skin-electrode interface including its contact surface, impedance, and normal current distribution. The CEM can be applied for both tES and EEG electrodes which are advantageous when a parallel system is used. In comparison to the CEM, we test two important reduced approaches: the gap model (GAP) and the point electrode model (PEM). We aim to find out the differences of these approaches for a realistic numerical setting based on the stimulation of the auditory cortex. The results obtained suggest, among other things, that GAP and GAP/PEM are sufficiently accurate for the practical application of tES and parallel tES/EEG, respectively. Differences between CEM and GAP were observed mainly in the skin compartment, where only CEM explains the heating effects characteristic to tES.

  17. Pairwise Force SPH Model for Real-Time Multi-Interaction Applications.

    PubMed

    Yang, Tao; Martin, Ralph R; Lin, Ming C; Chang, Jian; Hu, Shi-Min

    2017-10-01

    In this paper, we present a novel pairwise-force smoothed particle hydrodynamics (PF-SPH) model to enable simulation of various interactions at interfaces in real time. Realistic capture of interactions at interfaces is a challenging problem for SPH-based simulations, especially for scenarios involving multiple interactions at different interfaces. Our PF-SPH model can readily handle multiple types of interactions simultaneously in a single simulation; its basis is to use a larger support radius than that used in standard SPH. We adopt a novel anisotropic filtering term to further improve the performance of interaction forces. The proposed model is stable; furthermore, it avoids the particle clustering problem which commonly occurs at the free surface. We show how our model can be used to capture various interactions. We also consider the close connection between droplets and bubbles, and show how to animate bubbles rising in liquid as well as bubbles in air. Our method is versatile, physically plausible and easy-to-implement. Examples are provided to demonstrate the capabilities and effectiveness of our approach.

  18. A novel SPH method for sedimentation in a turbulent fluid

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kwon, Jihoe; Monaghan, J.J., E-mail: joe.monaghan@sci.monash.edu.au

    2015-11-01

    A novel method for simulating sedimentation is described and applied to the sedimentation of dust in a turbulent fluid. We assume the dust grains are sufficiently numerous that they may be treated as a fluid and modelled by SPH particles. A different set of SPH particles describes the fluid. The equations of motion are therefore similar to those of Monaghan and Kocharyan [14] with the exception that the sedimentation of dust onto a solid surface is treated as if the surface mimics a sink for the dust fluid. The continuity equation for the dust then contains a sink term thatmore » can be modelled in the SPH formulation by allowing the mass of each SPH dust particle to decrease when it is sufficiently close to the boundary. We apply this method both to sedimentation in a nearly static fluid, and to sedimentation in a turbulent fluid. In the latter case we produce the turbulence by both a mechanical stirrer and by a stochastic algorithm. Our results agree very closely with the experiments of Martin and Nokes.« less

  19. SPH Modelling of Sea-ice Pack Dynamics

    NASA Astrophysics Data System (ADS)

    Staroszczyk, Ryszard

    2017-12-01

    The paper is concerned with the problem of sea-ice pack motion and deformation under the action of wind and water currents. Differential equations describing the dynamics of ice, with its very distinct mateFfigrial responses in converging and diverging flows, express the mass and linear momentum balances on the horizontal plane (the free surface of the ocean). These equations are solved by the fully Lagrangian method of smoothed particle hydrodynamics (SPH). Assuming that the ice behaviour can be approximated by a non-linearly viscous rheology, the proposed SPH model has been used to simulate the evolution of a sea-ice pack driven by wind drag stresses. The results of numerical simulations illustrate the evolution of an ice pack, including variations in ice thickness and ice area fraction in space and time. The effects of different initial ice pack configurations and of different conditions assumed at the coast-ice interface are examined. In particular, the SPH model is applied to a pack flow driven by a vortex wind to demonstrate how well the Lagrangian formulation can capture large deformations and displacements of sea ice.

  20. A visual parallel-BCI speller based on the time-frequency coding strategy

    NASA Astrophysics Data System (ADS)

    Xu, Minpeng; Chen, Long; Zhang, Lixin; Qi, Hongzhi; Ma, Lan; Tang, Jiabei; Wan, Baikun; Ming, Dong

    2014-04-01

    Objective. Spelling is one of the most important issues in brain-computer interface (BCI) research. This paper is to develop a visual parallel-BCI speller system based on the time-frequency coding strategy in which the sub-speller switching among four simultaneously presented sub-spellers and the character selection are identified in a parallel mode. Approach. The parallel-BCI speller was constituted by four independent P300+SSVEP-B (P300 plus SSVEP blocking) spellers with different flicker frequencies, thereby all characters had a specific time-frequency code. To verify its effectiveness, 11 subjects were involved in the offline and online spellings. A classification strategy was designed to recognize the target character through jointly using the canonical correlation analysis and stepwise linear discriminant analysis. Main results. Online spellings showed that the proposed parallel-BCI speller had a high performance, reaching the highest information transfer rate of 67.4 bit min-1, with an average of 54.0 bit min-1 and 43.0 bit min-1 in the three rounds and five rounds, respectively. Significance. The results indicated that the proposed parallel-BCI could be effectively controlled by users with attention shifting fluently among the sub-spellers, and highly improved the BCI spelling performance.

  1. Comparison of ALE and SPH Methods for Simulating Mine Blast Effects on Structures

    DTIC Science & Technology

    2010-12-01

    Comparison of ALE and SPH methods for simulating mine blast effects on struc- tures Geneviève Toussaint Amal Bouamoul DRDC Valcartier Defence R&D...Canada – Valcartier Technical Report DRDC Valcartier TR 2010-326 December 2010 Comparison of ALE and SPH methods for simulating mine blast...Valcartier TR 2010-326 iii Executive summary Comparison of ALE and SPH methods for simulating mine blast effects on structures

  2. Development Of A Parallel Performance Model For The THOR Neutral Particle Transport Code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yessayan, Raffi; Azmy, Yousry; Schunert, Sebastian

    The THOR neutral particle transport code enables simulation of complex geometries for various problems from reactor simulations to nuclear non-proliferation. It is undergoing a thorough V&V requiring computational efficiency. This has motivated various improvements including angular parallelization, outer iteration acceleration, and development of peripheral tools. For guiding future improvements to the code’s efficiency, better characterization of its parallel performance is useful. A parallel performance model (PPM) can be used to evaluate the benefits of modifications and to identify performance bottlenecks. Using INL’s Falcon HPC, the PPM development incorporates an evaluation of network communication behavior over heterogeneous links and a functionalmore » characterization of the per-cell/angle/group runtime of each major code component. After evaluating several possible sources of variability, this resulted in a communication model and a parallel portion model. The former’s accuracy is bounded by the variability of communication on Falcon while the latter has an error on the order of 1%.« less

  3. Fast ℓ1-SPIRiT Compressed Sensing Parallel Imaging MRI: Scalable Parallel Implementation and Clinically Feasible Runtime

    PubMed Central

    Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael

    2012-01-01

    We present ℓ1-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the Wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative Self-Consistent Parallel Imaging (SPIRiT). Like many iterative MRI reconstructions, ℓ1-SPIRiT’s image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing ℓ1-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of ℓ1-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT Spoiled Gradient Echo (SPGR) sequence with up to 8× acceleration via poisson-disc undersampling in the two phase-encoded directions. PMID:22345529

  4. Role of sph2 Gene Regulation in Hemolytic and Sphingomyelinase Activities Produced by Leptospira interrogans.

    PubMed

    Narayanavari, Suneel A; Lourdault, Kristel; Sritharan, Manjula; Haake, David A; Matsunaga, James

    2015-01-01

    Pathogenic members of the genus Leptospira are the causative agents of leptospirosis, a neglected disease of public and veterinary health concern. Leptospirosis is a systemic disease that in its severest forms leads to renal insufficiency, hepatic dysfunction, and pulmonary failure. Many strains of Leptospira produce hemolytic and sphingomyelinase activities, and a number of candidate leptospiral hemolysins have been identified based on sequence similarity to well-characterized bacterial hemolysins. Five of the putative hemolysins are sphingomyelinase paralogs. Although recombinant forms of the sphingomyelinase Sph2 and other hemolysins lyse erythrocytes, none have been demonstrated to contribute to the hemolytic activity secreted by leptospiral cells. In this study, we examined the regulation of sph2 and its relationship to hemolytic and sphingomyelinase activities produced by several L. interrogans strains cultivated under the osmotic conditions found in the mammalian host. The sph2 gene was poorly expressed when the Fiocruz L1-130 (serovar Copenhageni), 56601 (sv. Lai), and L495 (sv. Manilae) strains were cultivated in the standard culture medium EMJH. Raising EMJH osmolarity to physiological levels with sodium chloride enhanced Sph2 production in all three strains. In addition, the Pomona subtype kennewicki strain LC82-25 produced substantially greater amounts of Sph2 during standard EMJH growth than the other strains, and sph2 expression increased further by addition of salt. When 10% rat serum was present in EMJH along with the sodium chloride supplement, Sph2 production increased further in all strains. Osmotic regulation and differences in basal Sph2 production in the Manilae L495 and Pomona strains correlated with the levels of secreted hemolysin and sphingomyelinase activities. Finally, a transposon insertion in sph2 dramatically reduced hemolytic and sphingomyelinase activities during incubation of L. interrogans at physiologic osmolarity

  5. Solar wind interaction with Venus and Mars in a parallel hybrid code

    NASA Astrophysics Data System (ADS)

    Jarvinen, Riku; Sandroos, Arto

    2013-04-01

    We discuss the development and applications of a new parallel hybrid simulation, where ions are treated as particles and electrons as a charge-neutralizing fluid, for the interaction between the solar wind and Venus and Mars. The new simulation code under construction is based on the algorithm of the sequential global planetary hybrid model developed at the Finnish Meteorological Institute (FMI) and on the Corsair parallel simulation platform also developed at the FMI. The FMI's sequential hybrid model has been used for studies of plasma interactions of several unmagnetized and weakly magnetized celestial bodies for more than a decade. Especially, the model has been used to interpret in situ particle and magnetic field observations from plasma environments of Mars, Venus and Titan. Further, Corsair is an open source MPI (Message Passing Interface) particle and mesh simulation platform, mainly aimed for simulations of diffusive shock acceleration in solar corona and interplanetary space, but which is now also being extended for global planetary hybrid simulations. In this presentation we discuss challenges and strategies of parallelizing a legacy simulation code as well as possible applications and prospects of a scalable parallel hybrid model for the solar wind interactions of Venus and Mars.

  6. Au36(SPh)23 nanomolecules.

    PubMed

    Nimmala, Praneeth Reddy; Dass, Amala

    2011-06-22

    A new core size protected completely by an aromatic thiol, Au(36)(SPh)(23), is synthesized and characterized by MALDI-TOF mass spectrometry and UV-visible spectroscopy. The synthesis involving core size changes is studied by MS, and the complete ligand coverage by aromatic thiol group is shown by NMR.

  7. Coupled SPH-FV method with net vorticity and mass transfer

    NASA Astrophysics Data System (ADS)

    Chiron, L.; Marrone, S.; Di Mascio, A.; Le Touzé, D.

    2018-07-01

    Recently, an algorithm for coupling a Finite Volume (FV) method, that discretize the Navier-Stokes equations on block structured Eulerian grids, with the weakly-compressible Lagrangian Smoothed Particle Hydrodynamics (SPH) was presented in [16]. The algorithm takes advantage of the SPH method to discretize flow regions close to free-surfaces and of the FV method to resolve the bulk flow and the wall regions. The continuity between the two solutions is guaranteed by overlapping zones. Here we extend the algorithm by adding the possibility to have: 1) net mass transfer between the SPH and FV sub-domains; 2) free-surface across the overlapping region. In this context, particle generation at common boundaries is required to prevent depletion or clustering of particles. This operation is not trivial, because consistency between the Lagrangian and Eulerian description of the flow must be retained to ensure mass conservation. We propose here a new coupling paradigm that extends the algorithm developed in [16] and renders it suitable to test cases where vorticity and free surface significantly pass from one domain to the other. On the SPH side, a novel technique for the creation/deletion of particle was developed. On the FV side, the information recovered from the SPH solver are exploited to improve free surface prediction in a fashion that resemble the Particle Level-Set algorithms. The combination of the two new features was tested and validated in a number of test cases where both vorticity and front evolution are important. Convergence and robustness of the algorithm are shown.

  8. A parallel and modular deformable cell Car-Parrinello code

    NASA Astrophysics Data System (ADS)

    Cavazzoni, Carlo; Chiarotti, Guido L.

    1999-12-01

    We have developed a modular parallel code implementing the Car-Parrinello [Phys. Rev. Lett. 55 (1985) 2471] algorithm including the variable cell dynamics [Europhys. Lett. 36 (1994) 345; J. Phys. Chem. Solids 56 (1995) 510]. Our code is written in Fortran 90, and makes use of some new programming concepts like encapsulation, data abstraction and data hiding. The code has a multi-layer hierarchical structure with tree like dependences among modules. The modules include not only the variables but also the methods acting on them, in an object oriented fashion. The modular structure allows easier code maintenance, develop and debugging procedures, and is suitable for a developer team. The layer structure permits high portability. The code displays an almost linear speed-up in a wide range of number of processors independently of the architecture. Super-linear speed up is obtained with a "smart" Fast Fourier Transform (FFT) that uses the available memory on the single node (increasing for a fixed problem with the number of processing elements) as temporary buffer to store wave function transforms. This code has been used to simulate water and ammonia at giant planet conditions for systems as large as 64 molecules for ˜50 ps.

  9. SPH/N-Body simulations of small (D = 10km) asteroidal breakups and improved parametric relations for Monte-Carlo collisional models

    NASA Astrophysics Data System (ADS)

    Ševeček, P.; Brož, M.; Nesvorný, D.; Enke, B.; Durda, D.; Walsh, K.; Richardson, D. C.

    2017-11-01

    We report on our study of asteroidal breakups, i.e. fragmentations of targets, subsequent gravitational reaccumulation and formation of small asteroid families. We focused on parent bodies with diameters Dpb = 10km . Simulations were performed with a smoothed-particle hydrodynamics (SPH) code combined with an efficient N-body integrator. We assumed various projectile sizes, impact velocities and impact angles (125 runs in total). Resulting size-frequency distributions are significantly different from scaled-down simulations with Dpb = 100km targets (Durda et al., 2007). We derive new parametric relations describing fragment distributions, suitable for Monte-Carlo collisional models. We also characterize velocity fields and angular distributions of fragments, which can be used as initial conditions for N-body simulations of small asteroid families. Finally, we discuss a number of uncertainties related to SPH simulations.

  10. X-Ray modeling of η Carinae & WR 140 from SPH simulations

    NASA Astrophysics Data System (ADS)

    Russell, Christopher M. P.; Corcoran, Michael F.; Okazaki, Atsuo T.; Madura, Thomas I.; Owocki, Stanley P.

    2011-07-01

    The colliding wind binary (CWB) systems η Carinae and WR140 provide unique laboratories for X-ray astrophysics. Their wind-wind collisions produce hard X-rays that have been monitored extensively by several X-ray telescopes, including RXTE. To interpret these RXTE X-ray light curves, we apply 3D hydrodynamic simulations of the wind-wind collision using smoothed particle hydrodynamics (SPH). We find adiabatic simulations that account for the absorption of X-rays from an assumed point source of X-ray emission at the apex of the wind-collision shock cone can closely match the RXTE light curves of both η Car and WR140. This point-source model can also explain the early recovery of η Car's X-ray light curve from the 2009.0 minimum by a factor of 2-4 reduction in the mass loss rate of η Car. Our more recent models account for the extended emission and absorption along the full wind-wind interaction shock front. For WR140, the computed X-ray light curves again match the RXTE observations quite well. But for η Car, a hot, post-periastron bubble leads to an emission level that does not match the extended X-ray minimum observed by RXTE. Initial results from incorporating radiative cooling and radiative forces via an anti-gravity approach into the SPH code are also discussed.

  11. SWIFT: SPH With Inter-dependent Fine-grained Tasking

    NASA Astrophysics Data System (ADS)

    Schaller, Matthieu; Gonnet, Pedro; Chalk, Aidan B. G.; Draper, Peter W.

    2018-05-01

    SWIFT runs cosmological simulations on peta-scale machines for solving gravity and SPH. It uses the Fast Multipole Method (FMM) to calculate gravitational forces between nearby particles, combining these with long-range forces provided by a mesh that captures both the periodic nature of the calculation and the expansion of the simulated universe. SWIFT currently uses a single fixed but time-variable softening length for all the particles. Many useful external potentials are also available, such as galaxy haloes or stratified boxes that are used in idealised problems. SWIFT implements a standard LCDM cosmology background expansion and solves the equations in a comoving frame; equations of state of dark-energy evolve with scale-factor. The structure of the code allows implementation for modified-gravity solvers or self-interacting dark matter schemes to be implemented. Many hydrodynamics schemes are implemented in SWIFT and the software allows users to add their own.

  12. [Series: Medical Applications of the PHITS Code (2): Acceleration by Parallel Computing].

    PubMed

    Furuta, Takuya; Sato, Tatsuhiko

    2015-01-01

    Time-consuming Monte Carlo dose calculation becomes feasible owing to the development of computer technology. However, the recent development is due to emergence of the multi-core high performance computers. Therefore, parallel computing becomes a key to achieve good performance of software programs. A Monte Carlo simulation code PHITS contains two parallel computing functions, the distributed-memory parallelization using protocols of message passing interface (MPI) and the shared-memory parallelization using open multi-processing (OpenMP) directives. Users can choose the two functions according to their needs. This paper gives the explanation of the two functions with their advantages and disadvantages. Some test applications are also provided to show their performance using a typical multi-core high performance workstation.

  13. Development of Parallel Computing Framework to Enhance Radiation Transport Code Capabilities for Rare Isotope Beam Facility Design

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kostin, Mikhail; Mokhov, Nikolai; Niita, Koji

    A parallel computing framework has been developed to use with general-purpose radiation transport codes. The framework was implemented as a C++ module that uses MPI for message passing. It is intended to be used with older radiation transport codes implemented in Fortran77, Fortran 90 or C. The module is significantly independent of radiation transport codes it can be used with, and is connected to the codes by means of a number of interface functions. The framework was developed and tested in conjunction with the MARS15 code. It is possible to use it with other codes such as PHITS, FLUKA andmore » MCNP after certain adjustments. Besides the parallel computing functionality, the framework offers a checkpoint facility that allows restarting calculations with a saved checkpoint file. The checkpoint facility can be used in single process calculations as well as in the parallel regime. The framework corrects some of the known problems with the scheduling and load balancing found in the original implementations of the parallel computing functionality in MARS15 and PHITS. The framework can be used efficiently on homogeneous systems and networks of workstations, where the interference from the other users is possible.« less

  14. Parallelized direct execution simulation of message-passing parallel programs

    NASA Technical Reports Server (NTRS)

    Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

    1994-01-01

    As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.

  15. Examining the accuracy of astrophysical disk simulations with a generalized hydrodynamical test problem [The role of pressure and viscosity in SPH simulations of astrophysical disks

    DOE PAGES

    Raskin, Cody; Owen, J. Michael

    2016-10-24

    Here, we discuss a generalization of the classic Keplerian disk test problem allowing for both pressure and rotational support, as a method of testing astrophysical codes incorporating both gravitation and hydrodynamics. We argue for the inclusion of pressure in rotating disk simulations on the grounds that realistic, astrophysical disks exhibit non-negligible pressure support. We then apply this test problem to examine the performance of various smoothed particle hydrodynamics (SPH) methods incorporating a number of improvements proposed over the years to address problems noted in modeling the classical gravitation-only Keplerian disk. We also apply this test to a newly developed extensionmore » of SPH based on reproducing kernels called CRKSPH. Counterintuitively, we find that pressure support worsens the performance of traditional SPH on this problem, causing unphysical collapse away from the steady-state disk solution even more rapidly than the purely gravitational problem, whereas CRKSPH greatly reduces this error.« less

  16. PIXIE3D: A Parallel, Implicit, eXtended MHD 3D Code.

    NASA Astrophysics Data System (ADS)

    Chacon, L.; Knoll, D. A.

    2004-11-01

    We report on the development of PIXIE3D, a 3D parallel, fully implicit Newton-Krylov extended primitive-variable MHD code in general curvilinear geometry. PIXIE3D employs a second-order, finite-volume-based spatial discretization that satisfies remarkable properties such as being conservative, solenoidal in the magnetic field, non-dissipative, and stable in the absence of physical dissipation.(L. Chacón , phComput. Phys. Comm.) submitted (2004) PIXIE3D employs fully-implicit Newton-Krylov methods for the time advance. Currently, first and second-order implicit schemes are available, although higher-order temporal implicit schemes can be effortlessly implemented within the Newton-Krylov framework. A successful, scalable, MG physics-based preconditioning strategy, similar in concept to previous 2D MHD efforts,(L. Chacón et al., phJ. Comput. Phys). 178 (1), 15- 36 (2002); phJ. Comput. Phys., 188 (2), 573-592 (2003) has been developed. We are currently in the process of parallelizing the code using the PETSc library, and a Newton-Krylov-Schwarz approach for the parallel treatment of the preconditioner. In this poster, we will report on both the serial and parallel performance of PIXIE3D, focusing primarily on scalability and CPU speedup vs. an explicit approach.

  17. Rooted tRNAomes and evolution of the genetic code

    PubMed Central

    Pak, Daewoo; Du, Nan; Kim, Yunsoo; Sun, Yanni

    2018-01-01

    ABSTRACT We advocate for a tRNA- rather than an mRNA-centric model for evolution of the genetic code. The mechanism for evolution of cloverleaf tRNA provides a root sequence for radiation of tRNAs and suggests a simplified understanding of code evolution. To analyze code sectoring, rooted tRNAomes were compared for several archaeal and one bacterial species. Rooting of tRNAome trees reveals conserved structures, indicating how the code was shaped during evolution and suggesting a model for evolution of a LUCA tRNAome tree. We propose the polyglycine hypothesis that the initial product of the genetic code may have been short chain polyglycine to stabilize protocells. In order to describe how anticodons were allotted in evolution, the sectoring-degeneracy hypothesis is proposed. Based on sectoring, a simple stepwise model is developed, in which the code sectors from a 1→4→8→∼16 letter code. At initial stages of code evolution, we posit strong positive selection for wobble base ambiguity, supporting convergence to 4-codon sectors and ∼16 letters. In a later stage, ∼5–6 letters, including stops, were added through innovating at the anticodon wobble position. In archaea and bacteria, tRNA wobble adenine is negatively selected, shrinking the maximum size of the primordial genetic code to 48 anticodons. Because 64 codons are recognized in mRNA, tRNA-mRNA coevolution requires tRNA wobble position ambiguity leading to degeneracy of the code. PMID:29372672

  18. Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Waheed, Abdul; Yan, Jerry

    1998-01-01

    This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.

  19. Two-way coupled SPH and particle level set fluid simulation.

    PubMed

    Losasso, Frank; Talton, Jerry; Kwatra, Nipun; Fedkiw, Ronald

    2008-01-01

    Grid-based methods have difficulty resolving features on or below the scale of the underlying grid. Although adaptive methods (e.g. RLE, octrees) can alleviate this to some degree, separate techniques are still required for simulating small-scale phenomena such as spray and foam, especially since these more diffuse materials typically behave quite differently than their denser counterparts. In this paper, we propose a two-way coupled simulation framework that uses the particle level set method to efficiently model dense liquid volumes and a smoothed particle hydrodynamics (SPH) method to simulate diffuse regions such as sprays. Our novel SPH method allows us to simulate both dense and diffuse water volumes, fully incorporates the particles that are automatically generated by the particle level set method in under-resolved regions, and allows for two way mixing between dense SPH volumes and grid-based liquid representations.

  20. Parallelization of GeoClaw code for modeling geophysical flows with adaptive mesh refinement on many-core systems

    USGS Publications Warehouse

    Zhang, S.; Yuen, D.A.; Zhu, A.; Song, S.; George, D.L.

    2011-01-01

    We parallelized the GeoClaw code on one-level grid using OpenMP in March, 2011 to meet the urgent need of simulating tsunami waves at near-shore from Tohoku 2011 and achieved over 75% of the potential speed-up on an eight core Dell Precision T7500 workstation [1]. After submitting that work to SC11 - the International Conference for High Performance Computing, we obtained an unreleased OpenMP version of GeoClaw from David George, who developed the GeoClaw code as part of his PH.D thesis. In this paper, we will show the complementary characteristics of the two approaches used in parallelizing GeoClaw and the speed-up obtained by combining the advantage of each of the two individual approaches with adaptive mesh refinement (AMR), demonstrating the capabilities of running GeoClaw efficiently on many-core systems. We will also show a novel simulation of the Tohoku 2011 Tsunami waves inundating the Sendai airport and Fukushima Nuclear Power Plants, over which the finest grid distance of 20 meters is achieved through a 4-level AMR. This simulation yields quite good predictions about the wave-heights and travel time of the tsunami waves. ?? 2011 IEEE.

  1. Integration of SPH Students with Non-Handicapped Peers at Pine Ridge Center.

    ERIC Educational Resources Information Center

    Pegnatore, Linda A.

    A 9-week practicum was designed to integrate severely/profoundly handicapped (SPH) students with third-grade nonhandicapped peer tutors in Broward County, Florida. Additional Objectives were to promote greater understanding of handicaps by nonhandicapped peer tutors and to increase awareness by SPH teachers of the importance of interactions…

  2. A serine proteinase homologue, SPH-3, plays a central role in insect immunity.

    PubMed

    Felföldi, Gabriella; Eleftherianos, Ioannis; Ffrench-Constant, Richard H; Venekei, István

    2011-04-15

    Numerous vertebrate and invertebrate genes encode serine proteinase homologues (SPHs) similar to members of the serine proteinase family, but lacking one or more residues of the catalytic triad. These SPH proteins are thought to play a role in immunity, but their precise functions are poorly understood. In this study, we show that SPH-3 (an insect non-clip domain-containing SPH) is of central importance in the immune response of a model lepidopteran, Manduca sexta. We examine M. sexta infection with a virulent, insect-specific, Gram-negative bacterium Photorhabdus luminescens. RNA interference suppression of bacteria-induced SPH-3 synthesis severely compromises the insect's ability to defend itself against infection by preventing the transcription of multiple antimicrobial effector genes, but, surprisingly, not the transcription of immune recognition genes. Upregulation of the gene encoding prophenoloxidase and the activity of the phenoloxidase enzyme are among the antimicrobial responses that are severely attenuated on SPH-3 knockdown. These findings suggest the existence of two largely independent signaling pathways controlling immune recognition by the fat body, one governing effector gene transcription, and the other regulating genes encoding pattern recognition proteins.

  3. GIZMO: Multi-method magneto-hydrodynamics+gravity code

    NASA Astrophysics Data System (ADS)

    Hopkins, Philip F.

    2014-10-01

    GIZMO is a flexible, multi-method magneto-hydrodynamics+gravity code that solves the hydrodynamic equations using a variety of different methods. It introduces new Lagrangian Godunov-type methods that allow solving the fluid equations with a moving particle distribution that is automatically adaptive in resolution and avoids the advection errors, angular momentum conservation errors, and excessive diffusion problems that seriously limit the applicability of “adaptive mesh” (AMR) codes, while simultaneously avoiding the low-order errors inherent to simpler methods like smoothed-particle hydrodynamics (SPH). GIZMO also allows the use of SPH either in “traditional” form or “modern” (more accurate) forms, or use of a mesh. Self-gravity is solved quickly with a BH-Tree (optionally a hybrid PM-Tree for periodic boundaries) and on-the-fly adaptive gravitational softenings. The code is descended from P-GADGET, itself descended from GADGET-2 (ascl:0003.001), and many of the naming conventions remain (for the sake of compatibility with the large library of GADGET work and analysis software).

  4. Self-Scheduling Parallel Methods for Multiple Serial Codes with Application to WOPWOP

    NASA Technical Reports Server (NTRS)

    Long, Lyle N.; Brentner, Kenneth S.

    2000-01-01

    This paper presents a scheme for efficiently running a large number of serial jobs on parallel computers. Two examples are given of computer programs that run relatively quickly, but often they must be run numerous times to obtain all the results needed. It is very common in science and engineering to have codes that are not massive computing challenges in themselves, but due to the number of instances that must be run, they do become large-scale computing problems. The two examples given here represent common problems in aerospace engineering: aerodynamic panel methods and aeroacoustic integral methods. The first example simply solves many systems of linear equations. This is representative of an aerodynamic panel code where someone would like to solve for numerous angles of attack. The complete code for this first example is included in the appendix so that it can be readily used by others as a template. The second example is an aeroacoustics code (WOPWOP) that solves the Ffowcs Williams Hawkings equation to predict the far-field sound due to rotating blades. In this example, one quite often needs to compute the sound at numerous observer locations, hence parallelization is utilized to automate the noise computation for a large number of observers.

  5. Research in Parallel Algorithms and Software for Computational Aerosciences

    NASA Technical Reports Server (NTRS)

    Domel, Neal D.

    1996-01-01

    Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.

  6. Polymeric cobalt(ii) thiolato complexes - syntheses, structures and properties of [Co(SMes)2] and [Co(SPh)2NH3].

    PubMed

    Eichhöfer, Andreas; Buth, Gernot

    2016-11-01

    Reactions of [Co(N(SiMe 3 ) 2 ) 2 thf] with 2.1 equiv. of MesSH (Mes = C 6 H 2 -2,4,6-(CH 3 ) 3 ) yield dark brown crystals of the one dimensional chain compound [Co(SMes) 2 ]. In contrast reactions of [Co(N(SiMe 3 ) 2 ) 2 thf] with 2.1 equiv. of PhSH result in the formation of a dark brown almost X-ray amorphous powder of 'Co(SPh) 2 '. Addition of aliquots of CH 3 OH to the latter reaction resulted in the almost quantitative formation of crystalline ammonia thiolato complexes either [Co(SPh) 2 (NH 3 ) 2 ] or [Co(SPh) 2 NH 3 ]. Single crystal XRD reveals that [Co(SPh) 2 NH 3 ] forms one-dimensional chains in the crystal via μ 2 -SPh bridges whereas [Co(SPh) 2 (NH 3 ) 2 ] consists at a first glance of isolated distorted tetrahedral units. Magnetic measurements suggest strong antiferromagnetic coupling for the two chain compounds [Co(SMes) 2 ] (J = -38.6 cm -1 ) and [Co(SPh) 2 NH 3 ] (J = -27.1 cm -1 ). Interestingly, also the temperature dependence of the susceptibility of tetrahedral [Co(SPh) 2 (NH 3 ) 2 ] shows an antiferromagnetic transition at around 6 K. UV-Vis-NIR spectra display d-d bands in the NIR region between 500 and 2250 nm. Thermal gravimetric analysis of [Co(SPh) 2 (NH 3 ) 2 ] and [Co(SPh) 2 NH 3 ] reveals two well separated cleavage processes for NH 3 and SPh 2 upon heating accompanied by the stepwise formation of 'Co(SPh) 2 ' and cobalt sulfide.

  7. CHOLLA: A New Massively Parallel Hydrodynamics Code for Astrophysical Simulation

    NASA Astrophysics Data System (ADS)

    Schneider, Evan E.; Robertson, Brant E.

    2015-04-01

    We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳2563) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.

  8. X-ray Modeling of η Carinae & WR140 from SPH Simulations

    NASA Astrophysics Data System (ADS)

    Russell, Christopher M. P.; Corcoran, Michael F.; Okazaki, Atsuo T.; Madura, Thomas I.; Owocki, Stanley P.

    2011-01-01

    The colliding wind binary (CWB) systems η Carinae and WR140 provide unique laboratories for X-ray astrophysics. Their wind-wind collisions produce hard X-rays that have been monitored extensively by several X-ray telescopes, including RXTE. To interpret these RXTE X-ray light curves, we model the wind-wind collision using 3D smoothed particle hydrodynamics (SPH) simulations. Adiabatic simulations that account for the emission and absorption of X-rays from an assumed point source at the apex of the wind-collision shock cone by the distorted winds can closely match the observed 2-10keV RXTE light curves of both η Car and WR140. This point-source model can also explain the early recovery of η Car's X-ray light curve from the 2009.0 minimum by a factor of 2-4 reduction in the mass loss rate of η Car. Our more recent models relax the point-source approximation and account for the spatially extended emission along the wind-wind interaction shock front. For WR140, the computed X-ray light curve again matches the RXTE observations quite well. But for η Car, a hot, post-periastron bubble leads to an emission level that does not match the extended X-ray minimum observed by RXTE. Initial results from incorporating radiative cooling and radiatively-driven wind acceleration via a new anti-gravity approach into the SPH code are also discussed.

  9. Soluble soy protein peptic hydrolysate stimulates adipocyte differentiation in 3T3-L1 cells.

    PubMed

    Goto, Tsuyoshi; Mori, Ayaka; Nagaoka, Satoshi

    2013-08-01

    The molecular mechanisms underlying the potential health benefit effects of soybean proteins on obesity-associated metabolic disorders have not been fully clarified. In this study, we investigated the effects of soluble soybean protein peptic hydrolysate (SPH) on adipocyte differentiation by using 3T3-L1 murine preadipocytes. The addition of SPH increased lipid accumulation during adipocyte differentiation. SPH increased the mRNA expression levels of an adipogenic marker gene and decreased that of a preadipocyte marker gene, suggesting that SPH promotes adipocyte differentiation. SPH induced antidiabetic and antiatherogenic adiponectin mRNA expression and secretion. Moreover, SPH increased the mRNA expression levels of insulin-responsive glucose transporter 4 and insulin-stimulated glucose uptake. The expression levels of peroxisome proliferator-activated receptor γ (PPARγ), a key regulator of adipocyte differentiation, during adipocyte differentiation were up-regulated in 3T3-L1 cells treated with SPH, and lipid accumulation during adipocyte differentiation induced by SPH was inhibited in the presence of a PPARγ antagonist. However, SPH did not exhibit PPARγ ligand activity. These findings indicate that SPH stimulates adipocyte differentiation, at least in part, via the up-regulation of PPARγ expression levels. These effects of SPH might be important for the health benefit effects of soybean proteins on obesity-associated metabolic disorders. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

    NASA Astrophysics Data System (ADS)

    Moon, Hongsik

    What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the

  11. Molecular definition of red cell Rh haplotypes by tightly linked SphI RFLPs.

    PubMed

    Huang, C H; Reid, M E; Chen, Y; Coghlan, G; Okubo, Y

    1996-01-01

    The Rh blood group system of human red cells contains five major antigens D, C/c, and E/e (the latter four designated "non-D") that are specified by eight gene complexes known as Rh haplotypes. In this paper, we report on the mapping of RH locus and identification of a set of SphI RFLPs that are tightly linked with the Rh structural genes. Using exon-specific probes, we have localized the SphI cleavage sites resulting in these DNA markers and derived a comprehensive map for the RH locus. It was found that the SphI fragments encompassing exons 4-7 of the Rh genes occur in four banding patterns or frameworks that correspond to the distribution and segregation of the common Rh haplotypes. This linkage disequilibrium allowed a genotype-phenotype correlation and direct determination of Rh zygosity related to the Rh-positive or Rh-negative status (D/D, D/d, and d/d). Studies on the occurrence of SphI RFLPs in a number of rare Rh variants indicated that Rh phenotypic diversity has taken place on different haplotype backgrounds and has arisen by diverse genetic mechanisms. The molecular definition of Rh haplotypes by SphI RFLP frameworks should provide a useful procedure for genetic counseling and prenatal assessment of Rh alloimmunization.

  12. Parallelizing serial code for a distributed processing environment with an application to high frequency electromagnetic scattering

    NASA Astrophysics Data System (ADS)

    Work, Paul R.

    1991-12-01

    This thesis investigates the parallelization of existing serial programs in computational electromagnetics for use in a parallel environment. Existing algorithms for calculating the radar cross section of an object are covered, and a ray-tracing code is chosen for implementation on a parallel machine. Current parallel architectures are introduced and a suitable parallel machine is selected for the implementation of the chosen ray-tracing algorithm. The standard techniques for the parallelization of serial codes are discussed, including load balancing and decomposition considerations, and appropriate methods for the parallelization effort are selected. A load balancing algorithm is modified to increase the efficiency of the application, and a high level design of the structure of the serial program is presented. A detailed design of the modifications for the parallel implementation is also included, with both the high level and the detailed design specified in a high level design language called UNITY. The correctness of the design is proven using UNITY and standard logic operations. The theoretical and empirical results show that it is possible to achieve an efficient parallel application for a serial computational electromagnetic program where the characteristics of the algorithm and the target architecture critically influence the development of such an implementation.

  13. A Multiple Sphere T-Matrix Fortran Code for Use on Parallel Computer Clusters

    NASA Technical Reports Server (NTRS)

    Mackowski, D. W.; Mishchenko, M. I.

    2011-01-01

    A general-purpose Fortran-90 code for calculation of the electromagnetic scattering and absorption properties of multiple sphere clusters is described. The code can calculate the efficiency factors and scattering matrix elements of the cluster for either fixed or random orientation with respect to the incident beam and for plane wave or localized- approximation Gaussian incident fields. In addition, the code can calculate maps of the electric field both interior and exterior to the spheres.The code is written with message passing interface instructions to enable the use on distributed memory compute clusters, and for such platforms the code can make feasible the calculation of absorption, scattering, and general EM characteristics of systems containing several thousand spheres.

  14. Shared Memory Parallelization of an Implicit ADI-type CFD Code

    NASA Technical Reports Server (NTRS)

    Hauser, Th.; Huang, P. G.

    1999-01-01

    A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.

  15. A strategy to couple the material point method (MPM) and smoothed particle hydrodynamics (SPH) computational techniques

    NASA Astrophysics Data System (ADS)

    Raymond, Samuel J.; Jones, Bruce; Williams, John R.

    2018-01-01

    A strategy is introduced to allow coupling of the material point method (MPM) and smoothed particle hydrodynamics (SPH) for numerical simulations. This new strategy partitions the domain into SPH and MPM regions, particles carry all state variables and as such no special treatment is required for the transition between regions. The aim of this work is to derive and validate the coupling methodology between MPM and SPH. Such coupling allows for general boundary conditions to be used in an SPH simulation without further augmentation. Additionally, as SPH is a purely particle method, and MPM is a combination of particles and a mesh. This coupling also permits a smooth transition from particle methods to mesh methods, where further coupling to mesh methods could in future provide an effective farfield boundary treatment for the SPH method. The coupling technique is introduced and described alongside a number of simulations in 1D and 2D to validate and contextualize the potential of using these two methods in a single simulation. The strategy shown here is capable of fully coupling the two methods without any complicated algorithms to transform information from one method to another.

  16. An improved weakly compressible SPH method for simulating free surface flows of viscous and viscoelastic fluids

    NASA Astrophysics Data System (ADS)

    Xu, Xiaoyang; Deng, Xiao-Long

    2016-04-01

    In this paper, an improved weakly compressible smoothed particle hydrodynamics (SPH) method is proposed to simulate transient free surface flows of viscous and viscoelastic fluids. The improved SPH algorithm includes the implementation of (i) the mixed symmetric correction of kernel gradient to improve the accuracy and stability of traditional SPH method and (ii) the Rusanov flux in the continuity equation for improving the computation of pressure distributions in the dynamics of liquids. To assess the effectiveness of the improved SPH algorithm, a number of numerical examples including the stretching of an initially circular water drop, dam breaking flow against a vertical wall, the impact of viscous and viscoelastic fluid drop with a rigid wall, and the extrudate swell of viscoelastic fluid have been presented and compared with available numerical and experimental data in literature. The convergent behavior of the improved SPH algorithm has also been studied by using different number of particles. All numerical results demonstrate that the improved SPH algorithm proposed here is capable of modeling free surface flows of viscous and viscoelastic fluids accurately and stably, and even more important, also computing an accurate and little oscillatory pressure field.

  17. Research in Parallel Algorithms and Software for Computational Aerosciences

    NASA Technical Reports Server (NTRS)

    Domel, Neal D.

    1996-01-01

    Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.

  18. Molecular definition of red cell Rh haplotypes by tightly linked SphI RFLPs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, C.H.; Reid, M.E.; Chen, Y.

    The Rh blood group system of human red cells contains five major antigens D, C/c, and E/e (the latter four designated {open_quotes}non-D{close_quotes}) that are specified by eight gene complexes known as Rh haplotypes. In this paper, we report on the mapping of the RH locus and identification of a set of SphI RFLPs that are tightly linked with the Rh structural genes. Using exon-specific probes, we have localized the SphI cleavage sites resulting in these DNA markers and derived a comprehensive map for the RH locus. It was found that the SphI fragments encompassing exons 4-7 of the Rh genesmore » occur in four banding patterns or frameworks that correspond to the distribution and segregation of the common Rh haplotypes. This linkage disequilibrium allowed a genotype-phenotype correlation and direct determination of Rh zygosity related to the Rh-positive or Rh-negative status (D/D, D/d, and d/d). Studies on the occurrence of SphI RFLPs in a number of rare Rh variants indicated that Rh phenotypic diversity has taken place on different haplotype backgrounds and has arisen by diverse genetic mechanisms. The molecular definition of Rh haplotypes by SphI RFLP frameworks should provide a useful procedure for genetic counseling and prenatal assessment of Rh alloimmunization. 32 refs., 7 figs., 3 tabs.« less

  19. Quercetin ameliorates pulmonary fibrosis by inhibiting SphK1/S1P signaling.

    PubMed

    Zhang, Xingcai; Cai, Yuli; Zhang, Wei; Chen, Xianhai

    2018-06-25

    Idiopathic pulmonary fibrosis (IPF) is an agnogenic chronic disorder with high morbidity and low survival rate. Quercetin is a flavonoid found in a variety of herbs with anti-fibrosis function. In this study, bleomycin was employed to induce a pulmonary fibrosis mouse model. The quercetin administration ameliorated bleomycin-induced pulmonary fibrosis, evidenced by the expression level changes of hydroxyproline, fibronectin, α-smooth muscle actin, Collagen I and Collagen III. The similar results were observed in transforming growth factor (TGF)-β-treated human embryonic lung fibroblast (HELF). The bleomycin or TGF-β administration caused the increase of sphingosine-1-phosphate (S1P) level in pulmonary tissue and HELF cells, as well as its activation-required kinase, sphingosine kinase 1 (SphK1), and its degradation enzyme, sphinogosine-1-phosphate lyase (S1PL). However, the increase of S1P, SphK1 and S1PL was attenuated by application of quercetin. In addition, the effect of quercetin on fibrosis was abolished by the ectopic expression of SphK1. The colocalization of SphK1/S1PL and fibroblast specific protein 1 (FSP1) suggested the roles of fibroblasts in pulmonary fibrosis. In summary, we demonstrated that quercetin ameliorated pulmonary fibrosis in vivo and in vitro by inhibiting SphK1/S1P signaling.

  20. Tough2{_}MP: A parallel version of TOUGH2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Keni; Wu, Yu-Shu; Ding, Chris

    2003-04-09

    TOUGH2{_}MP is a massively parallel version of TOUGH2. It was developed for running on distributed-memory parallel computers to simulate large simulation problems that may not be solved by the standard, single-CPU TOUGH2 code. The new code implements an efficient massively parallel scheme, while preserving the full capacity and flexibility of the original TOUGH2 code. The new software uses the METIS software package for grid partitioning and AZTEC software package for linear-equation solving. The standard message-passing interface is adopted for communication among processors. Numerical performance of the current version code has been tested on CRAY-T3E and IBM RS/6000 SP platforms. Inmore » addition, the parallel code has been successfully applied to real field problems of multi-million-cell simulations for three-dimensional multiphase and multicomponent fluid and heat flow, as well as solute transport. In this paper, we will review the development of the TOUGH2{_}MP, and discuss the basic features, modules, and their applications.« less

  1. Coding for parallel execution of hardware-in-the-loop millimeter-wave scene generation models on multicore SIMD processor architectures

    NASA Astrophysics Data System (ADS)

    Olson, Richard F.

    2013-05-01

    Rendering of point scatterer based radar scenes for millimeter wave (mmW) seeker tests in real-time hardware-in-the-loop (HWIL) scene generation requires efficient algorithms and vector-friendly computer architectures for complex signal synthesis. New processor technology from Intel implements an extended 256-bit vector SIMD instruction set (AVX, AVX2) in a multi-core CPU design providing peak execution rates of hundreds of GigaFLOPS (GFLOPS) on one chip. Real world mmW scene generation code can approach peak SIMD execution rates only after careful algorithm and source code design. An effective software design will maintain high computing intensity emphasizing register-to-register SIMD arithmetic operations over data movement between CPU caches or off-chip memories. Engineers at the U.S. Army Aviation and Missile Research, Development and Engineering Center (AMRDEC) applied two basic parallel coding methods to assess new 256-bit SIMD multi-core architectures for mmW scene generation in HWIL. These include use of POSIX threads built on vector library functions and more portable, highlevel parallel code based on compiler technology (e.g. OpenMP pragmas and SIMD autovectorization). Since CPU technology is rapidly advancing toward high processor core counts and TeraFLOPS peak SIMD execution rates, it is imperative that coding methods be identified which produce efficient and maintainable parallel code. This paper describes the algorithms used in point scatterer target model rendering, the parallelization of those algorithms, and the execution performance achieved on an AVX multi-core machine using the two basic parallel coding methods. The paper concludes with estimates for scale-up performance on upcoming multi-core technology.

  2. Molecular mechanism for sphingosine-induced Pseudomonas ceramidase expression through the transcriptional regulator SphR

    PubMed Central

    Okino, Nozomu; Ito, Makoto

    2016-01-01

    Pseudomonas aeruginosa, an opportunistic, but serious multidrug-resistant pathogen, secretes a ceramidase capable of cleaving the N-acyl linkage of ceramide to generate fatty acids and sphingosine. We previously reported that the secretion of P. aeruginosa ceramidase was induced by host-derived sphingolipids, through which phospholipase C-induced hemolysis was significantly enhanced. We herein investigated the gene(s) regulating sphingolipid-induced ceramidase expression and identified SphR, which encodes a putative AraC family transcriptional regulator. Disruption of the sphR gene in P. aeruginosa markedly decreased the sphingomyelin-induced secretion of ceramidase, reduced hemolytic activity, and resulted in the loss of sphingomyelin-induced ceramidase expression. A microarray analysis confirmed that sphingomyelin significantly induced ceramidase expression in P. aeruginosa. Furthermore, an electrophoretic mobility shift assay revealed that SphR specifically bound free sphingoid bases such as sphingosine, dihydrosphingosine, and phytosphingosine, but not sphingomyelin or ceramide. A β-galactosidase-assisted promoter assay showed that sphingosine activated ceramidase expression through SphR at a concentration of 100 nM. Collectively, these results demonstrated that sphingosine induces the secretion of ceramidase by promoting the mRNA expression of ceramidase through SphR, thereby enhancing hemolytic phospholipase C-induced cytotoxicity. These results facilitate understanding of the physiological role of bacterial ceramidase in host cells. PMID:27941831

  3. Evaluation of superporous hydrogel (SPH) and SPH composite in porcine intestine ex-vivo: assessment of drug transport, morphology effect, and mechanical fixation to intestinal wall.

    PubMed

    Dorkoosh, Farid A; Borchard, Gerrit; Rafiee-Tehrani, Morteza; Verhoef, J Coos; Junginger, Hans E

    2002-03-01

    The objective of this study was to investigate the potential of superporous hydrogel (SPH) and SPH composite (SPHC) polymers to enhance the transport of N-alpha-benzoyl-L-arginine ethylester (BAEE) and fluorescein isothiocyanate-dextran 4400 (FD4) across porcine intestinal epithelium ex-vivo, and to study any possible morphological damage to the epithelium by applying these polymers. In addition, the ability of these polymers to attach to the gut wall by mechanical pressure was examined by using a specifically designed centrifuge model. The transport of BAEE and FD4 across the intestinal mucosa was enhanced 2- to 3-fold by applying SPHC polymer in comparison to negative control. No significant morphological damage was observed by applying these polymers inside the intestinal lumen. Moreover, the SPH and SPHC polymers were able to attach mechanically to the intestinal wall by swelling and did not move in the intestinal lumen even when a horizontal force of 13 gms(-2) was applied. In conclusion, these polymers are appropriate vehicles for enhancing the intestinal absorption of peptide and protein drugs.

  4. Continuum modeling of rate-dependent granular flows in SPH

    DOE PAGES

    Hurley, Ryan C.; Andrade, José E.

    2016-09-13

    In this paper, we discuss a constitutive law for modeling rate-dependent granular flows that has been implemented in smoothed particle hydrodynamics (SPH). We model granular materials using a viscoplastic constitutive law that produces a Drucker–Prager-like yield condition in the limit of vanishing flow. A friction law for non-steady flows, incorporating rate-dependence and dilation, is derived and implemented within the constitutive law. We compare our SPH simulations with experimental data, demonstrating that they can capture both steady and non-steady dynamic flow behavior, notably including transient column collapse profiles. In conclusion, this technique may therefore be attractive for modeling the time-dependent evolutionmore » of natural and industrial flows.« less

  5. A technique to remove the tensile instability in weakly compressible SPH

    NASA Astrophysics Data System (ADS)

    Xu, Xiaoyang; Yu, Peng

    2018-01-01

    When smoothed particle hydrodynamics (SPH) is directly applied for the numerical simulations of transient viscoelastic free surface flows, a numerical problem called tensile instability arises. In this paper, we develop an optimized particle shifting technique to remove the tensile instability in SPH. The basic equations governing free surface flow of an Oldroyd-B fluid are considered, and approximated by an improved SPH scheme. This includes the implementations of the correction of kernel gradient and the introduction of Rusanov flux into the continuity equation. To verify the effectiveness of the optimized particle shifting technique in removing the tensile instability, the impacting drop, the injection molding of a C-shaped cavity, and the extrudate swell, are conducted. The numerical results obtained are compared with those simulated by other numerical methods. A comparison among different numerical techniques (e.g., the artificial stress) to remove the tensile instability is further performed. All numerical results agree well with the available data.

  6. AX-GADGET: a new code for cosmological simulations of Fuzzy Dark Matter and Axion models

    NASA Astrophysics Data System (ADS)

    Nori, Matteo; Baldi, Marco

    2018-05-01

    We present a new module of the parallel N-Body code P-GADGET3 for cosmological simulations of light bosonic non-thermal dark matter, often referred as Fuzzy Dark Matter (FDM). The dynamics of the FDM features a highly non-linear Quantum Potential (QP) that suppresses the growth of structures at small scales. Most of the previous attempts of FDM simulations either evolved suppressed initial conditions, completely neglecting the dynamical effects of QP throughout cosmic evolution, or resorted to numerically challenging full-wave solvers. The code provides an interesting alternative, following the FDM evolution without impairing the overall performance. This is done by computing the QP acceleration through the Smoothed Particle Hydrodynamics (SPH) routines, with improved schemes to ensure precise and stable derivatives. As an extension of the P-GADGET3 code, it inherits all the additional physics modules implemented up to date, opening a wide range of possibilities to constrain FDM models and explore its degeneracies with other physical phenomena. Simulations are compared with analytical predictions and results of other codes, validating the QP as a crucial player in structure formation at small scales.

  7. Thermally robust Au99(SPh)42 nanoclusters for chemoselective hydrogenation of nitrobenzaldehyde derivatives in water.

    PubMed

    Li, Gao; Zeng, Chenjie; Jin, Rongchao

    2014-03-05

    We report the synthesis and catalytic application of thermally robust gold nanoclusters formulated as Au99(SPh)42. The formula was determined by electrospray ionization and matrix-assisted laser desorption ionization mass spectrometry in conjunction with thermogravimetric analysis. The optical spectrum of Au99(SPh)42 nanoclusters shows absorption peaks at ~920 nm (1.35 eV), 730 nm (1.70 eV), 600 nm (2.07 eV), 490 nm (2.53 eV), and 400 nm (3.1 eV) in contrast to conventional gold nanoparticles, which exhibit a plasmon resonance band at 520 nm (for spherical particles). The ceria-supported Au99(SPh)42 nanoclusters were utilized as a catalyst for chemoselective hydrogenation of nitrobenzaldehyde to nitrobenzyl alcohol in water using H2 gas as the hydrogen source. The selective hydrogenation of the aldehyde group catalyzed by nanoclusters is a surprise because conventional nanogold catalysts instead give rise to the product resulting from reduction of the nitro group. The Au99(SPh)42/CeO2 catalyst gives high catalytic activity for a range of nitrobenzaldehyde derivatives and also shows excellent recyclability due to its thermal robustness. We further tested the size-dependent catalytic performance of Au25(SPh)18 and Au36(SPh)24 nanoclusters, and on the basis of their crystal structures we propose a molecular adsorption site for nitrobenzaldehyde. The nanocluster material is expected to find wide application in catalytic reactions.

  8. FLY MPI-2: a parallel tree code for LSS

    NASA Astrophysics Data System (ADS)

    Becciani, U.; Comparato, M.; Antonuccio-Delogu, V.

    2006-04-01

    New version program summaryProgram title: FLY 3.1 Catalogue identifier: ADSC_v2_0 Licensing provisions: yes Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADSC_v2_0 Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland No. of lines in distributed program, including test data, etc.: 158 172 No. of bytes in distributed program, including test data, etc.: 4 719 953 Distribution format: tar.gz Programming language: Fortran 90, C Computer: Beowulf cluster, PC, MPP systems Operating system: Linux, Aix RAM: 100M words Catalogue identifier of previous version: ADSC_v1_0 Journal reference of previous version: Comput. Phys. Comm. 155 (2003) 159 Does the new version supersede the previous version?: yes Nature of problem: FLY is a parallel collisionless N-body code for the calculation of the gravitational force Solution method: FLY is based on the hierarchical oct-tree domain decomposition introduced by Barnes and Hut (1986) Reasons for the new version: The new version of FLY is implemented by using the MPI-2 standard: the distributed version 3.1 was developed by using the MPICH2 library on a PC Linux cluster. Today the FLY performance allows us to consider the FLY code among the most powerful parallel codes for tree N-body simulations. Another important new feature regards the availability of an interface with hydrodynamical Paramesh based codes. Simulations must follow a box large enough to accurately represent the power spectrum of fluctuations on very large scales so that we may hope to compare them meaningfully with real data. The number of particles then sets the mass resolution of the simulation, which we would like to make as fine as possible. The idea to build an interface between two codes, that have different and complementary cosmological tasks, allows us to execute complex cosmological simulations with FLY, specialized for DM evolution, and a code specialized for hydrodynamical components that uses a Paramesh block

  9. Identification of a novel SPLIT-HULL (SPH) gene associated with hull splitting in rice (Oryza sativa L.).

    PubMed

    Lee, Gileung; Lee, Kang-Ie; Lee, Yunjoo; Kim, Backki; Lee, Dongryung; Seo, Jeonghwan; Jang, Su; Chin, Joong Hyoun; Koh, Hee-Jong

    2018-07-01

    The split-hull phenotype caused by reduced lemma width and low lignin content is under control of SPH encoding a type-2 13-lipoxygenase and contributes to high dehulling efficiency. Rice hulls consist of two bract-like structures, the lemma and palea. The hull is an important organ that helps to protect seeds from environmental stress, determines seed shape, and ensures grain filling. Achieving optimal hull size and morphology is beneficial for seed development. We characterized the split-hull (sph) mutant in rice, which exhibits hull splitting in the interlocking part between lemma and palea and/or the folded part of the lemma during the grain filling stage. Morphological and chemical analysis revealed that reduction in the width of the lemma and lignin content of the hull in the sph mutant might be the cause of hull splitting. Genetic analysis indicated that the mutant phenotype was controlled by a single recessive gene, sph (Os04g0447100), which encodes a type-2 13-lipoxygenase. SPH knockout and knockdown transgenic plants displayed the same split-hull phenotype as in the mutant. The sph mutant showed significantly higher linoleic and linolenic acid (substrates of lipoxygenase) contents in spikelets compared to the wild type. It is probably due to the genetic defect of SPH and subsequent decrease in lipoxygenase activity. In dehulling experiment, the sph mutant showed high dehulling efficiency even by a weak tearing force in a dehulling machine. Collectively, the results provide a basis for understanding of the functional role of lipoxygenase in structure and maintenance of hulls, and would facilitate breeding of easy-dehulling rice.

  10. Local Group dSph radio survey with ATCA (III): constraints on particle dark matter

    NASA Astrophysics Data System (ADS)

    Regis, Marco; Colafrancesco, Sergio; Profumo, Stefano; de Blok, W. J. G.; Massardi, Marcella; Richter, Laura

    2014-10-01

    We performed a deep search for radio synchrotron emissions induced by weakly interacting massive particles (WIMPs) annihilation or decay in six dwarf spheroidal (dSph) galaxies of the Local Group. Observations were conducted with the Australia Telescope Compact Array (ATCA) at 16 cm wavelength, with an rms sensitivity better than 0.05 mJy/beam in each field. In this work, we first discuss the uncertainties associated with the modeling of the expected signal, such as the shape of the dark matter (DM) profile and the dSph magnetic properties. We then investigate the possibility that point-sources detected in the proximity of the dSph optical center might be due to the emission from a DM cuspy profile. No evidence for an extended emission over a size of few arcmin (which is the DM halo size) has been detected. We present the associated bounds on the WIMP parameter space for different annihilation/decay final states and for different astrophysical assumptions. If the confinement of electrons and positrons in the dSph is such that the majority of their power is radiated within the dSph region, we obtain constraints on the WIMP annihilation rate which are well below the thermal value for masses up to few TeV. On the other hand, for conservative assumptions on the dSph magnetic properties, the bounds can be dramatically relaxed. We show however that, within the next 10 years and regardless of the astrophysical assumptions, it will be possible to progressively close in on the full parameter space of WIMPs by searching for radio signals in dSphs with SKA and its precursors.

  11. Shrimp serine proteinase homologues PmMasSPH-1 and -2 play a role in the activation of the prophenoloxidase system.

    PubMed

    Jearaphunt, Miti; Amparyup, Piti; Sangsuriya, Pakkakul; Charoensapsri, Walaiporn; Senapin, Saengchan; Tassanakajon, Anchalee

    2015-01-01

    Melanization mediated by the prophenoloxidase (proPO) activating system is a rapid immune response used by invertebrates against intruding pathogens. Several masquerade-like and serine proteinase homologues (SPHs) have been demonstrated to play an essential role in proPO activation in insects and crustaceans. In a previous study, we characterized the masquerade-like SPH, PmMasSPH1, in the black tiger shrimp Penaeus monodon as a multifunctional immune protein based on its recognition and antimicrobial activity against the Gram-negative bacteria Vibrio harveyi. In the present study, we identify a novel SPH, known as PmMasSPH2, composed of an N-terminal clip domain and a C-terminal SP-like domain that share high similarity to those of other insect and crustacean SPHs. We demonstrate that gene silencing of PmMasSPH1 and PmMasSPH2 significantly reduces PO activity, resulting in a high number of V. harveyi in the hemolymph. Interestingly, knockdown of PmMasSPH1 suppressed not only its gene transcript but also other immune-related genes in the proPO system (e.g., PmPPAE2) and antimicrobial peptides (e.g., PenmonPEN3, PenmonPEN5, crustinPm1 and Crus-likePm). The PmMasSPH1 and PmMasSPH2 also show binding activity to peptidoglycan (PGN) of Gram-positive bacteria. Using a yeast two-hybrid analysis and co-immunoprecipitation, we demonstrate that PmMasSPH1 specifically interacted with the final proteinase of the proPO cascade, PmPPAE2. Furthermore, the presence of both PmMasSPH1 and PmPPAE2 enhances PGN-induced PO activity in vitro. Taken together, these results suggest the importance of PmMasSPHs in the activation of the shrimp proPO system.

  12. Shrimp Serine Proteinase Homologues PmMasSPH-1 and -2 Play a Role in the Activation of the Prophenoloxidase System

    PubMed Central

    Jearaphunt, Miti; Amparyup, Piti; Sangsuriya, Pakkakul; Charoensapsri, Walaiporn; Senapin, Saengchan; Tassanakajon, Anchalee

    2015-01-01

    Melanization mediated by the prophenoloxidase (proPO) activating system is a rapid immune response used by invertebrates against intruding pathogens. Several masquerade-like and serine proteinase homologues (SPHs) have been demonstrated to play an essential role in proPO activation in insects and crustaceans. In a previous study, we characterized the masquerade-like SPH, PmMasSPH1, in the black tiger shrimp Penaeus monodon as a multifunctional immune protein based on its recognition and antimicrobial activity against the Gram-negative bacteria Vibrio harveyi. In the present study, we identify a novel SPH, known as PmMasSPH2, composed of an N-terminal clip domain and a C-terminal SP-like domain that share high similarity to those of other insect and crustacean SPHs. We demonstrate that gene silencing of PmMasSPH1 and PmMasSPH2 significantly reduces PO activity, resulting in a high number of V. harveyi in the hemolymph. Interestingly, knockdown of PmMasSPH1 suppressed not only its gene transcript but also other immune-related genes in the proPO system (e.g., PmPPAE2) and antimicrobial peptides (e.g., PenmonPEN3, PenmonPEN5, crustinPm1 and Crus-likePm). The PmMasSPH1 and PmMasSPH2 also show binding activity to peptidoglycan (PGN) of Gram-positive bacteria. Using a yeast two-hybrid analysis and co-immunoprecipitation, we demonstrate that PmMasSPH1 specifically interacted with the final proteinase of the proPO cascade, PmPPAE2. Furthermore, the presence of both PmMasSPH1 and PmPPAE2 enhances PGN-induced PO activity in vitro. Taken together, these results suggest the importance of PmMasSPHs in the activation of the shrimp proPO system. PMID:25803442

  13. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  14. Incompressible SPH Model for Simulating Violent Free-Surface Fluid Flows

    NASA Astrophysics Data System (ADS)

    Staroszczyk, Ryszard

    2014-06-01

    In this paper the problem of transient gravitational wave propagation in a viscous incompressible fluid is considered, with a focus on flows with fast-moving free surfaces. The governing equations of the problem are solved by the smoothed particle hydrodynamics method (SPH). In order to impose the incompressibility constraint on the fluid motion, the so-called projection method is applied in which the discrete SPH equations are integrated in time by using a fractional-step technique. Numerical performance of the proposed model has been assessed by comparing its results with experimental data and with results obtained by a standard (weakly compressible) version of the SPH approach. For this purpose, a plane dam-break flow problem is simulated, in order to investigate the formation and propagation of a wave generated by a sudden collapse of a water column initially contained in a rectangular tank, as well as the impact of such a wave on a rigid vertical wall. The results of simulations show the evolution of the free surface of water, the variation of velocity and pressure fields in the fluid, and the time history of pressures exerted by an impacting wave on a wall.

  15. Local Group dSph radio survey with ATCA - II. Non-thermal diffuse emission

    NASA Astrophysics Data System (ADS)

    Regis, Marco; Richter, Laura; Colafrancesco, Sergio; Profumo, Stefano; de Blok, W. J. G.; Massardi, Marcella

    2015-04-01

    Our closest neighbours, the Local Group dwarf spheroidal (dSph) galaxies, are extremely quiescent and dim objects, where thermal and non-thermal diffuse emissions lack, so far, of detection. In order to possibly study the dSph interstellar medium, deep observations are required. They could reveal non-thermal emissions associated with the very low level of star formation, or to particle dark matter annihilating or decaying in the dSph halo. In this work, we employ radio observations of six dSphs, conducted with the Australia Telescope Compact Array in the frequency band 1.1-3.1 GHz, to test the presence of a diffuse component over typical scales of few arcmin and at an rms sensitivity below 0.05 mJy beam-1. We observed the dSph fields with both a compact array and long baselines. Short spacings led to a synthesized beam of about 1 arcmin and were used for the extended emission search. The high-resolution data mapped background sources, which in turn were subtracted in the short-baseline maps, to reduce their confusion limit. We found no significant detection of a diffuse radio continuum component. After a detailed discussion on the modelling of the cosmic ray (CR) electron distribution and on the dSph magnetic properties, we present bounds on several physical quantities related to the dSphs, such that the total radio flux, the angular shape of the radio emissivity, the equipartition magnetic field, and the injection and equilibrium distributions of CR electrons. Finally, we discuss the connection to far-infrared and X-ray observations.

  16. Schnek: A C++ library for the development of parallel simulation codes on regular grids

    NASA Astrophysics Data System (ADS)

    Schmitz, Holger

    2018-05-01

    A large number of algorithms across the field of computational physics are formulated on grids with a regular topology. We present Schnek, a library that enables fast development of parallel simulations on regular grids. Schnek contains a number of easy-to-use modules that greatly reduce the amount of administrative code for large-scale simulation codes. The library provides an interface for reading simulation setup files with a hierarchical structure. The structure of the setup file is translated into a hierarchy of simulation modules that the developer can specify. The reader parses and evaluates mathematical expressions and initialises variables or grid data. This enables developers to write modular and flexible simulation codes with minimal effort. Regular grids of arbitrary dimension are defined as well as mechanisms for defining physical domain sizes, grid staggering, and ghost cells on these grids. Ghost cells can be exchanged between neighbouring processes using MPI with a simple interface. The grid data can easily be written into HDF5 files using serial or parallel I/O.

  17. Mechanistic studies on the reactions of PhS(-) or [MoS(4)](2)(-) with [M(4)(SPh)(10)](2)(-) (M = Fe or Co).

    PubMed

    Cui, Zhen; Henderson, Richard A

    2002-08-12

    Kinetic studies, using stopped-flow spectrophotometry, on the reactions of [M(4)(SPh)(10)](2)(-) (M = Fe or Co) with PhS(-) to form [M(SPh)(4)](2)(-) are described, as are the reactions between [M(4)(SPh)(10)](2)(-) and [MoS(4)](2)(-) to form [S(2)MoS(2)Fe(SPh)(2)](2)(-) or [S(2)MoS(2)CoS(2)MoS(2)](2)(-). The kinetics of the reactions with PhS(-) are consistent with an initial associative substitution mechanism involving attack of PhS(-) at one of the tetrahedral M sites of [M(4)(SPh)(10)](2)(-) to form [M(4)(SPh)(11)](3)(-). Subsequent or concomitant cleavage of a micro-SPh ligand, at the same M, initiates a cascade of rapid reactions which result ultimately in the complete rupture of the cluster and formation of [M(SPh)(4)](2)(-). The kinetics of the reaction between [M(4)(SPh)(10)](2)(-) and [MoS(4)](2)(-) indicate an initial dissociative substitution mechanism at low concentrations of [MoS(4)](2)(-), in which rate-limiting dissociation of a terminal thiolate from [M(4)(SPh)(10)](2)(-) produces [M(4)(SPh)(9)](-) and the coordinatively unsaturated M site is rapidly attacked by a sulfido group of [MoS(4)](2)(-). It is proposed that subsequent chelation of the MoS(4) ligand results in cleavage of an M-micro-SPh bond, initiating a cascade of reactions which lead to the ultimate break-up of the cluster and formation of the products, [S(2)MoS(2)Fe(SPh)(2)](2)(-) or [S(2)MoS(2)CoS(2)MoS(2)](2)(-). With [Co(4)(SPh)(10)](2)(-), at higher concentrations of [MoS(4)](2)(-), a further substitution pathway is evident which exhibits a second order dependence on the concentration of [MoS(4)](2)(-). The mechanistic picture of cluster disruption which emerges from these studies rationalizes the "all or nothing" reactivity of [M(4)(SPh)(10)](2)(-).

  18. Multi-resolution Delta-plus-SPH with tensile instability control: Towards high Reynolds number flows

    NASA Astrophysics Data System (ADS)

    Sun, P. N.; Colagrossi, A.; Marrone, S.; Antuono, M.; Zhang, A. M.

    2018-03-01

    It is well known that the use of SPH models in simulating flow at high Reynolds numbers is limited because of the tensile instability inception in the fluid region characterized by high vorticity and negative pressure. In order to overcome this issue, the δ+-SPH scheme is modified by implementing a Tensile Instability Control (TIC). The latter consists of switching the momentum equation to a non-conservative formulation in the unstable flow regions. The loss of conservation properties is shown to induce small errors, provided that the particle distribution is regular. The latter condition can be ensured thanks to the implementation of a Particle Shifting Technique (PST). The novel variant of the δ+-SPH is proved to be effective in preventing the onset of tensile instability. Several challenging benchmark tests involving flows past bodies at large Reynolds numbers have been used. Within this a simulation characterized by a deforming foil that resembles a fish-like swimming body is used as a practical application of the δ+-SPH model in biological fluid mechanics.

  19. ALEGRA -- A massively parallel h-adaptive code for solid dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Summers, R.M.; Wong, M.K.; Boucheron, E.A.

    1997-12-31

    ALEGRA is a multi-material, arbitrary-Lagrangian-Eulerian (ALE) code for solid dynamics designed to run on massively parallel (MP) computers. It combines the features of modern Eulerian shock codes, such as CTH, with modern Lagrangian structural analysis codes using an unstructured grid. ALEGRA is being developed for use on the teraflop supercomputers to conduct advanced three-dimensional (3D) simulations of shock phenomena important to a variety of systems. ALEGRA was designed with the Single Program Multiple Data (SPMD) paradigm, in which the mesh is decomposed into sub-meshes so that each processor gets a single sub-mesh with approximately the same number of elements. Usingmore » this approach the authors have been able to produce a single code that can scale from one processor to thousands of processors. A current major effort is to develop efficient, high precision simulation capabilities for ALEGRA, without the computational cost of using a global highly resolved mesh, through flexible, robust h-adaptivity of finite elements. H-adaptivity is the dynamic refinement of the mesh by subdividing elements, thus changing the characteristic element size and reducing numerical error. The authors are working on several major technical challenges that must be met to make effective use of HAMMER on MP computers.« less

  20. The UPSF code: a metaprogramming-based high-performance automatically parallelized plasma simulation framework

    NASA Astrophysics Data System (ADS)

    Gao, Xiatian; Wang, Xiaogang; Jiang, Binhao

    2017-10-01

    UPSF (Universal Plasma Simulation Framework) is a new plasma simulation code designed for maximum flexibility by using edge-cutting techniques supported by C++17 standard. Through use of metaprogramming technique, UPSF provides arbitrary dimensional data structures and methods to support various kinds of plasma simulation models, like, Vlasov, particle in cell (PIC), fluid, Fokker-Planck, and their variants and hybrid methods. Through C++ metaprogramming technique, a single code can be used to arbitrary dimensional systems with no loss of performance. UPSF can also automatically parallelize the distributed data structure and accelerate matrix and tensor operations by BLAS. A three-dimensional particle in cell code is developed based on UPSF. Two test cases, Landau damping and Weibel instability for electrostatic and electromagnetic situation respectively, are presented to show the validation and performance of the UPSF code.

  1. Efficient high-quality volume rendering of SPH data.

    PubMed

    Fraedrich, Roland; Auer, Stefan; Westermann, Rüdiger

    2010-01-01

    High quality volume rendering of SPH data requires a complex order-dependent resampling of particle quantities along the view rays. In this paper we present an efficient approach to perform this task using a novel view-space discretization of the simulation domain. Our method draws upon recent work on GPU-based particle voxelization for the efficient resampling of particles into uniform grids. We propose a new technique that leverages a perspective grid to adaptively discretize the view-volume, giving rise to a continuous level-of-detail sampling structure and reducing memory requirements compared to a uniform grid. In combination with a level-of-detail representation of the particle set, the perspective grid allows effectively reducing the amount of primitives to be processed at run-time. We demonstrate the quality and performance of our method for the rendering of fluid and gas dynamics SPH simulations consisting of many millions of particles.

  2. SUPREM-DSMC: A New Scalable, Parallel, Reacting, Multidimensional Direct Simulation Monte Carlo Flow Code

    NASA Technical Reports Server (NTRS)

    Campbell, David; Wysong, Ingrid; Kaplan, Carolyn; Mott, David; Wadsworth, Dean; VanGilder, Douglas

    2000-01-01

    An AFRL/NRL team has recently been selected to develop a scalable, parallel, reacting, multidimensional (SUPREM) Direct Simulation Monte Carlo (DSMC) code for the DoD user community under the High Performance Computing Modernization Office (HPCMO) Common High Performance Computing Software Support Initiative (CHSSI). This paper will introduce the JANNAF Exhaust Plume community to this three-year development effort and present the overall goals, schedule, and current status of this new code.

  3. CRUNCH_PARALLEL

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shumaker, Dana E.; Steefel, Carl I.

    The code CRUNCH_PARALLEL is a parallel version of the CRUNCH code. CRUNCH code version 2.0 was previously released by LLNL, (UCRL-CODE-200063). Crunch is a general purpose reactive transport code developed by Carl Steefel and Yabusake (Steefel Yabsaki 1996). The code handles non-isothermal transport and reaction in one, two, and three dimensions. The reaction algorithm is generic in form, handling an arbitrary number of aqueous and surface complexation as well as mineral dissolution/precipitation. A standardized database is used containing thermodynamic and kinetic data. The code includes advective, dispersive, and diffusive transport.

  4. A parallel Monte Carlo code for planar and SPECT imaging: implementation, verification and applications in (131)I SPECT.

    PubMed

    Dewaraja, Yuni K; Ljungberg, Michael; Majumdar, Amitava; Bose, Abhijit; Koral, Kenneth F

    2002-02-01

    This paper reports the implementation of the SIMIND Monte Carlo code on an IBM SP2 distributed memory parallel computer. Basic aspects of running Monte Carlo particle transport calculations on parallel architectures are described. Our parallelization is based on equally partitioning photons among the processors and uses the Message Passing Interface (MPI) library for interprocessor communication and the Scalable Parallel Random Number Generator (SPRNG) to generate uncorrelated random number streams. These parallelization techniques are also applicable to other distributed memory architectures. A linear increase in computing speed with the number of processors is demonstrated for up to 32 processors. This speed-up is especially significant in Single Photon Emission Computed Tomography (SPECT) simulations involving higher energy photon emitters, where explicit modeling of the phantom and collimator is required. For (131)I, the accuracy of the parallel code is demonstrated by comparing simulated and experimental SPECT images from a heart/thorax phantom. Clinically realistic SPECT simulations using the voxel-man phantom are carried out to assess scatter and attenuation correction.

  5. The Complexity of Parallel Algorithms,

    DTIC Science & Technology

    1985-11-01

    programns have been written for se(luiential coiipn ters. Many p~eop~le want coimp ~ilers dihal. will c(nimpile t he, code for parallel machines, to avoid...between two vertices. We also rely on parallel algorithms for maintaining data structures and manipulating graphs. We do not go into the details of these...Jpatlis and maintain connected coimp ~onents. The routine is: - 35 .- ExtendPath(r, Q, V) begin P +-0; s 4- while there is a path in V - P from s to a vertex

  6. Insulin-like growth factor binding protein-3 induces angiogenesis through IGF-I- and SphK1-dependent mechanisms.

    PubMed

    Granata, R; Trovato, L; Lupia, E; Sala, G; Settanni, F; Camussi, G; Ghidoni, R; Ghigo, E

    2007-04-01

    Angiogenesis is critical for development and repair, and is a prominent feature of many pathological conditions. Based on evidence that insulin-like growth factor binding protein (IGFBP)-3 enhances cell motility and activates sphingosine kinase (SphK) in human endothelial cells, we have investigated whether IGFBP-3 plays a role in promoting angiogenesis. IGFBP-3 potently induced network formation by human endothelial cells on Matrigel. Moreover, it up-regulated proangiogenic genes, such as vascular endothelial growth factor (VEGF) and matrix metalloproteinases (MMP)-2 and -9. IGFBP-3 even induced membrane-type 1 MMP (MT1-MMP), which regulates MMP-2 activation. Decreasing SphK1 expression by small interfering RNA (siRNA), blocked IGFBP-3-induced network formation and inhibited VEGF, MT1-MMP but not IGF-I up-regulation. IGF-I activated SphK, leading to sphingosine-1-phosphate (S1P) formation. The IGF-I effect on SphK activity was blocked by specific inhibitors of IGF-IR, PI3K/Akt and ERK1/2 phosphorylation. The disruption of IGF-I signaling prevented the IGFBP-3 effect on tube formation, SphK activity and VEGF release. Blocking ERK1/2 signaling caused the loss of SphK activation and VEGF and IGF-I up-regulation. Finally, IGFBP-3 dose-dependently stimulated neovessel formation into subcutaneous implants of Matrigel in vivo. Thus, IGFBP-3 positively regulates angiogenesis through involvement of IGF-IR signaling and subsequent SphK/S1P activation.

  7. An implementation of a tree code on a SIMD, parallel computer

    NASA Technical Reports Server (NTRS)

    Olson, Kevin M.; Dorband, John E.

    1994-01-01

    We describe a fast tree algorithm for gravitational N-body simulation on SIMD parallel computers. The tree construction uses fast, parallel sorts. The sorted lists are recursively divided along their x, y and z coordinates. This data structure is a completely balanced tree (i.e., each particle is paired with exactly one other particle) and maintains good spatial locality. An implementation of this tree-building algorithm on a 16k processor Maspar MP-1 performs well and constitutes only a small fraction (approximately 15%) of the entire cycle of finding the accelerations. Each node in the tree is treated as a monopole. The tree search and the summation of accelerations also perform well. During the tree search, node data that is needed from another processor is simply fetched. Roughly 55% of the tree search time is spent in communications between processors. We apply the code to two problems of astrophysical interest. The first is a simulation of the close passage of two gravitationally, interacting, disk galaxies using 65,636 particles. We also simulate the formation of structure in an expanding, model universe using 1,048,576 particles. Our code attains speeds comparable to one head of a Cray Y-MP, so single instruction, multiple data (SIMD) type computers can be used for these simulations. The cost/performance ratio for SIMD machines like the Maspar MP-1 make them an extremely attractive alternative to either vector processors or large multiple instruction, multiple data (MIMD) type parallel computers. With further optimizations (e.g., more careful load balancing), speeds in excess of today's vector processing computers should be possible.

  8. Two-dimensional free-surface flow under gravity: A new benchmark case for SPH method

    NASA Astrophysics Data System (ADS)

    Wu, J. Z.; Fang, L.

    2018-02-01

    Currently there are few free-surface benchmark cases with analytical results for the Smoothed Particle Hydrodynamics (SPH) simulation. In the present contribution we introduce a two-dimensional free-surface flow under gravity, and obtain an analytical expression on the surface height difference and a theoretical estimation on the surface fractal dimension. They are preliminarily validated and supported by SPH calculations.

  9. Analysis of novel sph (spherocytosis) alleles in mice reveals allele-specific loss of band 3 and adducin in α-spectrin–deficient red cells

    PubMed Central

    Robledo, Raymond F.; Lambert, Amy J.; Birkenmeier, Connie S.; Cirlan, Marius V.; Cirlan, Andreea Flavia M.; Campagna, Dean R.; Lux, Samuel E.

    2010-01-01

    Five spontaneous, allelic mutations in the α-spectrin gene, Spna1, have been identified in mice (spherocytosis [sph], sph1J, sph2J, sph2BC, sphDem). All cause severe hemolytic anemia. Here, analysis of 3 new alleles reveals previously unknown consequences of red blood cell (RBC) spectrin deficiency. In sph3J, a missense mutation (H2012Y) in repeat 19 introduces a cryptic splice site resulting in premature termination of translation. In sphIhj, a premature stop codon occurs (Q1853Stop) in repeat 18. Both mutations result in markedly reduced RBC membrane spectrin content, decreased band 3, and absent β-adducin. Reevaluation of available, previously described sph alleles reveals band 3 and adducin deficiency as well. In sph4J, a missense mutation occurs in the C-terminal EF hand domain (C2384Y). Notably, an equally severe hemolytic anemia occurs despite minimally decreased membrane spectrin with normal band 3 levels and present, although reduced, β-adducin. The severity of anemia in sph4J indicates that the highly conserved cysteine residue at the C-terminus of α-spectrin participates in interactions critical to membrane stability. The data reinforce the notion that a membrane bridge in addition to the classic protein 4.1-p55-glycophorin C linkage exists at the RBC junctional complex that involves interactions between spectrin, adducin, and band 3. PMID:20056793

  10. Analysis of novel sph (spherocytosis) alleles in mice reveals allele-specific loss of band 3 and adducin in alpha-spectrin-deficient red cells.

    PubMed

    Robledo, Raymond F; Lambert, Amy J; Birkenmeier, Connie S; Cirlan, Marius V; Cirlan, Andreea Flavia M; Campagna, Dean R; Lux, Samuel E; Peters, Luanne L

    2010-03-04

    Five spontaneous, allelic mutations in the alpha-spectrin gene, Spna1, have been identified in mice (spherocytosis [sph], sph(1J), sph(2J), sph(2BC), sph(Dem)). All cause severe hemolytic anemia. Here, analysis of 3 new alleles reveals previously unknown consequences of red blood cell (RBC) spectrin deficiency. In sph(3J), a missense mutation (H2012Y) in repeat 19 introduces a cryptic splice site resulting in premature termination of translation. In sph(Ihj), a premature stop codon occurs (Q1853Stop) in repeat 18. Both mutations result in markedly reduced RBC membrane spectrin content, decreased band 3, and absent beta-adducin. Reevaluation of available, previously described sph alleles reveals band 3 and adducin deficiency as well. In sph(4J), a missense mutation occurs in the C-terminal EF hand domain (C2384Y). Notably, an equally severe hemolytic anemia occurs despite minimally decreased membrane spectrin with normal band 3 levels and present, although reduced, beta-adducin. The severity of anemia in sph(4J) indicates that the highly conserved cysteine residue at the C-terminus of alpha-spectrin participates in interactions critical to membrane stability. The data reinforce the notion that a membrane bridge in addition to the classic protein 4.1-p55-glycophorin C linkage exists at the RBC junctional complex that involves interactions between spectrin, adducin, and band 3.

  11. Incompressible SPH method for simulating Newtonian and non-Newtonian flows with a free surface

    NASA Astrophysics Data System (ADS)

    Shao, Songdong; Lo, Edmond Y. M.

    An incompressible smoothed particle hydrodynamics (SPH) method is presented to simulate Newtonian and non-Newtonian flows with free surfaces. The basic equations solved are the incompressible mass conservation and Navier-Stokes equations. The method uses prediction-correction fractional steps with the temporal velocity field integrated forward in time without enforcing incompressibility in the prediction step. The resulting deviation of particle density is then implicitly projected onto a divergence-free space to satisfy incompressibility through a pressure Poisson equation derived from an approximate pressure projection. Various SPH formulations are employed in the discretization of the relevant gradient, divergence and Laplacian terms. Free surfaces are identified by the particles whose density is below a set point. Wall boundaries are represented by particles whose positions are fixed. The SPH formulation is also extended to non-Newtonian flows and demonstrated using the Cross rheological model. The incompressible SPH method is tested by typical 2-D dam-break problems in which both water and fluid mud are considered. The computations are in good agreement with available experimental data. The different flow features between Newtonian and non-Newtonian flows after the dam-break are discussed.

  12. Ray-tracing 3D dust radiative transfer with DART-Ray: code upgrade and public release

    NASA Astrophysics Data System (ADS)

    Natale, Giovanni; Popescu, Cristina C.; Tuffs, Richard J.; Clarke, Adam J.; Debattista, Victor P.; Fischera, Jörg; Pasetto, Stefano; Rushton, Mark; Thirlwall, Jordan J.

    2017-11-01

    We present an extensively updated version of the purely ray-tracing 3D dust radiation transfer code DART-Ray. The new version includes five major upgrades: 1) a series of optimizations for the ray-angular density and the scattered radiation source function; 2) the implementation of several data and task parallelizations using hybrid MPI+OpenMP schemes; 3) the inclusion of dust self-heating; 4) the ability to produce surface brightness maps for observers within the models in HEALPix format; 5) the possibility to set the expected numerical accuracy already at the start of the calculation. We tested the updated code with benchmark models where the dust self-heating is not negligible. Furthermore, we performed a study of the extent of the source influence volumes, using galaxy models, which are critical in determining the efficiency of the DART-Ray algorithm. The new code is publicly available, documented for both users and developers, and accompanied by several programmes to create input grids for different model geometries and to import the results of N-body and SPH simulations. These programmes can be easily adapted to different input geometries, and for different dust models or stellar emission libraries.

  13. Parallel coding of conjunctions in visual search.

    PubMed

    Found, A

    1998-10-01

    Two experiments investigated whether the conjunctive nature of nontarget items influenced search for a conjunction target. Each experiment consisted of two conditions. In both conditions, the target item was a red bar tilted to the right, among white tilted bars and vertical red bars. As well as color and orientation, display items also differed in terms of size. Size was irrelevant to search in that the size of the target varied randomly from trial to trial. In one condition, the size of items correlated with the other attributes of display items (e.g., all red items were big and all white items were small). In the other condition, the size of items varied randomly (i.e., some red items were small and some were big, and some white items were big and some were small). Search was more efficient in the size-correlated condition, consistent with the parallel coding of conjunctions in visual search.

  14. Performance analysis of parallel gravitational N-body codes on large GPU clusters

    NASA Astrophysics Data System (ADS)

    Huang, Si-Yi; Spurzem, Rainer; Berczik, Peter

    2016-01-01

    We compare the performance of two very different parallel gravitational N-body codes for astrophysical simulations on large Graphics Processing Unit (GPU) clusters, both of which are pioneers in their own fields as well as on certain mutual scales - NBODY6++ and Bonsai. We carry out benchmarks of the two codes by analyzing their performance, accuracy and efficiency through the modeling of structure decomposition and timing measurements. We find that both codes are heavily optimized to leverage the computational potential of GPUs as their performance has approached half of the maximum single precision performance of the underlying GPU cards. With such performance we predict that a speed-up of 200 - 300 can be achieved when up to 1k processors and GPUs are employed simultaneously. We discuss the quantitative information about comparisons of the two codes, finding that in the same cases Bonsai adopts larger time steps as well as larger relative energy errors than NBODY6++, typically ranging from 10 - 50 times larger, depending on the chosen parameters of the codes. Although the two codes are built for different astrophysical applications, in specified conditions they may overlap in performance at certain physical scales, thus allowing the user to choose either one by fine-tuning parameters accordingly.

  15. PIXIE3D: A Parallel, Implicit, eXtended MHD 3D Code

    NASA Astrophysics Data System (ADS)

    Chacon, Luis

    2006-10-01

    We report on the development of PIXIE3D, a 3D parallel, fully implicit Newton-Krylov extended MHD code in general curvilinear geometry. PIXIE3D employs a second-order, finite-volume-based spatial discretization that satisfies remarkable properties such as being conservative, solenoidal in the magnetic field to machine precision, non-dissipative, and linearly and nonlinearly stable in the absence of physical dissipation. PIXIE3D employs fully-implicit Newton-Krylov methods for the time advance. Currently, second-order implicit schemes such as Crank-Nicolson and BDF2 (2^nd order backward differentiation formula) are available. PIXIE3D is fully parallel (employs PETSc for parallelism), and exhibits excellent parallel scalability. A parallel, scalable, MG preconditioning strategy, based on physics-based preconditioning ideas, has been developed for resistive MHD, and is currently being extended to Hall MHD. In this poster, we will report on progress in the algorithmic formulation for extended MHD, as well as the the serial and parallel performance of PIXIE3D in a variety of problems and geometries. L. Chac'on, Comput. Phys. Comm., 163 (3), 143-171 (2004) L. Chac'on et al., J. Comput. Phys. 178 (1), 15- 36 (2002); J. Comput. Phys., 188 (2), 573-592 (2003) L. Chac'on, 32nd EPS Conf. Plasma Physics, Tarragona, Spain, 2005 L. Chac'on et al., 33rd EPS Conf. Plasma Physics, Rome, Italy, 2006

  16. Dynamic simulations of geologic materials using combined FEM/DEM/SPH analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morris, J P; Johnson, S M

    2008-03-26

    An overview of the Lawrence Discrete Element Code (LDEC) is presented, and results from a study investigating the effect of explosive and impact loading on geologic materials using the Livermore Distinct Element Code (LDEC) are detailed. LDEC was initially developed to simulate tunnels and other structures in jointed rock masses using large numbers of polyhedral blocks. Many geophysical applications, such as projectile penetration into rock, concrete targets, and boulder fields, require a combination of continuum and discrete methods in order to predict the formation and interaction of the fragments produced. In an effort to model this class of problems, LDECmore » now includes implementations of Cosserat point theory and cohesive elements. This approach directly simulates the transition from continuum to discontinuum behavior, thereby allowing for dynamic fracture within a combined finite element/discrete element framework. In addition, there are many application involving geologic materials where fluid-structure interaction is important. To facilitate solution of this class of problems a Smooth Particle Hydrodynamics (SPH) capability has been incorporated into LDEC to simulate fully coupled systems involving geologic materials and a saturating fluid. We will present results from a study of a broad range of geomechanical problems that exercise the various components of LDEC in isolation and in tandem.« less

  17. TORUS: Radiation transport and hydrodynamics code

    NASA Astrophysics Data System (ADS)

    Harries, Tim

    2014-04-01

    TORUS is a flexible radiation transfer and radiation-hydrodynamics code. The code has a basic infrastructure that includes the AMR mesh scheme that is used by several physics modules including atomic line transfer in a moving medium, molecular line transfer, photoionization, radiation hydrodynamics and radiative equilibrium. TORUS is useful for a variety of problems, including magnetospheric accretion onto T Tauri stars, spiral nebulae around Wolf-Rayet stars, discs around Herbig AeBe stars, structured winds of O supergiants and Raman-scattered line formation in symbiotic binaries, and dust emission and molecular line formation in star forming clusters. The code is written in Fortran 2003 and is compiled using a standard Gnu makefile. The code is parallelized using both MPI and OMP, and can use these parallel sections either separately or in a hybrid mode.

  18. Development of stress boundary conditions in smoothed particle hydrodynamics (SPH) for the modeling of solids deformation

    NASA Astrophysics Data System (ADS)

    Douillet-Grellier, Thomas; Pramanik, Ranjan; Pan, Kai; Albaiz, Abdulaziz; Jones, Bruce D.; Williams, John R.

    2017-10-01

    This paper develops a method for imposing stress boundary conditions in smoothed particle hydrodynamics (SPH) with and without the need for dummy particles. SPH has been used for simulating phenomena in a number of fields, such as astrophysics and fluid mechanics. More recently, the method has gained traction as a technique for simulation of deformation and fracture in solids, where the meshless property of SPH can be leveraged to represent arbitrary crack paths. Despite this interest, application of boundary conditions within the SPH framework is typically limited to imposed velocity or displacement using fictitious dummy particles to compensate for the lack of particles beyond the boundary interface. While this is enough for a large variety of problems, especially in the case of fluid flow, for problems in solid mechanics there is a clear need to impose stresses upon boundaries. In addition to this, the use of dummy particles to impose a boundary condition is not always suitable or even feasibly, especially for those problems which include internal boundaries. In order to overcome these difficulties, this paper first presents an improved method for applying stress boundary conditions in SPH with dummy particles. This is then followed by a proposal of a formulation which does not require dummy particles. These techniques are then validated against analytical solutions to two common problems in rock mechanics, the Brazilian test and the penny-shaped crack problem both in 2D and 3D. This study highlights the fact that SPH offers a good level of accuracy to solve these problems and that results are reliable. This validation work serves as a foundation for addressing more complex problems involving plasticity and fracture propagation.

  19. SPH with dynamical smoothing length adjustment based on the local flow kinematics

    NASA Astrophysics Data System (ADS)

    Olejnik, Michał; Szewc, Kamil; Pozorski, Jacek

    2017-11-01

    Due to the Lagrangian nature of Smoothed Particle Hydrodynamics (SPH), the adaptive resolution remains a challenging task. In this work, we first analyse the influence of the simulation parameters and the smoothing length on solution accuracy, in particular in high strain regions. Based on this analysis we develop a novel approach to dynamically adjust the kernel range for each SPH particle separately, accounting for the local flow kinematics. We use the Okubo-Weiss parameter that distinguishes the strain and vorticity dominated regions in the flow domain. The proposed development is relatively simple and implies only a moderate computational overhead. We validate the modified SPH algorithm for a selection of two-dimensional test cases: the Taylor-Green flow, the vortex spin-down, the lid-driven cavity and the dam-break flow against a sharp-edged obstacle. The simulation results show good agreement with the reference data and improvement of the long-term accuracy for unsteady flows. For the lid-driven cavity case, the proposed dynamical adjustment remedies the problem of tensile instability (particle clustering).

  20. Implementation of the DPM Monte Carlo code on a parallel architecture for treatment planning applications.

    PubMed

    Tyagi, Neelam; Bose, Abhijit; Chetty, Indrin J

    2004-09-01

    We have parallelized the Dose Planning Method (DPM), a Monte Carlo code optimized for radiotherapy class problems, on distributed-memory processor architectures using the Message Passing Interface (MPI). Parallelization has been investigated on a variety of parallel computing architectures at the University of Michigan-Center for Advanced Computing, with respect to efficiency and speedup as a function of the number of processors. We have integrated the parallel pseudo random number generator from the Scalable Parallel Pseudo-Random Number Generator (SPRNG) library to run with the parallel DPM. The Intel cluster consisting of 800 MHz Intel Pentium III processor shows an almost linear speedup up to 32 processors for simulating 1 x 10(8) or more particles. The speedup results are nearly linear on an Athlon cluster (up to 24 processors based on availability) which consists of 1.8 GHz+ Advanced Micro Devices (AMD) Athlon processors on increasing the problem size up to 8 x 10(8) histories. For a smaller number of histories (1 x 10(8)) the reduction of efficiency with the Athlon cluster (down to 83.9% with 24 processors) occurs because the processing time required to simulate 1 x 10(8) histories is less than the time associated with interprocessor communication. A similar trend was seen with the Opteron Cluster (consisting of 1400 MHz, 64-bit AMD Opteron processors) on increasing the problem size. Because of the 64-bit architecture Opteron processors are capable of storing and processing instructions at a faster rate and hence are faster as compared to the 32-bit Athlon processors. We have validated our implementation with an in-phantom dose calculation study using a parallel pencil monoenergetic electron beam of 20 MeV energy. The phantom consists of layers of water, lung, bone, aluminum, and titanium. The agreement in the central axis depth dose curves and profiles at different depths shows that the serial and parallel codes are equivalent in accuracy.

  1. Application of PCDA/SPH/CHO/Lysine vesicles to detect pathogenic bacteria in chicken.

    PubMed

    de Oliveira, Taíla V; Soares, Nilda de F F; de Andrade, Nélio J; Silva, Deusanilde J; Medeiros, Eber Antônio A; Badaró, Amanda T

    2015-04-01

    During the course of infection, Salmonella must successively survive the harsh acid stress of the stomach and multiply into a mild acidic compartment within macrophages. Inducible amino acid decarboxylases are known to promote adaptation to acidic environments, as lysine decarboxylation to cadaverine. The idea of Salmonella defenses responses could be employed in systems as polydiacetylene (PDA) to detect this pathogen so important to public health system. Beside that PDA is an important substance because of the unique optical property; that undergoes a colorimetric transitions by various external stimuli. Therefore 10,12-pentacosadyinoic acid (PCDA)/Sphingomyelin(SPH)/Cholesterol(CHO)/Lysine system was tested to determine the colorimetric response induced by Salmonella choleraesuis. PCDA/SPH/CHO/Lysine vesicles showed a colour change even in low S. choleraesuis concentration present in laboratory conditions and in chicken meat. Thus, this work showed a PCDA/SPH/CHO/Lysine vesicle application to simplify routine analyses in food industry, as chicken meat industry. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Development of a two-phase SPH model for sediment laden flows

    NASA Astrophysics Data System (ADS)

    Shi, Huabin; Yu, Xiping; Dalrymple, Robert A.

    2017-12-01

    A SPH model based on a general formulation for solid-fluid two-phase flows is proposed for suspended sediment motion in free surface flows. The water and the sediment are treated as two miscible fluids, and the multi-fluid system is discretized by a single set of SPH particles, which move with the water velocity and carry properties of the two phases. Large eddy simulation (LES) is introduced to deal with the turbulence effect, and the widely used Smagorinsky model is modified to take into account the influence of sediment particles on the turbulence. The drag force is accurately formulated by including the hindered settling effect. In the model, the water is assumed to be weakly compressible while the sediment is incompressible, and a new equation of state is proposed for the pressure in the sediment-water mixture. Dynamic boundary condition is employed to treat wall boundaries, and a new strategy of Shepard filtering is adopted to damp the pressure oscillation. The developed two-phase SPH model is validated by comparing the numerical results with analytical solutions for idealized cases of still water containing both neutrally buoyant and naturally settling sand and for plane Poiseuille flows carrying neutrally buoyant particles, and is then applied to sand dumping from a line source into a water tank, where the sand cloud settles with a response of the free water surface. It is shown that the numerical results are in good agreement with the experimental data as well as the empirical formulas. The characteristics of the settling sand cloud, the pressure field, and the flow vortices are studied. The motion of the free water surface is also discussed. The proposed two-phase SPH model is proven to be effective for numerical simulation of sand dumping into waters.

  3. On origin of genetic code and tRNA before translation

    PubMed Central

    2011-01-01

    Background Synthesis of proteins is based on the genetic code - a nearly universal assignment of codons to amino acids (aas). A major challenge to the understanding of the origins of this assignment is the archetypal "key-lock vs. frozen accident" dilemma. Here we re-examine this dilemma in light of 1) the fundamental veto on "foresight evolution", 2) modular structures of tRNAs and aminoacyl-tRNA synthetases, and 3) the updated library of aa-binding sites in RNA aptamers successfully selected in vitro for eight amino acids. Results The aa-binding sites of arginine, isoleucine and tyrosine contain both their cognate triplets, anticodons and codons. We have noticed that these cases might be associated with palindrome-dinucleotides. For example, one-base shift to the left brings arginine codons CGN, with CG at 1-2 positions, to the respective anticodons NCG, with CG at 2-3 positions. Formally, the concomitant presence of codons and anticodons is also expected in the reverse situation, with codons containing palindrome-dinucleotides at their 2-3 positions, and anticodons exhibiting them at 1-2 positions. A closer analysis reveals that, surprisingly, RNA binding sites for Arg, Ile and Tyr "prefer" (exactly as in the actual genetic code) the anticodon(2-3)/codon(1-2) tetramers to their anticodon(1-2)/codon(2-3) counterparts, despite the seemingly perfect symmetry of the latter. However, since in vitro selection of aa-specific RNA aptamers apparently had nothing to do with translation, this striking preference provides a new strong support to the notion of the genetic code emerging before translation, in response to catalytic (and possibly other) needs of ancient RNA life. Consistently with the pre-translation origin of the code, we propose here a new model of tRNA origin by the gradual, Fibonacci process-like, elongation of a tRNA molecule from a primordial coding triplet and 5'DCCA3' quadruplet (D is a base-determinator) to the eventual 76 base-long cloverleaf

  4. A 1D-2D coupled SPH-SWE model applied to open channel flow simulations in complicated geometries

    NASA Astrophysics Data System (ADS)

    Chang, Kao-Hua; Sheu, Tony Wen-Hann; Chang, Tsang-Jung

    2018-05-01

    In this study, a one- and two-dimensional (1D-2D) coupled model is developed to solve the shallow water equations (SWEs). The solutions are obtained using a Lagrangian meshless method called smoothed particle hydrodynamics (SPH) to simulate shallow water flows in converging, diverging and curved channels. A buffer zone is introduced to exchange information between the 1D and 2D SPH-SWE models. Interpolated water discharge values and water surface levels at the internal boundaries are prescribed as the inflow/outflow boundary conditions in the two SPH-SWE models. In addition, instead of using the SPH summation operator, we directly solve the continuity equation by introducing a diffusive term to suppress oscillations in the predicted water depth. The performance of the two approaches in calculating the water depth is comprehensively compared through a case study of a straight channel. Additionally, three benchmark cases involving converging, diverging and curved channels are adopted to demonstrate the ability of the proposed 1D and 2D coupled SPH-SWE model through comparisons with measured data and predicted mesh-based numerical results. The proposed model provides satisfactory accuracy and guaranteed convergence.

  5. StarSmasher: Smoothed Particle Hydrodynamics code for smashing stars and planets

    NASA Astrophysics Data System (ADS)

    Gaburov, Evghenii; Lombardi, James C., Jr.; Portegies Zwart, Simon; Rasio, F. A.

    2018-05-01

    Smoothed Particle Hydrodynamics (SPH) is a Lagrangian particle method that approximates a continuous fluid as discrete nodes, each carrying various parameters such as mass, position, velocity, pressure, and temperature. In an SPH simulation the resolution scales with the particle density; StarSmasher is able to handle both equal-mass and equal number-density particle models. StarSmasher solves for hydro forces by calculating the pressure for each particle as a function of the particle's properties - density, internal energy, and internal properties (e.g. temperature and mean molecular weight). The code implements variational equations of motion and libraries to calculate the gravitational forces between particles using direct summation on NVIDIA graphics cards. Using a direct summation instead of a tree-based algorithm for gravity increases the accuracy of the gravity calculations at the cost of speed. The code uses a cubic spline for the smoothing kernel and an artificial viscosity prescription coupled with a Balsara Switch to prevent unphysical interparticle penetration. The code also implements an artificial relaxation force to the equations of motion to add a drag term to the calculated accelerations during relaxation integrations. Initially called StarCrash, StarSmasher was developed originally by Rasio.

  6. Implementation and Characterization of Three-Dimensional Particle-in-Cell Codes on Multiple-Instruction-Multiple-Data Massively Parallel Supercomputers

    NASA Technical Reports Server (NTRS)

    Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.

    1995-01-01

    A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.

  7. Hybrid-view programming of nuclear fusion simulation code in the PGAS parallel programming language XcalableMP

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tsugane, Keisuke; Boku, Taisuke; Murai, Hitoshi

    Recently, the Partitioned Global Address Space (PGAS) parallel programming model has emerged as a usable distributed memory programming model. XcalableMP (XMP) is a PGAS parallel programming language that extends base languages such as C and Fortran with directives in OpenMP-like style. XMP supports a global-view model that allows programmers to define global data and to map them to a set of processors, which execute the distributed global data as a single thread. In XMP, the concept of a coarray is also employed for local-view programming. In this study, we port Gyrokinetic Toroidal Code - Princeton (GTC-P), which is a three-dimensionalmore » gyrokinetic PIC code developed at Princeton University to study the microturbulence phenomenon in magnetically confined fusion plasmas, to XMP as an example of hybrid memory model coding with the global-view and local-view programming models. In local-view programming, the coarray notation is simple and intuitive compared with Message Passing Interface (MPI) programming while the performance is comparable to that of the MPI version. Thus, because the global-view programming model is suitable for expressing the data parallelism for a field of grid space data, we implement a hybrid-view version using a global-view programming model to compute the field and a local-view programming model to compute the movement of particles. Finally, the performance is degraded by 20% compared with the original MPI version, but the hybrid-view version facilitates more natural data expression for static grid space data (in the global-view model) and dynamic particle data (in the local-view model), and it also increases the readability of the code for higher productivity.« less

  8. Hybrid-view programming of nuclear fusion simulation code in the PGAS parallel programming language XcalableMP

    DOE PAGES

    Tsugane, Keisuke; Boku, Taisuke; Murai, Hitoshi; ...

    2016-06-01

    Recently, the Partitioned Global Address Space (PGAS) parallel programming model has emerged as a usable distributed memory programming model. XcalableMP (XMP) is a PGAS parallel programming language that extends base languages such as C and Fortran with directives in OpenMP-like style. XMP supports a global-view model that allows programmers to define global data and to map them to a set of processors, which execute the distributed global data as a single thread. In XMP, the concept of a coarray is also employed for local-view programming. In this study, we port Gyrokinetic Toroidal Code - Princeton (GTC-P), which is a three-dimensionalmore » gyrokinetic PIC code developed at Princeton University to study the microturbulence phenomenon in magnetically confined fusion plasmas, to XMP as an example of hybrid memory model coding with the global-view and local-view programming models. In local-view programming, the coarray notation is simple and intuitive compared with Message Passing Interface (MPI) programming while the performance is comparable to that of the MPI version. Thus, because the global-view programming model is suitable for expressing the data parallelism for a field of grid space data, we implement a hybrid-view version using a global-view programming model to compute the field and a local-view programming model to compute the movement of particles. Finally, the performance is degraded by 20% compared with the original MPI version, but the hybrid-view version facilitates more natural data expression for static grid space data (in the global-view model) and dynamic particle data (in the local-view model), and it also increases the readability of the code for higher productivity.« less

  9. Variable stars in Local Group Galaxies - II. Sculptor dSph

    NASA Astrophysics Data System (ADS)

    Martínez-Vázquez, C. E.; Stetson, P. B.; Monelli, M.; Bernard, E. J.; Fiorentino, G.; Gallart, C.; Bono, G.; Cassisi, S.; Dall'Ora, M.; Ferraro, I.; Iannicola, G.; Walker, A. R.

    2016-11-01

    We present the identification of 634 variable stars in the Milky Way dwarf spheroidal (dSph) satellite Sculptor based on archival ground-based optical observations spanning ˜24 yr and covering ˜2.5 deg2. We employed the same methodologies as the `Homogeneous Photometry' series published by Stetson. In particular, we have identified and characterized one of the largest (536) RR Lyrae samples so far in a Milky Way dSph satellite. We have also detected four Anomalous Cepheids, 23 SX Phoenicis stars, five eclipsing binaries, three field variable stars, three peculiar variable stars located above the horizontal branch - near to the locus of BL Herculis - that we are unable to classify properly. Additionally, we identify 37 long period variables plus 23 probable variable stars, for which the current data do not allow us to determine the period. We report positions and finding charts for all the variable stars, and basic properties (period, amplitude, mean magnitude) and light curves for 574 of them. We discuss the properties of the RR Lyrae stars in the Bailey diagram, which supports the coexistence of subpopulations with different chemical compositions. We estimate the mean mass of Anomalous Cepheids (˜1.5 M⊙) and SX Phoenicis stars (˜1 M⊙). We discuss in detail the nature of the former. The connections between the properties of the different families of variable stars are discussed in the context of the star formation history of the Sculptor dSph galaxy.

  10. Parallelization of the FLAPW method

    NASA Astrophysics Data System (ADS)

    Canning, A.; Mannstadt, W.; Freeman, A. J.

    2000-08-01

    The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.

  11. Coding for Parallel Links to Maximize the Expected Value of Decodable Messages

    NASA Technical Reports Server (NTRS)

    Klimesh, Matthew A.; Chang, Christopher S.

    2011-01-01

    When multiple parallel communication links are available, it is useful to consider link-utilization strategies that provide tradeoffs between reliability and throughput. Interesting cases arise when there are three or more available links. Under the model considered, the links have known probabilities of being in working order, and each link has a known capacity. The sender has a number of messages to send to the receiver. Each message has a size and a value (i.e., a worth or priority). Messages may be divided into pieces arbitrarily, and the value of each piece is proportional to its size. The goal is to choose combinations of messages to send on the links so that the expected value of the messages decodable by the receiver is maximized. There are three parts to the innovation: (1) Applying coding to parallel links under the model; (2) Linear programming formulation for finding the optimal combinations of messages to send on the links; and (3) Algorithms for assisting in finding feasible combinations of messages, as support for the linear programming formulation. There are similarities between this innovation and methods developed in the field of network coding. However, network coding has generally been concerned with either maximizing throughput in a fixed network, or robust communication of a fixed volume of data. In contrast, under this model, the throughput is expected to vary depending on the state of the network. Examples of error-correcting codes that are useful under this model but which are not needed under previous models have been found. This model can represent either a one-shot communication attempt, or a stream of communications. Under the one-shot model, message sizes and link capacities are quantities of information (e.g., measured in bits), while under the communications stream model, message sizes and link capacities are information rates (e.g., measured in bits/second). This work has the potential to increase the value of data returned from

  12. SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX/80

    NASA Astrophysics Data System (ADS)

    Kamat, Manohar P.; Watson, Brian C.

    1992-02-01

    The results of a research activity aimed at providing a finite element capability for analyzing turbo-machinery bladed-disk assemblies in a vector/parallel processing environment are summarized. Analysis of aircraft turbofan engines is very computationally intensive. The performance limit of modern day computers with a single processing unit was estimated at 3 billions of floating point operations per second (3 gigaflops). In view of this limit of a sequential unit, performance rates higher than 3 gigaflops can be achieved only through vectorization and/or parallelization as on Alliant FX/80. Accordingly, the efforts of this critically needed research were geared towards developing and evaluating parallel finite element methods for static and vibration analysis. A special purpose code, named with the acronym SAPNEW, performs static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements.

  13. SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX/80

    NASA Technical Reports Server (NTRS)

    Kamat, Manohar P.; Watson, Brian C.

    1992-01-01

    The results of a research activity aimed at providing a finite element capability for analyzing turbo-machinery bladed-disk assemblies in a vector/parallel processing environment are summarized. Analysis of aircraft turbofan engines is very computationally intensive. The performance limit of modern day computers with a single processing unit was estimated at 3 billions of floating point operations per second (3 gigaflops). In view of this limit of a sequential unit, performance rates higher than 3 gigaflops can be achieved only through vectorization and/or parallelization as on Alliant FX/80. Accordingly, the efforts of this critically needed research were geared towards developing and evaluating parallel finite element methods for static and vibration analysis. A special purpose code, named with the acronym SAPNEW, performs static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements.

  14. Parallel evolution of chordate cis-regulatory code for development.

    PubMed

    Doglio, Laura; Goode, Debbie K; Pelleri, Maria C; Pauls, Stefan; Frabetti, Flavia; Shimeld, Sebastian M; Vavouri, Tanya; Elgar, Greg

    2013-11-01

    Urochordates are the closest relatives of vertebrates and at the larval stage, possess a characteristic bilateral chordate body plan. In vertebrates, the genes that orchestrate embryonic patterning are in part regulated by highly conserved non-coding elements (CNEs), yet these elements have not been identified in urochordate genomes. Consequently the evolution of the cis-regulatory code for urochordate development remains largely uncharacterised. Here, we use genome-wide comparisons between C. intestinalis and C. savignyi to identify putative urochordate cis-regulatory sequences. Ciona conserved non-coding elements (ciCNEs) are associated with largely the same key regulatory genes as vertebrate CNEs. Furthermore, some of the tested ciCNEs are able to activate reporter gene expression in both zebrafish and Ciona embryos, in a pattern that at least partially overlaps that of the gene they associate with, despite the absence of sequence identity. We also show that the ability of a ciCNE to up-regulate gene expression in vertebrate embryos can in some cases be localised to short sub-sequences, suggesting that functional cross-talk may be defined by small regions of ancestral regulatory logic, although functional sub-sequences may also be dispersed across the whole element. We conclude that the structure and organisation of cis-regulatory modules is very different between vertebrates and urochordates, reflecting their separate evolutionary histories. However, functional cross-talk still exists because the same repertoire of transcription factors has likely guided their parallel evolution, exploiting similar sets of binding sites but in different combinations.

  15. Simulating Free Surface Flows with SPH

    NASA Astrophysics Data System (ADS)

    Monaghan, J. J.

    1994-02-01

    The SPH (smoothed particle hydrodynamics) method is extended to deal with free surface incompressible flows. The method is easy to use, and examples will be given of its application to a breaking dam, a bore, the simulation of a wave maker, and the propagation of waves towards a beach. Arbitrary moving boundaries can be included by modelling the boundaries by particles which repel the fluid particles. The method is explicit, and the time steps are therefore much shorter than required by other less flexible methods, but it is robust and easy to program.

  16. Jet formation and equatorial superrotation in Jupiter's atmosphere: Numerical modelling using a new efficient parallel code

    NASA Astrophysics Data System (ADS)

    Rivier, Leonard Gilles

    Using an efficient parallel code solving the primitive equations of atmospheric dynamics, the jet structure of a Jupiter like atmosphere is modeled. In the first part of this thesis, a parallel spectral code solving both the shallow water equations and the multi-level primitive equations of atmospheric dynamics is built. The implementation of this code called BOB is done so that it runs effectively on an inexpensive cluster of workstations. A one dimensional decomposition and transposition method insuring load balancing among processes is used. The Legendre transform is cache-blocked. A "compute on the fly" of the Legendre polynomials used in the spectral method produces a lower memory footprint and enables high resolution runs on relatively small memory machines. Performance studies are done using a cluster of workstations located at the National Center for Atmospheric Research (NCAR). BOB performances are compared to the parallel benchmark code PSTSWM and the dynamical core of NCAR's CCM3.6.6. In both cases, the comparison favors BOB. In the second part of this thesis, the primitive equation version of the code described in part I is used to study the formation of organized zonal jets and equatorial superrotation in a planetary atmosphere where the parameters are chosen to best model the upper atmosphere of Jupiter. Two levels are used in the vertical and only large scale forcing is present. The model is forced towards a baroclinically unstable flow, so that eddies are generated by baroclinic instability. We consider several types of forcing, acting on either the temperature or the momentum field. We show that only under very specific parametric conditions, zonally elongated structures form and persist resembling the jet structure observed near the cloud level top (1 bar) on Jupiter. We also study the effect of an equatorial heat source, meant to be a crude representation of the effect of the deep convective planetary interior onto the outer atmospheric layer. We

  17. A clip-domain serine proteinase homolog (SPH) in oriental river prawn, Macrobrachium nipponense provides insights into its role in innate immune response.

    PubMed

    Ding, Zhili; Kong, Youqin; Chen, Liqiao; Qin, Jianguang; Sun, Shengming; Li, Ming; Du, Zhenyu; Ye, Jinyun

    2014-08-01

    In this study, a clip-domain serine proteinase homolog designated as MnSPH was cloned and characterized from a freshwater prawn Macrobrachium nipponense. The full-length cDNA of MnSPH was 1897 bp and contained a 1701 bp open reading frame (ORF) encoding a protein of 566 amino acids, a 103 bp 5'-untranslated region, and a 93 bp 3'-untranslated region. Sequence comparison showed that the deduced amino acids of MnSPH shared 30-59% identity with sequences reported in other animals. Tissue distribution analysis indicated that the MnSPH transcripts were present in all detected tissues with highest in the hepatopancreas and ovary. The MnSPH mRNA levels in the developing ovary were stable at the initial three developmental stages, then increased gradually from stage IV (later vitellogenesis), and reached a maximum at stage VI (paracmasis). Furthermore, the expression of MnSPH mRNA in hemocytes was significantly up-regulated at 1.5 h, 6 h, 12 h and 48 h post Aeromonas hydrophila injection. The increased phenoloxidase activity also demonstrated a clear time-dependent pattern after A. hydrophila challenge. These results suggest that MnSPH participates in resisting to pathogenic microorganisms and plays a pivotal role in host defense against microbe invasion in M. nipponense. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Dark matter in the Reticulum II dSph: a radio search

    NASA Astrophysics Data System (ADS)

    Regis, Marco; Richter, Laura; Colafrancesco, Sergio

    2017-07-01

    We present a deep radio search in the Reticulum II dwarf spheroidal (dSph) galaxy performed with the Australia Telescope Compact Array. Observations were conducted at 16 cm wavelength, with an rms sensitivity of 0.01 mJy/beam, and with the goal of searching for synchrotron emission induced by annihilation or decay of weakly interacting massive particles (WIMPs). Data were complemented with observations on large angular scales taken with the KAT-7 telescope. We find no evidence for a diffuse emission from the dSph and we derive competitive bounds on the WIMP properties. In addition, we detect more than 200 new background radio sources. Among them, we show there are two compelling candidates for being the radio counterpart of the possible γ-ray emission reported by other groups using Fermi-LAT data.

  19. Hybrid parallelization of the XTOR-2F code for the simulation of two-fluid MHD instabilities in tokamaks

    NASA Astrophysics Data System (ADS)

    Marx, Alain; Lütjens, Hinrich

    2017-03-01

    A hybrid MPI/OpenMP parallel version of the XTOR-2F code [Lütjens and Luciani, J. Comput. Phys. 229 (2010) 8130] solving the two-fluid MHD equations in full tokamak geometry by means of an iterative Newton-Krylov matrix-free method has been developed. The present work shows that the code has been parallelized significantly despite the numerical profile of the problem solved by XTOR-2F, i.e. a discretization with pseudo-spectral representations in all angular directions, the stiffness of the two-fluid stability problem in tokamaks, and the use of a direct LU decomposition to invert the physical pre-conditioner at every Krylov iteration of the solver. The execution time of the parallelized version is an order of magnitude smaller than the sequential one for low resolution cases, with an increasing speedup when the discretization mesh is refined. Moreover, it allows to perform simulations with higher resolutions, previously forbidden because of memory limitations.

  20. Magnetosphere simulations with a high-performance 3D AMR MHD Code

    NASA Astrophysics Data System (ADS)

    Gombosi, Tamas; Dezeeuw, Darren; Groth, Clinton; Powell, Kenneth; Song, Paul

    1998-11-01

    BATS-R-US is a high-performance 3D AMR MHD code for space physics applications running on massively parallel supercomputers. In BATS-R-US the electromagnetic and fluid equations are solved with a high-resolution upwind numerical scheme in a tightly coupled manner. The code is very robust and it is capable of spanning a wide range of plasma parameters (such as β, acoustic and Alfvénic Mach numbers). Our code is highly scalable: it achieved a sustained performance of 233 GFLOPS on a Cray T3E-1200 supercomputer with 1024 PEs. This talk reports results from the BATS-R-US code for the GGCM (Geospace General Circularculation Model) Phase 1 Standard Model Suite. This model suite contains 10 different steady-state configurations: 5 IMF clock angles (north, south, and three equally spaced angles in- between) with 2 IMF field strengths for each angle (5 nT and 10 nT). The other parameters are: solar wind speed =400 km/sec; solar wind number density = 5 protons/cc; Hall conductance = 0; Pedersen conductance = 5 S; parallel conductivity = ∞.

  1. Steel Fibre Reinforced Concrete Simulation with the SPH Method

    NASA Astrophysics Data System (ADS)

    Hušek, Martin; Kala, Jiří; Král, Petr; Hokeš, Filip

    2017-10-01

    Steel fibre reinforced concrete (SFRC) is very popular in many branches of civil engineering. Thanks to its increased ductility, it is able to resist various types of loading. When designing a structure, the mechanical behaviour of SFRC can be described by currently available material models (with equivalent material for example) and therefore no problems arise with numerical simulations. But in many scenarios, e.g. high speed loading, it would be a mistake to use such an equivalent material. Physical modelling of the steel fibres used in concrete is usually problematic, though. It is necessary to consider the fact that mesh-based methods are very unsuitable for high-speed simulations with regard to the issues that occur due to the effect of excessive mesh deformation. So-called meshfree methods are much more suitable for this purpose. The Smoothed Particle Hydrodynamics (SPH) method is currently the best choice, thanks to its advantages. However, a numerical defect known as tensile instability may appear when the SPH method is used. It causes the development of numerical (false) cracks, making simulations of ductile types of failure significantly more difficult to perform. The contribution therefore deals with the description of a procedure for avoiding this defect and successfully simulating the behaviour of SFRC with the SPH method. The essence of the problem lies in the choice of coordinates and the description of the integration domain derived from them - spatial (Eulerian kernel) or material coordinates (Lagrangian kernel). The contribution describes the behaviour of both formulations. Conclusions are drawn from the fundamental tasks, and the contribution additionally demonstrates the functionality of SFRC simulations. The random generation of steel fibres and their inclusion in simulations are also discussed. The functionality of the method is supported by the results of pressure test simulations which compare various levels of fibre reinforcement of SFRC

  2. OpenGeoSys-GEMS: Hybrid parallelization of a reactive transport code with MPI and threads

    NASA Astrophysics Data System (ADS)

    Kosakowski, G.; Kulik, D. A.; Shao, H.

    2012-04-01

    OpenGeoSys-GEMS is a generic purpose reactive transport code based on the operator splitting approach. The code couples the Finite-Element groundwater flow and multi-species transport modules of the OpenGeoSys (OGS) project (http://www.ufz.de/index.php?en=18345) with the GEM-Selektor research package to model thermodynamic equilibrium of aquatic (geo)chemical systems utilizing the Gibbs Energy Minimization approach (http://gems.web.psi.ch/). The combination of OGS and the GEM-Selektor kernel (GEMS3K) is highly flexible due to the object-oriented modular code structures and the well defined (memory based) data exchange modules. Like other reactive transport codes, the practical applicability of OGS-GEMS is often hampered by the long calculation time and large memory requirements. • For realistic geochemical systems which might include dozens of mineral phases and several (non-ideal) solid solutions the time needed to solve the chemical system with GEMS3K may increase exceptionally. • The codes are coupled in a sequential non-iterative loop. In order to keep the accuracy, the time step size is restricted. In combination with a fine spatial discretization the time step size may become very small which increases calculation times drastically even for small 1D problems. • The current version of OGS is not optimized for memory use and the MPI version of OGS does not distribute data between nodes. Even for moderately small 2D problems the number of MPI processes that fit into memory of up-to-date workstations or HPC hardware is limited. One strategy to overcome the above mentioned restrictions of OGS-GEMS is to parallelize the coupled code. For OGS a parallelized version already exists. It is based on a domain decomposition method implemented with MPI and provides a parallel solver for fluid and mass transport processes. In the coupled code, after solving fluid flow and solute transport, geochemical calculations are done in form of a central loop over all finite

  3. One ancestor for two codes viewed from the perspective of two complementary modes of tRNA aminoacylation

    PubMed Central

    Rodin, Andrei S; Szathmáry, Eörs; Rodin, Sergei N

    2009-01-01

    Background The genetic code is brought into action by 20 aminoacyl-tRNA synthetases. These enzymes are evenly divided into two classes (I and II) that recognize tRNAs from the minor and major groove sides of the acceptor stem, respectively. We have reported recently that: (1) ribozymic precursors of the synthetases seem to have used the same two sterically mirror modes of tRNA recognition, (2) having these two modes might have helped in preventing erroneous aminoacylation of ancestral tRNAs with complementary anticodons, yet (3) the risk of confusion for the presumably earliest pairs of complementarily encoded amino acids had little to do with anticodons. Accordingly, in this communication we focus on the acceptor stem. Results Our main result is the emergence of a palindrome structure for the acceptor stem's common ancestor, reconstructed from the phylogenetic trees of Bacteria, Archaea and Eukarya. In parallel, for pairs of ancestral tRNAs with complementary anticodons, we present updated evidence of concerted complementarity of the second bases in the acceptor stems. These two results suggest that the first pairs of "complementary" amino acids that were engaged in primordial coding, such as Gly and Ala, could have avoided erroneous aminoacylation if and only if the acceptor stems of their adaptors were recognized from the same, major groove, side. The class II protein synthetases then inherited this "primary preference" from isofunctional ribozymes. Conclusion Taken together, our results support the hypothesis that the genetic code per se (the one associated with the anticodons) and the operational code of aminoacylation (associated with the acceptor) diverged from a common ancestor that probably began developing before translation. The primordial advantage of linking some amino acids (most likely glycine and alanine) to the ancestral acceptor stem may have been selective retention in a protocell surrounded by a leaky membrane for use in nucleotide and coenzyme

  4. SPH modelling of energy partitioning during impacts on Venus

    NASA Technical Reports Server (NTRS)

    Takata, T.; Ahrens, T. J.

    1993-01-01

    Impact cratering of the Venusian planetary surface by meteorites was investigated numerically using the Smoothed Particle Hydrodynamics (SPH) method. Venus presently has a dense atmosphere. Vigorous transfer of energy between impacting meteorites, the planetary surface, and the atmosphere is expected during impact events. The investigation concentrated on the effects of the atmosphere on energy partitioning and the flow of ejecta and gas. The SPH method is particularly suitable for studying complex motion, especially because of its ability to be extended to three dimensions. In our simulations, particles representing impactors and targets are initially set to a uniform density, and those of atmosphere are set to be in hydrostatic equilibrium. Target, impactor, and atmosphere are represented by 9800, 80, and 4200 particles, respectively. A Tillotson equation of state for granite is assumed for the target and impactor, and an ideal gas with constant specific heat ratio is used for the atmosphere. Two dimensional axisymmetric geometry was assumed and normal impacts of 10km diameter projectiles with velocities of 5, 10, 20, and 40 km/s, both with and without an atmosphere present were modeled.

  5. Fortran code for SU(3) lattice gauge theory with and without MPI checkerboard parallelization

    NASA Astrophysics Data System (ADS)

    Berg, Bernd A.; Wu, Hao

    2012-10-01

    We document plain Fortran and Fortran MPI checkerboard code for Markov chain Monte Carlo simulations of pure SU(3) lattice gauge theory with the Wilson action in D dimensions. The Fortran code uses periodic boundary conditions and is suitable for pedagogical purposes and small scale simulations. For the Fortran MPI code two geometries are covered: the usual torus with periodic boundary conditions and the double-layered torus as defined in the paper. Parallel computing is performed on checkerboards of sublattices, which partition the full lattice in one, two, and so on, up to D directions (depending on the parameters set). For updating, the Cabibbo-Marinari heatbath algorithm is used. We present validations and test runs of the code. Performance is reported for a number of currently used Fortran compilers and, when applicable, MPI versions. For the parallelized code, performance is studied as a function of the number of processors. Program summary Program title: STMC2LSU3MPI Catalogue identifier: AEMJ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEMJ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 26666 No. of bytes in distributed program, including test data, etc.: 233126 Distribution format: tar.gz Programming language: Fortran 77 compatible with the use of Fortran 90/95 compilers, in part with MPI extensions. Computer: Any capable of compiling and executing Fortran 77 or Fortran 90/95, when needed with MPI extensions. Operating system: Red Hat Enterprise Linux Server 6.1 with OpenMPI + pgf77 11.8-0, Centos 5.3 with OpenMPI + gfortran 4.1.2, Cray XT4 with MPICH2 + pgf90 11.2-0. Has the code been vectorised or parallelized?: Yes, parallelized using MPI extensions. Number of processors used: 2 to 11664 RAM: 200 Mega bytes per process. Classification: 11

  6. A Parallel Decoding Algorithm for Short Polar Codes Based on Error Checking and Correcting

    PubMed Central

    Pan, Xiaofei; Pan, Kegang; Ye, Zhan; Gong, Chao

    2014-01-01

    We propose a parallel decoding algorithm based on error checking and correcting to improve the performance of the short polar codes. In order to enhance the error-correcting capacity of the decoding algorithm, we first derive the error-checking equations generated on the basis of the frozen nodes, and then we introduce the method to check the errors in the input nodes of the decoder by the solutions of these equations. In order to further correct those checked errors, we adopt the method of modifying the probability messages of the error nodes with constant values according to the maximization principle. Due to the existence of multiple solutions of the error-checking equations, we formulate a CRC-aided optimization problem of finding the optimal solution with three different target functions, so as to improve the accuracy of error checking. Besides, in order to increase the throughput of decoding, we use a parallel method based on the decoding tree to calculate probability messages of all the nodes in the decoder. Numerical results show that the proposed decoding algorithm achieves better performance than that of some existing decoding algorithms with the same code length. PMID:25540813

  7. Numerical simulation for the air entrainment of aerated flow with an improved multiphase SPH model

    NASA Astrophysics Data System (ADS)

    Wan, Hang; Li, Ran; Pu, Xunchi; Zhang, Hongwei; Feng, Jingjie

    2017-11-01

    Aerated flow is a complex hydraulic phenomenon that exists widely in the field of environmental hydraulics. It is generally characterised by large deformation and violent fragmentation of the free surface. Compared to Euler methods (volume of fluid (VOF) method or rigid-lid hypothesis method), the existing single-phase Smooth Particle Hydrodynamics (SPH) method has performed well for solving particle motion. A lack of research on interphase interaction and air concentration, however, has affected the application of SPH model. In our study, an improved multiphase SPH model is presented to simulate aeration flows. A drag force was included in the momentum equation to ensure accuracy of the air particle slip velocity. Furthermore, a calculation method for air concentration is developed to analyse the air entrainment characteristics. Two studies were used to simulate the hydraulic and air entrainment characteristics. And, compared with the experimental results, the simulation results agree with the experimental results well.

  8. Parallel Monte Carlo transport modeling in the context of a time-dependent, three-dimensional multi-physics code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Procassini, R.J.

    1997-12-31

    The fine-scale, multi-space resolution that is envisioned for accurate simulations of complex weapons systems in three spatial dimensions implies flop-rate and memory-storage requirements that will only be obtained in the near future through the use of parallel computational techniques. Since the Monte Carlo transport models in these simulations usually stress both of these computational resources, they are prime candidates for parallelization. The MONACO Monte Carlo transport package, which is currently under development at LLNL, will utilize two types of parallelism within the context of a multi-physics design code: decomposition of the spatial domain across processors (spatial parallelism) and distribution ofmore » particles in a given spatial subdomain across additional processors (particle parallelism). This implementation of the package will utilize explicit data communication between domains (message passing). Such a parallel implementation of a Monte Carlo transport model will result in non-deterministic communication patterns. The communication of particles between subdomains during a Monte Carlo time step may require a significant level of effort to achieve a high parallel efficiency.« less

  9. Dark matter in the Reticulum II dSph: a radio search

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Regis, Marco; Richter, Laura; Colafrancesco, Sergio, E-mail: regis@to.infn.it, E-mail: llrichter@gmail.com, E-mail: sergio.colafrancesco@wits.ac.za

    2017-07-01

    We present a deep radio search in the Reticulum II dwarf spheroidal (dSph) galaxy performed with the Australia Telescope Compact Array. Observations were conducted at 16 cm wavelength, with an rms sensitivity of 0.01 mJy/beam, and with the goal of searching for synchrotron emission induced by annihilation or decay of weakly interacting massive particles (WIMPs). Data were complemented with observations on large angular scales taken with the KAT-7 telescope. We find no evidence for a diffuse emission from the dSph and we derive competitive bounds on the WIMP properties. In addition, we detect more than 200 new background radio sources.more » Among them, we show there are two compelling candidates for being the radio counterpart of the possible γ-ray emission reported by other groups using Fermi-LAT data.« less

  10. Hybrid MPI-OpenMP Parallelism in the ONETEP Linear-Scaling Electronic Structure Code: Application to the Delamination of Cellulose Nanofibrils.

    PubMed

    Wilkinson, Karl A; Hine, Nicholas D M; Skylaris, Chris-Kriton

    2014-11-11

    We present a hybrid MPI-OpenMP implementation of Linear-Scaling Density Functional Theory within the ONETEP code. We illustrate its performance on a range of high performance computing (HPC) platforms comprising shared-memory nodes with fast interconnect. Our work has focused on applying OpenMP parallelism to the routines which dominate the computational load, attempting where possible to parallelize different loops from those already parallelized within MPI. This includes 3D FFT box operations, sparse matrix algebra operations, calculation of integrals, and Ewald summation. While the underlying numerical methods are unchanged, these developments represent significant changes to the algorithms used within ONETEP to distribute the workload across CPU cores. The new hybrid code exhibits much-improved strong scaling relative to the MPI-only code and permits calculations with a much higher ratio of cores to atoms. These developments result in a significantly shorter time to solution than was possible using MPI alone and facilitate the application of the ONETEP code to systems larger than previously feasible. We illustrate this with benchmark calculations from an amyloid fibril trimer containing 41,907 atoms. We use the code to study the mechanism of delamination of cellulose nanofibrils when undergoing sonification, a process which is controlled by a large number of interactions that collectively determine the structural properties of the fibrils. Many energy evaluations were needed for these simulations, and as these systems comprise up to 21,276 atoms this would not have been feasible without the developments described here.

  11. ls1 mardyn: The Massively Parallel Molecular Dynamics Code for Large Systems.

    PubMed

    Niethammer, Christoph; Becker, Stefan; Bernreuther, Martin; Buchholz, Martin; Eckhardt, Wolfgang; Heinecke, Alexander; Werth, Stephan; Bungartz, Hans-Joachim; Glass, Colin W; Hasse, Hans; Vrabec, Jadran; Horsch, Martin

    2014-10-14

    The molecular dynamics simulation code ls1 mardyn is presented. It is a highly scalable code, optimized for massively parallel execution on supercomputing architectures and currently holds the world record for the largest molecular simulation with over four trillion particles. It enables the application of pair potentials to length and time scales that were previously out of scope for molecular dynamics simulation. With an efficient dynamic load balancing scheme, it delivers high scalability even for challenging heterogeneous configurations. Presently, multicenter rigid potential models based on Lennard-Jones sites, point charges, and higher-order polarities are supported. Due to its modular design, ls1 mardyn can be extended to new physical models, methods, and algorithms, allowing future users to tailor it to suit their respective needs. Possible applications include scenarios with complex geometries, such as fluids at interfaces, as well as nonequilibrium molecular dynamics simulation of heat and mass transfer.

  12. Reactor Dosimetry Applications Using RAPTOR-M3G:. a New Parallel 3-D Radiation Transport Code

    NASA Astrophysics Data System (ADS)

    Longoni, Gianluca; Anderson, Stanwood L.

    2009-08-01

    The numerical solution of the Linearized Boltzmann Equation (LBE) via the Discrete Ordinates method (SN) requires extensive computational resources for large 3-D neutron and gamma transport applications due to the concurrent discretization of the angular, spatial, and energy domains. This paper will discuss the development RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries), a new 3-D parallel radiation transport code, and its application to the calculation of ex-vessel neutron dosimetry responses in the cavity of a commercial 2-loop Pressurized Water Reactor (PWR). RAPTOR-M3G is based domain decomposition algorithms, where the spatial and angular domains are allocated and processed on multi-processor computer architectures. As compared to traditional single-processor applications, this approach reduces the computational load as well as the memory requirement per processor, yielding an efficient solution methodology for large 3-D problems. Measured neutron dosimetry responses in the reactor cavity air gap will be compared to the RAPTOR-M3G predictions. This paper is organized as follows: Section 1 discusses the RAPTOR-M3G methodology; Section 2 describes the 2-loop PWR model and the numerical results obtained. Section 3 addresses the parallel performance of the code, and Section 4 concludes this paper with final remarks and future work.

  13. Divergence-Free SPH for Incompressible and Viscous Fluids.

    PubMed

    Bender, Jan; Koschier, Dan

    2017-03-01

    In this paper we present a novel Smoothed Particle Hydrodynamics (SPH) method for the efficient and stable simulation of incompressible fluids. The most efficient SPH-based approaches enforce incompressibility either on position or velocity level. However, the continuity equation for incompressible flow demands to maintain a constant density and a divergence-free velocity field. We propose a combination of two novel implicit pressure solvers enforcing both a low volume compression as well as a divergence-free velocity field. While a compression-free fluid is essential for realistic physical behavior, a divergence-free velocity field drastically reduces the number of required solver iterations and increases the stability of the simulation significantly. Thanks to the improved stability, our method can handle larger time steps than previous approaches. This results in a substantial performance gain since the computationally expensive neighborhood search has to be performed less frequently. Moreover, we introduce a third optional implicit solver to simulate highly viscous fluids which seamlessly integrates into our solver framework. Our implicit viscosity solver produces realistic results while introducing almost no numerical damping. We demonstrate the efficiency, robustness and scalability of our method in a variety of complex simulations including scenarios with millions of turbulent particles or highly viscous materials.

  14. Evidence of enrichment by individual SN from elemental abundance ratios in the very metal-poor dSph galaxy Boötes I

    NASA Astrophysics Data System (ADS)

    Feltzing, S.; Eriksson, K.; Kleyna, J.; Wilkinson, M. I.

    2009-12-01

    Aims. We establish the mean metallicity from high-resolution spectroscopy for the recently found dwarf spheroidal galaxy Boötes I and test whether it is a common feature for ultra-faint dwarf spheroidal galaxies to show signs of inhomogeneous chemical evolution (e.g. as found in the Hercules dwarf spheroidal galaxy). Methods: We analyse high-resolution, moderate signal-to-noise spectra for seven red giant stars in the Boötes I dSph galaxy using standard abundance analysis techniques. In particular, we assume local thermodynamic equilibrium and employ spherical model atmospheres and codes that take the sphericity of the star into account when calculating the elemental abundances. Results: We confirm previous determinations of the mean metallicity of the Boötes I dwarf spheroidal galaxy to be -2.3 dex. Whilst five stars are clustered around this metallicity, one is significantly more metal-poor, at -2.9 dex, and one is more metal-rich at, -1.9 dex. Additionally, we find that one of the stars, Boo-127, shows an atypically high [Mg/Ca] ratio, indicative of stochastic enrichment processes within the dSph galaxy. Similar results have previously only been found in the Hercules and Draco dSph galaxies and appear, so far, to be unique to this type of galaxy. The data presented herein were obtained at the W.M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W.M. Keck Foundation.

  15. SPH modeling and simulation of spherical particles interacting in a viscoelastic matrix

    NASA Astrophysics Data System (ADS)

    Vázquez-Quesada, A.; Ellero, M.

    2017-12-01

    In this work, we extend the three-dimensional Smoothed Particle Hydrodynamics (SPH) non-colloidal particulate model previously developed for Newtonian suspending media in Vázquez-Quesada and Ellero ["Rheology and microstructure of non-colloidal suspensions under shear studied with smoothed particle hydrodynamics," J. Non-Newtonian Fluid Mech. 233, 37-47 (2016)] to viscoelastic matrices. For the solvent medium, the coarse-grained SPH viscoelastic formulation proposed in Vázquez-Quesada, Ellero, and Español ["Smoothed particle hydrodynamic model for viscoelastic fluids with thermal fluctuations," Phys. Rev. E 79, 056707 (2009)] is adopted. The property of this particular set of equations is that they are entirely derived within the general equation for non-equilibrium reversible-irreversible coupling formalism and therefore enjoy automatically thermodynamic consistency. The viscoelastic model is derived through a physical specification of a conformation-tensor-dependent entropy function for the fluid particles. In the simple case of suspended Hookean dumbbells, this delivers a specific SPH discretization of the Oldroyd-B constitutive equation. We validate the suspended particle model by studying the dynamics of single and mutually interacting "noncolloidal" rigid spheres under shear flow and in the presence of confinement. Numerical results agree well with available numerical and experimental data. It is straightforward to extend the particulate model to Brownian conditions and to more complex viscoelastic solvents.

  16. High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

    ERIC Educational Resources Information Center

    von Davier, Matthias

    2016-01-01

    This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…

  17. Genetic Code Optimization for Cotranslational Protein Folding: Codon Directional Asymmetry Correlates with Antiparallel Betasheets, tRNA Synthetase Classes.

    PubMed

    Seligmann, Hervé; Warthi, Ganesh

    2017-01-01

    A new codon property, codon directional asymmetry in nucleotide content (CDA), reveals a biologically meaningful genetic code dimension: palindromic codons (first and last nucleotides identical, codon structure XZX) are symmetric (CDA = 0), codons with structures ZXX/XXZ are 5'/3' asymmetric (CDA = - 1/1; CDA = - 0.5/0.5 if Z and X are both purines or both pyrimidines, assigning negative/positive (-/+) signs is an arbitrary convention). Negative/positive CDAs associate with (a) Fujimoto's tetrahedral codon stereo-table; (b) tRNA synthetase class I/II (aminoacylate the 2'/3' hydroxyl group of the tRNA's last ribose, respectively); and (c) high/low antiparallel (not parallel) betasheet conformation parameters. Preliminary results suggest CDA-whole organism associations (body temperature, developmental stability, lifespan). Presumably, CDA impacts spatial kinetics of codon-anticodon interactions, affecting cotranslational protein folding. Some synonymous codons have opposite CDA sign (alanine, leucine, serine, and valine), putatively explaining how synonymous mutations sometimes affect protein function. Correlations between CDA and tRNA synthetase classes are weaker than between CDA and antiparallel betasheet conformation parameters. This effect is stronger for mitochondrial genetic codes, and potentially drives mitochondrial codon-amino acid reassignments. CDA reveals information ruling nucleotide-protein relations embedded in reversed (not reverse-complement) sequences (5'-ZXX-3'/5'-XXZ-3').

  18. Hybrid parallel code acceleration methods in full-core reactor physics calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Courau, T.; Plagne, L.; Ponicot, A.

    2012-07-01

    When dealing with nuclear reactor calculation schemes, the need for three dimensional (3D) transport-based reference solutions is essential for both validation and optimization purposes. Considering a benchmark problem, this work investigates the potential of discrete ordinates (Sn) transport methods applied to 3D pressurized water reactor (PWR) full-core calculations. First, the benchmark problem is described. It involves a pin-by-pin description of a 3D PWR first core, and uses a 8-group cross-section library prepared with the DRAGON cell code. Then, a convergence analysis is performed using the PENTRAN parallel Sn Cartesian code. It discusses the spatial refinement and the associated angular quadraturemore » required to properly describe the problem physics. It also shows that initializing the Sn solution with the EDF SPN solver COCAGNE reduces the number of iterations required to converge by nearly a factor of 6. Using a best estimate model, PENTRAN results are then compared to multigroup Monte Carlo results obtained with the MCNP5 code. Good consistency is observed between the two methods (Sn and Monte Carlo), with discrepancies that are less than 25 pcm for the k{sub eff}, and less than 2.1% and 1.6% for the flux at the pin-cell level and for the pin-power distribution, respectively. (authors)« less

  19. Multicale modeling of the detonation of aluminized explosives using SPH-MD-QM method

    NASA Astrophysics Data System (ADS)

    Peng, Qing; Wang, Guangyu; Liu, Gui-Rong; de, Suvranu

    Aluminized explosives have been applied in military industry since decades ago. Compared with ideal explosives, aluminized explosives feature both fast detonation and slow metal combustion chemistry, generating a complex multi-phase reactive flow. Here, we introduce a sequential multiscale model of SPH-MD-QM to simulate the detonation behavior of aluminized explosives. At the bottom level, first-principles quantum mechanics (QM) calculations are employed to obtain the training sets for fitting the ReaxFF potentials, which are used in turn in the reactive molecular dynamics (MD) simulations in the middle level to obtain the chemical reaction rates and equations of states. At the up lever, a smooth particle hydrodynamics (SPH) method incorporated ignition and growth model and afterburning model has been used for the simulation of the detonation and combustion of the aluminized explosive. Simulation is compared with experiment and good agreement is observed. The proposed multiscale method of SPH-MD-QM could be used to optimize the performance of aluminized explosives. The authors would like to acknowledge the generous financial support from the Defense Threat Reduction Agency (DTRA) Grant No. HDTRA1-13-1-0025 and the Office of Naval Research Grants ONR Award No. N00014-08-1-0462 and No. N00014-12-1-0527.

  20. Multi-phase SPH model for simulation of erosion and scouring by means of the shields and Drucker-Prager criteria.

    NASA Astrophysics Data System (ADS)

    Zubeldia, Elizabeth H.; Fourtakas, Georgios; Rogers, Benedict D.; Farias, Márcio M.

    2018-07-01

    A two-phase numerical model using Smoothed Particle Hydrodynamics (SPH) is developed to model the scouring of two-phase liquid-sediments flows with large deformation. The rheology of sediment scouring due to flows with slow kinematics and high shear forces presents a challenge in terms of spurious numerical fluctuations. This paper bridges the gap between the non-Newtonian and Newtonian flows by proposing a model that combines the yielding, shear and suspension layer mechanics which are needed to predict accurately the local erosion phenomena. A critical bed-mobility condition based on the Shields criterion is imposed to the particles located at the sediment surface. Thus, the onset of the erosion process is independent on the pressure field and eliminates the numerical problem of pressure dependant erosion at the interface. This is combined with the Drucker-Prager yield criterion to predict the onset of yielding of the sediment surface and a concentration suspension model. The multi-phase model has been implemented in the open-source DualSPHysics code accelerated with a graphics processing unit (GPU). The multi-phase model has been compared with 2-D reference numerical models and new experimental data for scour with convergent results. Numerical results for a dry-bed dam break over an erodible bed shows improved agreement with experimental scour and water surface profiles compared to well-known SPH multi-phase models.

  1. Visual analysis of inter-process communication for large-scale parallel computing.

    PubMed

    Muelder, Chris; Gygi, Francois; Ma, Kwan-Liu

    2009-01-01

    In serial computation, program profiling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more efficient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become difficult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.

  2. The Sagittarius Dwarf Galaxy Survey (SDGS): Constraints on the Star Formation History of the Sgr dSph

    NASA Astrophysics Data System (ADS)

    Bellazzini, M.; Ferraro, F. R.; Buonanno, R.

    1999-01-01

    We present the first results of a large photometric survey devoted to the study of the star formation history in the Sagittarius dwarf spheroidal galaxy (Sgr dSph). Three large (size: 9 x 35 arcmin2) and widely spaced fields located nearly along the Sgr dSph major axis [(l,b) = (6.5 -16);(6-14);(5-12)] have been observed in the V and I passbands with the ESO-NTT 3.5-m telescope (La Silla - Chile). Well-calibrated photometry has been obtained for ˜90000 stars toward Sgr dSph and for ˜9000 stars in a (9 x 24 arcmin2) control field down to a limiting magnitude of V 22. At present this is the largest photometric (CCD) sample of Sgr dSph stars and the wide spacing between field provides the first opportunity of studying the stellar content of different regions of the galaxy (over a range of ˜2 Kpc across). Age and metallicity estimates are obtained for the detected stellar populations and the very first evidences are presented for (a) spatial differences in the stellar content and (b) the detection of a very metal poor population in the field of the Sgr galaxy.

  3. Lighting Up the Thioflavin T by Parallel-Stranded TG(GA) n DNA Homoduplexes.

    PubMed

    Zhu, Jinbo; Yan, Zhiqiang; Zhou, Weijun; Liu, Chuanbo; Wang, Jin; Wang, Erkang

    2018-06-22

    Thioflavin T (ThT) was once regarded to be a specific fluorescent probe for the human telomeric G-quadruplex, but more other kinds of DNA were found that can also bind to ThT in recent years. Herein, we focus on G-rich parallel-stranded DNA and utilize fluorescence, absorbance, circular dichroism, and surface plasmon resonance spectroscopy to investigate its interaction with ThT. Pyrene label and molecular modeling are applied to unveil the binding mechanism. We find a new class of non-G-quadruplex G-rich parallel-stranded ( ps) DNA with the sequence of TG(GA) n can bind to ThT and increase the fluorescence with an enhancement ability superior to G-quadruplex. The optimal binding specificity for ThT is conferred by two parts. The first part is composed of two bases TG at the 5' end, which is a critical domain and plays an important role in the formation of the binding site for ThT. The second part is the rest alternative d(GA) bases, which forms the ps homoduplex and cooperates with the TG bases at the 5' end to bind the ThT.

  4. SPH/N-body simulations of small (D = 10 km) monolithic asteroidal breakups and improved parametric relations for Monte-Carlo collisional models

    NASA Astrophysics Data System (ADS)

    Ševecek, Pavel; Broz, Miroslav; Nesvorny, David; Durda, Daniel D.; Asphaug, Erik; Walsh, Kevin J.; Richardson, Derek C.

    2016-10-01

    Detailed models of asteroid collisions can yield important constrains for the evolution of the Main Asteroid Belt, but the respective parameter space is large and often unexplored. We thus performed a new set of simulations of asteroidal breakups, i.e. fragmentations of intact targets, subsequent gravitational reaccumulation and formation of small asteroid families, focusing on parent bodies with diameters D = 10 km.Simulations were performed with a smoothed-particle hydrodynamics (SPH) code (Benz & Asphaug 1994), combined with an efficient N-body integrator (Richardson et al. 2000). We assumed a number of projectile sizes, impact velocities and impact angles. The rheology used in the physical model does not include friction nor crushing; this allows for a direct comparison to results of Durda et al. (2007). Resulting size-frequency distributions are significantly different from scaled-down simulations with D = 100 km monolithic targets, although they may be even more different for pre-shattered targets.We derive new parametric relations describing fragment distributions, suitable for Monte-Carlo collisional models. We also characterize velocity fields and angular distributions of fragments, which can be used as initial conditions in N-body simulations of small asteroid families. Finally, we discuss various uncertainties related to SPH simulations.

  5. SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX-80

    NASA Astrophysics Data System (ADS)

    Kamat, Manohar P.; Watson, Brian C.

    1992-11-01

    The finite element method has proven to be an invaluable tool for analysis and design of complex, high performance systems, such as bladed-disk assemblies in aircraft turbofan engines. However, as the problem size increase, the computation time required by conventional computers can be prohibitively high. Parallel processing computers provide the means to overcome these computation time limits. This report summarizes the results of a research activity aimed at providing a finite element capability for analyzing turbomachinery bladed-disk assemblies in a vector/parallel processing environment. A special purpose code, named with the acronym SAPNEW, has been developed to perform static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements. SAPNEW provides a stand alone capability for static and eigen analysis on the Alliant FX/80, a parallel processing computer. A preprocessor, named with the acronym NTOS, has been developed to accept NASTRAN input decks and convert them to the SAPNEW format to make SAPNEW more readily used by researchers at NASA Lewis Research Center.

  6. Temporal parallelization of edge plasma simulations using the parareal algorithm and the SOLPS code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Samaddar, Debasmita; Coster, D. P.; Bonnin, X.

    We show that numerical modelling of edge plasma physics may be successfully parallelized in time. The parareal algorithm has been employed for this purpose and the SOLPS code package coupling the B2.5 finite-volume fluid plasma solver with the kinetic Monte-Carlo neutral code Eirene has been used as a test bed. The complex dynamics of the plasma and neutrals in the scrape-off layer (SOL) region makes this a unique application. It is demonstrated that a significant computational gain (more than an order of magnitude) may be obtained with this technique. The use of the IPS framework for event-based parareal implementation optimizesmore » resource utilization and has been shown to significantly contribute to the computational gain.« less

  7. Temporal parallelization of edge plasma simulations using the parareal algorithm and the SOLPS code

    DOE PAGES

    Samaddar, Debasmita; Coster, D. P.; Bonnin, X.; ...

    2017-07-31

    We show that numerical modelling of edge plasma physics may be successfully parallelized in time. The parareal algorithm has been employed for this purpose and the SOLPS code package coupling the B2.5 finite-volume fluid plasma solver with the kinetic Monte-Carlo neutral code Eirene has been used as a test bed. The complex dynamics of the plasma and neutrals in the scrape-off layer (SOL) region makes this a unique application. It is demonstrated that a significant computational gain (more than an order of magnitude) may be obtained with this technique. The use of the IPS framework for event-based parareal implementation optimizesmore » resource utilization and has been shown to significantly contribute to the computational gain.« less

  8. ALARIC: An algorithm for constructing arbitrarily complex initial density distributions with low particle noise for SPH/SPMHD applications

    NASA Astrophysics Data System (ADS)

    Vela Vela, Luis; Sanchez, Raul; Geiger, Joachim

    2018-03-01

    A method is presented to obtain initial conditions for Smoothed Particle Hydrodynamic (SPH) scenarios where arbitrarily complex density distributions and low particle noise are needed. Our method, named ALARIC, tampers with the evolution of the internal variables to obtain a fast and efficient profile evolution towards the desired goal. The result has very low levels of particle noise and constitutes a perfect candidate to study the equilibrium and stability properties of SPH/SPMHD systems. The method uses the iso-thermal SPH equations to calculate hydrodynamical forces under the presence of an external fictitious potential and evolves them in time with a 2nd-order symplectic integrator. The proposed method generates tailored initial conditions that perform better in many cases than those based on purely crystalline lattices, since it prevents the appearance of anisotropies.

  9. Fully Parallel MHD Stability Analysis Tool

    NASA Astrophysics Data System (ADS)

    Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang

    2014-10-01

    Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Initial results of the code parallelization will be reported. Work is supported by the U.S. DOE SBIR program.

  10. Bilingual parallel programming

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Foster, I.; Overbeek, R.

    1990-01-01

    Numerous experiments have demonstrated that computationally intensive algorithms support adequate parallelism to exploit the potential of large parallel machines. Yet successful parallel implementations of serious applications are rare. The limiting factor is clearly programming technology. None of the approaches to parallel programming that have been proposed to date -- whether parallelizing compilers, language extensions, or new concurrent languages -- seem to adequately address the central problems of portability, expressiveness, efficiency, and compatibility with existing software. In this paper, we advocate an alternative approach to parallel programming based on what we call bilingual programming. We present evidence that this approach providesmore » and effective solution to parallel programming problems. The key idea in bilingual programming is to construct the upper levels of applications in a high-level language while coding selected low-level components in low-level languages. This approach permits the advantages of a high-level notation (expressiveness, elegance, conciseness) to be obtained without the cost in performance normally associated with high-level approaches. In addition, it provides a natural framework for reusing existing code.« less

  11. Computer-Aided Parallelizer and Optimizer

    NASA Technical Reports Server (NTRS)

    Jin, Haoqiang

    2011-01-01

    The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.

  12. Co-assembly of Zn(SPh){sub 2} and organic linkers into helical and zig-zag polymer chains

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu Yi; Yu Lingmin; Loo, Say Chye Joachim

    2012-07-15

    Two novel one-dimensional coordination polymers, single helicate [Zn(SPh){sub 2}(TPyTA)(EG)]{sub n} (EG=ethylene glycol) (1) and zig-zag structure [Zn(SPh){sub 2}(BPyVB)]{sub n} (2), were synthesized under solvothermal conditions at 150 Degree-Sign C or room temperature by the co-assembly of Zn(SPh){sub 2} and organic linkers such as 2,4,6-tri(4-pyridyl)-1,3,5-triazine (TPyTA) and 1,3-bis(trans-4-pyridylvinyl)benzene (BPyVB). X-ray crystallography study reveals that both polymers 1 and 2 crystallize in space group P2{sub 1}/c of the monoclinic system. The solid-state UV-vis absorption spectra show that 1 and 2 have maxium absorption onsets at 400 nm and 420 nm, respectively. TGA analysis indicates that 1 and 2 are stable up tomore » 110 Degree-Sign C and 210 Degree-Sign C. - Graphical abstract: Two novel one-dimensional coordination polymers, single helicate [Zn(SPh){sub 2}(TPyTA)(EG)]{sub n} (1) and zig-zag structure [Zn(SPh){sub 2}(BPyVB)]{sub n} (2), were synthesized. Solid-state UV-vis absorptions show that 1 and 2 have maxium absorption onsets at 400 nm and 420 nm, respectively. TGA analysis indicates that 1 and 2 are stable up to 110 Degree-Sign C and 210 Degree-Sign C. Highlights: Black-Right-Pointing-Pointer Two novel one-dimensional coordination polymers have been synthesized. Black-Right-Pointing-Pointer TPyTA results in helical structures in 1 while BPyVB leads to zig-zag chains in 2. Black-Right-Pointing-Pointer Solid-state UV-vis absorption spectra and TGA analysis of the title polymers were studied.« less

  13. Improvement of Mishchenko's T-matrix code for absorbing particles.

    PubMed

    Moroz, Alexander

    2005-06-10

    The use of Gaussian elimination with backsubstitution for matrix inversion in scattering theories is discussed. Within the framework of the T-matrix method (the state-of-the-art code by Mishchenko is freely available at http://www.giss.nasa.gov/-crmim), it is shown that the domain of applicability of Mishchenko's FORTRAN 77 (F77) code can be substantially expanded in the direction of strongly absorbing particles where the current code fails to converge. Such an extension is especially important if the code is to be used in nanoplasmonic or nanophotonic applications involving metallic particles. At the same time, convergence can also be achieved for large nonabsorbing particles, in which case the non-Numerical Algorithms Group option of Mishchenko's code diverges. Computer F77 implementation of Mishchenko's code supplemented with Gaussian elimination with backsubstitution is freely available at http://www.wave-scattering.com.

  14. Performance analysis of a parallel Monte Carlo code for simulating solar radiative transfer in cloudy atmospheres using CUDA-enabled NVIDIA GPU

    NASA Astrophysics Data System (ADS)

    Russkova, Tatiana V.

    2017-11-01

    One tool to improve the performance of Monte Carlo methods for numerical simulation of light transport in the Earth's atmosphere is the parallel technology. A new algorithm oriented to parallel execution on the CUDA-enabled NVIDIA graphics processor is discussed. The efficiency of parallelization is analyzed on the basis of calculating the upward and downward fluxes of solar radiation in both a vertically homogeneous and inhomogeneous models of the atmosphere. The results of testing the new code under various atmospheric conditions including continuous singlelayered and multilayered clouds, and selective molecular absorption are presented. The results of testing the code using video cards with different compute capability are analyzed. It is shown that the changeover of computing from conventional PCs to the architecture of graphics processors gives more than a hundredfold increase in performance and fully reveals the capabilities of the technology used.

  15. SphK1 inhibitor II (SKI-II) inhibits acute myelogenous leukemia cell growth in vitro and in vivo

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Li; Weng, Wei; Sun, Zhi-Xin

    Previous studies have identified sphingosine kinase 1 (SphK1) as a potential drug target for treatment of acute myeloid leukemia (AML). In the current study, we investigated the potential anti-leukemic activity of a novel and specific SphK1 inhibitor, SKI-II. We demonstrated that SKI-II inhibited growth and survival of human AML cell lines (HL-60 and U937 cells). SKI-II was more efficient than two known SphK1 inhibitors SK1-I and FTY720 in inhibiting AML cells. Meanwhile, it induced dramatic apoptosis in above AML cells, and the cytotoxicity by SKI-II was almost reversed by the general caspase inhibitor z-VAD-fmk. SKI-II treatment inhibited SphK1 activation, andmore » concomitantly increased level of sphingosine-1-phosphate (S1P) precursor ceramide in AML cells. Conversely, exogenously-added S1P protected against SKI-II-induced cytotoxicity, while cell permeable short-chain ceramide (C6) aggravated SKI-II's lethality against AML cells. Notably, SKI-II induced potent apoptotic death in primary human AML cells, but was generally safe to the human peripheral blood mononuclear cells (PBMCs) isolated from healthy donors. In vivo, SKI-II administration suppressed growth of U937 leukemic xenograft tumors in severe combined immunodeficient (SCID) mice. These results suggest that SKI-II might be further investigated as a promising anti-AML agent. - Highlights: • SKI-II inhibits proliferation and survival of primary and transformed AML cells. • SKI-II induces apoptotic death of AML cells, but is safe to normal PBMCs. • SKI-II is more efficient than two known SphK1 inhibitors in inhibiting AML cells. • SKI-II inhibits SphK1 activity, while increasing ceramide production in AML cells. • SKI-II dose-dependently inhibits U937 xenograft growth in SCID mice.« less

  16. Legacy Code Modernization

    NASA Technical Reports Server (NTRS)

    Hribar, Michelle R.; Frumkin, Michael; Jin, Haoqiang; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

    1998-01-01

    Over the past decade, high performance computing has evolved rapidly; systems based on commodity microprocessors have been introduced in quick succession from at least seven vendors/families. Porting codes to every new architecture is a difficult problem; in particular, here at NASA, there are many large CFD applications that are very costly to port to new machines by hand. The LCM ("Legacy Code Modernization") Project is the development of an integrated parallelization environment (IPE) which performs the automated mapping of legacy CFD (Fortran) applications to state-of-the-art high performance computers. While most projects to port codes focus on the parallelization of the code, we consider porting to be an iterative process consisting of several steps: 1) code cleanup, 2) serial optimization,3) parallelization, 4) performance monitoring and visualization, 5) intelligent tools for automated tuning using performance prediction and 6) machine specific optimization. The approach for building this parallelization environment is to build the components for each of the steps simultaneously and then integrate them together. The demonstration will exhibit our latest research in building this environment: 1. Parallelizing tools and compiler evaluation. 2. Code cleanup and serial optimization using automated scripts 3. Development of a code generator for performance prediction 4. Automated partitioning 5. Automated insertion of directives. These demonstrations will exhibit the effectiveness of an automated approach for all the steps involved with porting and tuning a legacy code application for a new architecture.

  17. A 3-D SPH model for simulating water flooding of a damaged floating structure

    NASA Astrophysics Data System (ADS)

    Guo, Kai; Sun, Peng-nan; Cao, Xue-yan; Huang, Xiao

    2017-10-01

    With the quasi-static analysis method, the terminal floating state of a damaged ship is usually evaluated for the risk assessment. But this is not enough since the ship has the possibility to lose its stability during the transient flooding process. Therefore, an enhanced smoothed particle hydrodynamics (SPH) model is applied in this paper to investigate the response of a simplified cabin model under the condition of the transient water flooding. The enhanced SPH model is presented firstly including the governing equations, the diffusive terms, the boundary implementations and then an algorithm regarding the coupling motions of six degrees of freedom (6-DOF) between the structure and the fluid is described. In the numerical results, a non-damaged cabin floating under the rest condition is simulated. It is shown that a stable floating state can be reached and maintained by using the present SPH scheme. After that, three-dimensional (3-D) test cases of the damaged cabin with a hole at different locations are simulated. A series of model tests are also carried out for the validation. Fairly good agreements are achieved between the numerical results and the experimental data. Relevant conclusions are drawn with respect to the mechanism of the responses of the damaged cabin model under water flooding conditions.

  18. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    DOE PAGES

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; ...

    2015-12-21

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Somemore » specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000 ® problems. These benchmark and scaling studies show promising results.« less

  19. An arbitrary boundary with ghost particles incorporated in coupled FEM-SPH model for FSI problems

    NASA Astrophysics Data System (ADS)

    Long, Ting; Hu, Dean; Wan, Detao; Zhuang, Chen; Yang, Gang

    2017-12-01

    It is important to treat the arbitrary boundary of Fluid-Structure Interaction (FSI) problems in computational mechanics. In order to ensure complete support condition and restore the first-order consistency near the boundary of Smoothed Particle Hydrodynamics (SPH) method for coupling Finite Element Method (FEM) with SPH model, a new ghost particle method is proposed by dividing the interceptive area of kernel support domain into subareas corresponding to boundary segments of structure. The ghost particles are produced automatically for every fluid particle at each time step, and the properties of ghost particles, such as density, mass and velocity, are defined by using the subareas to satisfy the boundary condition. In the coupled FEM-SPH model, the normal and shear forces from a boundary segment of structure to a fluid particle are calculated through the corresponding ghost particles, and its opposite forces are exerted on the corresponding boundary segment, then the momentum of the present method is conservation and there is no matching requirements between the size of elements and the size of particles. The performance of the present method is discussed and validated by several FSI problems with complex geometry boundary and moving boundary.

  20. OSIRIS - an object-oriented parallel 3D PIC code for modeling laser and particle beam-plasma interaction

    NASA Astrophysics Data System (ADS)

    Hemker, Roy

    1999-11-01

    The advances in computational speed make it now possible to do full 3D PIC simulations of laser plasma and beam plasma interactions, but at the same time the increased complexity of these problems makes it necessary to apply modern approaches like object oriented programming to the development of simulation codes. We report here on our progress in developing an object oriented parallel 3D PIC code using Fortran 90. In its current state the code contains algorithms for 1D, 2D, and 3D simulations in cartesian coordinates and for 2D cylindrically-symmetric geometry. For all of these algorithms the code allows for a moving simulation window and arbitrary domain decomposition for any number of dimensions. Recent 3D simulation results on the propagation of intense laser and electron beams through plasmas will be presented.

  1. Fully Parallel MHD Stability Analysis Tool

    NASA Astrophysics Data System (ADS)

    Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang

    2015-11-01

    Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Results of MARS parallelization and of the development of a new fix boundary equilibrium code adapted for MARS input will be reported. Work is supported by the U.S. DOE SBIR program.

  2. Support for Debugging Automatically Parallelized Programs

    NASA Technical Reports Server (NTRS)

    Hood, Robert; Jost, Gabriele

    2001-01-01

    This viewgraph presentation provides information on support sources available for the automatic parallelization of computer program. CAPTools, a support tool developed at the University of Greenwich, transforms, with user guidance, existing sequential Fortran code into parallel message passing code. Comparison routines are then run for debugging purposes, in essence, ensuring that the code transformation was accurate.

  3. T-cell libraries allow simple parallel generation of multiple peptide-specific human T-cell clones.

    PubMed

    Theaker, Sarah M; Rius, Cristina; Greenshields-Watson, Alexander; Lloyd, Angharad; Trimby, Andrew; Fuller, Anna; Miles, John J; Cole, David K; Peakman, Mark; Sewell, Andrew K; Dolton, Garry

    2016-03-01

    Isolation of peptide-specific T-cell clones is highly desirable for determining the role of T-cells in human disease, as well as for the development of therapies and diagnostics. However, generation of monoclonal T-cells with the required specificity is challenging and time-consuming. Here we describe a library-based strategy for the simple parallel detection and isolation of multiple peptide-specific human T-cell clones from CD8(+) or CD4(+) polyclonal T-cell populations. T-cells were first amplified by CD3/CD28 microbeads in a 96U-well library format, prior to screening for desired peptide recognition. T-cells from peptide-reactive wells were then subjected to cytokine-mediated enrichment followed by single-cell cloning, with the entire process from sample to validated clone taking as little as 6 weeks. Overall, T-cell libraries represent an efficient and relatively rapid tool for the generation of peptide-specific T-cell clones, with applications shown here in infectious disease (Epstein-Barr virus, influenza A, and Ebola virus), autoimmunity (type 1 diabetes) and cancer. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  4. Multi-Zone Liquid Thrust Chamber Performance Code with Domain Decomposition for Parallel Processing

    NASA Technical Reports Server (NTRS)

    Navaz, Homayun K.

    2002-01-01

    -equation turbulence model, and two-phase flow. To overcome these limitations, the LTCP code is rewritten to include the multi-zone capability with domain decomposition that makes it suitable for parallel processing, i.e., enabling the code to run every zone or sub-domain on a separate processor. This can reduce the run time by a factor of 6 to 8, depending on the problem.

  5. Implementation of a flexible and scalable particle-in-cell method for massively parallel computations in the mantle convection code ASPECT

    NASA Astrophysics Data System (ADS)

    Gassmöller, Rene; Bangerth, Wolfgang

    2016-04-01

    Particle-in-cell methods have a long history and many applications in geodynamic modelling of mantle convection, lithospheric deformation and crustal dynamics. They are primarily used to track material information, the strain a material has undergone, the pressure-temperature history a certain material region has experienced, or the amount of volatiles or partial melt present in a region. However, their efficient parallel implementation - in particular combined with adaptive finite-element meshes - is complicated due to the complex communication patterns and frequent reassignment of particles to cells. Consequently, many current scientific software packages accomplish this efficient implementation by specifically designing particle methods for a single purpose, like the advection of scalar material properties that do not evolve over time (e.g., for chemical heterogeneities). Design choices for particle integration, data storage, and parallel communication are then optimized for this single purpose, making the code relatively rigid to changing requirements. Here, we present the implementation of a flexible, scalable and efficient particle-in-cell method for massively parallel finite-element codes with adaptively changing meshes. Using a modular plugin structure, we allow maximum flexibility of the generation of particles, the carried tracer properties, the advection and output algorithms, and the projection of properties to the finite-element mesh. We present scaling tests ranging up to tens of thousands of cores and tens of billions of particles. Additionally, we discuss efficient load-balancing strategies for particles in adaptive meshes with their strengths and weaknesses, local particle-transfer between parallel subdomains utilizing existing communication patterns from the finite element mesh, and the use of established parallel output algorithms like the HDF5 library. Finally, we show some relevant particle application cases, compare our implementation to a

  6. General Relativistic Smoothed Particle Hydrodynamics code developments: A progress report

    NASA Astrophysics Data System (ADS)

    Faber, Joshua; Silberman, Zachary; Rizzo, Monica

    2017-01-01

    We report on our progress in developing a new general relativistic Smoothed Particle Hydrodynamics (SPH) code, which will be appropriate for studying the properties of accretion disks around black holes as well as compact object binary mergers and their ejecta. We will discuss in turn the relativistic formalisms being used to handle the evolution, our techniques for dealing with conservative and primitive variables, as well as those used to ensure proper conservation of various physical quantities. Code tests and performance metrics will be discussed, as will the prospects for including smoothed particle hydrodynamics codes within other numerical relativity codebases, particularly the publicly available Einstein Toolkit. We acknowledge support from NSF award ACI-1550436 and an internal RIT D-RIG grant.

  7. Incremental Parallelization of Non-Data-Parallel Programs Using the Charon Message-Passing Library

    NASA Technical Reports Server (NTRS)

    VanderWijngaart, Rob F.

    2000-01-01

    Message passing is among the most popular techniques for parallelizing scientific programs on distributed-memory architectures. The reasons for its success are wide availability (MPI), efficiency, and full tuning control provided to the programmer. A major drawback, however, is that incremental parallelization, as offered by compiler directives, is not generally possible, because all data structures have to be changed throughout the program simultaneously. Charon remedies this situation through mappings between distributed and non-distributed data. It allows breaking up the parallelization into small steps, guaranteeing correctness at every stage. Several tools are available to help convert legacy codes into high-performance message-passing programs. They usually target data-parallel applications, whose loops carrying most of the work can be distributed among all processors without much dependency analysis. Others do a full dependency analysis and then convert the code virtually automatically. Even more toolkits are available that aid construction from scratch of message passing programs. None, however, allows piecemeal translation of codes with complex data dependencies (i.e. non-data-parallel programs) into message passing codes. The Charon library (available in both C and Fortran) provides incremental parallelization capabilities by linking legacy code arrays with distributed arrays. During the conversion process, non-distributed and distributed arrays exist side by side, and simple mapping functions allow the programmer to switch between the two in any location in the program. Charon also provides wrapper functions that leave the structure of the legacy code intact, but that allow execution on truly distributed data. Finally, the library provides a rich set of communication functions that support virtually all patterns of remote data demands in realistic structured grid scientific programs, including transposition, nearest-neighbor communication, pipelining

  8. Prescribed Velocity Gradients for Highly Viscous SPH Fluids with Vorticity Diffusion.

    PubMed

    Peer, Andreas; Teschner, Matthias

    2017-12-01

    Working with prescribed velocity gradients is a promising approach to efficiently and robustly simulate highly viscous SPH fluids. Such approaches allow to explicitly and independently process shear rate, spin, and expansion rate. This can be used to, e.g., avoid interferences between pressure and viscosity solvers. Another interesting aspect is the possibility to explicitly process the vorticity, e.g., to preserve the vorticity. In this context, this paper proposes a novel variant of the prescribed-gradient idea that handles vorticity in a physically motivated way. In contrast to a less appropriate vorticity preservation that has been used in a previous approach, vorticity is diffused. The paper illustrates the utility of the vorticity diffusion. Therefore, comparisons of the proposed vorticity diffusion with vorticity preservation and additionally with vorticity damping are presented. The paper further discusses the relation between prescribed velocity gradients and prescribed velocity Laplacians which improves the intuition behind the prescribed-gradient method for highly viscous SPH fluids. Finally, the paper discusses the relation of the proposed method to a physically correct implicit viscosity formulation.

  9. M3Ag17(SPh)12 Nanoparticles and Their Structure Prediction.

    PubMed

    Wickramasinghe, Sameera; Atnagulov, Aydar; Conn, Brian E; Yoon, Bokwon; Barnett, Robert N; Griffith, Wendell P; Landman, Uzi; Bigioni, Terry P

    2015-09-16

    Although silver nanoparticles are of great fundamental and practical interest, only one structure has been determined thus far: M4Ag44(SPh)30, where M is a monocation, and SPh is an aromatic thiolate ligand. This is in part due to the fact that no other molecular silver nanoparticles have been synthesized with aromatic thiolate ligands. Here we report the synthesis of M3Ag17(4-tert-butylbenzene-thiol)12, which has good stability and an unusual optical spectrum. We also present a rational strategy for predicting the structure of this molecule. First-principles calculations support the structural model, predict a HOMO-LUMO energy gap of 1.77 eV, and predict a new "monomer mount" capping motif, Ag(SR)3, for Ag nanoparticles. The calculated optical absorption spectrum is in good correspondence with the measured spectrum. Heteroatom substitution was also used as a structural probe. First-principles calculations based on the structural model predicted a strong preference for a single Au atom substitution in agreement with experiment.

  10. VizieR Online Data Catalog: Mercury-T code (Bolmont+, 2015)

    NASA Astrophysics Data System (ADS)

    Bolmont, E.; Raymond, S. N.; Leconte, J.; Hersant, F.; Correia, A. C. M.

    2015-11-01

    The major addition to Mercury provided in Mercury-T is the addition of the tidal forces and torques. But we also added the effect of general relativity and rotation-induced deformation. We explain in the following sections how these effects were incorporated in the code. We also give the planets and star/BD/Jupiter parameters which are implemented in the code. The link to this code and the manual can also be found here: http://www.emelinebolmont.com/research-interests (2 data files).

  11. OpenSWPC: an open-source integrated parallel simulation code for modeling seismic wave propagation in 3D heterogeneous viscoelastic media

    NASA Astrophysics Data System (ADS)

    Maeda, Takuto; Takemura, Shunsuke; Furumura, Takashi

    2017-07-01

    We have developed an open-source software package, Open-source Seismic Wave Propagation Code (OpenSWPC), for parallel numerical simulations of seismic wave propagation in 3D and 2D (P-SV and SH) viscoelastic media based on the finite difference method in local-to-regional scales. This code is equipped with a frequency-independent attenuation model based on the generalized Zener body and an efficient perfectly matched layer for absorbing boundary condition. A hybrid-style programming using OpenMP and the Message Passing Interface (MPI) is adopted for efficient parallel computation. OpenSWPC has wide applicability for seismological studies and great portability to allowing excellent performance from PC clusters to supercomputers. Without modifying the code, users can conduct seismic wave propagation simulations using their own velocity structure models and the necessary source representations by specifying them in an input parameter file. The code has various modes for different types of velocity structure model input and different source representations such as single force, moment tensor and plane-wave incidence, which can easily be selected via the input parameters. Widely used binary data formats, the Network Common Data Form (NetCDF) and the Seismic Analysis Code (SAC) are adopted for the input of the heterogeneous structure model and the outputs of the simulation results, so users can easily handle the input/output datasets. All codes are written in Fortran 2003 and are available with detailed documents in a public repository.[Figure not available: see fulltext.

  12. Parallel filtering in global gyrokinetic simulations

    NASA Astrophysics Data System (ADS)

    Jolliet, S.; McMillan, B. F.; Villard, L.; Vernay, T.; Angelino, P.; Tran, T. M.; Brunner, S.; Bottino, A.; Idomura, Y.

    2012-02-01

    In this work, a Fourier solver [B.F. McMillan, S. Jolliet, A. Bottino, P. Angelino, T.M. Tran, L. Villard, Comp. Phys. Commun. 181 (2010) 715] is implemented in the global Eulerian gyrokinetic code GT5D [Y. Idomura, H. Urano, N. Aiba, S. Tokuda, Nucl. Fusion 49 (2009) 065029] and in the global Particle-In-Cell code ORB5 [S. Jolliet, A. Bottino, P. Angelino, R. Hatzky, T.M. Tran, B.F. McMillan, O. Sauter, K. Appert, Y. Idomura, L. Villard, Comp. Phys. Commun. 177 (2007) 409] in order to reduce the memory of the matrix associated with the field equation. This scheme is verified with linear and nonlinear simulations of turbulence. It is demonstrated that the straight-field-line angle is the coordinate that optimizes the Fourier solver, that both linear and nonlinear turbulent states are unaffected by the parallel filtering, and that the k∥ spectrum is independent of plasma size at fixed normalized poloidal wave number.

  13. CoCoNuT: General relativistic hydrodynamics code with dynamical space-time evolution

    NASA Astrophysics Data System (ADS)

    Dimmelmeier, Harald; Novak, Jérôme; Cerdá-Durán, Pablo

    2012-02-01

    CoCoNuT is a general relativistic hydrodynamics code with dynamical space-time evolution. The main aim of this numerical code is the study of several astrophysical scenarios in which general relativity can play an important role, namely the collapse of rapidly rotating stellar cores and the evolution of isolated neutron stars. The code has two flavors: CoCoA, the axisymmetric (2D) magnetized version, and CoCoNuT, the 3D non-magnetized version.

  14. RY-Coding and Non-Homogeneous Models Can Ameliorate the Maximum-Likelihood Inferences From Nucleotide Sequence Data with Parallel Compositional Heterogeneity.

    PubMed

    Ishikawa, Sohta A; Inagaki, Yuji; Hashimoto, Tetsuo

    2012-01-01

    In phylogenetic analyses of nucleotide sequences, 'homogeneous' substitution models, which assume the stationarity of base composition across a tree, are widely used, albeit individual sequences may bear distinctive base frequencies. In the worst-case scenario, a homogeneous model-based analysis can yield an artifactual union of two distantly related sequences that achieved similar base frequencies in parallel. Such potential difficulty can be countered by two approaches, 'RY-coding' and 'non-homogeneous' models. The former approach converts four bases into purine and pyrimidine to normalize base frequencies across a tree, while the heterogeneity in base frequency is explicitly incorporated in the latter approach. The two approaches have been applied to real-world sequence data; however, their basic properties have not been fully examined by pioneering simulation studies. Here, we assessed the performances of the maximum-likelihood analyses incorporating RY-coding and a non-homogeneous model (RY-coding and non-homogeneous analyses) on simulated data with parallel convergence to similar base composition. Both RY-coding and non-homogeneous analyses showed superior performances compared with homogeneous model-based analyses. Curiously, the performance of RY-coding analysis appeared to be significantly affected by a setting of the substitution process for sequence simulation relative to that of non-homogeneous analysis. The performance of a non-homogeneous analysis was also validated by analyzing a real-world sequence data set with significant base heterogeneity.

  15. Loss of immunological tolerance in Gimap5-deficient mice is associated with loss of Foxo in CD4+ T cells

    PubMed Central

    Aksoylar, H. Ibrahim; Lampe, Kristin; Barnes, Michael J.; Plas, David R.; Hoebe, Kasper

    2011-01-01

    Previously, we reported the abrogation of quiescence and reduced survival in lymphocytes from Gimap5sph/sph mice, an ENU germline mutant with a missense mutation in the GTPase of immunity-associated nucleotide binding protein 5 (Gimap5). These mice showed a progressive loss of peripheral lymphocyte populations and developed spontaneous colitis, resulting in early mortality. Here, we identify the molecular pathways that contribute to the onset of colitis in Gimap5sph/sph mice. We show that CD4+ T cells become Th1/Th17-polarized and are critically important for the development of colitis. Concomitantly, Treg cells become reduced in frequency in the peripheral tissues and their immune-suppressive capacity becomes impaired. Most importantly, these progressive changes in CD4+ T cells are associated with the loss of Foxo1, Foxo3 and Foxo4 expression. Our data establish a novel link between Gimap5 and Foxo expression and provide evidence for a regulatory mechanism that controls Foxo protein expression and may help maintain immunological tolerance. PMID:22106000

  16. First X-ray crystal structure and internal reference diffusion-ordered NMR spectroscopy study of the prototypical Posner reagent, MeCu(SPh)Li(THF)3.

    PubMed

    Bertz, Steven H; Hardin, Richard A; Heavey, Thomas J; Jones, Daniel S; Monroe, T Blake; Murphy, Michael D; Ogle, Craig A; Whaley, Tara N

    2013-07-29

    Grow slow: The usual direct treatment of MeLi and CuSPh did not yield X-ray quality crystals of MeCu(SPh)Li. An indirect method starting from Me2CuLi⋅LiSPh and chalcone afforded the desired crystals by the slow reaction of the intermediate π-complex (see scheme). This strategy produced the first X-ray crystal structure of a Posner cuprate. A complementary NMR study showed that the contact ion pair was also the main species in solution. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Force user's manual: A portable, parallel FORTRAN

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.; Benten, Muhammad S.; Arenstorf, Norbert S.; Ramanan, Aruna V.

    1990-01-01

    The use of Force, a parallel, portable FORTRAN on shared memory parallel computers is described. Force simplifies writing code for parallel computers and, once the parallel code is written, it is easily ported to computers on which Force is installed. Although Force is nearly the same for all computers, specific details are included for the Cray-2, Cray-YMP, Convex 220, Flex/32, Encore, Sequent, Alliant computers on which it is installed.

  18. Efficient Helicopter Aerodynamic and Aeroacoustic Predictions on Parallel Computers

    NASA Technical Reports Server (NTRS)

    Wissink, Andrew M.; Lyrintzis, Anastasios S.; Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak

    1996-01-01

    This paper presents parallel implementations of two codes used in a combined CFD/Kirchhoff methodology to predict the aerodynamics and aeroacoustics properties of helicopters. The rotorcraft Navier-Stokes code, TURNS, computes the aerodynamic flowfield near the helicopter blades and the Kirchhoff acoustics code computes the noise in the far field, using the TURNS solution as input. The overall parallel strategy adds MPI message passing calls to the existing serial codes to allow for communication between processors. As a result, the total code modifications required for parallel execution are relatively small. The biggest bottleneck in running the TURNS code in parallel comes from the LU-SGS algorithm that solves the implicit system of equations. We use a new hybrid domain decomposition implementation of LU-SGS to obtain good parallel performance on the SP-2. TURNS demonstrates excellent parallel speedups for quasi-steady and unsteady three-dimensional calculations of a helicopter blade in forward flight. The execution rate attained by the code on 114 processors is six times faster than the same cases run on one processor of the Cray C-90. The parallel Kirchhoff code also shows excellent parallel speedups and fast execution rates. As a performance demonstration, unsteady acoustic pressures are computed at 1886 far-field observer locations for a sample acoustics problem. The calculation requires over two hundred hours of CPU time on one C-90 processor but takes only a few hours on 80 processors of the SP2. The resultant far-field acoustic field is analyzed with state of-the-art audio and video rendering of the propagating acoustic signals.

  19. gpuSPHASE-A shared memory caching implementation for 2D SPH using CUDA

    NASA Astrophysics Data System (ADS)

    Winkler, Daniel; Meister, Michael; Rezavand, Massoud; Rauch, Wolfgang

    2017-04-01

    Smoothed particle hydrodynamics (SPH) is a meshless Lagrangian method that has been successfully applied to computational fluid dynamics (CFD), solid mechanics and many other multi-physics problems. Using the method to solve transport phenomena in process engineering requires the simulation of several days to weeks of physical time. Based on the high computational demand of CFD such simulations in 3D need a computation time of years so that a reduction to a 2D domain is inevitable. In this paper gpuSPHASE, a new open-source 2D SPH solver implementation for graphics devices, is developed. It is optimized for simulations that must be executed with thousands of frames per second to be computed in reasonable time. A novel caching algorithm for Compute Unified Device Architecture (CUDA) shared memory is proposed and implemented. The software is validated and the performance is evaluated for the well established dambreak test case.

  20. Au99(SPh)42 nanomolecules: aromatic thiolate ligand induced conversion of Au144(SCH2CH2Ph)60.

    PubMed

    Nimmala, Praneeth Reddy; Dass, Amala

    2014-12-10

    A new aromatic thiolate protected gold nanomolecule Au99(SPh)42 has been synthesized by reacting the highly stable Au144(SCH2CH2Ph)60 with thiophenol, HSPh. The ubiquitous Au144(SR)60 is known for its high stability even at elevated temperature and in the presence of excess thiol. This report demonstrates for the first time the reactivity of the Au144(SCH2CH2Ph)60 with thiophenol to form a different 99-Au atom species. The resulting Au99(SPh)42 compound, however, is unreactive and highly stable in the presence of excess aromatic thiol. The molecular formula of the title compound is determined by high resolution electrospray mass spectrometry (ESI-MS) and confirmed by the preparation of the 99-atom nanomolecule using two ligands, namely, Au99(SPh)42 and Au99(SPh-OMe)42. This mass spectrometry study is an unprecedented advance in nanoparticle reaction monitoring, in studying the 144-atom to 99-atom size evolution at such high m/z (∼12k) and resolution. The optical and electrochemical properties of Au99(SPh)42 are reported. Other substituents on the phenyl group, HS-Ph-X, where X = -F, -CH3, -OCH3, also show the Au144 to Au99 core size conversion, suggesting minimal electronic effects for these substituents. Control experiments were conducted by reacting Au144(SCH2CH2Ph)60 with HS-(CH2)n-Ph (where n = 1 and 2), bulky ligands like adamantanethiol and cyclohexanethiol. It was observed that conversion of Au144 to Au99 occurs only when the phenyl group is directly attached to the thiol, suggesting that the formation of a 99-atom species is largely influenced by aromaticity of the ligand and less so on the bulkiness of the ligand.

  1. Isolation, gene cloning and expression profile of a pathogen recognition protein: a serine proteinase homolog (Sp-SPH) involved in the antibacterial response in the crab Scylla paramamosain.

    PubMed

    Liu, Hai-peng; Chen, Rong-yuan; Zhang, Min; Wang, Ke-jian

    2010-07-01

    To identify the frontline defense molecules against microbial infection in the crab Scylla paramamosain, a live crab pathogenic microbe, Vibrio parahaemolyticus, was recruited as an affinity matrix to isolate innate immune factors from crab hemocytes lysate. Interestingly, a serine proteinase homolog (Sp-SPH) was obtained together with an antimicrobial peptide-antilipopolysaccharide factor (Sp-ALF). We then determined the full-length cDNA sequence of Sp-SPH, which contained 1298bp with an open reading frame of 1107bp encoding 369 amino acid residues. Multiple alignment analysis showed that the deduced amino acid sequences of Sp-SPH shared overall identity (83.8%) with those of SPH-containing proteins from other crab species. Tissue distribution analysis indicated that the Sp-SPH transcripts were present in various tissues including eye stalk, subcuticular epidermis, gill, hemocyte, stomach, thorax ganglion, brain and muscle of S. paramamosain. The Sp-SPH was highly expressed in selected different development stages including embryo (I, II, III and V), zoea (I), megalopa, and juvenile. Importantly, the prophenoloxidase was also present in the embryos, zoea, juvenile and adult crabs, but relatively lower in megalopa compared to those of other stages. Furthermore, the Sp-SPH mRNA expression showed a statistically significant increase (P<0.05) in both hemocyte and subcuticular epidermis at 24h, and in gill at 96h after challenge of V. parahaemolyticus determined by quantitative real-time PCR. Taken together, the live-bacterial-binding activity and the acute-phase response against bacterial infection of Sp-SPH suggested that it might function as an innate immune recognition molecule and play a key role in host defense against microbe invasion in the crab S. paramamosain. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  2. Modeling Strain Rate Effect of Heterogeneous Materials Using SPH Method

    NASA Astrophysics Data System (ADS)

    Ma, G. W.; Wang, X. J.; Li, Q. M.

    2010-11-01

    The strain rate effect on the dynamic compressive failure of heterogeneous material based on the smoothed particle hydrodynamics (SPH) method is studied. The SPH method employs a rate-insensitive elasto-plastic damage model incorporated with a Weibull distribution law to reflect the mechanical behavior of heterogeneous rock-like materials. A series of simulations are performed for heterogeneous specimens by applying axial velocity conditions, which induce different strain-rate loadings to the specimen. A detailed failure process of the specimens in terms of microscopic crack-activities and the macro-mechanical response are discussed. Failure mechanisms between the low and high strain rate cases are compared. The result shows that the strain-rate effects on the rock strength are mainly caused by the changing internal pressure due to the inertial effects as well as the material heterogeneity. It also demonstrates that the inertial effect becomes significant only when the induced strain rate exceeds a threshold, below which, the dynamic strength enhancement can be explained due to the heterogeneities in the material. It also shows that the dynamic strength is affected more significantly for a relatively more heterogeneous specimen, which coincides with the experimental results showing that the poor quality specimen had a relatively larger increase in the dynamic strength.

  3. Parallel Event Analysis Under Unix

    NASA Astrophysics Data System (ADS)

    Looney, S.; Nilsson, B. S.; Oest, T.; Pettersson, T.; Ranjard, F.; Thibonnier, J.-P.

    The ALEPH experiment at LEP, the CERN CN division and Digital Equipment Corp. have, in a joint project, developed a parallel event analysis system. The parallel physics code is identical to ALEPH's standard analysis code, ALPHA, only the organisation of input/output is changed. The user may switch between sequential and parallel processing by simply changing one input "card". The initial implementation runs on an 8-node DEC 3000/400 farm, using the PVM software, and exhibits a near-perfect speed-up linearity, reducing the turn-around time by a factor of 8.

  4. PARAMESH: A Parallel Adaptive Mesh Refinement Community Toolkit

    NASA Technical Reports Server (NTRS)

    MacNeice, Peter; Olson, Kevin M.; Mobarry, Clark; deFainchtein, Rosalinda; Packer, Charles

    1999-01-01

    In this paper, we describe a community toolkit which is designed to provide parallel support with adaptive mesh capability for a large and important class of computational models, those using structured, logically cartesian meshes. The package of Fortran 90 subroutines, called PARAMESH, is designed to provide an application developer with an easy route to extend an existing serial code which uses a logically cartesian structured mesh into a parallel code with adaptive mesh refinement. Alternatively, in its simplest use, and with minimal effort, it can operate as a domain decomposition tool for users who want to parallelize their serial codes, but who do not wish to use adaptivity. The package can provide them with an incremental evolutionary path for their code, converting it first to uniformly refined parallel code, and then later if they so desire, adding adaptivity.

  5. Evolution of Occupant Survivability Simulation Framework Using FEM-SPH Coupling

    DTIC Science & Technology

    2011-08-01

    SPH Coupling, Dooge and Thyagarajan. UNCLASSIFIED: Dist A. Approved for public release Page 2 of 14 works outward in the chain: soldier  seats ...reduced degree of freedom (DOF) system, evaluation of occupant seating independent of the vehicle environment, or using a substructure approach to...the ejected material covering the charge imposes most of the loading onto the structure above. The properties of the material in the "soil cap" are

  6. SPH-based numerical simulations of flow slides in municipal solid waste landfills.

    PubMed

    Huang, Yu; Dai, Zili; Zhang, Weijie; Huang, Maosong

    2013-03-01

    Most municipal solid waste (MSW) is disposed of in landfills. Over the past few decades, catastrophic flow slides have occurred in MSW landfills around the world, causing substantial economic damage and occasionally resulting in human victims. It is therefore important to predict the run-out, velocity and depth of such slides in order to provide adequate mitigation and protection measures. To overcome the limitations of traditional numerical methods for modelling flow slides, a mesh-free particle method entitled smoothed particle hydrodynamics (SPH) is introduced in this paper. The Navier-Stokes equations were adopted as the governing equations and a Bingham model was adopted to analyse the relationship between material stress rates and particle motion velocity. The accuracy of the model is assessed using a series of verifications, and then flow slides that occurred in landfills located in Sarajevo and Bandung were simulated to extend its applications. The simulated results match the field data well and highlight the capability of the proposed SPH modelling method to simulate such complex phenomena as flow slides in MSW landfills.

  7. A SPH elastic-viscoplastic model for granular flows and bed-load transport

    NASA Astrophysics Data System (ADS)

    Ghaïtanellis, Alex; Violeau, Damien; Ferrand, Martin; Abderrezzak, Kamal El Kadi; Leroy, Agnès; Joly, Antoine

    2018-01-01

    An elastic-viscoplastic model (Ulrich, 2013) is combined to a multi-phase SPH formulation (Hu and Adams, 2006; Ghaitanellis et al., 2015) to model granular flows and non-cohesive sediment transport. The soil is treated as a continuum exhibiting a viscoplastic behaviour. Thus, below a critical shear stress (i.e. the yield stress), the soil is assumed to behave as an isotropic linear-elastic solid. When the yield stress is exceeded, the soil flows and behaves as a shear-thinning fluid. A liquid-solid transition threshold based on the granular material properties is proposed, so as to make the model free of numerical parameter. The yield stress is obtained from Drucker-Prager criterion that requires an accurate computation of the effective stress in the soil. A novel method is proposed to compute the effective stress in SPH, solving a Laplace equation. The model is applied to a two-dimensional soil collapse (Bui et al., 2008) and a dam break over mobile beds (Spinewine and Zech, 2007). Results are compared with experimental data and a good agreement is obtained.

  8. Spin wave based parallel logic operations for binary data coded with domain walls

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Urazuka, Y.; Oyabu, S.; Chen, H.

    2014-05-07

    We numerically investigate the feasibility of spin wave (SW) based parallel logic operations, where the phase of SW packet (SWP) is exploited as a state variable and the phase shift caused by the interaction with domain wall (DW) is utilized as a logic inversion functionality. A designed functional element consists of parallel ferromagnetic nanowires (6 nm-thick, 36 nm-width, 5120 nm-length, and 200 nm separation) with the perpendicular magnetization and sub-μm scale overlaid conductors. The logic outputs for binary data, coded with the existence (“1”) or absence (“0”) of the DW, are inductively read out from interferometric aspect of the superposed SWPs, one of themmore » propagating through the stored data area. A practical exclusive-or operation, based on 2π periodicity in the phase logic, is demonstrated for the individual nanowire with an order of different output voltage V{sub out}, depending on the logic output for the stored data. The inductive output from the two nanowires exhibits well defined three different signal levels, corresponding to the information distance (Hamming distance) between 2-bit data stored in the multiple nanowires.« less

  9. MPI parallelization of Vlasov codes for the simulation of nonlinear laser-plasma interactions

    NASA Astrophysics Data System (ADS)

    Savchenko, V.; Won, K.; Afeyan, B.; Decyk, V.; Albrecht-Marc, M.; Ghizzo, A.; Bertrand, P.

    2003-10-01

    The simulation of optical mixing driven KEEN waves [1] and electron plasma waves [1] in laser-produced plasmas require nonlinear kinetic models and massive parallelization. We use Massage Passing Interface (MPI) libraries and Appleseed [2] to solve the Vlasov Poisson system of equations on an 8 node dual processor MAC G4 cluster. We use the semi-Lagrangian time splitting method [3]. It requires only row-column exchanges in the global data redistribution, minimizing the total number of communications between processors. Recurrent communication patterns for 2D FFTs involves global transposition. In the Vlasov-Maxwell case, we use splitting into two 1D spatial advections and a 2D momentum advection [4]. Discretized momentum advection equations have a double loop structure with the outer index being assigned to different processors. We adhere to a code structure with separate routines for calculations and data management for parallel computations. [1] B. Afeyan et al., IFSA 2003 Conference Proceedings, Monterey, CA [2] V. K. Decyk, Computers in Physics, 7, 418 (1993) [3] Sonnendrucker et al., JCP 149, 201 (1998) [4] Begue et al., JCP 151, 458 (1999)

  10. Parallel community climate model: Description and user`s guide

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Drake, J.B.; Flanery, R.E.; Semeraro, B.D.

    This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain intomore » geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.« less

  11. Dynamic particle refinement in SPH: application to free surface flow and non-cohesive soil simulations

    NASA Astrophysics Data System (ADS)

    Reyes López, Yaidel; Roose, Dirk; Recarey Morfa, Carlos

    2013-05-01

    In this paper, we present a dynamic refinement algorithm for the smoothed particle Hydrodynamics (SPH) method. An SPH particle is refined by replacing it with smaller daughter particles, which positions are calculated by using a square pattern centered at the position of the refined particle. We determine both the optimal separation and the smoothing distance of the new particles such that the error produced by the refinement in the gradient of the kernel is small and possible numerical instabilities are reduced. We implemented the dynamic refinement procedure into two different models: one for free surface flows, and one for post-failure flow of non-cohesive soil. The results obtained for the test problems indicate that using the dynamic refinement procedure provides a good trade-off between the accuracy and the cost of the simulations.

  12. Breakdown of Spatial Parallel Coding in Children's Drawing

    ERIC Educational Resources Information Center

    De Bruyn, Bart; Davis, Alyson

    2005-01-01

    When drawing real scenes or copying simple geometric figures young children are highly sensitive to parallel cues and use them effectively. However, this sensitivity can break down in surprisingly simple tasks such as copying a single line where robust directional errors occur despite the presence of parallel cues. Before we can conclude that this…

  13. Global Magnetohydrodynamic Simulation Using High Performance FORTRAN on Parallel Computers

    NASA Astrophysics Data System (ADS)

    Ogino, T.

    High Performance Fortran (HPF) is one of modern and common techniques to achieve high performance parallel computation. We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5 VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.

  14. Multitasking TORT under UNICOS: Parallel performance models and measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barnett, A.; Azmy, Y.Y.

    1999-09-27

    The existing parallel algorithms in the TORT discrete ordinates code were updated to function in a UNICOS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead.

  15. Optical properties of trinuclear metal chalcogenolate complexes - room temperature NIR fluorescence in [Cu2Ti(SPh)6(PPh3)2].

    PubMed

    Kühn, Michael; Lebedkin, Sergei; Weigend, Florian; Eichhöfer, Andreas

    2017-01-31

    The optical properties of four isostructural trinuclear chalcogenolato bridged metal complexes [Cu 2 Sn(SPh) 6 (PPh 3 ) 2 ], [Cu 2 Sn(SePh) 6 (PPh 3 ) 2 ], [Ag 2 Sn(SPh) 6 (PPh 3 ) 2 ] and [Cu 2 Ti(SPh) 6 (PPh 3 ) 2 ] have been investigated by absorption and photoluminescence spectroscopy and time-dependent density functional theory (TDDFT) calculations. All copper-tin compounds demonstrate near-infrared (NIR) phosphorescence at ∼900-1100 nm in the solid state at low temperature, which is nearly absent at ambient temperature. Stokes shifts of these emissions are found to be unusually large with values of about 1.5 eV. The copper-titanium complex [Cu 2 Ti(SPh) 6 (PPh 3 ) 2 ] also shows luminescence in the NIR at 1090 nm but with a much faster decay (τ ∼ 10 ns at 150 K) and a much smaller Stokes shift (ca. 0.3 eV). Even at 295 K this fluorescence is found to comprise a quantum yield as high as 9.5%. The experimental electronic absorption spectra well correspond to the spectra simulated from the calculated singlet transitions. In line with the large Stokes shifts of the emission spectra the calculations reveal for the copper-tin complexes strong structural relaxation of the excited triplet states whereas those effects are found to be much smaller in the case of the copper-titanium complex.

  16. A practical approach to portability and performance problems on massively parallel supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beazley, D.M.; Lomdahl, P.S.

    1994-12-08

    We present an overview of the tactics we have used to achieve a high-level of performance while improving portability for a large-scale molecular dynamics code SPaSM. SPaSM was originally implemented in ANSI C with message passing for the Connection Machine 5 (CM-5). In 1993, SPaSM was selected as one of the winners in the IEEE Gordon Bell Prize competition for sustaining 50 Gflops on the 1024 node CM-5 at Los Alamos National Laboratory. Achieving this performance on the CM-5 required rewriting critical sections of code in CDPEAC assembler language. In addition, the code made extensive use of CM-5 parallel I/Omore » and the CMMD message passing library. Given this highly specialized implementation, we describe how we have ported the code to the Cray T3D and high performance workstations. In addition we will describe how it has been possible to do this using a single version of source code that runs on all three platforms without sacrificing any performance. Sound too good to be true? We hope to demonstrate that one can realize both code performance and portability without relying on the latest and greatest prepackaged tool or parallelizing compiler.« less

  17. SANTA BARBARA CLUSTER COMPARISON TEST WITH DISPH

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saitoh, Takayuki R.; Makino, Junichiro, E-mail: saitoh@elsi.jp

    2016-06-01

    The Santa Barbara cluster comparison project revealed that there is a systematic difference between entropy profiles of clusters of galaxies obtained by Eulerian mesh and Lagrangian smoothed particle hydrodynamics (SPH) codes: mesh codes gave a core with a constant entropy, whereas SPH codes did not. One possible reason for this difference is that mesh codes are not Galilean invariant. Another possible reason is the problem of the SPH method, which might give too much “protection” to cold clumps because of the unphysical surface tension induced at contact discontinuities. In this paper, we apply the density-independent formulation of SPH (DISPH), whichmore » can handle contact discontinuities accurately, to simulations of a cluster of galaxies and compare the results with those with the standard SPH. We obtained the entropy core when we adopt DISPH. The size of the core is, however, significantly smaller than those obtained with mesh simulations and is comparable to those obtained with quasi-Lagrangian schemes such as “moving mesh” and “mesh free” schemes. We conclude that both the standard SPH without artificial conductivity and Eulerian mesh codes have serious problems even with such an idealized simulation, while DISPH, SPH with artificial conductivity, and quasi-Lagrangian schemes have sufficient capability to deal with it.« less

  18. Méningiome en plaque sphéno-orbitaire: à propos d'un cas avec revue de la littérature

    PubMed Central

    Abdellaoui, Meriem; Andaloussi, Idriss Benatiya; Tahri, Hicham

    2015-01-01

    Le méningiome intra osseux est une variété des méningiomes ectopiques dans lequel les cellules méningothéliales envahissent la paroi osseuse et entraînent une hyperostose. Le méningiome en plaque, variante macroscopique des méningiomes intra osseux, est une tumeur rare et survient fréquemment au niveau de la région sphéno-orbitaire ce qui le confond avec les tumeurs osseuses primitives. Nous rapportons le cas d'une patiente de 50 ans qui présente une exophtalmie avec cécité unilatérale gauche d'installation progressive depuis un an. L'examen trouve une exophtalmie axile, indolore et non réductible ainsi qu'une limitation de la motilité oculaire dans tous les sens du regard. La palpation montre une masse temporale gauche dure et adhérente à l'os. L'examen du fond d’œil trouve un œdème papillaire gauche. Le scanner montre une lésion ostéocondensante temporo-sphéno-orbitaire gauche avec envahissement locorégional. Le diagnostic préopératoire fut une tumeur osseuse essentiellement maligne primitive ou secondaire. L’étude histologique a révélée un méningiome meningothélial de type en plaque. La patiente a bénéficié d'une exérèse avec reconstruction chirurgicale. Aucune récidive n'a été notée après 1 an de recul. PMID:26327996

  19. Comparison of ALE and SPH Simulations of Vertical Drop Tests of a Composite Fuselage Section into Water

    NASA Technical Reports Server (NTRS)

    Jackson, Karen E.; Fuchs, Yvonne T.

    2008-01-01

    Simulation of multi-terrain impact has been identified as an important research area for improved prediction of rotorcraft crashworthiness within the NASA Subsonic Rotary Wing Aeronautics Program on Rotorcraft Crashworthiness. As part of this effort, two vertical drop tests were conducted of a 5-ft-diameter composite fuselage section into water. For the first test, the fuselage section was impacted in a baseline configuration without energy absorbers. For the second test, the fuselage section was retrofitted with a composite honeycomb energy absorber. Both tests were conducted at a nominal velocity of 25-ft/s. A detailed finite element model was developed to represent each test article and water impact was simulated using both Arbitrary Lagrangian Eulerian (ALE) and Smooth Particle Hydrodynamics (SPH) approaches in LS-DYNA, a nonlinear, explicit transient dynamic finite element code. Analytical predictions were correlated with experimental data for both test configurations. In addition, studies were performed to evaluate the influence of mesh density on test-analysis correlation.

  20. SPH investigation of the thermal effects on the fluid mixing in a microchannel with rotating stirrers

    NASA Astrophysics Data System (ADS)

    Shamsoddini, Rahim

    2018-04-01

    An incompressible smoothed particle hydrodynamics algorithm is proposed to model and investigate the thermal effect on the mixing rate of an active micromixer in which the rotating stirrers enhance the mixing rate. In liquids, mass diffusion increases with increasing temperature, while viscosity decreases; so, the local Schmidt number decreases considerably with increasing temperature. The present study investigates the effect of wall temperature on mixing rate with an improved SPH method. The robust SPH method used in the present work is equipped with a shifting algorithm and renormalization tensors. By introducing this new algorithm, the several mass, momentum, energy, and concentration equations are solved. The results, discussed for different temperature ratios, show that mixing rate increases significantly with increased temperature ratio.

  1. Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

    PubMed

    Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry

    2016-11-01

    Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.

  2. Magneto-Structural Correlations in Pseudotetrahedral Forms of the [Co(SPh)4]2- Complex Probed by Magnetometry, MCD Spectroscopy, Advanced EPR Techniques, and ab Initio Electronic Structure Calculations.

    PubMed

    Suturina, Elizaveta A; Nehrkorn, Joscha; Zadrozny, Joseph M; Liu, Junjie; Atanasov, Mihail; Weyhermüller, Thomas; Maganas, Dimitrios; Hill, Stephen; Schnegg, Alexander; Bill, Eckhard; Long, Jeffrey R; Neese, Frank

    2017-03-06

    The magnetic properties of pseudotetrahedral Co(II) complexes spawned intense interest after (PPh 4 ) 2 [Co(SPh) 4 ] was shown to be the first mononuclear transition-metal complex displaying slow relaxation of the magnetization in the absence of a direct current magnetic field. However, there are differing reports on its fundamental magnetic spin Hamiltonian (SH) parameters, which arise from inherent experimental challenges in detecting large zero-field splittings. There are also remarkable changes in the SH parameters of [Co(SPh) 4 ] 2- upon structural variations, depending on the counterion and crystallization conditions. In this work, four complementary experimental techniques are utilized to unambiguously determine the SH parameters for two different salts of [Co(SPh) 4 ] 2- : (PPh 4 ) 2 [Co(SPh) 4 ] (1) and (NEt 4 ) 2 [Co(SPh) 4 ] (2). The characterization methods employed include multifield SQUID magnetometry, high-field/high-frequency electron paramagnetic resonance (HF-EPR), variable-field variable-temperature magnetic circular dichroism (VTVH-MCD), and frequency domain Fourier transform THz-EPR (FD-FT THz-EPR). Notably, the paramagnetic Co(II) complex [Co(SPh) 4 ] 2- shows strong axial magnetic anisotropy in 1, with D = -55(1) cm -1 and E/D = 0.00(3), but rhombic anisotropy is seen for 2, with D = +11(1) cm -1 and E/D = 0.18(3). Multireference ab initio CASSCF/NEVPT2 calculations enable interpretation of the remarkable variation of D and its dependence on the electronic structure and geometry.

  3. SPH modeling of fluid-solid interaction for dynamic failure analysis of fluid-filled thin shells

    NASA Astrophysics Data System (ADS)

    Caleyron, F.; Combescure, A.; Faucher, V.; Potapov, S.

    2013-05-01

    This work concerns the prediction of failure of a fluid-filled tank under impact loading, including the resulting fluid leakage. A water-filled steel cylinder associated with a piston is impacted by a mass falling at a prescribed velocity. The cylinder is closed at its base by an aluminum plate whose characteristics are allowed to vary. The impact on the piston creates a pressure wave in the fluid which is responsible for the deformation of the plate and, possibly, the propagation of cracks. The structural part of the problem is modeled using Mindlin-Reissner finite elements (FE) and Smoothed Particle Hydrodynamics (SPH) shells. The modeling of the fluid is also based on an SPH formulation. The problem involves significant fluid-structure interactions (FSI) which are handled through a master-slave-based method and the pinballs method. Numerical results are compared to experimental data.

  4. Utilizing GPUs to Accelerate Turbomachinery CFD Codes

    NASA Technical Reports Server (NTRS)

    MacCalla, Weylin; Kulkarni, Sameer

    2016-01-01

    GPU computing has established itself as a way to accelerate parallel codes in the high performance computing world. This work focuses on speeding up APNASA, a legacy CFD code used at NASA Glenn Research Center, while also drawing conclusions about the nature of GPU computing and the requirements to make GPGPU worthwhile on legacy codes. Rewriting and restructuring of the source code was avoided to limit the introduction of new bugs. The code was profiled and investigated for parallelization potential, then OpenACC directives were used to indicate parallel parts of the code. The use of OpenACC directives was not able to reduce the runtime of APNASA on either the NVIDIA Tesla discrete graphics card, or the AMD accelerated processing unit. Additionally, it was found that in order to justify the use of GPGPU, the amount of parallel work being done within a kernel would have to greatly exceed the work being done by any one portion of the APNASA code. It was determined that in order for an application like APNASA to be accelerated on the GPU, it should not be modular in nature, and the parallel portions of the code must contain a large portion of the code's computation time.

  5. Benchmarking the SPHINX and CTH shock physics codes for three problems in ballistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wilson, L.T.; Hertel, E.; Schwalbe, L.

    1998-02-01

    The CTH Eulerian hydrocode, and the SPHINX smooth particle hydrodynamics (SPH) code were used to model a shock tube, two long rod penetrations into semi-infinite steel targets, and a long rod penetration into a spaced plate array. The results were then compared to experimental data. Both SPHINX and CTH modeled the one-dimensional shock tube problem well. Both codes did a reasonable job in modeling the outcome of the axisymmetric rod impact problem. Neither code correctly reproduced the depth of penetration in both experiments. In the 3-D problem, both codes reasonably replicated the penetration of the rod through the first plate.more » After this, however, the predictions of both codes began to diverge from the results seen in the experiment. In terms of computer resources, the run times are problem dependent, and are discussed in the text.« less

  6. Rubus: A compiler for seamless and extensible parallelism.

    PubMed

    Adnan, Muhammad; Aslam, Faisal; Nawaz, Zubair; Sarwar, Syed Mansoor

    2017-01-01

    Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer's expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been

  7. Rubus: A compiler for seamless and extensible parallelism

    PubMed Central

    Adnan, Muhammad; Aslam, Faisal; Sarwar, Syed Mansoor

    2017-01-01

    Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer’s expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been

  8. Investigating the Et-1/SphK/S1P Pathway as a Novel Approach for the Prevention of Inflammation-Induced Preterm Birth.

    PubMed

    Giusto, Kiersten; Ashby, Charles R

    2018-01-30

    Preterm birth (PTB), defined as birth before 37 completed weeks of gestation, occurs in up to 18 percent of births worldwide and accounts for the majority of perinatal morbidity and mortality. While the single most common cause of PTB has been identified as inflammation, safe and effective pharmacotherapy to prevent PTB has yet to be developed. Our group has used an in vivo model of inflammation driven PTB, biochemical methods, pharmacological approaches, a novel endothelin receptor antagonist that we synthesized and RNA knockdown to help establish the role of endothelin-1 (ET-1) in inflammation-associated PTB. Further, we have used our in vivo model to test whether sphingosine kinase, which acts downstream of ET-1, plays a role in PTB. We have shown that levels of endothelin converting enzyme-1 (ECE-1) and ET-1 are increased when PTB is induced in timed pregnant mice with lipopolysaccharide (LPS) and that blocking ET-1 action, pharmacologically or using ECE-1 RNA silencing, rescues LPS-induced mice from PTB. ET-1 activates the sphingosine kinase/sphingosine-1-phosphate (SphK/S1P) pathway. S1P, in turn, is an important signaling molecule in the pro-inflammatory response. Interestingly, we have shown that SphK inhibition also prevents LPS-induced PTB in timed pregnant mice. Further, we showed that SphK inhibition suppresses the ECE-1/ET-1 axis, implicating positive feedback regulation of the SphK/S1P/ECE-1/ET-1 axis. The ET-1/SphK/SIP pathway is a potential pharmacotherapeutic target for the prevention of PTB. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  9. A new way to generate cytolytic tumor-specific T cells: electroporation of RNA coding for a T cell receptor into T lymphocytes.

    PubMed

    Schaft, Niels; Dörrie, Jan; Müller, Ina; Beck, Verena; Baumann, Stefanie; Schunder, Tanja; Kämpgen, Eckhart; Schuler, Gerold

    2006-09-01

    Effective T cell receptor (TCR) transfer until now required stable retroviral transduction. However, retroviral transduction poses the threat of irreversible genetic manipulation of autologous cells. We, therefore, used optimized RNA transfection for transient manipulation. The transfection efficiency, using EGFP RNA, was >90%. The electroporation of primary T cells, isolated from blood, with TCR-coding RNA resulted in functional cytotoxic T lymphocytes (CTLs) (>60% killing at an effector to target ratio of 20:1) with the same HLA-A2/gp100-specificity as the parental CTL clone. The TCR-transfected T cells specifically recognized peptide-pulsed T2 cells, or dendritic cells electroporated with gp100-coding RNA, in an IFNgamma-secretion assay and retained this ability, even after cryopreservation, over 3 days. Most importantly, we show here for the first time that the electroporated T cells also displayed cytotoxicity, and specifically lysed peptide-loaded T2 cells and HLA-A2+/gp100+ melanoma cells over a period of at least 72 h. Peptide-titration studies showed that the lytic efficiency of the RNA-transfected T cells was similar to that of retrovirally transduced T cells, and approximated that of the parental CTL clone. Functional TCR transfer by RNA electroporation is now possible without the disadvantages of retroviral transduction, and forms a new strategy for the immunotherapy of cancer.

  10. Slow magnetic relaxation at zero field in the tetrahedral complex [Co(SPh)4]2-.

    PubMed

    Zadrozny, Joseph M; Long, Jeffrey R

    2011-12-28

    The Ph(4)P(+) salt of the tetrahedral complex [Co(SPh)(4)](2-), possessing an S = (3)/(2) ground state with an axial zero-field splitting of D = -70 cm(-1), displays single-molecule magnet behavior in the absence of an applied magnetic field. At very low temperatures, ac magnetic susceptibility data show the magnetic relaxation time, τ, to be temperature-independent, while above 2.5 K thermally activated Arrhenius behavior is apparent with U(eff) = 21(1) cm(-1) and τ(0) = 1.0(3) × 10(-7) s. Under an applied field of 1 kOe, τ more closely approximates Arrhenius behavior over the entire temperature range. Upon dilution of the complex within a matrix of the isomorphous compound (Ph(4)P)(2)[Zn(SPh)(4)], ac susceptibility data reveal the molecular nature of the slow magnetic relaxation and indicate that the quantum tunneling pathway observed at low temperatures is likely mediated by intermolecular dipolar interactions. © 2011 American Chemical Society

  11. Efficient Parallelization of a Dynamic Unstructured Application on the Tera MTA

    NASA Technical Reports Server (NTRS)

    Oliker, Leonid; Biswas, Rupak

    1999-01-01

    The success of parallel computing in solving real-life computationally-intensive problems relies on their efficient mapping and execution on large-scale multiprocessor architectures. Many important applications are both unstructured and dynamic in nature, making their efficient parallel implementation a daunting task. This paper presents the parallelization of a dynamic unstructured mesh adaptation algorithm using three popular programming paradigms on three leading supercomputers. We examine an MPI message-passing implementation on the Cray T3E and the SGI Origin2OOO, a shared-memory implementation using cache coherent nonuniform memory access (CC-NUMA) of the Origin2OOO, and a multi-threaded version on the newly-released Tera Multi-threaded Architecture (MTA). We compare several critical factors of this parallel code development, including runtime, scalability, programmability, and memory overhead. Our overall results demonstrate that multi-threaded systems offer tremendous potential for quickly and efficiently solving some of the most challenging real-life problems on parallel computers.

  12. tRNA acceptor stem and anticodon bases form independent codes related to protein folding

    PubMed Central

    Carter, Charles W.; Wolfenden, Richard

    2015-01-01

    Aminoacyl-tRNA synthetases recognize tRNA anticodon and 3′ acceptor stem bases. Synthetase Urzymes acylate cognate tRNAs even without anticodon-binding domains, in keeping with the possibility that acceptor stem recognition preceded anticodon recognition. Representing tRNA identity elements with two bits per base, we show that the anticodon encodes the hydrophobicity of each amino acid side-chain as represented by its water-to-cyclohexane distribution coefficient, and this relationship holds true over the entire temperature range of liquid water. The acceptor stem codes preferentially for the surface area or size of each side-chain, as represented by its vapor-to-cyclohexane distribution coefficient. These orthogonal experimental properties are both necessary to account satisfactorily for the exposed surface area of amino acids in folded proteins. Moreover, the acceptor stem codes correctly for β-branched and carboxylic acid side-chains, whereas the anticodon codes for a wider range of such properties, but not for size or β-branching. These and other results suggest that genetic coding of 3D protein structures evolved in distinct stages, based initially on the size of the amino acid and later on its compatibility with globular folding in water. PMID:26034281

  13. Coset Codes Viewed as Terminated Convolutional Codes

    NASA Technical Reports Server (NTRS)

    Fossorier, Marc P. C.; Lin, Shu

    1996-01-01

    In this paper, coset codes are considered as terminated convolutional codes. Based on this approach, three new general results are presented. First, it is shown that the iterative squaring construction can equivalently be defined from a convolutional code whose trellis terminates. This convolutional code determines a simple encoder for the coset code considered, and the state and branch labelings of the associated trellis diagram become straightforward. Also, from the generator matrix of the code in its convolutional code form, much information about the trade-off between the state connectivity and complexity at each section, and the parallel structure of the trellis, is directly available. Based on this generator matrix, it is shown that the parallel branches in the trellis diagram of the convolutional code represent the same coset code C(sub 1), of smaller dimension and shorter length. Utilizing this fact, a two-stage optimum trellis decoding method is devised. The first stage decodes C(sub 1), while the second stage decodes the associated convolutional code, using the branch metrics delivered by stage 1. Finally, a bidirectional decoding of each received block starting at both ends is presented. If about the same number of computations is required, this approach remains very attractive from a practical point of view as it roughly doubles the decoding speed. This fact is particularly interesting whenever the second half of the trellis is the mirror image of the first half, since the same decoder can be implemented for both parts.

  14. Identifying personal microbiomes using metagenomic codes

    PubMed Central

    Franzosa, Eric A.; Huang, Katherine; Meadow, James F.; Gevers, Dirk; Lemon, Katherine P.; Bohannan, Brendan J. M.; Huttenhower, Curtis

    2015-01-01

    Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30–300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed >80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability—a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability. PMID:25964341

  15. Parallel optical image addition and subtraction in a dynamic photorefractive memory by phase-code multiplexing

    NASA Astrophysics Data System (ADS)

    Denz, Cornelia; Dellwig, Thilo; Lembcke, Jan; Tschudi, Theo

    1996-02-01

    We propose and demonstrate experimentally a method for utilizing a dynamic phase-encoded photorefractive memory to realize parallel optical addition, subtraction, and inversion operations of stored images. The phase-encoded holographic memory is realized in photorefractive BaTiO3, storing eight images using WalshHadamard binary phase codes and an incremental recording procedure. By subsampling the set of reference beams during the recall operation, the selectivity of the phase address is decreased, allowing one to combine images in such a way that different linear combination of the images can be realized at the output of the memory.

  16. IMPETUS: Consistent SPH calculations of 3D spherical Bondi accretion onto a black hole

    NASA Astrophysics Data System (ADS)

    Ramírez-Velasquez, J. M.; Sigalotti, L. Di G.; Gabbasov, R.; Cruz, F.; Klapp, J.

    2018-04-01

    We present three-dimensional calculations of spherically symmetric Bondi accretion onto a stationary supermassive black hole (SMBH) of mass 108M⊙ within a radial range of 0.02 - 10 pc, using a modified version of the smoothed particle hydrodynamics (SPH) GADGET-2 code, which ensures approximate first-order consistency (i.e., second-order accuracy) for the particle approximation. First-order consistency is restored by allowing the number of neighbours, nneigh, and the smoothing length, h, to vary with the total number of particles, N, such that the asymptotic limits nneigh → ∞ and h → 0 hold as N → ∞. The ability of the method to reproduce the isothermal (γ = 1) and adiabatic (γ = 5/3) Bondi accretion is investigated with increased spatial resolution. In particular, for the isothermal models the numerical radial profiles closely match the Bondi solution, except near the accretor, where the density and radial velocity are slightly underestimated. However, as nneigh is increased and h is decreased, the calculations approach first-order consistency and the deviations from the Bondi solution decrease. The density and radial velocity profiles for the adiabatic models are qualitatively similar to those for the isothermal Bondi accretion. Steady-state Bondi accretion is reproduced by the highly resolved consistent models with a percent relative error of ≲ 1% for γ = 1 and ˜9% for γ = 5/3, with the adiabatic accretion taking longer than the isothermal case to reach steady flow. The performance of the method is assessed by comparing the results with those obtained using the standard GADGET-2 and the GIZMO codes.

  17. Parallel Adaptive Mesh Refinement Library

    NASA Technical Reports Server (NTRS)

    Mac-Neice, Peter; Olson, Kevin

    2005-01-01

    Parallel Adaptive Mesh Refinement Library (PARAMESH) is a package of Fortran 90 subroutines designed to provide a computer programmer with an easy route to extension of (1) a previously written serial code that uses a logically Cartesian structured mesh into (2) a parallel code with adaptive mesh refinement (AMR). Alternatively, in its simplest use, and with minimal effort, PARAMESH can operate as a domain-decomposition tool for users who want to parallelize their serial codes but who do not wish to utilize adaptivity. The package builds a hierarchy of sub-grids to cover the computational domain of a given application program, with spatial resolution varying to satisfy the demands of the application. The sub-grid blocks form the nodes of a tree data structure (a quad-tree in two or an oct-tree in three dimensions). Each grid block has a logically Cartesian mesh. The package supports one-, two- and three-dimensional models.

  18. Implementing Shared Memory Parallelism in MCBEND

    NASA Astrophysics Data System (ADS)

    Bird, Adam; Long, David; Dobson, Geoff

    2017-09-01

    MCBEND is a general purpose radiation transport Monte Carlo code from AMEC Foster Wheelers's ANSWERS® Software Service. MCBEND is well established in the UK shielding community for radiation shielding and dosimetry assessments. The existing MCBEND parallel capability effectively involves running the same calculation on many processors. This works very well except when the memory requirements of a model restrict the number of instances of a calculation that will fit on a machine. To more effectively utilise parallel hardware OpenMP has been used to implement shared memory parallelism in MCBEND. This paper describes the reasoning behind the choice of OpenMP, notes some of the challenges of multi-threading an established code such as MCBEND and assesses the performance of the parallel method implemented in MCBEND.

  19. Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB

    NASA Technical Reports Server (NTRS)

    Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.

    2017-01-01

    Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.

  20. Parallel tempering simulation of the three-dimensional Edwards-Anderson model with compact asynchronous multispin coding on GPU

    NASA Astrophysics Data System (ADS)

    Fang, Ye; Feng, Sheng; Tam, Ka-Ming; Yun, Zhifeng; Moreno, Juana; Ramanujam, J.; Jarrell, Mark

    2014-10-01

    Monte Carlo simulations of the Ising model play an important role in the field of computational statistical physics, and they have revealed many properties of the model over the past few decades. However, the effect of frustration due to random disorder, in particular the possible spin glass phase, remains a crucial but poorly understood problem. One of the obstacles in the Monte Carlo simulation of random frustrated systems is their long relaxation time making an efficient parallel implementation on state-of-the-art computation platforms highly desirable. The Graphics Processing Unit (GPU) is such a platform that provides an opportunity to significantly enhance the computational performance and thus gain new insight into this problem. In this paper, we present optimization and tuning approaches for the CUDA implementation of the spin glass simulation on GPUs. We discuss the integration of various design alternatives, such as GPU kernel construction with minimal communication, memory tiling, and look-up tables. We present a binary data format, Compact Asynchronous Multispin Coding (CAMSC), which provides an additional 28.4% speedup compared with the traditionally used Asynchronous Multispin Coding (AMSC). Our overall design sustains a performance of 33.5 ps per spin flip attempt for simulating the three-dimensional Edwards-Anderson model with parallel tempering, which significantly improves the performance over existing GPU implementations.

  1. Development of a cryogenic mixed fluid J-T cooling computer code, 'JTMIX'

    NASA Technical Reports Server (NTRS)

    Jones, Jack A.

    1991-01-01

    An initial study was performed for analyzing and predicting the temperatures and cooling capacities when mixtures of fluids are used in Joule-Thomson coolers and in heat pipes. A computer code, JTMIX, was developed for mixed gas J-T analysis for any fluid combination of neon, nitrogen, various hydrocarbons, argon, oxygen, carbon monoxide, carbon dioxide, and hydrogen sulfide. When used in conjunction with the NIST computer code, DDMIX, it has accurately predicted order-of-magnitude increases in J-T cooling capacities when various hydrocarbons are added to nitrogen, and it predicts nitrogen normal boiling point depressions to as low as 60 K when neon is added.

  2. Parallel loss of nuclear-encoded mitochondrial aminoacyl-tRNA synthetases and mtDNA-encoded tRNAs in Cnidaria.

    PubMed

    Haen, Karri M; Pett, Walker; Lavrov, Dennis V

    2010-10-01

    Unlike most animal mitochondrial (mt) genomes, which encode a set of 22 transfer RNAs (tRNAs) sufficient for mt protein synthesis, those of cnidarians have only retained one or two tRNA genes. Whether the missing cnidarian mt-tRNA genes relocated outside the main mt chromosome or were lost remains unclear. It is also unknown what impact the loss of tRNA genes had on other components of the mt translational machinery. Here, we explored the nuclear genome of the cnidarian Nematostella vectensis for the presence of mt-tRNA genes and their corresponding mt aminoacyl-tRNA synthetases (mt-aaRS). We detected no candidates for mt-tRNA genes and only two mt-aaRS orthologs. At the same time, we found that all but one cytosolic aaRS appear to be targeted to mitochondria. These results indicate that the loss of mt-tRNAs in Cnidaria is genuine and occurred in parallel with the loss of nuclear-encoded mt-aaRS. Our phylogenetic analyses of individual aaRS revealed that although the nearly total loss of mt-aaRS is rare, aaRS gene deletion and replacement have occurred throughout the evolution of Metazoa.

  3. Parallel Higher-order Finite Element Method for Accurate Field Computations in Wakefield and PIC Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Candel, A.; Kabel, A.; Lee, L.

    Over the past years, SLAC's Advanced Computations Department (ACD), under SciDAC sponsorship, has developed a suite of 3D (2D) parallel higher-order finite element (FE) codes, T3P (T2P) and Pic3P (Pic2P), aimed at accurate, large-scale simulation of wakefields and particle-field interactions in radio-frequency (RF) cavities of complex shape. The codes are built on the FE infrastructure that supports SLAC's frequency domain codes, Omega3P and S3P, to utilize conformal tetrahedral (triangular)meshes, higher-order basis functions and quadratic geometry approximation. For time integration, they adopt an unconditionally stable implicit scheme. Pic3P (Pic2P) extends T3P (T2P) to treat charged-particle dynamics self-consistently using the PIC (particle-in-cell)more » approach, the first such implementation on a conformal, unstructured grid using Whitney basis functions. Examples from applications to the International Linear Collider (ILC), Positron Electron Project-II (PEP-II), Linac Coherent Light Source (LCLS) and other accelerators will be presented to compare the accuracy and computational efficiency of these codes versus their counterparts using structured grids.« less

  4. Three-Dimensional Algebraic Models of the tRNA Code and 12 Graphs for Representing the Amino Acids.

    PubMed

    José, Marco V; Morgado, Eberto R; Guimarães, Romeu Cardoso; Zamudio, Gabriel S; de Farías, Sávio Torres; Bobadilla, Juan R; Sosa, Daniela

    2014-08-11

    Three-dimensional algebraic models, also called Genetic Hotels, are developed to represent the Standard Genetic Code, the Standard tRNA Code (S-tRNA-C), and the Human tRNA code (H-tRNA-C). New algebraic concepts are introduced to be able to describe these models, to wit, the generalization of the 2n-Klein Group and the concept of a subgroup coset with a tail. We found that the H-tRNA-C displayed broken symmetries in regard to the S-tRNA-C, which is highly symmetric. We also show that there are only 12 ways to represent each of the corresponding phenotypic graphs of amino acids. The averages of statistical centrality measures of the 12 graphs for each of the three codes are carried out and they are statistically compared. The phenotypic graphs of the S-tRNA-C display a common triangular prism of amino acids in 10 out of the 12 graphs, whilst the corresponding graphs for the H-tRNA-C display only two triangular prisms. The graphs exhibit disjoint clusters of amino acids when their polar requirement values are used. We contend that the S-tRNA-C is in a frozen-like state, whereas the H-tRNA-C may be in an evolving state.

  5. Synthesis, structure and DFT conformation analysis of CpNiX(NHC) and NiX2(NHC)2 (X = SPh or Br) complexes

    NASA Astrophysics Data System (ADS)

    Malan, Frederick P.; Singleton, Eric; van Rooyen, Petrus H.; Conradie, Jeanet; Landman, Marilé

    2017-11-01

    The synthesis, density functional theory (DFT) conformational study and structure analysis of novel two-legged piano stool Ni N-heterocyclic carbene (NHC) complexes and square planar Ni bis-N-heterocyclic carbene complexes, all containing either bromido- or thiophenolato ligands, are described. [CpNi(SPh)(NHC)] complexes were obtained from the neutral 18-electron [CpNiBr(NHC)] complexes by substitution of a bromido ligand with SPh, using NEt3 as a base to abstract the proton of HSPh. The 16-electron biscarbene complexes [Ni(SPh)2{NHC}2] were isolated when an excess of HSPh was added to the reaction mixture. Biscarbene complexes of the type [NiBr2(NHC)2] were obtained in the reaction of NiCp2 with a slight excess of the specific imidazolium bromide salt. The molecular and electronic structures of the mono- and bis-N-heterocyclic carbene complexes have been analysed using single crystal diffraction and density functional theory (DFT) calculations, to give insight into their structural properties.

  6. Aerodynamic simulation on massively parallel systems

    NASA Technical Reports Server (NTRS)

    Haeuser, Jochem; Simon, Horst D.

    1992-01-01

    This paper briefly addresses the computational requirements for the analysis of complete configurations of aircraft and spacecraft currently under design to be used for advanced transportation in commercial applications as well as in space flight. The discussion clearly shows that massively parallel systems are the only alternative which is both cost effective and on the other hand can provide the necessary TeraFlops, needed to satisfy the narrow design margins of modern vehicles. It is assumed that the solution of the governing physical equations, i.e., the Navier-Stokes equations which may be complemented by chemistry and turbulence models, is done on multiblock grids. This technique is situated between the fully structured approach of classical boundary fitted grids and the fully unstructured tetrahedra grids. A fully structured grid best represents the flow physics, while the unstructured grid gives best geometrical flexibility. The multiblock grid employed is structured within a block, but completely unstructured on the block level. While a completely unstructured grid is not straightforward to parallelize, the above mentioned multiblock grid is inherently parallel, in particular for multiple instruction multiple datastream (MIMD) machines. In this paper guidelines are provided for setting up or modifying an existing sequential code so that a direct parallelization on a massively parallel system is possible. Results are presented for three parallel systems, namely the Intel hypercube, the Ncube hypercube, and the FPS 500 system. Some preliminary results for an 8K CM2 machine will also be mentioned. The code run is the two dimensional grid generation module of Grid, which is a general two dimensional and three dimensional grid generation code for complex geometries. A system of nonlinear Poisson equations is solved. This code is also a good testcase for complex fluid dynamics codes, since the same datastructures are used. All systems provided good speedups, but

  7. Application of particle splitting method for both hydrostatic and hydrodynamic cases in SPH

    NASA Astrophysics Data System (ADS)

    Liu, W. T.; Sun, P. N.; Ming, F. R.; Zhang, A. M.

    2018-01-01

    Smoothed particle hydrodynamics (SPH) method with numerical diffusive terms shows satisfactory stability and accuracy in some violent fluid-solid interaction problems. However, in most simulations, uniform particle distributions are used and the multi-resolution, which can obviously improve the local accuracy and the overall computational efficiency, has seldom been applied. In this paper, a dynamic particle splitting method is applied and it allows for the simulation of both hydrostatic and hydrodynamic problems. The splitting algorithm is that, when a coarse (mother) particle enters the splitting region, it will be split into four daughter particles, which inherit the physical parameters of the mother particle. In the particle splitting process, conservations of mass, momentum and energy are ensured. Based on the error analysis, the splitting technique is designed to allow the optimal accuracy at the interface between the coarse and refined particles and this is particularly important in the simulation of hydrostatic cases. Finally, the scheme is validated by five basic cases, which demonstrate that the present SPH model with a particle splitting technique is of high accuracy and efficiency and is capable for the simulation of a wide range of hydrodynamic problems.

  8. Particle In Cell Codes on Highly Parallel Architectures

    NASA Astrophysics Data System (ADS)

    Tableman, Adam

    2014-10-01

    We describe strategies and examples of Particle-In-Cell Codes running on Nvidia GPU and Intel Phi architectures. This includes basic implementations in skeletons codes and full-scale development versions (encompassing 1D, 2D, and 3D codes) in Osiris. Both the similarities and differences between Intel's and Nvidia's hardware will be examined. Work supported by grants NSF ACI 1339893, DOE DE SC 000849, DOE DE SC 0008316, DOE DE NA 0001833, and DOE DE FC02 04ER 54780.

  9. Simulating coupled dynamics of a rigid-flexible multibody system and compressible fluid

    NASA Astrophysics Data System (ADS)

    Hu, Wei; Tian, Qiang; Hu, HaiYan

    2018-04-01

    As a subsequent work of previous studies of authors, a new parallel computation approach is proposed to simulate the coupled dynamics of a rigid-flexible multibody system and compressible fluid. In this approach, the smoothed particle hydrodynamics (SPH) method is used to model the compressible fluid, the natural coordinate formulation (NCF) and absolute nodal coordinate formulation (ANCF) are used to model the rigid and flexible bodies, respectively. In order to model the compressible fluid properly and efficiently via SPH method, three measures are taken as follows. The first is to use the Riemann solver to cope with the fluid compressibility, the second is to define virtual particles of SPH to model the dynamic interaction between the fluid and the multibody system, and the third is to impose the boundary conditions of periodical inflow and outflow to reduce the number of SPH particles involved in the computation process. Afterwards, a parallel computation strategy is proposed based on the graphics processing unit (GPU) to detect the neighboring SPH particles and to solve the dynamic equations of SPH particles in order to improve the computation efficiency. Meanwhile, the generalized-alpha algorithm is used to solve the dynamic equations of the multibody system. Finally, four case studies are given to validate the proposed parallel computation approach.

  10. Automatic Multilevel Parallelization Using OpenMP

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

    2002-01-01

    In this paper we describe the extension of the CAPO (CAPtools (Computer Aided Parallelization Toolkit) OpenMP) parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report some results for several benchmark codes and one full application that have been parallelized using our system.

  11. Anion-cation charge-transfer properties and spectral studies of [M(phen)3][Cd4(SPh)10] (M = Ru, Fe, and Ni).

    PubMed

    Jiang, Jian-Bing; Bian, Guo-Qing; Zhang, Ya-Ping; Luo, Wen; Zhu, Qin-Yu; Dai, Jie

    2011-10-07

    Three anion-cation compounds 1-3 with formula [M(phen)(3)][Cd(4)(SPh)(10)]·Sol (M = Ru(2+), Fe(2+), and Ni(2+), Sol = MeCN and H(2)O) have been synthesized and characterized by single-crystal analysis. Both the cations and anion are well-known ions, but the properties of the co-assembled compounds are interesting. Molecular structures and charge-transfer between the cations and anions in crystal and even in solution are discussed. These compounds are isomorphous and short inter-ion interactions are found in these crystals, such as π···π stacking and C-H···π contacts. Both spectroscopic and theoretical calculated results indicate that there is anion-cation charge-transfer (ACCT) between the Ru-phen complex dye and the Cd-SPh cluster, which plays an important role in their photophysical properties. The intensity of the fluorescent emission of the [Ru(phen)(3)](2+) is enhanced when the cation interacts with the [Cd(4)(SPh)(10)](2-) anion. The mechanism for the enhancement of photoluminescence has been proposed.

  12. Three-Dimensional Algebraic Models of the tRNA Code and 12 Graphs for Representing the Amino Acids

    PubMed Central

    José, Marco V.; Morgado, Eberto R.; Guimarães, Romeu Cardoso; Zamudio, Gabriel S.; de Farías, Sávio Torres; Bobadilla, Juan R.; Sosa, Daniela

    2014-01-01

    Three-dimensional algebraic models, also called Genetic Hotels, are developed to represent the Standard Genetic Code, the Standard tRNA Code (S-tRNA-C), and the Human tRNA code (H-tRNA-C). New algebraic concepts are introduced to be able to describe these models, to wit, the generalization of the 2n-Klein Group and the concept of a subgroup coset with a tail. We found that the H-tRNA-C displayed broken symmetries in regard to the S-tRNA-C, which is highly symmetric. We also show that there are only 12 ways to represent each of the corresponding phenotypic graphs of amino acids. The averages of statistical centrality measures of the 12 graphs for each of the three codes are carried out and they are statistically compared. The phenotypic graphs of the S-tRNA-C display a common triangular prism of amino acids in 10 out of the 12 graphs, whilst the corresponding graphs for the H-tRNA-C display only two triangular prisms. The graphs exhibit disjoint clusters of amino acids when their polar requirement values are used. We contend that the S-tRNA-C is in a frozen-like state, whereas the H-tRNA-C may be in an evolving state. PMID:25370377

  13. Mechanism of single metal exchange in the reactions of [M4(SPh)10]2- (M = Zn or Fe) with CoX2 (X = Cl or NO3) or FeCl2.

    PubMed

    Autissier, Valerie; Henderson, Richard A

    2008-07-21

    The kinetics of the reactions between [Zn4(SPh)10](2-) and an excess of MX2 (M = Co, X = NO3 or Cl; M = Fe, X = Cl), in which a Zn(II) is replaced by M(II), have been studied in MeCN at 25.0 degrees C. (1)H NMR spectroscopy shows that the ultimate product of the reactions is an equilibrium mixture of clusters of composition [Zn(n)M(4-n)(SPh)10](2-), and this is reflected in the multiphasic absorbance-time curves observed over protracted times (several minutes) using stopped-flow spectrophotometry to study the reactions. The kinetics of only the first phase have been determined, corresponding to the equilibrium formation of [Zn3M(SPh)10](2-). The effects of varying the concentrations of cluster, MX2, and ZnCl2 on the kinetics have been investigated. The rate law is consistent with the equilibrium nature of the metal exchange process and indicates a mechanism for the formation of [Zn3M(SPh)10](2-) involving two coupled equilibria. In the initial step binding of MX2 to a bridging thiolate in [Zn4(SPh)10](2-) results in breaking of a Zn-bridging thiolate bond. In the second step replacement of the cluster Zn involves transfer of the bridging thiolates from the Zn to M, with breaking of a Zn-bridged thiolate bond being rate-limiting. The kinetics for the reaction of ZnCl2 with [Zn3M(SPh)10](2-) (M = Fe or Co)} depends on the identity of M. This behavior indicates attack of ZnCl2 at a M-mu-SPh-Zn bridged thiolate. Similar studies on the analogous reactions between [Fe4(SPh)10](2-) and an excess of CoX2 (X = NO3 or Cl) in MeCN exhibit simpler kinetics but these are also consistent with the same mechanism.

  14. Parallelization of the FLAPW method and comparison with the PPW method

    NASA Astrophysics Data System (ADS)

    Canning, Andrew; Mannstadt, Wolfgang; Freeman, Arthur

    2000-03-01

    The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining electronic and magnetic properties of crystals and surfaces. In the past the FLAPW method has been limited to systems of about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell running on up to 512 processors on a Cray T3E parallel supercomputer. Some results will also be presented on a comparison of the plane-wave pseudopotential method and the FLAPW method on large systems.

  15. TOMO3D: 3-D joint refraction and reflection traveltime tomography parallel code for active-source seismic data—synthetic test

    NASA Astrophysics Data System (ADS)

    Meléndez, A.; Korenaga, J.; Sallarès, V.; Miniussi, A.; Ranero, C. R.

    2015-10-01

    We present a new 3-D traveltime tomography code (TOMO3D) for the modelling of active-source seismic data that uses the arrival times of both refracted and reflected seismic phases to derive the velocity distribution and the geometry of reflecting boundaries in the subsurface. This code is based on its popular 2-D version TOMO2D from which it inherited the methods to solve the forward and inverse problems. The traveltime calculations are done using a hybrid ray-tracing technique combining the graph and bending methods. The LSQR algorithm is used to perform the iterative regularized inversion to improve the initial velocity and depth models. In order to cope with an increased computational demand due to the incorporation of the third dimension, the forward problem solver, which takes most of the run time (˜90 per cent in the test presented here), has been parallelized with a combination of multi-processing and message passing interface standards. This parallelization distributes the ray-tracing and traveltime calculations among available computational resources. The code's performance is illustrated with a realistic synthetic example, including a checkerboard anomaly and two reflectors, which simulates the geometry of a subduction zone. The code is designed to invert for a single reflector at a time. A data-driven layer-stripping strategy is proposed for cases involving multiple reflectors, and it is tested for the successive inversion of the two reflectors. Layers are bound by consecutive reflectors, and an initial velocity model for each inversion step incorporates the results from previous steps. This strategy poses simpler inversion problems at each step, allowing the recovery of strong velocity discontinuities that would otherwise be smoothened.

  16. A compositional reservoir simulator on distributed memory parallel computers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rame, M.; Delshad, M.

    1995-12-31

    This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less

  17. Modeling Spectra of Icy Satellites and Cometary Icy Particles Using Multi-Sphere T-Matrix Code

    NASA Astrophysics Data System (ADS)

    Kolokolova, Ludmilla; Mackowski, Daniel; Pitman, Karly M.; Joseph, Emily C. S.; Buratti, Bonnie J.; Protopapa, Silvia; Kelley, Michael S.

    2016-10-01

    The Multi-Sphere T-matrix code (MSTM) allows rigorous computations of characteristics of the light scattered by a cluster of spherical particles. It was introduced to the scientific community in 1996 (Mackowski & Mishchenko, 1996, JOSA A, 13, 2266). Later it was put online and became one of the most popular codes to study photopolarimetric properties of aggregated particles. Later versions of this code, especially its parallelized version MSTM3 (Mackowski & Mishchenko, 2011, JQSRT, 112, 2182), were used to compute angular and wavelength dependence of the intensity and polarization of light scattered by aggregates of up to 4000 constituent particles (Kolokolova & Mackowski, 2012, JQSRT, 113, 2567). The version MSTM4 considers large thick slabs of spheres (Mackowski, 2014, Proc. of the Workshop ``Scattering by aggregates``, Bremen, Germany, March 2014, Th. Wriedt & Yu. Eremin, Eds., 6) and is significantly different from the earlier versions. It adopts a Discrete Fourier Convolution, implemented using a Fast Fourier Transform, for evaluation of the exciting field. MSTM4 is able to treat dozens of thousands of spheres and is about 100 times faster than the MSTM3 code. This allows us not only to compute the light scattering properties of a large number of electromagnetically interacting constituent particles, but also to perform multi-wavelength and multi-angular computations using computer resources with rather reasonable CPU and computer memory. We used MSTM4 to model near-infrared spectra of icy satellites of Saturn (Rhea, Dione, and Tethys data from Cassini VIMS), and of icy particles observed in the coma of comet 103P/Hartley 2 (data from EPOXI/DI HRII). Results of our modeling show that in the case of icy satellites the best fit to the observed spectra is provided by regolith made of spheres of radius ~1 micron with a porosity in the range 85% - 95%, which slightly varies for the different satellites. Fitting the spectra of the cometary icy particles requires icy

  18. Support for Debugging Automatically Parallelized Programs

    NASA Technical Reports Server (NTRS)

    Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)

    2001-01-01

    This viewgraph presentation provides information on the technical aspects of debugging computer code that has been automatically converted for use in a parallel computing system. Shared memory parallelization and distributed memory parallelization entail separate and distinct challenges for a debugging program. A prototype system has been developed which integrates various tools for the debugging of automatically parallelized programs including the CAPTools Database which provides variable definition information across subroutines as well as array distribution information.

  19. Efficient operator splitting algorithm for joint sparsity-regularized SPIRiT-based parallel MR imaging reconstruction.

    PubMed

    Duan, Jizhong; Liu, Yu; Jing, Peiguang

    2018-02-01

    Self-consistent parallel imaging (SPIRiT) is an auto-calibrating model for the reconstruction of parallel magnetic resonance imaging, which can be formulated as a regularized SPIRiT problem. The Projection Over Convex Sets (POCS) method was used to solve the formulated regularized SPIRiT problem. However, the quality of the reconstructed image still needs to be improved. Though methods such as NonLinear Conjugate Gradients (NLCG) can achieve higher spatial resolution, these methods always demand very complex computation and converge slowly. In this paper, we propose a new algorithm to solve the formulated Cartesian SPIRiT problem with the JTV and JL1 regularization terms. The proposed algorithm uses the operator splitting (OS) technique to decompose the problem into a gradient problem and a denoising problem with two regularization terms, which is solved by our proposed split Bregman based denoising algorithm, and adopts the Barzilai and Borwein method to update step size. Simulation experiments on two in vivo data sets demonstrate that the proposed algorithm is 1.3 times faster than ADMM for datasets with 8 channels. Especially, our proposal is 2 times faster than ADMM for the dataset with 32 channels. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. NAS Parallel Benchmark. Results 11-96: Performance Comparison of HPF and MPI Based NAS Parallel Benchmarks. 1.0

    NASA Technical Reports Server (NTRS)

    Saini, Subash; Bailey, David; Chancellor, Marisa K. (Technical Monitor)

    1997-01-01

    High Performance Fortran (HPF), the high-level language for parallel Fortran programming, is based on Fortran 90. HALF was defined by an informal standards committee known as the High Performance Fortran Forum (HPFF) in 1993, and modeled on TMC's CM Fortran language. Several HPF features have since been incorporated into the draft ANSI/ISO Fortran 95, the next formal revision of the Fortran standard. HPF allows users to write a single parallel program that can execute on a serial machine, a shared-memory parallel machine, or a distributed-memory parallel machine. HPF eliminates the complex, error-prone task of explicitly specifying how, where, and when to pass messages between processors on distributed-memory machines, or when to synchronize processors on shared-memory machines. HPF is designed in a way that allows the programmer to code an application at a high level, and then selectively optimize portions of the code by dropping into message-passing or calling tuned library routines as 'extrinsics'. Compilers supporting High Performance Fortran features first appeared in late 1994 and early 1995 from Applied Parallel Research (APR) Digital Equipment Corporation, and The Portland Group (PGI). IBM introduced an HPF compiler for the IBM RS/6000 SP/2 in April of 1996. Over the past two years, these implementations have shown steady improvement in terms of both features and performance. The performance of various hardware/ programming model (HPF and MPI (message passing interface)) combinations will be compared, based on latest NAS (NASA Advanced Supercomputing) Parallel Benchmark (NPB) results, thus providing a cross-machine and cross-model comparison. Specifically, HPF based NPB results will be compared with MPI based NPB results to provide perspective on performance currently obtainable using HPF versus MPI or versus hand-tuned implementations such as those supplied by the hardware vendors. In addition we would also present NPB (Version 1.0) performance results for

  1. Automatic Multilevel Parallelization Using OpenMP

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

    2002-01-01

    In this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report first results for several benchmark codes and one full application that have been parallelized using our system.

  2. Pyrrolysyl-tRNA Synthetase, an Aminoacyl-tRNA Synthetase for Genetic Code Expansion

    DOE PAGES

    Crnkovic, Ana; Suzuki, Tateki; Soll, Dieter; ...

    2016-06-14

    Genetic code expansion (GCE) has become a central topic of synthetic biology. GCE relies on engineered aminoacyl-tRNA synthetases (aaRSs) and a cognate tRNA species to allow codon reassignment by co-translational insertion of non-canonical amino acids (ncAAs) into proteins. Introduction of such amino acids increases the chemical diversity of recombinant proteins endowing them with novel properties. Such proteins serve in sophisticated biochemical and biophysical studies both in vitro and in vivo, they may become unique biomaterials or therapeutic agents, and they afford metabolic dependence of genetically modified organisms for biocontainment purposes. In the Methanosarcinaceae the incorporation of the 22nd genetically encodedmore » amino acid, pyrrolysine (Pyl), is facilitated by pyrrolysyl-tRNA synthetase (PylRS) and the cognate UAG-recognizing tRNAPyl. This unique aaRS•tRNA pair functions as an orthogonal translation system (OTS) in most model organisms. The facile directed evolution of the large PylRS active site to accommodate many ncAAs, and the enzyme’s anticodon-blind specific recognition of the cognate tRNAPyl make this system highly amenable for GCE purposes. The remarkable polyspecificity of PylRS has been exploited to incorporate >100 different ncAAs into proteins. Here we review the Pyl-OT system and selected GCE applications to examine the properties of an effective OTS.« less

  3. A NEW LOW MASS FOR THE HERCULES dSph: THE END OF A COMMON MASS SCALE FOR THE DWARFS?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aden, D.; Feltzing, S.; Lundstroem, I.

    2009-11-20

    We present a new mass estimate for the Hercules dwarf spheroidal (dSph) galaxy, based on the revised velocity dispersion obtained by Aden et al. The removal of a significant foreground contamination using newly acquired Stroemgren photometry has resulted in a reduced velocity dispersion. Using this new velocity dispersion of 3.72 +- 0.91 km s{sup -1}, we find a mass of M {sub 300} = 1.9{sup +1.1}{sub -0.8} x 10{sup 6} M{sub sun} within the central 300 pc, which is also the half-light radius, and a mass of M {sub 433} = 3.7{sup +2.2}{sub -1.6} x 10{sup 6} M{sub sun} withinmore » the reach of our data to 433 pc, significantly lower than previous estimates. We derive an overall mass-to-light ratio of M {sub 433}/L = 103{sup +83}{sub -48}[M{sub sun}/L{sub sun}]. Our mass estimate calls into question recent claims of a common mass scale for dSph galaxies. Additionally, we find tentative evidence for a velocity gradient in our kinematic data of 16 +- 3 km s{sup -1} kpc{sup -1}, and evidence of an asymmetric extension in the light distribution at approx0.5 kpc. We explore the possibility that these features are due to tidal interactions with the Milky Way. We show that there is a self-consistent model in which Hercules has an assumed tidal radius of r{sub t} = 485 pc, an orbital pericenter of r{sub p} = 18.5 +- 5 kpc, and a mass within r{sub t} of M{sub tid,r}=5.2{sup +2.7}{sub -2.7} x 10{sup 6} M-odot. Proper motions are required to test this model. Although we cannot exclude models in which Hercules contains no dark matter, we argue that Hercules is more likely to be a dark-matter-dominated system that is currently experiencing some tidal disturbance of its outer parts.« less

  4. A splitting integration scheme for the SPH simulation of concentrated particle suspensions

    NASA Astrophysics Data System (ADS)

    Bian, Xin; Ellero, Marco

    2014-01-01

    Simulating nearly contacting solid particles in suspension is a challenging task due to the diverging behavior of short-range lubrication forces, which pose a serious time-step limitation for explicit integration schemes. This general difficulty limits severely the total duration of simulations of concentrated suspensions. Inspired by the ideas developed in [S. Litvinov, M. Ellero, X.Y. Hu, N.A. Adams, J. Comput. Phys. 229 (2010) 5457-5464] for the simulation of highly dissipative fluids, we propose in this work a splitting integration scheme for the direct simulation of solid particles suspended in a Newtonian liquid. The scheme separates the contributions of different forces acting on the solid particles. In particular, intermediate- and long-range multi-body hydrodynamic forces, which are computed from the discretization of the Navier-Stokes equations using the smoothed particle hydrodynamics (SPH) method, are taken into account using an explicit integration; for short-range lubrication forces, velocities of pairwise interacting solid particles are updated implicitly by sweeping over all the neighboring pairs iteratively, until convergence in the solution is obtained. By using the splitting integration, simulations can be run stably and efficiently up to very large solid particle concentrations. Moreover, the proposed scheme is not limited to the SPH method presented here, but can be easily applied to other simulation techniques employed for particulate suspensions.

  5. New optimal asymmetric quantum codes constructed from constacyclic codes

    NASA Astrophysics Data System (ADS)

    Xu, Gen; Li, Ruihu; Guo, Luobin; Lü, Liangdong

    2017-02-01

    In this paper, we propose the construction of asymmetric quantum codes from two families of constacyclic codes over finite field 𝔽q2 of code length n, where for the first family, q is an odd prime power with the form 4t + 1 (t ≥ 1 is integer) or 4t - 1 (t ≥ 2 is integer) and n1 = q2+1 2; for the second family, q is an odd prime power with the form 10t + 3 or 10t + 7 (t ≥ 0 is integer) and n2 = q2+1 5. As a result, families of new asymmetric quantum codes [[n,k,dz/dx

  6. Parallel algorithms for modeling flow in permeable media. Annual report, February 15, 1995 - February 14, 1996

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    G.A. Pope; K. Sephernoori; D.C. McKinney

    1996-03-15

    This report describes the application of distributed-memory parallel programming techniques to a compositional simulator called UTCHEM. The University of Texas Chemical Flooding reservoir simulator (UTCHEM) is a general-purpose vectorized chemical flooding simulator that models the transport of chemical species in three-dimensional, multiphase flow through permeable media. The parallel version of UTCHEM addresses solving large-scale problems by reducing the amount of time that is required to obtain the solution as well as providing a flexible and portable programming environment. In this work, the original parallel version of UTCHEM was modified and ported to CRAY T3D and CRAY T3E, distributed-memory, multiprocessor computersmore » using CRAY-PVM as the interprocessor communication library. Also, the data communication routines were modified such that the portability of the original code across different computer architectures was mad possible.« less

  7. Parallel computing techniques for rotorcraft aerodynamics

    NASA Astrophysics Data System (ADS)

    Ekici, Kivanc

    The modification of unsteady three-dimensional Navier-Stokes codes for application on massively parallel and distributed computing environments is investigated. The Euler/Navier-Stokes code TURNS (Transonic Unsteady Rotor Navier-Stokes) was chosen as a test bed because of its wide use by universities and industry. For the efficient implementation of TURNS on parallel computing systems, two algorithmic changes are developed. First, main modifications to the implicit operator, Lower-Upper Symmetric Gauss Seidel (LU-SGS) originally used in TURNS, is performed. Second, application of an inexact Newton method, coupled with a Krylov subspace iterative method (Newton-Krylov method) is carried out. Both techniques have been tried previously for the Euler equations mode of the code. In this work, we have extended the methods to the Navier-Stokes mode. Several new implicit operators were tried because of convergence problems of traditional operators with the high cell aspect ratio (CAR) grids needed for viscous calculations on structured grids. Promising results for both Euler and Navier-Stokes cases are presented for these operators. For the efficient implementation of Newton-Krylov methods to the Navier-Stokes mode of TURNS, efficient preconditioners must be used. The parallel implicit operators used in the previous step are employed as preconditioners and the results are compared. The Message Passing Interface (MPI) protocol has been used because of its portability to various parallel architectures. It should be noted that the proposed methodology is general and can be applied to several other CFD codes (e.g. OVERFLOW).

  8. Collisional disruptions of rotating targets

    NASA Astrophysics Data System (ADS)

    Ševeček, Pavel; Broz, Miroslav

    2017-10-01

    Collisions are key processes in the evolution of the Main Asteroid Belt and impact events - i.e. target fragmentation and gravitational reaccumulation - are commonly studied by numerical simulations, namely by SPH and N-body methods. In our work, we extend the previous studies by assuming rotating targets and we study the dependence of resulting size-distributions on the pre-impact rotation of the target. To obtain stable initial conditions, it is also necessary to include the self-gravity already in the fragmentation phase which was previously neglected.To tackle this problem, we developed an SPH code, accelerated by SSE/AVX instruction sets and parallelized. The code solves the standard set of hydrodynamic equations, using the Tillotson equation of state, von Mises criterion for plastic yielding and scalar Grady-Kipp model for fragmentation. We further modified the velocity gradient by a correction tensor (Schäfer et al. 2007) to ensure a first-order conservation of the total angular momentum. As the intact target is a spherical body, its gravity can be approximated by a potential of a homogeneous sphere, making it easy to set up initial conditions. This is however infeasible for later stages of the disruption; to this point, we included the Barnes-Hut algorithm to compute the gravitational accelerations, using a multipole expansion of distant particles up to hexadecapole order.We tested the code carefully, comparing the results to our previous computations obtained with the SPH5 code (Benz and Asphaug 1994). Finally, we ran a set of simulations and we discuss the difference between the synthetic families created by rotating and static targets.

  9. SPH numerical investigation of the characteristics of an oscillating hydraulic jump at an abrupt drop

    NASA Astrophysics Data System (ADS)

    De Padova, Diana; Mossa, Michele; Sibilla, Stefano

    2018-02-01

    This paper shows the results of the smooth particle hydrodynamics (SPH) modelling of the hydraulic jump at an abrupt drop, where the transition from supercritical to subcritical flow is characterised by several flow patterns depending upon the inflow and tailwater conditions. SPH simulations are obtained by a pseudo-compressible XSPH scheme with pressure smoothing; turbulent stresses are represented either by an algebraic mixing-length model, or by a two-equation k- ɛ model. The numerical model is applied to analyse the occurrence of oscillatory flow conditions between two different jump types characterised by quasi-periodic oscillation, and the results are compared with experiments performed at the hydraulics laboratory of Bari Technical University. The purpose of this paper is to obtain a deeper understanding of the physical features of a flow which is in general difficult to be reproduced numerically, owing to its unstable character: in particular, vorticity and turbulent kinetic energy fields, velocity, water depth and pressure spectra downstream of the jump, and velocity and pressure cross-correlations can be computed and analysed.

  10. Time-Resolved 3D Quantitative Flow MRI of the Major Intracranial Vessels: Initial Experience and Comparative Evaluation at 1.5T and 3.0T in Combination With Parallel Imaging

    PubMed Central

    Bammer, Roland; Hope, Thomas A.; Aksoy, Murat; Alley, Marcus T.

    2012-01-01

    Exact knowledge of blood flow characteristics in the major cerebral vessels is of great relevance for diagnosing cerebrovascular abnormalities. This involves the assessment of hemodynamically critical areas as well as the derivation of biomechanical parameters such as wall shear stress and pressure gradients. A time-resolved, 3D phase-contrast (PC) MRI method using parallel imaging was implemented to measure blood flow in three dimensions at multiple instances over the cardiac cycle. The 4D velocity data obtained from 14 healthy volunteers were used to investigate dynamic blood flow with the use of multiplanar reformatting, 3D streamlines, and 4D particle tracing. In addition, the effects of magnetic field strength, parallel imaging, and temporal resolution on the data were investigated in a comparative evaluation at 1.5T and 3T using three different parallel imaging reduction factors and three different temporal resolutions in eight of the 14 subjects. Studies were consistently performed faster at 3T than at 1.5T because of better parallel imaging performance. A high temporal resolution (65 ms) was required to follow dynamic processes in the intracranial vessels. The 4D flow measurements provided a high degree of vascular conspicuity. Time-resolved streamline analysis provided features that have not been reported previously for the intracranial vasculature. PMID:17195166

  11. Collisionless stellar hydrodynamics as an efficient alternative to N-body methods

    NASA Astrophysics Data System (ADS)

    Mitchell, Nigel L.; Vorobyov, Eduard I.; Hensler, Gerhard

    2013-01-01

    The dominant constituents of the Universe's matter are believed to be collisionless in nature and thus their modelling in any self-consistent simulation is extremely important. For simulations that deal only with dark matter or stellar systems, the conventional N-body technique is fast, memory efficient and relatively simple to implement. However when extending simulations to include the effects of gas physics, mesh codes are at a distinct disadvantage compared to Smooth Particle Hydrodynamics (SPH) codes. Whereas implementing the N-body approach into SPH codes is fairly trivial, the particle-mesh technique used in mesh codes to couple collisionless stars and dark matter to the gas on the mesh has a series of significant scientific and technical limitations. These include spurious entropy generation resulting from discreteness effects, poor load balancing and increased communication overhead which spoil the excellent scaling in massively parallel grid codes. In this paper we propose the use of the collisionless Boltzmann moment equations as a means to model the collisionless material as a fluid on the mesh, implementing it into the massively parallel FLASH Adaptive Mesh Refinement (AMR) code. This approach which we term `collisionless stellar hydrodynamics' enables us to do away with the particle-mesh approach and since the parallelization scheme is identical to that used for the hydrodynamics, it preserves the excellent scaling of the FLASH code already demonstrated on peta-flop machines. We find that the classic hydrodynamic equations and the Boltzmann moment equations can be reconciled under specific conditions, allowing us to generate analytic solutions for collisionless systems using conventional test problems. We confirm the validity of our approach using a suite of demanding test problems, including the use of a modified Sod shock test. By deriving the relevant eigenvalues and eigenvectors of the Boltzmann moment equations, we are able to use high order

  12. Xyce parallel electronic simulator : users' guide.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.

    2011-05-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-artmore » algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a

  13. Negative regulation of prophenoloxidase (proPO) activation by a clip-domain serine proteinase homolog (SPH) from endoparasitoid venom.

    PubMed

    Zhang, Guangmei; Lu, Zhi-Qiang; Jiang, Haobo; Asgari, Sassan

    2004-05-01

    Most parasitic wasps inject maternal factors into the host hemocoel to suppress the host immune system and ensure successful development of their progeny. Melanization is one of the insect defence mechanisms against intruding pathogens or parasites. We previously isolated from the venom of Cotesia rubecula a 50 kDa protein that blocked melanization in the hemolymph of its host, Pieris rapae [Insect Biochem. Mol. Biol. 33 (2003) 1017]. This protein, designated Vn50, is a serine proteinase homolog (SPH) containing an amino-terminal clip domain. In this work, we demonstrated that recombinant Vn50 bound P. rapae hemolymph components that were recognized by antisera to Tenebrio molitor prophenoloxidase (proPO) and Manduca sexta proPO-activating proteinase (PAP). Vn50 is stable in the host hemolymph-it remained intact for at least 72 h after parasitization. Using M. sexta as a model system, we found that Vn50 efficiently down-regulated proPO activation mediated by M. sexta PAP-1, SPH-1, and SPH-2. Vn50 did not inhibit active phenoloxidase (PO) or PAP-1, but it significantly reduced the proteolysis of proPO. If recombinant Vn50 binds P. rapae proPO and PAP (as suggested by the antibody reactions), it is likely that the molecular interactions among M. sexta proPO, PAP-1, and SPHs were impaired by this venom protein. A similar strategy might be employed by C. rubecula to negatively impact the proPO activation reaction in its natural host.

  14. User's Guide for ENSAERO_FE Parallel Finite Element Solver

    NASA Technical Reports Server (NTRS)

    Eldred, Lloyd B.; Guruswamy, Guru P.

    1999-01-01

    A high fidelity parallel static structural analysis capability is created and interfaced to the multidisciplinary analysis package ENSAERO-MPI of Ames Research Center. This new module replaces ENSAERO's lower fidelity simple finite element and modal modules. Full aircraft structures may be more accurately modeled using the new finite element capability. Parallel computation is performed by breaking the full structure into multiple substructures. This approach is conceptually similar to ENSAERO's multizonal fluid analysis capability. The new substructure code is used to solve the structural finite element equations for each substructure in parallel. NASTRANKOSMIC is utilized as a front end for this code. Its full library of elements can be used to create an accurate and realistic aircraft model. It is used to create the stiffness matrices for each substructure. The new parallel code then uses an iterative preconditioned conjugate gradient method to solve the global structural equations for the substructure boundary nodes.

  15. Implementation of parallel moment equations in NIMROD

    NASA Astrophysics Data System (ADS)

    Lee, Hankyu Q.; Held, Eric D.; Ji, Jeong-Young

    2017-10-01

    As collisionality is low (the Knudsen number is large) in many plasma applications, kinetic effects become important, particularly in parallel dynamics for magnetized plasmas. Fluid models can capture some kinetic effects when integral parallel closures are adopted. The adiabatic and linear approximations are used in solving general moment equations to obtain the integral closures. In this work, we present an effort to incorporate non-adiabatic (time-dependent) and nonlinear effects into parallel closures. Instead of analytically solving the approximate moment system, we implement exact parallel moment equations in the NIMROD fluid code. The moment code is expected to provide a natural convergence scheme by increasing the number of moments. Work in collaboration with the PSI Center and supported by the U.S. DOE under Grant Nos. DE-SC0014033, DE-SC0016256, and DE-FG02-04ER54746.

  16. Parallel processing approach to transform-based image coding

    NASA Astrophysics Data System (ADS)

    Normile, James O.; Wright, Dan; Chu, Ken; Yeh, Chia L.

    1991-06-01

    This paper describes a flexible parallel processing architecture designed for use in real time video processing. The system consists of floating point DSP processors connected to each other via fast serial links, each processor has access to a globally shared memory. A multiple bus architecture in combination with a dual ported memory allows communication with a host control processor. The system has been applied to prototyping of video compression and decompression algorithms. The decomposition of transform based algorithms for decompression into a form suitable for parallel processing is described. A technique for automatic load balancing among the processors is developed and discussed, results ar presented with image statistics and data rates. Finally techniques for accelerating the system throughput are analyzed and results from the application of one such modification described.

  17. Parallel-vector computation for linear structural analysis and non-linear unconstrained optimization problems

    NASA Technical Reports Server (NTRS)

    Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.

    1991-01-01

    Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.

  18. Characterization of Cu(II) and Cd(II) resistance mechanisms in Sphingobium sp. PHE-SPH and Ochrobactrum sp. PHE-OCH and their potential application in the bioremediation of heavy metal-phenanthrene co-contaminated sites.

    PubMed

    Chen, Chen; Lei, Wenrui; Lu, Min; Zhang, Jianan; Zhang, Zhou; Luo, Chunling; Chen, Yahua; Hong, Qing; Shen, Zhenguo

    2016-04-01

    Soil that is co-contaminated with heavy metals (HMs) and polycyclic aromatic hydrocarbons (PAHs) is difficult to bioremediate due to the ability of toxic metals to inhibit PAH degradation by bacteria. We demonstrated the resistance mechanisms to Cu(II) and Cd(II) of two newly isolated strains of Sphingobium sp. PHE-SPH and Ochrobactrum sp. PHE-OCH and further tested their potential application in the bioremediation of HM-phenanthrene (PhA) co-contaminated sites. The PHE-SPH and PHE-OCH strains tolerated 4.63 and 4.34 mM Cu(II) and also showed tolerance to 0.48 and 1.52 mM Cd(II), respectively. Diverse resistance patterns were detected between the two strains. In PHE-OCH cells, the maximum accumulation of Cu(II) occurred in the cell wall, while the maximum accumulation was in the cytoplasm of PHE-SPH cells. This resulted in a sudden suppression of growth in PHE-OCH and a gradual inhibition in PHE-SPH as the concentration of Cu(II) increased. Organic acid production was markedly higher in PHE-OCH than in PHE-SPH, which may also have a role in the resistance mechanisms, and contributes to the higher Cd(II) tolerance of PHE-OCH. The factors involved in the absorption of Cu(II) or Cd(II) in PHE-SPH and PHE-OCH were identified as proteins and carbohydrates by Fourier transform infrared (FT-IR) spectroscopy. Furthermore, both strains showed the ability to efficiently degrade PhA and maintained this high degradation efficiency under HM stress. The high tolerance to HMs and the PhA degradation capacity make Sphingobium sp. PHE-SPH and Ochrobactrum sp. PHE-OCH excellent candidate organisms for the bioremediation of HM-PhA co-contaminated sites.

  19. Use of Hilbert Curves in Parallelized CUDA code: Interaction of Interstellar Atoms with the Heliosphere

    NASA Astrophysics Data System (ADS)

    Destefano, Anthony; Heerikhuisen, Jacob

    2015-04-01

    Fully 3D particle simulations can be a computationally and memory expensive task, especially when high resolution grid cells are required. The problem becomes further complicated when parallelization is needed. In this work we focus on computational methods to solve these difficulties. Hilbert curves are used to map the 3D particle space to the 1D contiguous memory space. This method of organization allows for minimized cache misses on the GPU as well as a sorted structure that is equivalent to an octal tree data structure. This type of sorted structure is attractive for uses in adaptive mesh implementations due to the logarithm search time. Implementations using the Message Passing Interface (MPI) library and NVIDIA's parallel computing platform CUDA will be compared, as MPI is commonly used on server nodes with many CPU's. We will also compare static grid structures with those of adaptive mesh structures. The physical test bed will be simulating heavy interstellar atoms interacting with a background plasma, the heliosphere, simulated from fully consistent coupled MHD/kinetic particle code. It is known that charge exchange is an important factor in space plasmas, specifically it modifies the structure of the heliosphere itself. We would like to thank the Alabama Supercomputer Authority for the use of their computational resources.

  20. Overview of SPH-ALE applications for hydraulic turbines in ANDRITZ Hydro

    NASA Astrophysics Data System (ADS)

    Rentschler, M.; Marongiu, J. C.; Neuhauser, M.; Parkinson, E.

    2018-02-01

    Over the past 13 years, ANDRITZ Hydro has developed an in-house tool based on the SPH-ALE method for applications in flow simulations in hydraulic turbines. The initial motivation is related to the challenging simulation of free surface flows in Pelton turbines, where highly dynamic water jets interact with rotating buckets, creating thin water jets traveling inside the housing and possibly causing disturbances on the runner. The present paper proposes an overview of industrial applications allowed by the developed tool, including design evaluation of Pelton runners and casings, transient operation of Pelton units and free surface flows in hydraulic structures.

  1. Parallel Coding of First- and Second-Order Stimulus Attributes by Midbrain Electrosensory Neurons

    PubMed Central

    McGillivray, Patrick; Vonderschen, Katrin; Fortune, Eric S.; Chacron, Maurice J.

    2015-01-01

    Natural stimuli often have time-varying first-order (i.e., mean) and second-order (i.e., variance) attributes that each carry critical information for perception and can vary independently over orders of magnitude. Experiments have shown that sensory systems continuously adapt their responses based on changes in each of these attributes. This adaptation creates ambiguity in the neural code as multiple stimuli may elicit the same neural response. While parallel processing of first- and second-order attributes by separate neural pathways is sufficient to remove this ambiguity, the existence of such pathways and the neural circuits that mediate their emergence have not been uncovered to date. We recorded the responses of midbrain electrosensory neurons in the weakly electric fish Apteronotus leptorhynchus to stimuli with first- and second-order attributes that varied independently in time. We found three distinct groups of midbrain neurons: the first group responded to both first- and second-order attributes, the second group responded selectively to first-order attributes, and the last group responded selectively to second-order attributes. In contrast, all afferent hindbrain neurons responded to both first- and second-order attributes. Using computational analyses, we show how inputs from a heterogeneous population of ON- and OFF-type afferent neurons are combined to give rise to response selectivity to either first- or second-order stimulus attributes in midbrain neurons. Our study thus uncovers, for the first time, generic and widely applicable mechanisms by which parallel processing of first- and second-order stimulus attributes emerges in the brain. PMID:22514313

  2. Parallel computing on Unix workstation arrays

    NASA Astrophysics Data System (ADS)

    Reale, F.; Bocchino, F.; Sciortino, S.

    1994-12-01

    We have tested arrays of general-purpose Unix workstations used as MIMD systems for massive parallel computations. In particular we have solved numerically a demanding test problem with a 2D hydrodynamic code, generally developed to study astrophysical flows, by exucuting it on arrays either of DECstations 5000/200 on Ethernet LAN, or of DECstations 3000/400, equipped with powerful Alpha processors, on FDDI LAN. The code is appropriate for data-domain decomposition, and we have used a library for parallelization previously developed in our Institute, and easily extended to work on Unix workstation arrays by using the PVM software toolset. We have compared the parallel efficiencies obtained on arrays of several processors to those obtained on a dedicated MIMD parallel system, namely a Meiko Computing Surface (CS-1), equipped with Intel i860 processors. We discuss the feasibility of using non-dedicated parallel systems and conclude that the convenience depends essentially on the size of the computational domain as compared to the relative processor power and network bandwidth. We point out that for future perspectives a parallel development of processor and network technology is important, and that the software still offers great opportunities of improvement, especially in terms of latency times in the message-passing protocols. In conditions of significant gain in terms of speedup, such workstation arrays represent a cost-effective approach to massive parallel computations.

  3. MLP: A Parallel Programming Alternative to MPI for New Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Taft, James R.

    1999-01-01

    Recent developments at the NASA AMES Research Center's NAS Division have demonstrated that the new generation of NUMA based Symmetric Multi-Processing systems (SMPs), such as the Silicon Graphics Origin 2000, can successfully execute legacy vector oriented CFD production codes at sustained rates far exceeding processing rates possible on dedicated 16 CPU Cray C90 systems. This high level of performance is achieved via shared memory based Multi-Level Parallelism (MLP). This programming approach, developed at NAS and outlined below, is distinct from the message passing paradigm of MPI. It offers parallelism at both the fine and coarse grained level, with communication latencies that are approximately 50-100 times lower than typical MPI implementations on the same platform. Such latency reductions offer the promise of performance scaling to very large CPU counts. The method draws on, but is also distinct from, the newly defined OpenMP specification, which uses compiler directives to support a limited subset of multi-level parallel operations. The NAS MLP method is general, and applicable to a large class of NASA CFD codes.

  4. Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM

    NASA Astrophysics Data System (ADS)

    de La Fuente Marcos, Carlos; Barge, Pierre; de La Fuente Marcos, Raúl

    2002-03-01

    We describe a parallel version of our high-order-accuracy particle-mesh code for the simulation of collisionless protoplanetary disks. We use this code to carry out a massively parallel, two-dimensional, time-dependent, numerical simulation, which includes dust particles, to study the potential role of large-scale, gaseous vortices in protoplanetary disks. This noncollisional problem is easy to parallelize on message-passing multicomputer architectures. We performed the simulations on a cache-coherent nonuniform memory access Origin 2000 machine, using both the parallel virtual machine (PVM) and message-passing interface (MPI) message-passing libraries. Our performance analysis suggests that, for our problem, PVM is about 25% faster than MPI. Using PVM and MPI made it possible to reduce CPU time and increase code performance. This allows for simulations with a large number of particles (N ~ 105-106) in reasonable CPU times. The performances of our implementation of the pa! rallel code on an Origin 2000 supercomputer are presented and discussed. They exhibit very good speedup behavior and low load unbalancing. Our results confirm that giant gaseous vortices can play a dominant role in giant planet formation.

  5. Low Density Parity Check Codes: Bandwidth Efficient Channel Coding

    NASA Technical Reports Server (NTRS)

    Fong, Wai; Lin, Shu; Maki, Gary; Yeh, Pen-Shu

    2003-01-01

    Low Density Parity Check (LDPC) Codes provide near-Shannon Capacity performance for NASA Missions. These codes have high coding rates R=0.82 and 0.875 with moderate code lengths, n=4096 and 8176. Their decoders have inherently parallel structures which allows for high-speed implementation. Two codes based on Euclidean Geometry (EG) were selected for flight ASIC implementation. These codes are cyclic and quasi-cyclic in nature and therefore have a simple encoder structure. This results in power and size benefits. These codes also have a large minimum distance as much as d,,, = 65 giving them powerful error correcting capabilities and error floors less than lo- BER. This paper will present development of the LDPC flight encoder and decoder, its applications and status.

  6. SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws

    NASA Technical Reports Server (NTRS)

    Cooke, Daniel; Rushton, Nelson

    2013-01-01

    With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less

  7. Parallel computation with the force

    NASA Technical Reports Server (NTRS)

    Jordan, H. F.

    1985-01-01

    A methodology, called the force, supports the construction of programs to be executed in parallel by a force of processes. The number of processes in the force is unspecified, but potentially very large. The force idea is embodied in a set of macros which produce multiproceossor FORTRAN code and has been studied on two shared memory multiprocessors of fairly different character. The method has simplified the writing of highly parallel programs within a limited class of parallel algorithms and is being extended to cover a broader class. The individual parallel constructs which comprise the force methodology are discussed. Of central concern are their semantics, implementation on different architectures and performance implications.

  8. Synthesizing parallel imaging applications using the CAP (computer-aided parallelization) tool

    NASA Astrophysics Data System (ADS)

    Gennart, Benoit A.; Mazzariol, Marc; Messerli, Vincent; Hersch, Roger D.

    1997-12-01

    Imaging applications such as filtering, image transforms and compression/decompression require vast amounts of computing power when applied to large data sets. These applications would potentially benefit from the use of parallel processing. However, dedicated parallel computers are expensive and their processing power per node lags behind that of the most recent commodity components. Furthermore, developing parallel applications remains a difficult task: writing and debugging the application is difficult (deadlocks), programs may not be portable from one parallel architecture to the other, and performance often comes short of expectations. In order to facilitate the development of parallel applications, we propose the CAP computer-aided parallelization tool which enables application programmers to specify at a high-level of abstraction the flow of data between pipelined-parallel operations. In addition, the CAP tool supports the programmer in developing parallel imaging and storage operations. CAP enables combining efficiently parallel storage access routines and image processing sequential operations. This paper shows how processing and I/O intensive imaging applications must be implemented to take advantage of parallelism and pipelining between data access and processing. This paper's contribution is (1) to show how such implementations can be compactly specified in CAP, and (2) to demonstrate that CAP specified applications achieve the performance of custom parallel code. The paper analyzes theoretically the performance of CAP specified applications and demonstrates the accuracy of the theoretical analysis through experimental measurements.

  9. Xyce parallel electronic simulator users guide, version 6.1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less

  10. Xyce Parallel Electronic Simulator Users' Guide Version 6.8

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase$-$ a message passing parallel implementation $-$ which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less

  11. Xyce parallel electronic simulator users guide, version 6.0.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less

  12. The Particle Accelerator Simulation Code PyORBIT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gorlov, Timofey V; Holmes, Jeffrey A; Cousineau, Sarah M

    2015-01-01

    The particle accelerator simulation code PyORBIT is presented. The structure, implementation, history, parallel and simulation capabilities, and future development of the code are discussed. The PyORBIT code is a new implementation and extension of algorithms of the original ORBIT code that was developed for the Spallation Neutron Source accelerator at the Oak Ridge National Laboratory. The PyORBIT code has a two level structure. The upper level uses the Python programming language to control the flow of intensive calculations performed by the lower level code implemented in the C++ language. The parallel capabilities are based on MPI communications. The PyORBIT ismore » an open source code accessible to the public through the Google Open Source Projects Hosting service.« less

  13. Osmotic regulation of expression of two extracellular matrix-binding proteins and a haemolysin of Leptospira interrogans: differential effects on LigA and Sph2 extracellular release.

    PubMed

    Matsunaga, James; Medeiros, Marco A; Sanchez, Yolanda; Werneid, Kristian F; Ko, Albert I

    2007-10-01

    The life cycle of the pathogen Leptospira interrogans involves stages outside and inside the host. Entry of L. interrogans from moist environments into the host is likely to be accompanied by the induction of genes encoding virulence determinants and the concomitant repression of genes encoding products required for survival outside of the host. The expression of the adhesin LigA, the haemolysin Sph2 (Lk73.5) and the outer-membrane lipoprotein LipL36 of pathogenic Leptospira species have been reported to be regulated by mammalian host signals. A previous study demonstrated that raising the osmolarity of the leptospiral growth medium to physiological levels encountered in the host by addition of various salts enhanced the levels of cell-associated LigA and LigB and extracellular LigA. In this study, we systematically examined the effects of osmotic upshift with ionic and non-ionic solutes on expression of the known mammalian host-regulated leptospiral genes. The levels of cell-associated LigA, LigB and Sph2 increased at physiological osmolarity, whereas LipL36 levels decreased, corresponding to changes in specific transcript levels. These changes in expression occurred irrespective of whether sodium chloride or sucrose was used as the solute. The increase of cellular LigA, LigB and Sph2 protein levels occurred within hours of adding sodium chloride. Extracellular Sph2 levels increased when either sodium chloride or sucrose was added to achieve physiological osmolarity. In contrast, enhanced levels of extracellular LigA were observed only with an increase in ionic strength. These results indicate that the mechanisms for release of LigA and Sph2 differ during host infection. Thus, osmolarity not only affects leptospiral gene expression by affecting transcript levels of putative virulence determinants but also affects the release of such proteins into the surroundings.

  14. Transferring ecosystem simulation codes to supercomputers

    NASA Technical Reports Server (NTRS)

    Skiles, J. W.; Schulbach, C. H.

    1995-01-01

    Many ecosystem simulation computer codes have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Supercomputing platforms (both parallel and distributed systems) have been largely unused, however, because of the perceived difficulty in accessing and using the machines. Also, significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers must be considered. We have transferred a grassland simulation model (developed on a VAX) to a Cray Y-MP/C90. We describe porting the model to the Cray and the changes we made to exploit the parallelism in the application and improve code execution. The Cray executed the model 30 times faster than the VAX and 10 times faster than a Unix workstation. We achieved an additional speedup of 30 percent by using the compiler's vectoring and 'in-line' capabilities. The code runs at only about 5 percent of the Cray's peak speed because it ineffectively uses the vector and parallel processing capabilities of the Cray. We expect that by restructuring the code, it could execute an additional six to ten times faster.

  15. A micro-macro coupling approach of MD-SPH method for reactive energetic materials

    NASA Astrophysics Data System (ADS)

    Liu, Gui Rong; Wang, Guang Yu; Peng, Qing; De, Suvranu

    2017-01-01

    The simulation of reactive energetic materials has long been the interest of researchers because of the extensive applications of explosives. Much research has been done on the subject at macro scale in the past and research at micro scale has been initiated recently. Equation of state (EoS) is the relation between physical quantities (pressure, temperature, energy and volume) describing thermodynamic states of materials under a given set of conditions. It plays a significant role in determining the characteristics of energetic materials, including Chapman-Jouguet point and detonation velocity. Furthermore, EoS is the key to connect microscopic and macroscopic phenomenon when simulating the macro effects of an explosion. For instance, an ignition and growth model for high explosives uses two JWL EoSs, one for solid explosive and the other for gaseous products, which are often obtained from experiments that can be quite expensive and hazardous. Therefore, it is ideal to calculate the EoS of energetic materials through computational means. In this paper, the EoSs for both solid and gaseous products of β-HMX are calculated using molecular dynamics simulation with ReaxFF-d3, a reactive force field obtained from quantum mechanics. The microscopic simulation results are then compared with experiments and the continuum ignition and growth model. Good agreement is observed. Then, the EoSs obtained through micro-scale simulation is applied in a smoothed particle hydrodynamics (SPH) code to simulate the macro effects of explosions. Simulation results are compared with experiments.

  16. Efficient Parallel Formulations of Hierarchical Methods and Their Applications

    NASA Astrophysics Data System (ADS)

    Grama, Ananth Y.

    1996-01-01

    Hierarchical methods such as the Fast Multipole Method (FMM) and Barnes-Hut (BH) are used for rapid evaluation of potential (gravitational, electrostatic) fields in particle systems. They are also used for solving integral equations using boundary element methods. The linear systems arising from these methods are dense and are solved iteratively. Hierarchical methods reduce the complexity of the core matrix-vector product from O(n^2) to O(n log n) and the memory requirement from O(n^2) to O(n). We have developed highly scalable parallel formulations of a hybrid FMM/BH method that are capable of handling arbitrarily irregular distributions. We apply these formulations to astrophysical simulations of Plummer and Gaussian galaxies. We have used our parallel formulations to solve the integral form of the Laplace equation. We show that our parallel hierarchical mat-vecs yield high efficiency and overall performance even on relatively small problems. A problem containing approximately 200K nodes takes under a second to compute on 256 processors and yet yields over 85% efficiency. The efficiency and raw performance is expected to increase for bigger problems. For the 200K node problem, our code delivers about 5 GFLOPS of performance on a 256 processor T3D. This is impressive considering the fact that the problem has floating point divides and roots, and very little locality resulting in poor cache performance. A dense matrix-vector product of the same dimensions would require about 0.5 TeraBytes of memory and about 770 TeraFLOPS of computing speed. Clearly, if the loss in accuracy resulting from the use of hierarchical methods is acceptable, our code yields significant savings in time and memory. We also study the convergence of a GMRES solver built around this mat-vec. We accelerate the convergence of the solver using three preconditioning techniques: diagonal scaling, block-diagonal preconditioning, and inner-outer preconditioning. We study the performance and parallel

  17. A Comparison of Automatic Parallelization Tools/Compilers on the SGI Origin 2000 Using the NAS Benchmarks

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry

    1998-01-01

    Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.

  18. Improve load balancing and coding efficiency of tiles in high efficiency video coding by adaptive tile boundary

    NASA Astrophysics Data System (ADS)

    Chan, Chia-Hsin; Tu, Chun-Chuan; Tsai, Wen-Jiin

    2017-01-01

    High efficiency video coding (HEVC) not only improves the coding efficiency drastically compared to the well-known H.264/AVC but also introduces coding tools for parallel processing, one of which is tiles. Tile partitioning is allowed to be arbitrary in HEVC, but how to decide tile boundaries remains an open issue. An adaptive tile boundary (ATB) method is proposed to select a better tile partitioning to improve load balancing (ATB-LoadB) and coding efficiency (ATB-Gain) with a unified scheme. Experimental results show that, compared to ordinary uniform-space partitioning, the proposed ATB can save up to 17.65% of encoding times in parallel encoding scenarios and can reduce up to 0.8% of total bit rates for coding efficiency.

  19. Integrated Task and Data Parallel Programming

    NASA Technical Reports Server (NTRS)

    Grimshaw, A. S.

    1998-01-01

    This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated

  20. tRNA-Derived Small RNA: A Novel Regulatory Small Non-Coding RNA.

    PubMed

    Li, Siqi; Xu, Zhengping; Sheng, Jinghao

    2018-05-10

    Deep analysis of next-generation sequencing data unveils numerous small non-coding RNAs with distinct functions. Recently, fragments derived from tRNA, named as tRNA-derived small RNA (tsRNA), have attracted broad attention. There are mainly two types of tsRNAs, including tRNA-derived stress-induced RNA (tiRNA) and tRNA-derived fragment (tRF), which differ in the cleavage position of the precursor or mature tRNA transcript. Emerging evidence has shown that tsRNAs are not merely tRNA degradation debris but have been recognized to play regulatory roles in many specific physiological and pathological processes. In this review, we summarize the biogeneses of various tsRNAs, present the emerging concepts regarding functions and mechanisms of action of tsRNAs, highlight the potential application of tsRNAs in human diseases, and put forward the current problems and future research directions.

  1. Spatio-Temporal Process Simulation of Dam-Break Flood Based on SPH

    NASA Astrophysics Data System (ADS)

    Wang, H.; Ye, F.; Ouyang, S.; Li, Z.

    2018-04-01

    On the basis of introducing the SPH (Smooth Particle Hydrodynamics) simulation method, the key research problems were given solutions in this paper, which ere the spatial scale and temporal scale adapting to the GIS(Geographical Information System) application, the boundary condition equations combined with the underlying surface, and the kernel function and parameters applicable to dam-break flood simulation. In this regards, a calculation method of spatio-temporal process emulation with elaborate particles for dam-break flood was proposed. Moreover the spatio-temporal process was dynamic simulated by using GIS modelling and visualization. The results show that the method gets more information, objectiveness and real situations.

  2. Transformation of dinitrosyl iron complexes [(NO)2Fe(SR)2]- (R = Et, Ph) into [4Fe-4S] Clusters [Fe4S4(SPh)4]2-: relevance to the repair of the nitric oxide-modified ferredoxin [4Fe-4S] clusters.

    PubMed

    Tsou, Chih-Chin; Lin, Zong-Sian; Lu, Tsai-Te; Liaw, Wen-Feng

    2008-12-17

    Transformation of dinitrosyl iron complexes (DNICs) [(NO)(2)Fe(SR)(2)](-) (R = Et, Ph) into [4Fe-4S] clusters [Fe(4)S(4)(SPh)(4)](2-) in the presence of [Fe(SPh)(4)](2-/1-) and S-donor species S(8) via the reassembling process ([(NO)(2)Fe(SR)(2)](-) --> [Fe(4)S(3)(NO)(7)](-) (1)/[Fe(4)S(3)(NO)(7)](2-) (2) --> [Fe(4)S(4)(NO)(4)](2-) (3) --> [Fe(4)S(4)(SPh)(4)](2-) (5)) was demonstrated. Reaction of [(NO)(2)Fe(SR)(2)](-) (R = Et, Ph) with S(8) in THF, followed by the addition of HBF(4) into the mixture solution, yielded complex [Fe(4)S(3)(NO)(7)](-) (1). Complex [Fe(4)S(3)(NO)(7)](2-) (2), obtained from reduction of complex 1 by [Na][biphenyl], was converted into complex [Fe(4)S(4)(NO)(4)](2-) (3) along with byproduct [(NO)(2)Fe(SR)(2)](-) via the proposed [Fe(4)S(3)(SPh)(NO)(4)](2-) intermediate upon treating complex 2 with 1.5 equiv of [Fe(SPh)(4)](2-) and the subsequent addition of 1/8 equiv of S(8) in CH(3)CN at ambient temperature. Complex 3 was characterized by IR, UV-vis, and single-crystal X-ray diffraction. Upon addition of complex 3 to the CH(3)CN solution of [Fe(SPh)(4)](-) in a 1:2 molar ratio at ambient temperature, the rapid NO radical-thiyl radical exchange reaction between complex 3 and the biomimetic oxidized form of rubredoxin [Fe(SPh)(4)](-) occurred, leading to the simultaneous formation of [4Fe-4S] cluster [Fe(4)S(4)(SPh)(4)](2-) (5) and DNIC [(NO)(2)Fe(SPh)(2)](-). This result demonstrates a successful biomimetic reassembly of [4Fe-4S] cluster [Fe(4)S(4)(SPh)(4)](2-) from NO-modified [Fe-S] clusters, relevant to the repair of DNICs derived from nitrosylation of [4Fe-4S] clusters of endonuclease III back to [4Fe-4S] clusters upon addition of ferrous ion, cysteine, and IscS.

  3. FLiT: a field line trace code for magnetic confinement devices

    NASA Astrophysics Data System (ADS)

    Innocente, P.; Lorenzini, R.; Terranova, D.; Zanca, P.

    2017-04-01

    This paper presents a field line tracing code (FLiT) developed to study particle and energy transport as well as other phenomena related to magnetic topology in reversed-field pinch (RFP) and tokamak experiments. The code computes magnetic field lines in toroidal geometry using curvilinear coordinates (r, ϑ, ϕ) and calculates the intersections of these field lines with specified planes. The code also computes the magnetic and thermal diffusivity due to stochastic magnetic field in the collisionless limit. Compared to Hamiltonian codes, there are no constraints on the magnetic field functional formulation, which allows the integration of whichever magnetic field is required. The code uses the magnetic field computed by solving the zeroth-order axisymmetric equilibrium and the Newcomb equation for the first-order helical perturbation matching the edge magnetic field measurements in toroidal geometry. Two algorithms are developed to integrate the field lines: one is a dedicated implementation of a first-order semi-implicit volume-preserving integration method, and the other is based on the Adams-Moulton predictor-corrector method. As expected, the volume-preserving algorithm is accurate in conserving divergence, but slow because the low integration order requires small amplitude steps. The second algorithm proves to be quite fast and it is able to integrate the field lines in many partially and fully stochastic configurations accurately. The code has already been used to study the core and edge magnetic topology of the RFX-mod device in both the reversed-field pinch and tokamak magnetic configurations.

  4. A New Low Mass for the Hercules dSph: The End of a Common Mass Scale for the Dwarfs?

    NASA Astrophysics Data System (ADS)

    Adén, D.; Wilkinson, M. I.; Read, J. I.; Feltzing, S.; Koch, A.; Gilmore, G. F.; Grebel, E. K.; Lundström, I.

    2009-11-01

    We present a new mass estimate for the Hercules dwarf spheroidal (dSph) galaxy, based on the revised velocity dispersion obtained by Adén et al. The removal of a significant foreground contamination using newly acquired Strömgren photometry has resulted in a reduced velocity dispersion. Using this new velocity dispersion of 3.72 ± 0.91 km s-1, we find a mass of M 300 = 1.9+1.1 -0.8 × 106 M sun within the central 300 pc, which is also the half-light radius, and a mass of M 433 = 3.7+2.2 -1.6 × 106 M sun within the reach of our data to 433 pc, significantly lower than previous estimates. We derive an overall mass-to-light ratio of M 433/L = 103+83 -48[M sun/L sun]. Our mass estimate calls into question recent claims of a common mass scale for dSph galaxies. Additionally, we find tentative evidence for a velocity gradient in our kinematic data of 16 ± 3 km s-1 kpc-1, and evidence of an asymmetric extension in the light distribution at ~0.5 kpc. We explore the possibility that these features are due to tidal interactions with the Milky Way. We show that there is a self-consistent model in which Hercules has an assumed tidal radius of rt = 485 pc, an orbital pericenter of rp = 18.5 ± 5 kpc, and a mass within rt of M_{tid,r_t}=5.2_{-2.7}^{+2.7} × 10^6 M_⊙. Proper motions are required to test this model. Although we cannot exclude models in which Hercules contains no dark matter, we argue that Hercules is more likely to be a dark-matter-dominated system that is currently experiencing some tidal disturbance of its outer parts.

  5. Xyce™ Parallel Electronic Simulator Users' Guide, Version 6.5.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R.; Aadithya, Karthik V.; Mei, Ting

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright © 2002-2016 Sandia Corporation. All rights reserved.« less

  6. Parallelization of an Object-Oriented Unstructured Aeroacoustics Solver

    NASA Technical Reports Server (NTRS)

    Baggag, Abdelkader; Atkins, Harold; Oezturan, Can; Keyes, David

    1999-01-01

    A computational aeroacoustics code based on the discontinuous Galerkin method is ported to several parallel platforms using MPI. The discontinuous Galerkin method is a compact high-order method that retains its accuracy and robustness on non-smooth unstructured meshes. In its semi-discrete form, the discontinuous Galerkin method can be combined with explicit time marching methods making it well suited to time accurate computations. The compact nature of the discontinuous Galerkin method also makes it well suited for distributed memory parallel platforms. The original serial code was written using an object-oriented approach and was previously optimized for cache-based machines. The port to parallel platforms was achieved simply by treating partition boundaries as a type of boundary condition. Code modifications were minimal because boundary conditions were abstractions in the original program. Scalability results are presented for the SCI Origin, IBM SP2, and clusters of SGI and Sun workstations. Slightly superlinear speedup is achieved on a fixed-size problem on the Origin, due to cache effects.

  7. A high-speed linear algebra library with automatic parallelism

    NASA Technical Reports Server (NTRS)

    Boucher, Michael L.

    1994-01-01

    Parallel or distributed processing is key to getting highest performance workstations. However, designing and implementing efficient parallel algorithms is difficult and error-prone. It is even more difficult to write code that is both portable to and efficient on many different computers. Finally, it is harder still to satisfy the above requirements and include the reliability and ease of use required of commercial software intended for use in a production environment. As a result, the application of parallel processing technology to commercial software has been extremely small even though there are numerous computationally demanding programs that would significantly benefit from application of parallel processing. This paper describes DSSLIB, which is a library of subroutines that perform many of the time-consuming computations in engineering and scientific software. DSSLIB combines the high efficiency and speed of parallel computation with a serial programming model that eliminates many undesirable side-effects of typical parallel code. The result is a simple way to incorporate the power of parallel processing into commercial software without compromising maintainability, reliability, or ease of use. This gives significant advantages over less powerful non-parallel entries in the market.

  8. 3D-radiative transfer in terrestrial atmosphere: An efficient parallel numerical procedure

    NASA Astrophysics Data System (ADS)

    Bass, L. P.; Germogenova, T. A.; Nikolaeva, O. V.; Kokhanovsky, A. A.; Kuznetsov, V. S.

    2003-04-01

    , V. V., 1972: Light scattering in planetary atmosphere, M.:Nauka. [2] Evans, K. F., 1998: The spherical harmonic discrete ordinate method for three dimensional atmospheric radiative transfer, J. Atmos. Sci., 55, 429 446. [3] L.P. Bass, T.A. Germogenova, V.S. Kuznetsov, O.V. Nikolaeva. RADUGA 5.1 and RADUGA 5.1(P) codes for stationary transport equation solution in 2D and 3D geometries on one and multiprocessors computers. Report on seminar “Algorithms and Codes for neutron physical of nuclear reactor calculations” (Neutronica 2001), Obninsk, Russia, 30 October 2 November 2001. [4] T.A. Germogenova, L.P. Bass, V.S. Kuznetsov, O.V. Nikolaeva. Mathematical modeling on parallel computers solar and laser radiation transport in 3D atmosphere. Report on International Symposium CIS countries “Atmosphere radiation”, 18 21 June 2002, St. Peterburg, Russia, p. 15 16. [5] L.P. Bass, T.A. Germogenova, O.V. Nikolaeva, V.S. Kuznetsov. Radiative Transfer Universal 2D 3D Code RADUGA 5.1(P) for Multiprocessor Computer. Abstract. Poster report on this Meeting. [6] L.P. Bass, O.V. Nikolaeva. Correct calculation of Angular Flux Distribution in Strongly Heterogeneous Media and Voids. Proc. of Joint International Conference on Mathematical Methods and Supercomputing for Nuclear Applications, Saratoga Springs, New York, October 5 9, 1997, p. 995 1004. [7] http://www/jscc.ru

  9. Interfacing Computer Aided Parallelization and Performance Analysis

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Biegel, Bryan A. (Technical Monitor)

    2003-01-01

    When porting sequential applications to parallel computer architectures, the program developer will typically go through several cycles of source code optimization and performance analysis. We have started a project to develop an environment where the user can jointly navigate through program structure and performance data information in order to make efficient optimization decisions. In a prototype implementation we have interfaced the CAPO computer aided parallelization tool with the Paraver performance analysis tool. We describe both tools and their interface and give an example for how the interface helps within the program development cycle of a benchmark code.

  10. Xyce parallel electronic simulator users' guide, Version 6.0.1.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less

  11. T-cell receptor transfer into human T cells with ecotropic retroviral vectors.

    PubMed

    Koste, L; Beissert, T; Hoff, H; Pretsch, L; Türeci, Ö; Sahin, U

    2014-05-01

    Adoptive T-cell transfer for cancer immunotherapy requires genetic modification of T cells with recombinant T-cell receptors (TCRs). Amphotropic retroviral vectors (RVs) used for TCR transduction for this purpose are considered safe in principle. Despite this, TCR-coding and packaging vectors could theoretically recombine to produce replication competent vectors (RCVs), and transduced T-cell preparations must be proven free of RCV. To eliminate the need for RCV testing, we transduced human T cells with ecotropic RVs so potential RCV would be non-infectious for human cells. We show that transfection of synthetic messenger RNA encoding murine cationic amino-acid transporter 1 (mCAT-1), the receptor for murine retroviruses, enables efficient transient ecotropic transduction of human T cells. mCAT-1-dependent transduction was more efficient than amphotropic transduction performed in parallel, and preferentially targeted naive T cells. Moreover, we demonstrate that ecotropic TCR transduction results in antigen-specific restimulation of primary human T cells. Thus, ecotropic RVs represent a versatile, safe and potent tool to prepare T cells for the adoptive transfer.

  12. Protein modeling and molecular dynamics simulation of the two novel surfactant proteins SP-G and SP-H.

    PubMed

    Rausch, Felix; Schicht, Martin; Bräuer, Lars; Paulsen, Friedrich; Brandt, Wolfgang

    2014-11-01

    Surfactant proteins are well known from the human lung where they are responsible for the stability and flexibility of the pulmonary surfactant system. They are able to influence the surface tension of the gas-liquid interface specifically by directly interacting with single lipids. This work describes the generation of reliable protein structure models to support the experimental characterization of two novel putative surfactant proteins called SP-G and SP-H. The obtained protein models were complemented by predicted posttranslational modifications and placed in a lipid model system mimicking the pulmonary surface. Molecular dynamics simulations of these protein-lipid systems showed the stability of the protein models and the formation of interactions between protein surface and lipid head groups on an atomic scale. Thereby, interaction interface and strength seem to be dependent on orientation and posttranslational modification of the protein. The here presented modeling was fundamental for experimental localization studies and the simulations showed that SP-G and SP-H are theoretically able to interact with lipid systems and thus are members of the surfactant protein family.

  13. Multitasking TORT Under UNICOS: Parallel Performance Models and Measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Azmy, Y.Y.; Barnett, D.A.

    1999-09-27

    The existing parallel algorithms in the TORT discrete ordinates were updated to function in a UNI-COS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead.

  14. Java application for the superposition T-matrix code to study the optical properties of cosmic dust aggregates

    NASA Astrophysics Data System (ADS)

    Halder, P.; Chakraborty, A.; Deb Roy, P.; Das, H. S.

    2014-09-01

    In this paper, we report the development of a java application for the Superposition T-matrix code, JaSTA (Java Superposition T-matrix App), to study the light scattering properties of aggregate structures. It has been developed using Netbeans 7.1.2, which is a java integrated development environment (IDE). The JaSTA uses double precession superposition codes for multi-sphere clusters in random orientation developed by Mackowski and Mischenko (1996). It consists of a graphical user interface (GUI) in the front hand and a database of related data in the back hand. Both the interactive GUI and database package directly enable a user to model by self-monitoring respective input parameters (namely, wavelength, complex refractive indices, grain size, etc.) to study the related optical properties of cosmic dust (namely, extinction, polarization, etc.) instantly, i.e., with zero computational time. This increases the efficiency of the user. The database of JaSTA is now created for a few sets of input parameters with a plan to create a large database in future. This application also has an option where users can compile and run the scattering code directly for aggregates in GUI environment. The JaSTA aims to provide convenient and quicker data analysis of the optical properties which can be used in different fields like planetary science, atmospheric science, nano science, etc. The current version of this software is developed for the Linux and Windows platform to study the light scattering properties of small aggregates which will be extended for larger aggregates using parallel codes in future. Catalogue identifier: AETB_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AETB_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 571570 No. of bytes in distributed program

  15. Development and Application of a Parallel LCAO Cluster Method

    NASA Astrophysics Data System (ADS)

    Patton, David C.

    1997-08-01

    CPU intensive steps in the SCF electronic structure calculations of clusters and molecules with a first-principles LCAO method have been fully parallelized via a message passing paradigm. Identification of the parts of the code that are composed of many independent compute-intensive steps is discussed in detail as they are the most readily parallelized. Most of the parallelization involves spatially decomposing numerical operations on a mesh. One exception is the solution of Poisson's equation which relies on distribution of the charge density and multipole methods. The method we use to parallelize this part of the calculation is quite novel and is covered in detail. We present a general method for dynamically load-balancing a parallel calculation and discuss how we use this method in our code. The results of benchmark calculations of the IR and Raman spectra of PAH molecules such as anthracene (C_14H_10) and tetracene (C_18H_12) are presented. These benchmark calculations were performed on an IBM SP2 and a SUN Ultra HPC server with both MPI and PVM. Scalability and speedup for these calculations is analyzed to determine the efficiency of the code. In addition, performance and usage issues for MPI and PVM are presented.

  16. Piecemeal Buildup of the Genetic Code, Ribosomes, and Genomes from Primordial tRNA Building Blocks.

    PubMed

    Caetano-Anollés, Derek; Caetano-Anollés, Gustavo

    2016-12-02

    The origin of biomolecular machinery likely centered around an ancient and central molecule capable of interacting with emergent macromolecular complexity. tRNA is the oldest and most central nucleic acid molecule of the cell. Its co-evolutionary interactions with aminoacyl-tRNA synthetase protein enzymes define the specificities of the genetic code and those with the ribosome their accurate biosynthetic interpretation. Phylogenetic approaches that focus on molecular structure allow reconstruction of evolutionary timelines that describe the history of RNA and protein structural domains. Here we review phylogenomic analyses that reconstruct the early history of the synthetase enzymes and the ribosome, their interactions with RNA, and the inception of amino acid charging and codon specificities in tRNA that are responsible for the genetic code. We also trace the age of domains and tRNA onto ancient tRNA homologies that were recently identified in rRNA. Our findings reveal a timeline of recruitment of tRNA building blocks for the formation of a functional ribosome, which holds both the biocatalytic functions of protein biosynthesis and the ability to store genetic memory in primordial RNA genomic templates.

  17. Piecemeal Buildup of the Genetic Code, Ribosomes, and Genomes from Primordial tRNA Building Blocks

    PubMed Central

    Caetano-Anollés, Derek; Caetano-Anollés, Gustavo

    2016-01-01

    The origin of biomolecular machinery likely centered around an ancient and central molecule capable of interacting with emergent macromolecular complexity. tRNA is the oldest and most central nucleic acid molecule of the cell. Its co-evolutionary interactions with aminoacyl-tRNA synthetase protein enzymes define the specificities of the genetic code and those with the ribosome their accurate biosynthetic interpretation. Phylogenetic approaches that focus on molecular structure allow reconstruction of evolutionary timelines that describe the history of RNA and protein structural domains. Here we review phylogenomic analyses that reconstruct the early history of the synthetase enzymes and the ribosome, their interactions with RNA, and the inception of amino acid charging and codon specificities in tRNA that are responsible for the genetic code. We also trace the age of domains and tRNA onto ancient tRNA homologies that were recently identified in rRNA. Our findings reveal a timeline of recruitment of tRNA building blocks for the formation of a functional ribosome, which holds both the biocatalytic functions of protein biosynthesis and the ability to store genetic memory in primordial RNA genomic templates. PMID:27918435

  18. Soft-output decoding algorithms in iterative decoding of turbo codes

    NASA Technical Reports Server (NTRS)

    Benedetto, S.; Montorsi, G.; Divsalar, D.; Pollara, F.

    1996-01-01

    In this article, we present two versions of a simplified maximum a posteriori decoding algorithm. The algorithms work in a sliding window form, like the Viterbi algorithm, and can thus be used to decode continuously transmitted sequences obtained by parallel concatenated codes, without requiring code trellis termination. A heuristic explanation is also given of how to embed the maximum a posteriori algorithms into the iterative decoding of parallel concatenated codes (turbo codes). The performances of the two algorithms are compared on the basis of a powerful rate 1/3 parallel concatenated code. Basic circuits to implement the simplified a posteriori decoding algorithm using lookup tables, and two further approximations (linear and threshold), with a very small penalty, to eliminate the need for lookup tables are proposed.

  19. The Design and Evaluation of "CAPTools"--A Computer Aided Parallelization Toolkit

    NASA Technical Reports Server (NTRS)

    Yan, Jerry; Frumkin, Michael; Hribar, Michelle; Jin, Haoqiang; Waheed, Abdul; Johnson, Steve; Cross, Jark; Evans, Emyr; Ierotheou, Constantinos; Leggett, Pete; hide

    1998-01-01

    Writing applications for high performance computers is a challenging task. Although writing code by hand still offers the best performance, it is extremely costly and often not very portable. The Computer Aided Parallelization Tools (CAPTools) are a toolkit designed to help automate the mapping of sequential FORTRAN scientific applications onto multiprocessors. CAPTools consists of the following major components: an inter-procedural dependence analysis module that incorporates user knowledge; a 'self-propagating' data partitioning module driven via user guidance; an execution control mask generation and optimization module for the user to fine tune parallel processing of individual partitions; a program transformation/restructuring facility for source code clean up and optimization; a set of browsers through which the user interacts with CAPTools at each stage of the parallelization process; and a code generator supporting multiple programming paradigms on various multiprocessors. Besides describing the rationale behind the architecture of CAPTools, the parallelization process is illustrated via case studies involving structured and unstructured meshes. The programming process and the performance of the generated parallel programs are compared against other programming alternatives based on the NAS Parallel Benchmarks, ARC3D and other scientific applications. Based on these results, a discussion on the feasibility of constructing architectural independent parallel applications is presented.

  20. Parallel Semi-Implicit Spectral Element Atmospheric Model

    NASA Astrophysics Data System (ADS)

    Fournier, A.; Thomas, S.; Loft, R.

    2001-05-01

    The shallow-water equations (SWE) have long been used to test atmospheric-modeling numerical methods. The SWE contain essential wave-propagation and nonlinear effects of more complete models. We present a semi-implicit (SI) improvement of the Spectral Element Atmospheric Model to solve the SWE (SEAM, Taylor et al. 1997, Fournier et al. 2000, Thomas & Loft 2000). SE methods are h-p finite element methods combining the geometric flexibility of size-h finite elements with the accuracy of degree-p spectral methods. Our work suggests that exceptional parallel-computation performance is achievable by a General-Circulation-Model (GCM) dynamical core, even at modest climate-simulation resolutions (>1o). The code derivation involves weak variational formulation of the SWE, Gauss(-Lobatto) quadrature over the collocation points, and Legendre cardinal interpolators. Appropriate weak variation yields a symmetric positive-definite Helmholtz operator. To meet the Ladyzhenskaya-Babuska-Brezzi inf-sup condition and avoid spurious modes, we use a staggered grid. The SI scheme combines leapfrog and Crank-Nicholson schemes for the nonlinear and linear terms respectively. The localization of operations to elements ideally fits the method to cache-based microprocessor computer architectures --derivatives are computed as collections of small (8x8), naturally cache-blocked matrix-vector products. SEAM also has desirable boundary-exchange communication, like finite-difference models. Timings on on the IBM SP and Compaq ES40 supercomputers indicate that the SI code (20-min timestep) requires 1/3 the CPU time of the explicit code (2-min timestep) for T42 resolutions. Both codes scale nearly linearly out to 400 processors. We achieved single-processor performance up to 30% of peak for both codes on the 375-MHz IBM Power-3 processors. Fast computation and linear scaling lead to a useful climate-simulation dycore only if enough model time is computed per unit wall-clock time. An efficient SI

  1. Parallel auto-correlative statistics with VTK.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  2. A Massively Parallel Code for Polarization Calculations

    NASA Astrophysics Data System (ADS)

    Akiyama, Shizuka; Höflich, Peter

    2001-03-01

    We present an implementation of our Monte-Carlo radiation transport method for rapidly expanding, NLTE atmospheres for massively parallel computers which utilizes both the distributed and shared memory models. This allows us to take full advantage of the fast communication and low latency inherent to nodes with multiple CPUs, and to stretch the limits of scalability with the number of nodes compared to a version which is based on the shared memory model. Test calculations on a local 20-node Beowulf cluster with dual CPUs showed an improved scalability by about 40%.

  3. OFF, Open source Finite volume Fluid dynamics code: A free, high-order solver based on parallel, modular, object-oriented Fortran API

    NASA Astrophysics Data System (ADS)

    Zaghi, S.

    2014-07-01

    OFF, an open source (free software) code for performing fluid dynamics simulations, is presented. The aim of OFF is to solve, numerically, the unsteady (and steady) compressible Navier-Stokes equations of fluid dynamics by means of finite volume techniques: the research background is mainly focused on high-order (WENO) schemes for multi-fluids, multi-phase flows over complex geometries. To this purpose a highly modular, object-oriented application program interface (API) has been developed. In particular, the concepts of data encapsulation and inheritance available within Fortran language (from standard 2003) have been stressed in order to represent each fluid dynamics "entity" (e.g. the conservative variables of a finite volume, its geometry, etc…) by a single object so that a large variety of computational libraries can be easily (and efficiently) developed upon these objects. The main features of OFF can be summarized as follows: Programming LanguageOFF is written in standard (compliant) Fortran 2003; its design is highly modular in order to enhance simplicity of use and maintenance without compromising the efficiency; Parallel Frameworks Supported the development of OFF has been also targeted to maximize the computational efficiency: the code is designed to run on shared-memory multi-cores workstations and distributed-memory clusters of shared-memory nodes (supercomputers); the code's parallelization is based on Open Multiprocessing (OpenMP) and Message Passing Interface (MPI) paradigms; Usability, Maintenance and Enhancement in order to improve the usability, maintenance and enhancement of the code also the documentation has been carefully taken into account; the documentation is built upon comprehensive comments placed directly into the source files (no external documentation files needed): these comments are parsed by means of doxygen free software producing high quality html and latex documentation pages; the distributed versioning system referred as git

  4. SKIRT: Hybrid parallelization of radiative transfer simulations

    NASA Astrophysics Data System (ADS)

    Verstocken, S.; Van De Putte, D.; Camps, P.; Baes, M.

    2017-07-01

    We describe the design, implementation and performance of the new hybrid parallelization scheme in our Monte Carlo radiative transfer code SKIRT, which has been used extensively for modelling the continuum radiation of dusty astrophysical systems including late-type galaxies and dusty tori. The hybrid scheme combines distributed memory parallelization, using the standard Message Passing Interface (MPI) to communicate between processes, and shared memory parallelization, providing multiple execution threads within each process to avoid duplication of data structures. The synchronization between multiple threads is accomplished through atomic operations without high-level locking (also called lock-free programming). This improves the scaling behaviour of the code and substantially simplifies the implementation of the hybrid scheme. The result is an extremely flexible solution that adjusts to the number of available nodes, processors and memory, and consequently performs well on a wide variety of computing architectures.

  5. Tuning iteration space slicing based tiled multi-core code implementing Nussinov's RNA folding.

    PubMed

    Palkowski, Marek; Bielecki, Wlodzimierz

    2018-01-15

    RNA folding is an ongoing compute-intensive task of bioinformatics. Parallelization and improving code locality for this kind of algorithms is one of the most relevant areas in computational biology. Fortunately, RNA secondary structure approaches, such as Nussinov's recurrence, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. This allows us to apply powerful polyhedral compilation techniques based on the transitive closure of dependence graphs to generate parallel tiled code implementing Nussinov's RNA folding. Such techniques are within the iteration space slicing framework - the transitive dependences are applied to the statement instances of interest to produce valid tiles. The main problem at generating parallel tiled code is defining a proper tile size and tile dimension which impact parallelism degree and code locality. To choose the best tile size and tile dimension, we first construct parallel parametric tiled code (parameters are variables defining tile size). With this purpose, we first generate two nonparametric tiled codes with different fixed tile sizes but with the same code structure and then derive a general affine model, which describes all integer factors available in expressions of those codes. Using this model and known integer factors present in the mentioned expressions (they define the left-hand side of the model), we find unknown integers in this model for each integer factor available in the same fixed tiled code position and replace in this code expressions, including integer factors, with those including parameters. Then we use this parallel parametric tiled code to implement the well-known tile size selection (TSS) technique, which allows us to discover in a given search space the best tile size and tile dimension maximizing target code performance. For a given search space, the presented approach allows us to choose the best tile size and tile dimension in

  6. Nyx: Adaptive mesh, massively-parallel, cosmological simulation code

    NASA Astrophysics Data System (ADS)

    Almgren, Ann; Beckner, Vince; Friesen, Brian; Lukic, Zarija; Zhang, Weiqun

    2017-12-01

    Nyx code solves equations of compressible hydrodynamics on an adaptive grid hierarchy coupled with an N-body treatment of dark matter. The gas dynamics in Nyx use a finite volume methodology on an adaptive set of 3-D Eulerian grids; dark matter is represented as discrete particles moving under the influence of gravity. Particles are evolved via a particle-mesh method, using Cloud-in-Cell deposition/interpolation scheme. Both baryonic and dark matter contribute to the gravitational field. In addition, Nyx includes physics for accurately modeling the intergalactic medium; in optically thin limits and assuming ionization equilibrium, the code calculates heating and cooling processes of the primordial-composition gas in an ionizing ultraviolet background radiation field.

  7. Parallel Visualization of Large-Scale Aerodynamics Calculations: A Case Study on the Cray T3E

    NASA Technical Reports Server (NTRS)

    Ma, Kwan-Liu; Crockett, Thomas W.

    1999-01-01

    This paper reports the performance of a parallel volume rendering algorithm for visualizing a large-scale, unstructured-grid dataset produced by a three-dimensional aerodynamics simulation. This dataset, containing over 18 million tetrahedra, allows us to extend our performance results to a problem which is more than 30 times larger than the one we examined previously. This high resolution dataset also allows us to see fine, three-dimensional features in the flow field. All our tests were performed on the Silicon Graphics Inc. (SGI)/Cray T3E operated by NASA's Goddard Space Flight Center. Using 511 processors, a rendering rate of almost 9 million tetrahedra/second was achieved with a parallel overhead of 26%.

  8. Parallel k-means++

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique.more » We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.« less

  9. Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud.

    PubMed

    Di Tommaso, Paolo; Orobitg, Miquel; Guirado, Fernando; Cores, Fernado; Espinosa, Toni; Notredame, Cedric

    2010-08-01

    We present the first parallel implementation of the T-Coffee consistency-based multiple aligner. We benchmark it on the Amazon Elastic Cloud (EC2) and show that the parallelization procedure is reasonably effective. We also conclude that for a web server with moderate usage (10K hits/month) the cloud provides a cost-effective alternative to in-house deployment. T-Coffee is a freeware open source package available from http://www.tcoffee.org/homepage.html

  10. Graphical Representation of Parallel Algorithmic Processes

    DTIC Science & Technology

    1990-12-01

    interface with the AAARF main process . The source code for the AAARF class-common library is in the common subdi- rectory and consists of the following files... for public release; distribution unlimited AFIT/GCE/ENG/90D-07 Graphical Representation of Parallel Algorithmic Processes THESIS Presented to the...goal of this study is to develop an algorithm animation facility for parallel processes executing on different architectures, from multiprocessor

  11. Parallel imaging of knee cartilage at 3 Tesla.

    PubMed

    Zuo, Jin; Li, Xiaojuan; Banerjee, Suchandrima; Han, Eric; Majumdar, Sharmila

    2007-10-01

    To evaluate the feasibility and reproducibility of quantitative cartilage imaging with parallel imaging at 3T and to determine the impact of the acceleration factor (AF) on morphological and relaxation measurements. An eight-channel phased-array knee coil was employed for conventional and parallel imaging on a 3T scanner. The imaging protocol consisted of a T2-weighted fast spin echo (FSE), a 3D-spoiled gradient echo (SPGR), a custom 3D-SPGR T1rho, and a 3D-SPGR T2 sequence. Parallel imaging was performed with an array spatial sensitivity technique (ASSET). The left knees of six healthy volunteers were scanned with both conventional and parallel imaging (AF = 2). Morphological parameters and relaxation maps from parallel imaging methods (AF = 2) showed comparable results with conventional method. The intraclass correlation coefficient (ICC) of the two methods for cartilage volume, mean cartilage thickness, T1rho, and T2 were 0.999, 0.977, 0.964, and 0.969, respectively, while demonstrating excellent reproducibility. No significant measurement differences were found when AF reached 3 despite the low signal-to-noise ratio (SNR). The study demonstrated that parallel imaging can be applied to current knee cartilage quantification at AF = 2 without degrading measurement accuracy with good reproducibility while effectively reducing scan time. Shorter imaging times can be achieved with higher AF at the cost of SNR. (c) 2007 Wiley-Liss, Inc.

  12. An integrated runtime and compile-time approach for parallelizing structured and block structured applications

    NASA Technical Reports Server (NTRS)

    Agrawal, Gagan; Sussman, Alan; Saltz, Joel

    1993-01-01

    Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). A combined runtime and compile-time approach for parallelizing these applications on distributed memory parallel machines in an efficient and machine-independent fashion was described. A runtime library which can be used to port these applications on distributed memory machines was designed and implemented. The library is currently implemented on several different systems. To further ease the task of application programmers, methods were developed for integrating this runtime library with compilers for HPK-like parallel programming languages. How this runtime library was integrated with the Fortran 90D compiler being developed at Syracuse University is discussed. Experimental results to demonstrate the efficacy of our approach are presented. A multiblock Navier-Stokes solver template and a multigrid code were experimented with. Our experimental results show that our primitives have low runtime communication overheads. Further, the compiler parallelized codes perform within 20 percent of the code parallelized by manually inserting calls to the runtime library.

  13. An Energy-Efficient Compressive Image Coding for Green Internet of Things (IoT).

    PubMed

    Li, Ran; Duan, Xiaomeng; Li, Xu; He, Wei; Li, Yanling

    2018-04-17

    Aimed at a low-energy consumption of Green Internet of Things (IoT), this paper presents an energy-efficient compressive image coding scheme, which provides compressive encoder and real-time decoder according to Compressive Sensing (CS) theory. The compressive encoder adaptively measures each image block based on the block-based gradient field, which models the distribution of block sparse degree, and the real-time decoder linearly reconstructs each image block through a projection matrix, which is learned by Minimum Mean Square Error (MMSE) criterion. Both the encoder and decoder have a low computational complexity, so that they only consume a small amount of energy. Experimental results show that the proposed scheme not only has a low encoding and decoding complexity when compared with traditional methods, but it also provides good objective and subjective reconstruction qualities. In particular, it presents better time-distortion performance than JPEG. Therefore, the proposed compressive image coding is a potential energy-efficient scheme for Green IoT.

  14. Retargeting of existing FORTRAN program and development of parallel compilers

    NASA Technical Reports Server (NTRS)

    Agrawal, Dharma P.

    1988-01-01

    The software models used in implementing the parallelizing compiler for the B-HIVE multiprocessor system are described. The various models and strategies used in the compiler development are: flexible granularity model, which allows a compromise between two extreme granularity models; communication model, which is capable of precisely describing the interprocessor communication timings and patterns; loop type detection strategy, which identifies different types of loops; critical path with coloring scheme, which is a versatile scheduling strategy for any multicomputer with some associated communication costs; and loop allocation strategy, which realizes optimum overlapped operations between computation and communication of the system. Using these models, several sample routines of the AIR3D package are examined and tested. It may be noted that automatically generated codes are highly parallelized to provide the maximized degree of parallelism, obtaining the speedup up to a 28 to 32-processor system. A comparison of parallel codes for both the existing and proposed communication model, is performed and the corresponding expected speedup factors are obtained. The experimentation shows that the B-HIVE compiler produces more efficient codes than existing techniques. Work is progressing well in completing the final phase of the compiler. Numerous enhancements are needed to improve the capabilities of the parallelizing compiler.

  15. Portability and Cross-Platform Performance of an MPI-Based Parallel Polygon Renderer

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1999-01-01

    Visualizing the results of computations performed on large-scale parallel computers is a challenging problem, due to the size of the datasets involved. One approach is to perform the visualization and graphics operations in place, exploiting the available parallelism to obtain the necessary rendering performance. Over the past several years, we have been developing algorithms and software to support visualization applications on NASA's parallel supercomputers. Our results have been incorporated into a parallel polygon rendering system called PGL. PGL was initially developed on tightly-coupled distributed-memory message-passing systems, including Intel's iPSC/860 and Paragon, and IBM's SP2. Over the past year, we have ported it to a variety of additional platforms, including the HP Exemplar, SGI Origin2OOO, Cray T3E, and clusters of Sun workstations. In implementing PGL, we have had two primary goals: cross-platform portability and high performance. Portability is important because (1) our manpower resources are limited, making it difficult to develop and maintain multiple versions of the code, and (2) NASA's complement of parallel computing platforms is diverse and subject to frequent change. Performance is important in delivering adequate rendering rates for complex scenes and ensuring that parallel computing resources are used effectively. Unfortunately, these two goals are often at odds. In this paper we report on our experiences with portability and performance of the PGL polygon renderer across a range of parallel computing platforms.

  16. High Performance Input/Output for Parallel Computer Systems

    NASA Technical Reports Server (NTRS)

    Ligon, W. B.

    1996-01-01

    The goal of our project is to study the I/O characteristics of parallel applications used in Earth Science data processing systems such as Regional Data Centers (RDCs) or EOSDIS. Our approach is to study the runtime behavior of typical programs and the effect of key parameters of the I/O subsystem both under simulation and with direct experimentation on parallel systems. Our three year activity has focused on two items: developing a test bed that facilitates experimentation with parallel I/O, and studying representative programs from the Earth science data processing application domain. The Parallel Virtual File System (PVFS) has been developed for use on a number of platforms including the Tiger Parallel Architecture Workbench (TPAW) simulator, The Intel Paragon, a cluster of DEC Alpha workstations, and the Beowulf system (at CESDIS). PVFS provides considerable flexibility in configuring I/O in a UNIX- like environment. Access to key performance parameters facilitates experimentation. We have studied several key applications fiom levels 1,2 and 3 of the typical RDC processing scenario including instrument calibration and navigation, image classification, and numerical modeling codes. We have also considered large-scale scientific database codes used to organize image data.

  17. Parallel software for lattice N = 4 supersymmetric Yang-Mills theory

    NASA Astrophysics Data System (ADS)

    Schaich, David; DeGrand, Thomas

    2015-05-01

    We present new parallel software, SUSY LATTICE, for lattice studies of four-dimensional N = 4 supersymmetric Yang-Mills theory with gauge group SU(N). The lattice action is constructed to exactly preserve a single supersymmetry charge at non-zero lattice spacing, up to additional potential terms included to stabilize numerical simulations. The software evolved from the MILC code for lattice QCD, and retains a similar large-scale framework despite the different target theory. Many routines are adapted from an existing serial code (Catterall and Joseph, 2012), which SUSY LATTICE supersedes. This paper provides an overview of the new parallel software, summarizing the lattice system, describing the applications that are currently provided and explaining their basic workflow for non-experts in lattice gauge theory. We discuss the parallel performance of the code, and highlight some notable aspects of the documentation for those interested in contributing to its future development.

  18. Parallelization of sequential Gaussian, indicator and direct simulation algorithms

    NASA Astrophysics Data System (ADS)

    Nunes, Ruben; Almeida, José A.

    2010-08-01

    Improving the performance and robustness of algorithms on new high-performance parallel computing architectures is a key issue in efficiently performing 2D and 3D studies with large amount of data. In geostatistics, sequential simulation algorithms are good candidates for parallelization. When compared with other computational applications in geosciences (such as fluid flow simulators), sequential simulation software is not extremely computationally intensive, but parallelization can make it more efficient and creates alternatives for its integration in inverse modelling approaches. This paper describes the implementation and benchmarking of a parallel version of the three classic sequential simulation algorithms: direct sequential simulation (DSS), sequential indicator simulation (SIS) and sequential Gaussian simulation (SGS). For this purpose, the source used was GSLIB, but the entire code was extensively modified to take into account the parallelization approach and was also rewritten in the C programming language. The paper also explains in detail the parallelization strategy and the main modifications. Regarding the integration of secondary information, the DSS algorithm is able to perform simple kriging with local means, kriging with an external drift and collocated cokriging with both local and global correlations. SIS includes a local correction of probabilities. Finally, a brief comparison is presented of simulation results using one, two and four processors. All performance tests were carried out on 2D soil data samples. The source code is completely open source and easy to read. It should be noted that the code is only fully compatible with Microsoft Visual C and should be adapted for other systems/compilers.

  19. Parallel DSMC Solution of Three-Dimensional Flow Over a Finite Flat Plate

    NASA Technical Reports Server (NTRS)

    Nance, Robert P.; Wilmoth, Richard G.; Moon, Bongki; Hassan, H. A.; Saltz, Joel

    1994-01-01

    This paper describes a parallel implementation of the direct simulation Monte Carlo (DSMC) method. Runtime library support is used for scheduling and execution of communication between nodes, and domain decomposition is performed dynamically to maintain a good load balance. Performance tests are conducted using the code to evaluate various remapping and remapping-interval policies, and it is shown that a one-dimensional chain-partitioning method works best for the problems considered. The parallel code is then used to simulate the Mach 20 nitrogen flow over a finite-thickness flat plate. It is shown that the parallel algorithm produces results which compare well with experimental data. Moreover, it yields significantly faster execution times than the scalar code, as well as very good load-balance characteristics.

  20. Parallel programming of industrial applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Heroux, M; Koniges, A; Simon, H

    1998-07-21

    In the introductory material, we overview the typical MPP environment for real application computing and the special tools available such as parallel debuggers and performance analyzers. Next, we draw from a series of real applications codes and discuss the specific challenges and problems that are encountered in parallelizing these individual applications. The application areas drawn from include biomedical sciences, materials processing and design, plasma and fluid dynamics, and others. We show how it was possible to get a particular application to run efficiently and what steps were necessary. Finally we end with a summary of the lessons learned from thesemore » applications and predictions for the future of industrial parallel computing. This tutorial is based on material from a forthcoming book entitled: "Industrial Strength Parallel Computing" to be published by Morgan Kaufmann Publishers (ISBN l-55860-54).« less

  1. Parallel-In-Time For Moving Meshes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Falgout, R. D.; Manteuffel, T. A.; Southworth, B.

    2016-02-04

    With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is appliedmore » to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.« less

  2. Parallel computing for probabilistic fatigue analysis

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Lua, Yuan J.; Smith, Mark D.

    1993-01-01

    This paper presents the results of Phase I research to investigate the most effective parallel processing software strategies and hardware configurations for probabilistic structural analysis. We investigate the efficiency of both shared and distributed-memory architectures via a probabilistic fatigue life analysis problem. We also present a parallel programming approach, the virtual shared-memory paradigm, that is applicable across both types of hardware. Using this approach, problems can be solved on a variety of parallel configurations, including networks of single or multiprocessor workstations. We conclude that it is possible to effectively parallelize probabilistic fatigue analysis codes; however, special strategies will be needed to achieve large-scale parallelism to keep large number of processors busy and to treat problems with the large memory requirements encountered in practice. We also conclude that distributed-memory architecture is preferable to shared-memory for achieving large scale parallelism; however, in the future, the currently emerging hybrid-memory architectures will likely be optimal.

  3. Collisional tests and an extension of the TEMPEST continuum gyrokinetic code

    NASA Astrophysics Data System (ADS)

    Cohen, R. H.; Dorr, M.; Hittinger, J.; Kerbel, G.; Nevins, W. M.; Rognlien, T.; Xiong, Z.; Xu, X. Q.

    2006-04-01

    An important requirement of a kinetic code for edge plasmas is the ability to accurately treat the effect of colllisions over a broad range of collisionalities. To test the interaction of collisions and parallel streaming, TEMPEST has been compared with published analytic and numerical (Monte Carlo, bounce-averaged Fokker-Planck) results for endloss of particles confined by combined electrostatic and magnetic wells. Good agreement is found over a wide range of collisionality, confining potential and mirror ratio, and the required velocity space resolution is modest. We also describe progress toward extension of (4-dimensional) TEMPEST into a ``kinetic edge transport code'' (a kinetic counterpart of UEDGE). The extension includes averaging of the gyrokinetic equations over fast timescales and approximating the averaged quadratic terms by diffusion terms which respect the boundaries of inaccessable regions in phase space. F. Najmabadi, R.W. Conn and R.H. Cohen, Nucl. Fusion 24, 75 (1984); T.D. Rognlien and T.A. Cutler, Nucl. Fusion 20, 1003 (1980).

  4. T cells are influenced by a long non-coding RNA in the autoimmune associated PTPN2 locus.

    PubMed

    Houtman, Miranda; Shchetynsky, Klementy; Chemin, Karine; Hensvold, Aase Haj; Ramsköld, Daniel; Tandre, Karolina; Eloranta, Maija-Leena; Rönnblom, Lars; Uebe, Steffen; Catrina, Anca Irinel; Malmström, Vivianne; Padyukov, Leonid

    2018-06-01

    Non-coding SNPs in the protein tyrosine phosphatase non-receptor type 2 (PTPN2) locus have been linked with several autoimmune diseases, including rheumatoid arthritis, type I diabetes, and inflammatory bowel disease. However, the functional consequences of these SNPs are poorly characterized. Herein, we show in blood cells that SNPs in the PTPN2 locus are highly correlated with DNA methylation levels at four CpG sites downstream of PTPN2 and expression levels of the long non-coding RNA (lncRNA) LINC01882 downstream of these CpG sites. We observed that LINC01882 is mainly expressed in T cells and that anti-CD3/CD28 activated naïve CD4 + T cells downregulate the expression of LINC01882. RNA sequencing analysis of LINC01882 knockdown in Jurkat T cells, using a combination of antisense oligonucleotides and RNA interference, revealed the upregulation of the transcription factor ZEB1 and kinase MAP2K4, both involved in IL-2 regulation. Overall, our data suggests the involvement of LINC01882 in T cell activation and hints towards an auxiliary role of these non-coding SNPs in autoimmunity associated with the PTPN2 locus. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Integrated Task And Data Parallel Programming: Language Design

    NASA Technical Reports Server (NTRS)

    Grimshaw, Andrew S.; West, Emily A.

    1998-01-01

    his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated

  6. 1D and 3D Polymeric Manganese(II) Thiolato Complexes: Synthesis, Structure, and Properties of    ∞3[Mn4(SPh)8] and ∞1[Mn(SMes)2].

    PubMed

    Eichhöfer, Andreas; Lebedkin, Sergei

    2018-01-16

    Reactions of [Mn{N(SiMe 3 ) 2 } 2 ] 2 with 2.1 equiv of RSH, R = Ph or Mes = C 6 H 2 -2,4,6-(CH 3 ) 3 , yield compounds of the formal composition "Mn(SR) 2 ". Single-crystal X-ray diffraction reveals that ∞ 1 [Mn(SMes) 2 ] forms one-dimensional chains in the crystal via μ 2 -SMes bridges, whereas ∞ 3 [Mn 4 (SPh) 8 ] comprises a three-dimensional network in which adamantanoid cages composed of four Mn atoms and six μ 2 -bridging SPh ligands are connected in three dimensions by doubly bridging SPh ligands. Thermogravimetric analysis and powder diffractometry indicate an reversible uptake of solvent molecules (tetrahydrofuran) into the channels of ∞ 1 [Mn(SMes) 2 ]. Magnetic measurements reveal antiferromagnetic coupling for both compounds with J = -8.2 cm -1 ( ∞ 1 [Mn(SMes) 2 ]) and -10.0 cm -1 ( ∞ 3 [Mn 4 (SPh) 8 ]), respectively. Their optical absorption and photoluminescence (PL) excitation spectra display characteristic d-d bands of Mn 2+ ions in the visible spectral region. Both compounds emit bright phosphorescence at ∼800 nm at low temperatures (<100 K). However, only ∞ 1 [Mn(SMes) 2 ] retains a moderately intense emission at ambient temperature (with a quantum yield of 1.2%). Similar PL properties are also found for the related selenolate complexes ∞ 1 [Mn(SeR) 2 ] (R = Ph, Mes).

  7. The crystal structures of the psychrophilic subtilisin S41 and the mesophilic subtilisin Sph reveal the same calcium-loaded state.

    PubMed

    Almog, Orna; González, Ana; Godin, Noa; de Leeuw, Marina; Mekel, Marlene J; Klein, Daniela; Braun, Sergei; Shoham, Gil; Walter, Richard L

    2009-02-01

    We determine and compare the crystal structure of two proteases belonging to the subtilisin superfamily: S41, a cold-adapted serine protease produced by Antarctic bacilli, at 1.4 A resolution and Sph, a mesophilic serine protease produced by Bacillus sphaericus, at 0.8 A resolution. The purpose of this comparison was to find out whether multiple calcium ion binding is a molecular factor responsible for the adaptation of S41 to extreme low temperatures. We find that these two subtilisins have the same subtilisin fold with a root mean square between the two structures of 0.54 A. The final models for S41 and Sph include a calcium-loaded state of five ions bound to each of these two subtilisin molecules. None of these calcium-binding sites correlate with the high affinity known binding site (site A) found for other subtilisins. Structural analysis of the five calcium-binding sites found in these two crystal structures indicate that three of the binding sites have two side chains of an acidic residue coordinating the calcium ion, whereas the other two binding sites have either a main-chain carbonyl, or only one acidic residue side chain coordinating the calcium ion. Thus, we conclude that three of the sites are of high affinity toward calcium ions, whereas the other two are of low affinity. Because Sph is a mesophilic subtilisin and S41 is a psychrophilic subtilisin, but both crystal structures were found to bind five calcium ions, we suggest that multiple calcium ion binding is not responsible for the adaptation of S41 to low temperatures. Copyright 2008 Wiley-Liss, Inc.

  8. Neural representation of objects in space: a dual coding account.

    PubMed Central

    Humphreys, G W

    1998-01-01

    I present evidence on the nature of object coding in the brain and discuss the implications of this coding for models of visual selective attention. Neuropsychological studies of task-based constraints on: (i) visual neglect; and (ii) reading and counting, reveal the existence of parallel forms of spatial representation for objects: within-object representations, where elements are coded as parts of objects, and between-object representations, where elements are coded as independent objects. Aside from these spatial codes for objects, however, the coding of visual space is limited. We are extremely poor at remembering small spatial displacements across eye movements, indicating (at best) impoverished coding of spatial position per se. Also, effects of element separation on spatial extinction can be eliminated by filling the space with an occluding object, indicating that spatial effects on visual selection are moderated by object coding. Overall, there are separate limits on visual processing reflecting: (i) the competition to code parts within objects; (ii) the small number of independent objects that can be coded in parallel; and (iii) task-based selection of whether within- or between-object codes determine behaviour. Between-object coding may be linked to the dorsal visual system while parallel coding of parts within objects takes place in the ventral system, although there may additionally be some dorsal involvement either when attention must be shifted within objects or when explicit spatial coding of parts is necessary for object identification. PMID:9770227

  9. Local Group dSph radio survey with ATCA (I): observations and background sources

    NASA Astrophysics Data System (ADS)

    Regis, Marco; Richter, Laura; Colafrancesco, Sergio; Massardi, Marcella; de Blok, W. J. G.; Profumo, Stefano; Orford, Nicola

    2015-04-01

    Dwarf spheroidal (dSph) galaxies are key objects in near-field cosmology, especially in connection to the study of galaxy formation and evolution at small scales. In addition, dSphs are optimal targets to investigate the nature of dark matter. However, while we begin to have deep optical photometric observations of the stellar population in these objects, little is known so far about their diffuse emission at any observing frequency, and hence on thermal and non-thermal plasma possibly residing within dSphs. In this paper, we present deep radio observations of six local dSphs performed with the Australia Telescope Compact Array (ATCA) at 16 cm wavelength. We mosaicked a region of radius of about 1 deg around three `classical' dSphs, Carina, Fornax, and Sculptor, and of about half of degree around three `ultrafaint' dSphs, BootesII, Segue2, and Hercules. The rms noise level is below 0.05 mJy for all the maps. The restoring beams full width at half-maximum ranged from 4.2 arcsec × 2.5 arcsec to 30.0 arcsec × 2.1 arcsec in the most elongated case. A catalogue including the 1392 sources detected in the six dSph fields is reported. The main properties of the background sources are discussed, with positions and fluxes of brightest objects compared with the FIRST, NVSS, and SUMSS observations of the same fields. The observed population of radio emitters in these fields is dominated by synchrotron sources. We compute the associated source number counts at 2 GHz down to fluxes of 0.25 mJy, which prove to be in agreement with AGN count models.

  10. The chemical abundances of the stellar populations in the Leo I and II dSph galaxies

    NASA Astrophysics Data System (ADS)

    Bosler, Tammy L.; Smecker-Hane, Tammy A.; Stetson, Peter B.

    2007-06-01

    We have obtained calcium abundances and radial velocities for 102 red giant branch (RGB) stars in the Leo I dwarf spheroidal galaxy (dSph) and 74 RGB stars in the Leo II dSph using the low-resolution spectrograph (LRIS) on the Keck I 10-m telescope. We report on the calcium abundances [Ca/H] derived from the strengths of the CaII triplet absorption lines at 8498, 8542 and 8662 Å in the stellar spectra using a new empirical CaII triplet calibration to [Ca/H]. The two galaxies have different average [Ca/H] values of -1.34 +/- 0.02 for Leo I and -1.65 +/- 0.02 for Leo II with intrinsic abundance dispersions of 1.2 and 1.0 dex, respectively. The typical random and total errors in derived abundances are 0.10 and 0.17 dex per star. For comparison to the existing literature, we also converted our CaII measurements to [Fe/H] on the scale of Carretta and Gratton (1997) though we discuss why this may not be the best determinant of metallicity; Leo I has a mean [Fe/H] = -1.34 and Leo II has a mean [Fe/H] = -1.59. The metallicity distribution function of Leo I is approximately Gaussian in shape with an excess at the metal-rich end, while that of Leo II shows an abrupt cut-off at the metal-rich end. The lower mean metallicity of Leo II is consistent with the fact that it has a lower luminosity, hence lower the total mass than Leo I; thus, the evolution of Leo II may have been affected more by mass lost in galactic winds. Our direct and independent measurement of the metallicity distributions in these dSph will allow a more accurate star-formation histories to be derived from future analysis of their colour-magnitude diagrams(CMDs). Data presented herein were obtained at the W.M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W. M. Keck Foundation. E

  11. Programming Probabilistic Structural Analysis for Parallel Processing Computer

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

    1991-01-01

    The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.

  12. Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sarje, Abhinav; Jacobsen, Douglas W.; Williams, Samuel W.

    The incorporation of increasing core counts in modern processors used to build state-of-the-art supercomputers is driving application development towards exploitation of thread parallelism, in addition to distributed memory parallelism, with the goal of delivering efficient high-performance codes. In this work we describe the exploitation of threading and our experiences with it with respect to a real-world ocean modeling application code, MPAS-Ocean. We present detailed performance analysis and comparisons of various approaches and configurations for threading on the Cray XC series supercomputers.

  13. Error Control Coding Techniques for Space and Satellite Communications

    NASA Technical Reports Server (NTRS)

    Costello, Daniel J., Jr.; Takeshita, Oscar Y.; Cabral, Hermano A.

    1998-01-01

    It is well known that the BER performance of a parallel concatenated turbo-code improves roughly as 1/N, where N is the information block length. However, it has been observed by Benedetto and Montorsi that for most parallel concatenated turbo-codes, the FER performance does not improve monotonically with N. In this report, we study the FER of turbo-codes, and the effects of their concatenation with an outer code. Two methods of concatenation are investigated: across several frames and within each frame. Some asymmetric codes are shown to have excellent FER performance with an information block length of 16384. We also show that the proposed outer coding schemes can improve the BER performance as well by eliminating pathological frames generated by the iterative MAP decoding process.

  14. Parallel programming with Easy Java Simulations

    NASA Astrophysics Data System (ADS)

    Esquembre, F.; Christian, W.; Belloni, M.

    2018-01-01

    Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.

  15. Design of convolutional tornado code

    NASA Astrophysics Data System (ADS)

    Zhou, Hui; Yang, Yao; Gao, Hongmin; Tan, Lu

    2017-09-01

    As a linear block code, the traditional tornado (tTN) code is inefficient in burst-erasure environment and its multi-level structure may lead to high encoding/decoding complexity. This paper presents a convolutional tornado (cTN) code which is able to improve the burst-erasure protection capability by applying the convolution property to the tTN code, and reduce computational complexity by abrogating the multi-level structure. The simulation results show that cTN code can provide a better packet loss protection performance with lower computation complexity than tTN code.

  16. Numerical simulation of wave-current interaction using the SPH method

    NASA Astrophysics Data System (ADS)

    He, Ming; Gao, Xi-feng; Xu, Wan-hai

    2018-05-01

    In this paper, the smoothed particle hydrodynamics (SPH) method is used to build a numerical wave-current tank (NWCT). The wave is generated by using a piston-type wave generator and is absorbed by using a sponge layer. The uniform current field is generated by simultaneously imposing the directional velocity and hydrostatic pressure in both inflow and outflow regions set below the NWCT. Particle cyclic boundaries are also implemented for recycling the Lagrangian fluid particles. Furthermore, to shorten the time to reach a steady state, a temporary rigid-lid treatment for the water surface is proposed. It turns out to be very effective for weakening the undesired oscillatory flow at the beginning stage of the current generation. The calculated water surface elevation and horizontal-velocity profile are validated against the available experimental data. Satisfactory agreements are obtained, demonstrating the good capability of the NWCT.

  17. Parallel Implementation of the Discontinuous Galerkin Method

    NASA Technical Reports Server (NTRS)

    Baggag, Abdalkader; Atkins, Harold; Keyes, David

    1999-01-01

    This paper describes a parallel implementation of the discontinuous Galerkin method. Discontinuous Galerkin is a spatially compact method that retains its accuracy and robustness on non-smooth unstructured grids and is well suited for time dependent simulations. Several parallelization approaches are studied and evaluated. The most natural and symmetric of the approaches has been implemented in all object-oriented code used to simulate aeroacoustic scattering. The parallel implementation is MPI-based and has been tested on various parallel platforms such as the SGI Origin, IBM SP2, and clusters of SGI and Sun workstations. The scalability results presented for the SGI Origin show slightly superlinear speedup on a fixed-size problem due to cache effects.

  18. Modeling Cooperative Threads to Project GPU Performance for Adaptive Parallelism

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meng, Jiayuan; Uram, Thomas; Morozov, Vitali A.

    Most accelerators, such as graphics processing units (GPUs) and vector processors, are particularly suitable for accelerating massively parallel workloads. On the other hand, conventional workloads are developed for multi-core parallelism, which often scale to only a few dozen OpenMP threads. When hardware threads significantly outnumber the degree of parallelism in the outer loop, programmers are challenged with efficient hardware utilization. A common solution is to further exploit the parallelism hidden deep in the code structure. Such parallelism is less structured: parallel and sequential loops may be imperfectly nested within each other, neigh boring inner loops may exhibit different concurrency patternsmore » (e.g. Reduction vs. Forall), yet have to be parallelized in the same parallel section. Many input-dependent transformations have to be explored. A programmer often employs a larger group of hardware threads to cooperatively walk through a smaller outer loop partition and adaptively exploit any encountered parallelism. This process is time-consuming and error-prone, yet the risk of gaining little or no performance remains high for such workloads. To reduce risk and guide implementation, we propose a technique to model workloads with limited parallelism that can automatically explore and evaluate transformations involving cooperative threads. Eventually, our framework projects the best achievable performance and the most promising transformations without implementing GPU code or using physical hardware. We envision our technique to be integrated into future compilers or optimization frameworks for autotuning.« less

  19. CXSFIT Code Application to Process Charge-Exchange Recombination Spectroscopy Data at the T-10 Tokamak

    NASA Astrophysics Data System (ADS)

    Serov, S. V.; Tugarinov, S. N.; Klyuchnikov, L. A.; Krupin, V. A.; von Hellermann, M.

    2017-12-01

    The applicability of the CXSFIT code to process experimental data from Charge-eXchange Recombination Spectroscopy (CXRS) diagnostics at the T-10 tokamak is studied with a view to its further use for processing experimental data at the ITER facility. The design and operating principle of the CXRS diagnostics are described. The main methods for processing the CXRS spectra of the 5291-Å line of C5+ ions at the T-10 tokamak (with and without subtraction of parasitic emission from the edge plasma) are analyzed. The method of averaging the CXRS spectra over several shots, which is used at the T-10 tokamak to increase the signal-to-noise ratio, is described. The approximation of the spectrum by a set of Gaussian components is used to identify the active CXRS line in the measured spectrum. Using the CXSFIT code, the ion temperature in ohmic discharges and discharges with auxiliary electron cyclotron resonance heating (ECRH) at the T-10 tokamak is calculated from the CXRS spectra of the 5291-Å line. The time behavior of the ion temperature profile in different ohmic heating modes is studied. The temperature profile dependence on the ECRH power is measured, and the dynamics of ECR removal of carbon nuclei from the T-10 plasma is described. Experimental data from the CXRS diagnostics at T-10 substantially contribute to the implementation of physical programs of studies on heat and particle transport in tokamak plasmas and investigation of geodesic acoustic mode properties.

  20. Investigating mixing and emptying for aqueous liquid content from the stomach using a coupled biomechanical-SPH model.

    PubMed

    Harrison, Simon M; Cleary, Paul W; Sinnott, Matthew D

    2018-05-18

    The stomach is a critical organ for food digestion but it is not well understood how it operates, either when healthy or when dysfunction occurs. Stomach function depends on the timing and amplitude of wall contractions, the fill level and the type of gastric content. Using a coupled biomechanical-Smoothed Particle Hydrodynamics (B-SPH) model, we investigate how gastric discharge is affected by the contraction behaviour of the stomach wall and the viscosity of the content. The results of the model provide new insights into how the content viscosity and the number of compression waves down the length of the stomach affect the mixing within and the discharge rate of the content exiting from the stomach to the duodenum. This investigation shows that the B-SPH model is capable of simulating complicated stomach behaviour. The rate of gastric emptying is found to increase with a smaller period in between contractile waves and to have a nonlinear relationship with content viscosity. Increased resistance to flow into the duodenum is also shown to reduce the rate of emptying. The degree of gastric mixing is found to be insensitive to changes in the period between contractile waves for fluid with a viscosity of water but to be substantially affected by the viscosity of the gastric content.

  1. Stepwise assembly of a semiconducting coordination polymer [Cd8S(SPh)14(DMF)(bpy)]n and its photodegradation of organic dyes.

    PubMed

    Xu, Chao; Hedin, Niklas; Shi, Hua-Tian; Xin, ZhiFeng; Zhang, Qian-Feng

    2015-04-14

    Chalcogenolate clusters can be interlinked with organic linkers into semiconducting coordination polymers with photocatalytic properties. Here, discrete clusters of Cd8S(SPh)14(DMF)3 were interlinked with 4,4'-bipyridine into a one dimensional coordination polymer of [Cd8S(SPh)14(DMF)(bpy)]n with helical chains. A stepwise mechanism for the assembly of the coordination polymer in DMF was revealed by an ex situ dynamic light scattering study. The cluster was electrostatically neutral and showed a penta-supertetrahedral structure. During the assembly each cluster was interlinked with two 4,4'-bipyridine molecules, which replaced the two terminal DMF molecules of the clusters. In their solid-state forms, the cluster and the coordination polymer were semiconductors with wide band gaps of 3.08 and 2.80 ev. They photocatalytically degraded rhodamine B and methylene blue in aqueous solutions. The moderate conditions used for the synthesis could allow for further in situ studies of the reaction-assembly of related clusters and coordination polymers.

  2. Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Logan, Terry G.

    1994-01-01

    The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.

  3. Xyce Parallel Electronic Simulator Users Guide Version 6.2.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R.; Mei, Ting; Russo, Thomas V.

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2014 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are

  4. Xyce Parallel Electronic Simulator Users Guide Version 6.4

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R.; Mei, Ting; Russo, Thomas V.

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2015 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are

  5. Xyce Parallel Electronic Simulator : users' guide, version 2.0.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont

    2004-06-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator capable of simulating electrical circuits at a variety of abstraction levels. Primarily, Xyce has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability the current state-of-the-art in the following areas: {sm_bullet} Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. {sm_bullet} Improved performance for allmore » numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. {sm_bullet} Device models which are specifically tailored to meet Sandia's needs, including many radiation-aware devices. {sm_bullet} A client-server or multi-tiered operating model wherein the numerical kernel can operate independently of the graphical user interface (GUI). {sm_bullet} Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing of computing platforms. These include serial, shared-memory and distributed-memory parallel implementation - which allows it to run efficiently on the widest possible number parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. One feature required by designers is the ability to add device models, many specific to the needs of Sandia, to the code. To this end, the device package in the

  6. Performance of an Optimized Eta Model Code on the Cray T3E and a Network of PCs

    NASA Technical Reports Server (NTRS)

    Kouatchou, Jules; Rancic, Miodrag; Geiger, Jim

    2000-01-01

    In the year 2001, NASA will launch the satellite TRIANA that will be the first Earth observing mission to provide a continuous, full disk view of the sunlit Earth. As a part of the HPCC Program at NASA GSFC, we have started a project whose objectives are to develop and implement a 3D cloud data assimilation system, by combining TRIANA measurements with model simulation, and to produce accurate statistics of global cloud coverage as an important element of the Earth's climate. For simulation of the atmosphere within this project we are using the NCEP/NOAA operational Eta model. In order to compare TRIANA and the Eta model data on approximately the same grid without significant downscaling, the Eta model will be integrated at a resolution of about 15 km. The integration domain (from -70 to +70 deg in latitude and 150 deg in longitude) will cover most of the sunlit Earth disc and will continuously rotate around the globe following TRIANA. The cloud data assimilation is supposed to run and produce 3D clouds on a near real-time basis. Such a numerical setup and integration design is very ambitious and computationally demanding. Thus, though the Eta model code has been very carefully developed and its computational efficiency has been systematically polished during the years of operational implementation at NCEP, the current MPI version may still have problems with memory and efficiency for the TRIANA simulations. Within this work, we optimize a parallel version of the Eta model code on a Cray T3E and a network of PCs (theHIVE) in order to improve its overall efficiency. Our optimization procedure consists of introducing dynamically allocated arrays to reduce the size of static memory, and optimizing on a single processor by splitting loops to limit the number of streams. All the presented results are derived using an integration domain centered at the equator, with a size of 60 x 60 deg, and with horizontal resolutions of 1/2 and 1/3 deg, respectively. In accompanying

  7. Hybrid Parallel Contour Trees, Version 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sewell, Christopher; Fasel, Patricia; Carr, Hamish

    A common operation in scientific visualization is to compute and render a contour of a data set. Given a function of the form f : R^d -> R, a level set is defined as an inverse image f^-1(h) for an isovalue h, and a contour is a single connected component of a level set. The Reeb graph can then be defined to be the result of contracting each contour to a single point, and is well defined for Euclidean spaces or for general manifolds. For simple domains, the graph is guaranteed to be a tree, and is called the contourmore » tree. Analysis can then be performed on the contour tree in order to identify isovalues of particular interest, based on various metrics, and render the corresponding contours, without having to know such isovalues a priori. This code is intended to be the first data-parallel algorithm for computing contour trees. Our implementation will use the portable data-parallel primitives provided by Nvidia’s Thrust library, allowing us to compile our same code for both GPUs and multi-core CPUs. Native OpenMP and purely serial versions of the code will likely also be included. It will also be extended to provide a hybrid data-parallel / distributed algorithm, allowing scaling beyond a single GPU or CPU.« less

  8. Optimisation of a parallel ocean general circulation model

    NASA Astrophysics Data System (ADS)

    Beare, M. I.; Stevens, D. P.

    1997-10-01

    This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.

  9. Nuclide Depletion Capabilities in the Shift Monte Carlo Code

    DOE PAGES

    Davidson, Gregory G.; Pandya, Tara M.; Johnson, Seth R.; ...

    2017-12-21

    A new depletion capability has been developed in the Exnihilo radiation transport code suite. This capability enables massively parallel domain-decomposed coupling between the Shift continuous-energy Monte Carlo solver and the nuclide depletion solvers in ORIGEN to perform high-performance Monte Carlo depletion calculations. This paper describes this new depletion capability and discusses its various features, including a multi-level parallel decomposition, high-order transport-depletion coupling, and energy-integrated power renormalization. Several test problems are presented to validate the new capability against other Monte Carlo depletion codes, and the parallel performance of the new capability is analyzed.

  10. A code for optically thick and hot photoionized media

    NASA Astrophysics Data System (ADS)

    Dumont, A.-M.; Abrassart, A.; Collin, S.

    2000-05-01

    We describe a code designed for hot media (T >= a few 104 K), optically thick to Compton scattering. It computes the structure of a plane-parallel slab of gas in thermal and ionization equilibrium, illuminated on one or on both sides by a given spectrum. Contrary to the other photoionization codes, it solves the transfer of the continuum and of the lines in a two stream approximation, without using the local escape probability formalism to approximate the line transfer. We stress the importance of taking into account the returning flux even for small column densities (1022 cm-2), and we show that the escape probability approximation can lead to strong errors in the thermal and ionization structure, as well as in the emitted spectrum, for a Thomson thickness larger than a few tenths. The transfer code is coupled with a Monte Carlo code which allows to take into account Compton and inverse Compton diffusions, and to compute the spectrum emitted up to MeV energies, in any geometry. Comparisons with cloudy show that it gives similar results for small column densities. Several applications are mentioned.

  11. Performance of a plasma fluid code on the Intel parallel computers

    NASA Technical Reports Server (NTRS)

    Lynch, V. E.; Carreras, B. A.; Drake, J. B.; Leboeuf, J. N.; Liewer, P.

    1992-01-01

    One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A parallel algorithm for plasma turbulence calculations was tested on the Intel iPSC/860 hypercube and the Touchtone Delta machine. Using the 128 processors of the Intel iPSC/860 hypercube, a factor of 5 improvement over a single-processor CRAY-2 is obtained. For the Touchtone Delta machine, the corresponding improvement factor is 16. For plasma edge turbulence calculations, an extrapolation of the present results to the Intel (sigma) machine gives an improvement factor close to 64 over the single-processor CRAY-2.

  12. Parallel performance of TORT on the CRAY J90: Model and measurement

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barnett, A.; Azmy, Y.Y.

    1997-10-01

    A limitation on the parallel performance of TORT on the CRAY J90 is the amount of extra work introduced by the multitasking algorithm itself. The extra work beyond that of the serial version of the code, called overhead, arises from the synchronization of the parallel tasks and the accumulation of results by the master task. The goal of recent updates to TORT was to reduce the time consumed by these activities. To help understand which components of the multitasking algorithm contribute significantly to the overhead, a parallel performance model was constructed and compared to measurements of actual timings of themore » code.« less

  13. Hamming and Accumulator Codes Concatenated with MPSK or QAM

    NASA Technical Reports Server (NTRS)

    Divsalar, Dariush; Dolinar, Samuel

    2009-01-01

    In a proposed coding-and-modulation scheme, a high-rate binary data stream would be processed as follows: 1. The input bit stream would be demultiplexed into multiple bit streams. 2. The multiple bit streams would be processed simultaneously into a high-rate outer Hamming code that would comprise multiple short constituent Hamming codes a distinct constituent Hamming code for each stream. 3. The streams would be interleaved. The interleaver would have a block structure that would facilitate parallelization for high-speed decoding. 4. The interleaved streams would be further processed simultaneously into an inner two-state, rate-1 accumulator code that would comprise multiple constituent accumulator codes - a distinct accumulator code for each stream. 5. The resulting bit streams would be mapped into symbols to be transmitted by use of a higher-order modulation - for example, M-ary phase-shift keying (MPSK) or quadrature amplitude modulation (QAM). The novelty of the scheme lies in the concatenation of the multiple-constituent Hamming and accumulator codes and the corresponding parallel architectures of the encoder and decoder circuitry (see figure) needed to process the multiple bit streams simultaneously. As in the cases of other parallel-processing schemes, one advantage of this scheme is that the overall data rate could be much greater than the data rate of each encoder and decoder stream and, hence, the encoder and decoder could handle data at an overall rate beyond the capability of the individual encoder and decoder circuits.

  14. Anisotropic Effects on Constitutive Model Parameters of Aluminum Alloys

    DTIC Science & Technology

    2012-01-01

    constants are required input to computer codes (LS-DYNA, DYNA3D or SPH ) to accurately simulate fragment impact on structural components made of high...different temperatures. These model constants are required input to computer codes (LS-DYNA, DYNA3D or SPH ) to accurately simulate fragment impact on...ADDRESS(ES) Naval Surface Warfare Center,4104Evans Way Suite 102,Indian Head,MD,20640 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING

  15. Geopotential Error Analysis from Satellite Gradiometer and Global Positioning System Observables on Parallel Architecture

    NASA Technical Reports Server (NTRS)

    Schutz, Bob E.; Baker, Gregory A.

    1997-01-01

    The recovery of a high resolution geopotential from satellite gradiometer observations motivates the examination of high performance computational techniques. The primary subject matter addresses specifically the use of satellite gradiometer and GPS observations to form and invert the normal matrix associated with a large degree and order geopotential solution. Memory resident and out-of-core parallel linear algebra techniques along with data parallel batch algorithms form the foundation of the least squares application structure. A secondary topic includes the adoption of object oriented programming techniques to enhance modularity and reusability of code. Applications implementing the parallel and object oriented methods successfully calculate the degree variance for a degree and order 110 geopotential solution on 32 processors of the Cray T3E. The memory resident gradiometer application exhibits an overall application performance of 5.4 Gflops, and the out-of-core linear solver exhibits an overall performance of 2.4 Gflops. The combination solution derived from a sun synchronous gradiometer orbit produce average geoid height variances of 17 millimeters.

  16. Geopotential error analysis from satellite gradiometer and global positioning system observables on parallel architectures

    NASA Astrophysics Data System (ADS)

    Baker, Gregory Allen

    The recovery of a high resolution geopotential from satellite gradiometer observations motivates the examination of high performance computational techniques. The primary subject matter addresses specifically the use of satellite gradiometer and GPS observations to form and invert the normal matrix associated with a large degree and order geopotential solution. Memory resident and out-of-core parallel linear algebra techniques along with data parallel batch algorithms form the foundation of the least squares application structure. A secondary topic includes the adoption of object oriented programming techniques to enhance modularity and reusability of code. Applications implementing the parallel and object oriented methods successfully calculate the degree variance for a degree and order 110 geopotential solution on 32 processors of the Cray T3E. The memory resident gradiometer application exhibits an overall application performance of 5.4 Gflops, and the out-of-core linear solver exhibits an overall performance of 2.4 Gflops. The combination solution derived from a sun synchronous gradiometer orbit produce average geoid height variances of 17 millimeters.

  17. A CellML simulation compiler and code generator using ODE solving schemes

    PubMed Central

    2012-01-01

    Models written in description languages such as CellML are becoming a popular solution to the handling of complex cellular physiological models in biological function simulations. However, in order to fully simulate a model, boundary conditions and ordinary differential equation (ODE) solving schemes have to be combined with it. Though boundary conditions can be described in CellML, it is difficult to explicitly specify ODE solving schemes using existing tools. In this study, we define an ODE solving scheme description language-based on XML and propose a code generation system for biological function simulations. In the proposed system, biological simulation programs using various ODE solving schemes can be easily generated. We designed a two-stage approach where the system generates the equation set associating the physiological model variable values at a certain time t with values at t + Δt in the first stage. The second stage generates the simulation code for the model. This approach enables the flexible construction of code generation modules that can support complex sets of formulas. We evaluate the relationship between models and their calculation accuracies by simulating complex biological models using various ODE solving schemes. Using the FHN model simulation, results showed good qualitative and quantitative correspondence with the theoretical predictions. Results for the Luo-Rudy 1991 model showed that only first order precision was achieved. In addition, running the generated code in parallel on a GPU made it possible to speed up the calculation time by a factor of 50. The CellML Compiler source code is available for download at http://sourceforge.net/projects/cellmlcompiler. PMID:23083065

  18. Constructions for finite-state codes

    NASA Technical Reports Server (NTRS)

    Pollara, F.; Mceliece, R. J.; Abdel-Ghaffar, K.

    1987-01-01

    A class of codes called finite-state (FS) codes is defined and investigated. These codes, which generalize both block and convolutional codes, are defined by their encoders, which are finite-state machines with parallel inputs and outputs. A family of upper bounds on the free distance of a given FS code is derived from known upper bounds on the minimum distance of block codes. A general construction for FS codes is then given, based on the idea of partitioning a given linear block into cosets of one of its subcodes, and it is shown that in many cases the FS codes constructed in this way have a d sub free which is as large as possible. These codes are found without the need for lengthy computer searches, and have potential applications for future deep-space coding systems. The issue of catastropic error propagation (CEP) for FS codes is also investigated.

  19. Massively parallel multicanonical simulations

    NASA Astrophysics Data System (ADS)

    Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard

    2018-03-01

    Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.

  20. Parallelization of Lower-Upper Symmetric Gauss-Seidel Method for Chemically Reacting Flow

    NASA Technical Reports Server (NTRS)

    Yoon, Seokkwan; Jost, Gabriele; Chang, Sherry

    2005-01-01

    Development of technologies for exploration of the solar system has revived an interest in computational simulation of chemically reacting flows since planetary probe vehicles exhibit non-equilibrium phenomena during the atmospheric entry of a planet or a moon as well as the reentry to the Earth. Stability in combustion is essential for new propulsion systems. Numerical solution of real-gas flows often increases computational work by an order-of-magnitude compared to perfect gas flow partly because of the increased complexity of equations to solve. Recently, as part of Project Columbia, NASA has integrated a cluster of interconnected SGI Altix systems to provide a ten-fold increase in current supercomputing capacity that includes an SGI Origin system. Both the new and existing machines are based on cache coherent non-uniform memory access architecture. Lower-Upper Symmetric Gauss-Seidel (LU-SGS) relaxation method has been implemented into both perfect and real gas flow codes including Real-Gas Aerodynamic Simulator (RGAS). However, the vectorized RGAS code runs inefficiently on cache-based shared-memory machines such as SGI system. Parallelization of a Gauss-Seidel method is nontrivial due to its sequential nature. The LU-SGS method has been vectorized on an oblique plane in INS3D-LU code that has been one of the base codes for NAS Parallel benchmarks. The oblique plane has been called a hyperplane by computer scientists. It is straightforward to parallelize a Gauss-Seidel method by partitioning the hyperplanes once they are formed. Another way of parallelization is to schedule processors like a pipeline using software. Both hyperplane and pipeline methods have been implemented using openMP directives. The present paper reports the performance of the parallelized RGAS code on SGI Origin and Altix systems.

  1. Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

    PubMed

    Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

    2016-04-01

    Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .

  2. GRay: A Massively Parallel GPU-based Code for Ray Tracing in Relativistic Spacetimes

    NASA Astrophysics Data System (ADS)

    Chan, Chi-kwan; Psaltis, Dimitrios; Özel, Feryal

    2013-11-01

    We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This graphics-processing-unit (GPU)-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single-precision floating-point arithmetic on a single GPU exceeds 300 GFLOP (or 1 ns per photon per time step). For a realistic problem, where the peak performance cannot be reached, GRay is two orders of magnitude faster than existing central-processing-unit-based ray-tracing codes. This performance enhancement allows more effective searches of large parameter spaces when comparing theoretical predictions of images, spectra, and light curves from the vicinities of compact objects to observations. GRay can also perform on-the-fly ray tracing within general relativistic magnetohydrodynamic algorithms that simulate accretion flows around compact objects. Making use of this algorithm, we calculate the properties of the shadows of Kerr black holes and the photon rings that surround them. We also provide accurate fitting formulae of their dependencies on black hole spin and observer inclination, which can be used to interpret upcoming observations of the black holes at the center of the Milky Way, as well as M87, with the Event Horizon Telescope.

  3. GRay: A MASSIVELY PARALLEL GPU-BASED CODE FOR RAY TRACING IN RELATIVISTIC SPACETIMES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chan, Chi-kwan; Psaltis, Dimitrios; Özel, Feryal

    We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This graphics-processing-unit (GPU)-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single-precision floating-point arithmetic on a single GPU exceeds 300 GFLOP (or 1 ns per photon per time step). For a realistic problem, where the peak performance cannot be reached, GRay is two orders of magnitude faster than existing central-processing-unit-based ray-tracing codes. This performance enhancement allows more effective searches of large parameter spaces when comparingmore » theoretical predictions of images, spectra, and light curves from the vicinities of compact objects to observations. GRay can also perform on-the-fly ray tracing within general relativistic magnetohydrodynamic algorithms that simulate accretion flows around compact objects. Making use of this algorithm, we calculate the properties of the shadows of Kerr black holes and the photon rings that surround them. We also provide accurate fitting formulae of their dependencies on black hole spin and observer inclination, which can be used to interpret upcoming observations of the black holes at the center of the Milky Way, as well as M87, with the Event Horizon Telescope.« less

  4. Methods of parallel computation applied on granular simulations

    NASA Astrophysics Data System (ADS)

    Martins, Gustavo H. B.; Atman, Allbens P. F.

    2017-06-01

    Every year, parallel computing has becoming cheaper and more accessible. As consequence, applications were spreading over all research areas. Granular materials is a promising area for parallel computing. To prove this statement we study the impact of parallel computing in simulations of the BNE (Brazil Nut Effect). This property is due the remarkable arising of an intruder confined to a granular media when vertically shaken against gravity. By means of DEM (Discrete Element Methods) simulations, we study the code performance testing different methods to improve clock time. A comparison between serial and parallel algorithms, using OpenMP® is also shown. The best improvement was obtained by optimizing the function that find contacts using Verlet's cells.

  5. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.

  6. Parallel and Portable Monte Carlo Particle Transport

    NASA Astrophysics Data System (ADS)

    Lee, S. R.; Cummings, J. C.; Nolen, S. D.; Keen, N. D.

    1997-08-01

    We have developed a multi-group, Monte Carlo neutron transport code in C++ using object-oriented methods and the Parallel Object-Oriented Methods and Applications (POOMA) class library. This transport code, called MC++, currently computes k and α eigenvalues of the neutron transport equation on a rectilinear computational mesh. It is portable to and runs in parallel on a wide variety of platforms, including MPPs, clustered SMPs, and individual workstations. It contains appropriate classes and abstractions for particle transport and, through the use of POOMA, for portable parallelism. Current capabilities are discussed, along with physics and performance results for several test problems on a variety of hardware, including all three Accelerated Strategic Computing Initiative (ASCI) platforms. Current parallel performance indicates the ability to compute α-eigenvalues in seconds or minutes rather than days or weeks. Current and future work on the implementation of a general transport physics framework (TPF) is also described. This TPF employs modern C++ programming techniques to provide simplified user interfaces, generic STL-style programming, and compile-time performance optimization. Physics capabilities of the TPF will be extended to include continuous energy treatments, implicit Monte Carlo algorithms, and a variety of convergence acceleration techniques such as importance combing.

  7. A fully coupled 3D transport model in SPH for multi-species reaction-diffusion systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adami, Stefan; Hu, X. Y.; Adams, N. A.

    2011-08-23

    Abstract—In this paper we present a fully generalized transport model for multiple species in complex two and threedimensional geometries. Based on previous work [1] we have extended our interfacial reaction-diffusion model to handle arbitrary numbers of species allowing for coupled reaction models. Each species is tracked independently and we consider different physics of a species with respect to the bulk phases in contact. We use our SPH model to simulate the reaction-diffusion problem on a pore-scale level of a solid oxide fuel cell (SOFC) with special emphasize on the effect of surface diffusion.

  8. Coding Instructions, Worksheets, and Keypunch Sheets for M.E.T.R.O.-APEX Simulation.

    ERIC Educational Resources Information Center

    Michigan Univ., Ann Arbor. Environmental Simulation Lab.

    Compiled in this resource are coding instructions, worksheets, and keypunch sheets for use in the M.E.T.R.O.-APEX simulation, described in detail in documents ED 064 530 through ED 064 550. Air Pollution Exercise (APEX) is a computerized college and professional level "real world" simulation of a community with urban and rural problems, industrial…

  9. High-resolution whole-brain diffusion MRI at 7T using radiofrequency parallel transmission.

    PubMed

    Wu, Xiaoping; Auerbach, Edward J; Vu, An T; Moeller, Steen; Lenglet, Christophe; Schmitter, Sebastian; Van de Moortele, Pierre-François; Yacoub, Essa; Uğurbil, Kâmil

    2018-03-30

    Investigating the utility of RF parallel transmission (pTx) for Human Connectome Project (HCP)-style whole-brain diffusion MRI (dMRI) data at 7 Tesla (7T). Healthy subjects were scanned in pTx and single-transmit (1Tx) modes. Multiband (MB), single-spoke pTx pulses were designed to image sagittal slices. HCP-style dMRI data (i.e., 1.05-mm resolutions, MB2, b-values = 1000/2000 s/mm 2 , 286 images and 40-min scan) and data with higher accelerations (MB3 and MB4) were acquired with pTx. pTx significantly improved flip-angle detected signal uniformity across the brain, yielding ∼19% increase in temporal SNR (tSNR) averaged over the brain relative to 1Tx. This allowed significantly enhanced estimation of multiple fiber orientations (with ∼21% decrease in dispersion) in HCP-style 7T dMRI datasets. Additionally, pTx pulses achieved substantially lower power deposition, permitting higher accelerations, enabling collection of the same data in 2/3 and 1/2 the scan time or of more data in the same scan time. pTx provides a solution to two major limitations for slice-accelerated high-resolution whole-brain dMRI at 7T; it improves flip-angle uniformity, and enables higher slice acceleration relative to current state-of-the-art. As such, pTx provides significant advantages for rapid acquisition of high-quality, high-resolution truly whole-brain dMRI data. © 2018 International Society for Magnetic Resonance in Medicine.

  10. Parallelization of the TRIGRS model for rainfall-induced landslides using the message passing interface

    USGS Publications Warehouse

    Alvioli, M.; Baum, R.L.

    2016-01-01

    We describe a parallel implementation of TRIGRS, the Transient Rainfall Infiltration and Grid-Based Regional Slope-Stability Model for the timing and distribution of rainfall-induced shallow landslides. We have parallelized the four time-demanding execution modes of TRIGRS, namely both the saturated and unsaturated model with finite and infinite soil depth options, within the Message Passing Interface framework. In addition to new features of the code, we outline details of the parallel implementation and show the performance gain with respect to the serial code. Results are obtained both on commercial hardware and on a high-performance multi-node machine, showing the different limits of applicability of the new code. We also discuss the implications for the application of the model on large-scale areas and as a tool for real-time landslide hazard monitoring.

  11. Toward an automated parallel computing environment for geosciences

    NASA Astrophysics Data System (ADS)

    Zhang, Huai; Liu, Mian; Shi, Yaolin; Yuen, David A.; Yan, Zhenzhen; Liang, Guoping

    2007-08-01

    Software for geodynamic modeling has not kept up with the fast growing computing hardware and network resources. In the past decade supercomputing power has become available to most researchers in the form of affordable Beowulf clusters and other parallel computer platforms. However, to take full advantage of such computing power requires developing parallel algorithms and associated software, a task that is often too daunting for geoscience modelers whose main expertise is in geosciences. We introduce here an automated parallel computing environment built on open-source algorithms and libraries. Users interact with this computing environment by specifying the partial differential equations, solvers, and model-specific properties using an English-like modeling language in the input files. The system then automatically generates the finite element codes that can be run on distributed or shared memory parallel machines. This system is dynamic and flexible, allowing users to address different problems in geosciences. It is capable of providing web-based services, enabling users to generate source codes online. This unique feature will facilitate high-performance computing to be integrated with distributed data grids in the emerging cyber-infrastructures for geosciences. In this paper we discuss the principles of this automated modeling environment and provide examples to demonstrate its versatility.

  12. Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN

    PubMed Central

    Hammond, G E; Lichtner, P C; Mills, R T

    2014-01-01

    [1] To better inform the subsurface scientist on the expected performance of parallel simulators, this work investigates performance of the reactive multiphase flow and multicomponent biogeochemical transport code PFLOTRAN as it is applied to several realistic modeling scenarios run on the Jaguar supercomputer. After a brief introduction to the code's parallel layout and code design, PFLOTRAN's parallel performance (measured through strong and weak scalability analyses) is evaluated in the context of conceptual model layout, software and algorithmic design, and known hardware limitations. PFLOTRAN scales well (with regard to strong scaling) for three realistic problem scenarios: (1) in situ leaching of copper from a mineral ore deposit within a 5-spot flow regime, (2) transient flow and solute transport within a regional doublet, and (3) a real-world problem involving uranium surface complexation within a heterogeneous and extremely dynamic variably saturated flow field. Weak scalability is discussed in detail for the regional doublet problem, and several difficulties with its interpretation are noted. PMID:25506097

  13. Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN.

    PubMed

    Hammond, G E; Lichtner, P C; Mills, R T

    2014-01-01

    [1] To better inform the subsurface scientist on the expected performance of parallel simulators, this work investigates performance of the reactive multiphase flow and multicomponent biogeochemical transport code PFLOTRAN as it is applied to several realistic modeling scenarios run on the Jaguar supercomputer. After a brief introduction to the code's parallel layout and code design, PFLOTRAN's parallel performance (measured through strong and weak scalability analyses) is evaluated in the context of conceptual model layout, software and algorithmic design, and known hardware limitations. PFLOTRAN scales well (with regard to strong scaling) for three realistic problem scenarios: (1) in situ leaching of copper from a mineral ore deposit within a 5-spot flow regime, (2) transient flow and solute transport within a regional doublet, and (3) a real-world problem involving uranium surface complexation within a heterogeneous and extremely dynamic variably saturated flow field. Weak scalability is discussed in detail for the regional doublet problem, and several difficulties with its interpretation are noted.

  14. Parallel evolutionary computation in bioinformatics applications.

    PubMed

    Pinho, Jorge; Sobral, João Luis; Rocha, Miguel

    2013-05-01

    A large number of optimization problems within the field of Bioinformatics require methods able to handle its inherent complexity (e.g. NP-hard problems) and also demand increased computational efforts. In this context, the use of parallel architectures is a necessity. In this work, we propose ParJECoLi, a Java based library that offers a large set of metaheuristic methods (such as Evolutionary Algorithms) and also addresses the issue of its efficient execution on a wide range of parallel architectures. The proposed approach focuses on the easiness of use, making the adaptation to distinct parallel environments (multicore, cluster, grid) transparent to the user. Indeed, this work shows how the development of the optimization library can proceed independently of its adaptation for several architectures, making use of Aspect-Oriented Programming. The pluggable nature of parallelism related modules allows the user to easily configure its environment, adding parallelism modules to the base source code when needed. The performance of the platform is validated with two case studies within biological model optimization. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  15. Support for Debugging Automatically Parallelized Programs

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Hood, Robert; Biegel, Bryan (Technical Monitor)

    2001-01-01

    We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the computations begin to differ. If the original serial code is correct, errors due to parallelization will be isolated by the comparison. One of the primary goals of the system is to minimize the effort required of the user. To that end, the debugging system uses information produced by the parallelization tool to drive the comparison process. In particular the debugging system relies on the parallelization tool to provide information about where variables may have been modified and how arrays are distributed across multiple processes. User effort is also reduced through the use of dynamic instrumentation. This allows us to modify the program execution without changing the way the user builds the executable. The use of dynamic instrumentation also permits us to compare the executions in a fine-grained fashion and only involve the debugger when a difference has been detected. This reduces the overhead of executing instrumentation.

  16. Relative Debugging of Automatically Parallelized Programs

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Hood, Robert; Biegel, Bryan (Technical Monitor)

    2002-01-01

    We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the computations begin to differ. If the original serial code is correct, errors due to parallelization will be isolated by the comparison. One of the primary goals of the system is to minimize the effort required of the user. To that end, the debugging system uses information produced by the parallelization tool to drive the comparison process. In particular, the debugging system relies on the parallelization tool to provide information about where variables may have been modified and how arrays are distributed across multiple processes. User effort is also reduced through the use of dynamic instrumentation. This allows us to modify, the program execution with out changing the way the user builds the executable. The use of dynamic instrumentation also permits us to compare the executions in a fine-grained fashion and only involve the debugger when a difference has been detected. This reduces the overhead of executing instrumentation.

  17. Parallel Subspace Subcodes of Reed-Solomon Codes for Magnetic Recording Channels

    ERIC Educational Resources Information Center

    Wang, Han

    2010-01-01

    Read channel architectures based on a single low-density parity-check (LDPC) code are being considered for the next generation of hard disk drives. However, LDPC-only solutions suffer from the error floor problem, which may compromise reliability, if not handled properly. Concatenated architectures using an LDPC code plus a Reed-Solomon (RS) code…

  18. Parallel ALLSPD-3D: Speeding Up Combustor Analysis Via Parallel Processing

    NASA Technical Reports Server (NTRS)

    Fricker, David M.

    1997-01-01

    The ALLSPD-3D Computational Fluid Dynamics code for reacting flow simulation was run on a set of benchmark test cases to determine its parallel efficiency. These test cases included non-reacting and reacting flow simulations with varying numbers of processors. Also, the tests explored the effects of scaling the simulation with the number of processors in addition to distributing a constant size problem over an increasing number of processors. The test cases were run on a cluster of IBM RS/6000 Model 590 workstations with ethernet and ATM networking plus a shared memory SGI Power Challenge L workstation. The results indicate that the network capabilities significantly influence the parallel efficiency, i.e., a shared memory machine is fastest and ATM networking provides acceptable performance. The limitations of ethernet greatly hamper the rapid calculation of flows using ALLSPD-3D.

  19. Runtime Detection of C-Style Errors in UPC Code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pirkelbauer, P; Liao, C; Panas, T

    2011-09-29

    Unified Parallel C (UPC) extends the C programming language (ISO C 99) with explicit parallel programming support for the partitioned global address space (PGAS), which provides a global memory space with localized partitions to each thread. Like its ancestor C, UPC is a low-level language that emphasizes code efficiency over safety. The absence of dynamic (and static) safety checks allows programmer oversights and software flaws that can be hard to spot. In this paper, we present an extension of a dynamic analysis tool, ROSE-Code Instrumentation and Runtime Monitor (ROSECIRM), for UPC to help programmers find C-style errors involving the globalmore » address space. Built on top of the ROSE source-to-source compiler infrastructure, the tool instruments source files with code that monitors operations and keeps track of changes to the system state. The resulting code is linked to a runtime monitor that observes the program execution and finds software defects. We describe the extensions to ROSE-CIRM that were necessary to support UPC. We discuss complications that arise from parallel code and our solutions. We test ROSE-CIRM against a runtime error detection test suite, and present performance results obtained from running error-free codes. ROSE-CIRM is released as part of the ROSE compiler under a BSD-style open source license.« less

  20. Full Wave Parallel Code for Modeling RF Fields in Hot Plasmas

    NASA Astrophysics Data System (ADS)

    Spencer, Joseph; Svidzinski, Vladimir; Evstatiev, Evstati; Galkin, Sergei; Kim, Jin-Soo

    2015-11-01

    FAR-TECH, Inc. is developing a suite of full wave RF codes in hot plasmas. It is based on a formulation in configuration space with grid adaptation capability. The conductivity kernel (which includes a nonlocal dielectric response) is calculated by integrating the linearized Vlasov equation along unperturbed test particle orbits. For Tokamak applications a 2-D version of the code is being developed. Progress of this work will be reported. This suite of codes has the following advantages over existing spectral codes: 1) It utilizes the localized nature of plasma dielectric response to the RF field and calculates this response numerically without approximations. 2) It uses an adaptive grid to better resolve resonances in plasma and antenna structures. 3) It uses an efficient sparse matrix solver to solve the formulated linear equations. The linear wave equation is formulated using two approaches: for cold plasmas the local cold plasma dielectric tensor is used (resolving resonances by particle collisions), while for hot plasmas the conductivity kernel is calculated. Work is supported by the U.S. DOE SBIR program.

  1. Tutorial: Parallel Computing of Simulation Models for Risk Analysis.

    PubMed

    Reilly, Allison C; Staid, Andrea; Gao, Michael; Guikema, Seth D

    2016-10-01

    Simulation models are widely used in risk analysis to study the effects of uncertainties on outcomes of interest in complex problems. Often, these models are computationally complex and time consuming to run. This latter point may be at odds with time-sensitive evaluations or may limit the number of parameters that are considered. In this article, we give an introductory tutorial focused on parallelizing simulation code to better leverage modern computing hardware, enabling risk analysts to better utilize simulation-based methods for quantifying uncertainty in practice. This article is aimed primarily at risk analysts who use simulation methods but do not yet utilize parallelization to decrease the computational burden of these models. The discussion is focused on conceptual aspects of embarrassingly parallel computer code and software considerations. Two complementary examples are shown using the languages MATLAB and R. A brief discussion of hardware considerations is located in the Appendix. © 2016 Society for Risk Analysis.

  2. A Parallel Numerical Algorithm To Solve Linear Systems Of Equations Emerging From 3D Radiative Transfer

    NASA Astrophysics Data System (ADS)

    Wichert, Viktoria; Arkenberg, Mario; Hauschildt, Peter H.

    2016-10-01

    Highly resolved state-of-the-art 3D atmosphere simulations will remain computationally extremely expensive for years to come. In addition to the need for more computing power, rethinking coding practices is necessary. We take a dual approach by introducing especially adapted, parallel numerical methods and correspondingly parallelizing critical code passages. In the following, we present our respective work on PHOENIX/3D. With new parallel numerical algorithms, there is a big opportunity for improvement when iteratively solving the system of equations emerging from the operator splitting of the radiative transfer equation J = ΛS. The narrow-banded approximate Λ-operator Λ* , which is used in PHOENIX/3D, occurs in each iteration step. By implementing a numerical algorithm which takes advantage of its characteristic traits, the parallel code's efficiency is further increased and a speed-up in computational time can be achieved.

  3. Use Computer-Aided Tools to Parallelize Large CFD Applications

    NASA Technical Reports Server (NTRS)

    Jin, H.; Frumkin, M.; Yan, J.

    2000-01-01

    Porting applications to high performance parallel computers is always a challenging task. It is time consuming and costly. With rapid progressing in hardware architectures and increasing complexity of real applications in recent years, the problem becomes even more sever. Today, scalability and high performance are mostly involving handwritten parallel programs using message-passing libraries (e.g. MPI). However, this process is very difficult and often error-prone. The recent reemergence of shared memory parallel (SMP) architectures, such as the cache coherent Non-Uniform Memory Access (ccNUMA) architecture used in the SGI Origin 2000, show good prospects for scaling beyond hundreds of processors. Programming on an SMP is simplified by working in a globally accessible address space. The user can supply compiler directives, such as OpenMP, to parallelize the code. As an industry standard for portable implementation of parallel programs for SMPs, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran, C and C++ to express shared memory parallelism. It promises an incremental path for parallel conversion of existing software, as well as scalability and performance for a complete rewrite or an entirely new development. Perhaps the main disadvantage of programming with directives is that inserted directives may not necessarily enhance performance. In the worst cases, it can create erroneous results. While vendors have provided tools to perform error-checking and profiling, automation in directive insertion is very limited and often failed on large programs, primarily due to the lack of a thorough enough data dependence analysis. To overcome the deficiency, we have developed a toolkit, CAPO, to automatically insert OpenMP directives in Fortran programs and apply certain degrees of optimization. CAPO is aimed at taking advantage of detailed inter-procedural dependence analysis provided by CAPTools, developed by the University of

  4. PRATHAM: Parallel Thermal Hydraulics Simulations using Advanced Mesoscopic Methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Joshi, Abhijit S; Jain, Prashant K; Mudrich, Jaime A

    2012-01-01

    At the Oak Ridge National Laboratory, efforts are under way to develop a 3D, parallel LBM code called PRATHAM (PaRAllel Thermal Hydraulic simulations using Advanced Mesoscopic Methods) to demonstrate the accuracy and scalability of LBM for turbulent flow simulations in nuclear applications. The code has been developed using FORTRAN-90, and parallelized using the message passing interface MPI library. Silo library is used to compact and write the data files, and VisIt visualization software is used to post-process the simulation data in parallel. Both the single relaxation time (SRT) and multi relaxation time (MRT) LBM schemes have been implemented in PRATHAM.more » To capture turbulence without prohibitively increasing the grid resolution requirements, an LES approach [5] is adopted allowing large scale eddies to be numerically resolved while modeling the smaller (subgrid) eddies. In this work, a Smagorinsky model has been used, which modifies the fluid viscosity by an additional eddy viscosity depending on the magnitude of the rate-of-strain tensor. In LBM, this is achieved by locally varying the relaxation time of the fluid.« less

  5. Constructing Neuronal Network Models in Massively Parallel Environments.

    PubMed

    Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus

    2017-01-01

    Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.

  6. Constructing Neuronal Network Models in Massively Parallel Environments

    PubMed Central

    Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus

    2017-01-01

    Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808

  7. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.

  8. Parallel design of JPEG-LS encoder on graphics processing units

    NASA Astrophysics Data System (ADS)

    Duan, Hao; Fang, Yong; Huang, Bormin

    2012-01-01

    With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.

  9. SphK1 mediates hepatic inflammation in a mouse model of NASH induced by high saturated fat feeding and initiates proinflammatory signaling in hepatocytes[S

    PubMed Central

    Geng, Tuoyu; Sutter, Alton; Harland, Michael D.; Law, Brittany A.; Ross, Jessica S.; Lewin, David; Palanisamy, Arun; Russo, Sarah B.; Chavin, Kenneth D.; Cowart, L. Ashley

    2015-01-01

    Steatohepatitis occurs in up to 20% of patients with fatty liver disease and leads to its primary disease outcomes, including fibrosis, cirrhosis, and increased risk of hepatocellular carcinoma. Mechanisms that mediate this inflammation are of major interest. We previously showed that overload of saturated fatty acids, such as that which occurs with metabolic syndrome, induced sphingosine kinase 1 (SphK1), an enzyme that generates sphingosine-1-phosphate (S1P). While data suggest beneficial roles for S1P in some contexts, we hypothesized that it may promote hepatic inflammation in the context of obesity. Consistent with this, we observed 2-fold elevation of this enzyme in livers from humans with nonalcoholic fatty liver disease and also in mice with high saturated fat feeding, which recapitulated the human disease. Mice exhibited activation of NFκB, elevated cytokine production, and immune cell infiltration. Importantly, SphK1-null mice were protected from these outcomes. Studies in cultured cells demonstrated saturated fatty acid induction of SphK1 message, protein, and activity, and also a requirement of the enzyme for NFκB signaling and increased mRNA encoding TNFα and MCP1. Moreover, saturated fat-induced NFκB signaling and elevation of TNFα and MCP1 mRNA in HepG2 cells was blocked by targeted knockdown of S1P receptor 1, supporting a role for this lipid signaling pathway in inflammation in nonalcoholic fatty liver disease. PMID:26482537

  10. Increasing processor utilization during parallel computation rundown

    NASA Technical Reports Server (NTRS)

    Jones, W. H.

    1986-01-01

    Some parallel processing environments provide for asynchronous execution and completion of general purpose parallel computations from a single computational phase. When all the computations from such a phase are complete, a new parallel computational phase is begun. Depending upon the granularity of the parallel computations to be performed, there may be a shortage of available work as a particular computational phase draws to a close (computational rundown). This can result in the waste of computing resources and the delay of the overall problem. In many practical instances, strict sequential ordering of phases of parallel computation is not totally required. In such cases, the beginning of one phase can be correctly computed before the end of a previous phase is completed. This allows additional work to be generated somewhat earlier to keep computing resources busy during each computational rundown. The conditions under which this can occur are identified and the frequency of occurrence of such overlapping in an actual parallel Navier-Stokes code is reported. A language construct is suggested and possible control strategies for the management of such computational phase overlapping are discussed.

  11. Application of a Scalable, Parallel, Unstructured-Grid-Based Navier-Stokes Solver

    NASA Technical Reports Server (NTRS)

    Parikh, Paresh

    2001-01-01

    A parallel version of an unstructured-grid based Navier-Stokes solver, USM3Dns, previously developed for efficient operation on a variety of parallel computers, has been enhanced to incorporate upgrades made to the serial version. The resultant parallel code has been extensively tested on a variety of problems of aerospace interest and on two sets of parallel computers to understand and document its characteristics. An innovative grid renumbering construct and use of non-blocking communication are shown to produce superlinear computing performance. Preliminary results from parallelization of a recently introduced "porous surface" boundary condition are also presented.

  12. MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program

    NASA Astrophysics Data System (ADS)

    Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.

    2018-02-01

    We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.

  13. Multiple Independent File Parallel I/O with HDF5

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miller, M. C.

    2016-07-13

    The HDF5 library has supported the I/O requirements of HPC codes at Lawrence Livermore National Labs (LLNL) since the late 90’s. In particular, HDF5 used in the Multiple Independent File (MIF) parallel I/O paradigm has supported LLNL code’s scalable I/O requirements and has recently been gainfully used at scales as large as O(10 6) parallel tasks.

  14. Reconstruction for time-domain in vivo EPR 3D multigradient oximetric imaging--a parallel processing perspective.

    PubMed

    Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C

    2009-01-01

    Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  15. Exploiting Symmetry on Parallel Architectures.

    NASA Astrophysics Data System (ADS)

    Stiller, Lewis Benjamin

    1995-01-01

    This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.

  16. Methodes iteratives paralleles: Applications en neutronique et en mecanique des fluides

    NASA Astrophysics Data System (ADS)

    Qaddouri, Abdessamad

    Dans cette these, le calcul parallele est applique successivement a la neutronique et a la mecanique des fluides. Dans chacune de ces deux applications, des methodes iteratives sont utilisees pour resoudre le systeme d'equations algebriques resultant de la discretisation des equations du probleme physique. Dans le probleme de neutronique, le calcul des matrices des probabilites de collision (PC) ainsi qu'un schema iteratif multigroupe utilisant une methode inverse de puissance sont parallelises. Dans le probleme de mecanique des fluides, un code d'elements finis utilisant un algorithme iteratif du type GMRES preconditionne est parallelise. Cette these est presentee sous forme de six articles suivis d'une conclusion. Les cinq premiers articles traitent des applications en neutronique, articles qui representent l'evolution de notre travail dans ce domaine. Cette evolution passe par un calcul parallele des matrices des PC et un algorithme multigroupe parallele teste sur un probleme unidimensionnel (article 1), puis par deux algorithmes paralleles l'un mutiregion l'autre multigroupe, testes sur des problemes bidimensionnels (articles 2--3). Ces deux premieres etapes sont suivies par l'application de deux techniques d'acceleration, le rebalancement neutronique et la minimisation du residu aux deux algorithmes paralleles (article 4). Finalement, on a mis en oeuvre l'algorithme multigroupe et le calcul parallele des matrices des PC sur un code de production DRAGON ou les tests sont plus realistes et peuvent etre tridimensionnels (article 5). Le sixieme article (article 6), consacre a l'application a la mecanique des fluides, traite la parallelisation d'un code d'elements finis FES ou le partitionneur de graphe METIS et la librairie PSPARSLIB sont utilises.

  17. An object-oriented approach to nested data parallelism

    NASA Technical Reports Server (NTRS)

    Sheffler, Thomas J.; Chatterjee, Siddhartha

    1994-01-01

    This paper describes an implementation technique for integrating nested data parallelism into an object-oriented language. Data-parallel programming employs sets of data called 'collections' and expresses parallelism as operations performed over the elements of a collection. When the elements of a collection are also collections, then there is the possibility for 'nested data parallelism.' Few current programming languages support nested data parallelism however. In an object-oriented framework, a collection is a single object. Its type defines the parallel operations that may be applied to it. Our goal is to design and build an object-oriented data-parallel programming environment supporting nested data parallelism. Our initial approach is built upon three fundamental additions to C++. We add new parallel base types by implementing them as classes, and add a new parallel collection type called a 'vector' that is implemented as a template. Only one new language feature is introduced: the 'foreach' construct, which is the basis for exploiting elementwise parallelism over collections. The strength of the method lies in the compilation strategy, which translates nested data-parallel C++ into ordinary C++. Extracting the potential parallelism in nested 'foreach' constructs is called 'flattening' nested parallelism. We show how to flatten 'foreach' constructs using a simple program transformation. Our prototype system produces vector code which has been successfully run on workstations, a CM-2, and a CM-5.

  18. A transient FETI methodology for large-scale parallel implicit computations in structural mechanics

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier

    1992-01-01

    Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.

  19. Experience in highly parallel processing using DAP

    NASA Technical Reports Server (NTRS)

    Parkinson, D.

    1987-01-01

    Distributed Array Processors (DAP) have been in day to day use for ten years and a large amount of user experience has been gained. The profile of user applications is similar to that of the Massively Parallel Processor (MPP) working group. Experience has shown that contrary to expectations, highly parallel systems provide excellent performance on so-called dirty problems such as the physics part of meteorological codes. The reasons for this observation are discussed. The arguments against replacing bit processors with floating point processors are also discussed.

  20. Line-drawing algorithms for parallel machines

    NASA Technical Reports Server (NTRS)

    Pang, Alex T.

    1990-01-01

    The fact that conventional line-drawing algorithms, when applied directly on parallel machines, can lead to very inefficient codes is addressed. It is suggested that instead of modifying an existing algorithm for a parallel machine, a more efficient implementation can be produced by going back to the invariants in the definition. Popular line-drawing algorithms are compared with two alternatives; distance to a line (a point is on the line if sufficiently close to it) and intersection with a line (a point on the line if an intersection point). For massively parallel single-instruction-multiple-data (SIMD) machines (with thousands of processors and up), the alternatives provide viable line-drawing algorithms. Because of the pixel-per-processor mapping, their performance is independent of the line length and orientation.