Sample records for three-dimensional massively parallel

  1. Massively parallel GPU-accelerated minimization of classical density functional theory

    NASA Astrophysics Data System (ADS)

    Stopper, Daniel; Roth, Roland

    2017-08-01

    In this paper, we discuss the ability to numerically minimize the grand potential of hard disks in two-dimensional and of hard spheres in three-dimensional space within the framework of classical density functional and fundamental measure theory on modern graphics cards. Our main finding is that a massively parallel minimization leads to an enormous performance gain in comparison to standard sequential minimization schemes. Furthermore, the results indicate that in complex multi-dimensional situations, a heavy parallel minimization of the grand potential seems to be mandatory in order to reach a reasonable balance between accuracy and computational cost.

  2. Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Logan, Terry G.

    1994-01-01

    The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.

  3. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOEpatents

    Karasick, Michael S.; Strip, David R.

    1996-01-01

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modelling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modelling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modelling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication.

  4. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOEpatents

    Karasick, M.S.; Strip, D.R.

    1996-01-30

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modeling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modeling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modeling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication. 8 figs.

  5. Implementation and Characterization of Three-Dimensional Particle-in-Cell Codes on Multiple-Instruction-Multiple-Data Massively Parallel Supercomputers

    NASA Technical Reports Server (NTRS)

    Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.

    1995-01-01

    A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.

  6. Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

    NASA Astrophysics Data System (ADS)

    Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

    2018-01-01

    We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.

  7. Massively parallel implementation of 3D-RISM calculation with volumetric 3D-FFT.

    PubMed

    Maruyama, Yutaka; Yoshida, Norio; Tadano, Hiroto; Takahashi, Daisuke; Sato, Mitsuhisa; Hirata, Fumio

    2014-07-05

    A new three-dimensional reference interaction site model (3D-RISM) program for massively parallel machines combined with the volumetric 3D fast Fourier transform (3D-FFT) was developed, and tested on the RIKEN K supercomputer. The ordinary parallel 3D-RISM program has a limitation on the number of parallelizations because of the limitations of the slab-type 3D-FFT. The volumetric 3D-FFT relieves this limitation drastically. We tested the 3D-RISM calculation on the large and fine calculation cell (2048(3) grid points) on 16,384 nodes, each having eight CPU cores. The new 3D-RISM program achieved excellent scalability to the parallelization, running on the RIKEN K supercomputer. As a benchmark application, we employed the program, combined with molecular dynamics simulation, to analyze the oligomerization process of chymotrypsin Inhibitor 2 mutant. The results demonstrate that the massive parallel 3D-RISM program is effective to analyze the hydration properties of the large biomolecular systems. Copyright © 2014 Wiley Periodicals, Inc.

  8. Advanced miniature processing handware for ATR applications

    NASA Technical Reports Server (NTRS)

    Chao, Tien-Hsin (Inventor); Daud, Taher (Inventor); Thakoor, Anikumar (Inventor)

    2003-01-01

    A Hybrid Optoelectronic Neural Object Recognition System (HONORS), is disclosed, comprising two major building blocks: (1) an advanced grayscale optical correlator (OC) and (2) a massively parallel three-dimensional neural-processor. The optical correlator, with its inherent advantages in parallel processing and shift invariance, is used for target of interest (TOI) detection and segmentation. The three-dimensional neural-processor, with its robust neural learning capability, is used for target classification and identification. The hybrid optoelectronic neural object recognition system, with its powerful combination of optical processing and neural networks, enables real-time, large frame, automatic target recognition (ATR).

  9. An efficient three-dimensional Poisson solver for SIMD high-performance-computing architectures

    NASA Technical Reports Server (NTRS)

    Cohl, H.

    1994-01-01

    We present an algorithm that solves the three-dimensional Poisson equation on a cylindrical grid. The technique uses a finite-difference scheme with operator splitting. This splitting maps the banded structure of the operator matrix into a two-dimensional set of tridiagonal matrices, which are then solved in parallel. Our algorithm couples FFT techniques with the well-known ADI (Alternating Direction Implicit) method for solving Elliptic PDE's, and the implementation is extremely well suited for a massively parallel environment like the SIMD architecture of the MasPar MP-1. Due to the highly recursive nature of our problem, we believe that our method is highly efficient, as it avoids excessive interprocessor communication.

  10. Aerodynamic simulation on massively parallel systems

    NASA Technical Reports Server (NTRS)

    Haeuser, Jochem; Simon, Horst D.

    1992-01-01

    This paper briefly addresses the computational requirements for the analysis of complete configurations of aircraft and spacecraft currently under design to be used for advanced transportation in commercial applications as well as in space flight. The discussion clearly shows that massively parallel systems are the only alternative which is both cost effective and on the other hand can provide the necessary TeraFlops, needed to satisfy the narrow design margins of modern vehicles. It is assumed that the solution of the governing physical equations, i.e., the Navier-Stokes equations which may be complemented by chemistry and turbulence models, is done on multiblock grids. This technique is situated between the fully structured approach of classical boundary fitted grids and the fully unstructured tetrahedra grids. A fully structured grid best represents the flow physics, while the unstructured grid gives best geometrical flexibility. The multiblock grid employed is structured within a block, but completely unstructured on the block level. While a completely unstructured grid is not straightforward to parallelize, the above mentioned multiblock grid is inherently parallel, in particular for multiple instruction multiple datastream (MIMD) machines. In this paper guidelines are provided for setting up or modifying an existing sequential code so that a direct parallelization on a massively parallel system is possible. Results are presented for three parallel systems, namely the Intel hypercube, the Ncube hypercube, and the FPS 500 system. Some preliminary results for an 8K CM2 machine will also be mentioned. The code run is the two dimensional grid generation module of Grid, which is a general two dimensional and three dimensional grid generation code for complex geometries. A system of nonlinear Poisson equations is solved. This code is also a good testcase for complex fluid dynamics codes, since the same datastructures are used. All systems provided good speedups, but message passing MIMD systems seem to be best suited for large miltiblock applications.

  11. Parallel Tensor Compression for Large-Scale Scientific Data.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kolda, Tamara G.; Ballard, Grey; Austin, Woody Nathan

    As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memorymore » parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.« less

  12. SediFoam: A general-purpose, open-source CFD-DEM solver for particle-laden flow with emphasis on sediment transport

    NASA Astrophysics Data System (ADS)

    Sun, Rui; Xiao, Heng

    2016-04-01

    With the growth of available computational resource, CFD-DEM (computational fluid dynamics-discrete element method) becomes an increasingly promising and feasible approach for the study of sediment transport. Several existing CFD-DEM solvers are applied in chemical engineering and mining industry. However, a robust CFD-DEM solver for the simulation of sediment transport is still desirable. In this work, the development of a three-dimensional, massively parallel, and open-source CFD-DEM solver SediFoam is detailed. This solver is built based on open-source solvers OpenFOAM and LAMMPS. OpenFOAM is a CFD toolbox that can perform three-dimensional fluid flow simulations on unstructured meshes; LAMMPS is a massively parallel DEM solver for molecular dynamics. Several validation tests of SediFoam are performed using cases of a wide range of complexities. The results obtained in the present simulations are consistent with those in the literature, which demonstrates the capability of SediFoam for sediment transport applications. In addition to the validation test, the parallel efficiency of SediFoam is studied to test the performance of the code for large-scale and complex simulations. The parallel efficiency tests show that the scalability of SediFoam is satisfactory in the simulations using up to O(107) particles.

  13. A Novel Implementation of Massively Parallel Three Dimensional Monte Carlo Radiation Transport

    NASA Astrophysics Data System (ADS)

    Robinson, P. B.; Peterson, J. D. L.

    2005-12-01

    The goal of our summer project was to implement the difference formulation for radiation transport into Cosmos++, a multidimensional, massively parallel, magneto hydrodynamics code for astrophysical applications (Peter Anninos - AX). The difference formulation is a new method for Symbolic Implicit Monte Carlo thermal transport (Brooks and Szöke - PAT). Formerly, simultaneous implementation of fully implicit Monte Carlo radiation transport in multiple dimensions on multiple processors had not been convincingly demonstrated. We found that a combination of the difference formulation and the inherent structure of Cosmos++ makes such an implementation both accurate and straightforward. We developed a "nearly nearest neighbor physics" technique to allow each processor to work independently, even with a fully implicit code. This technique coupled with the increased accuracy of an implicit Monte Carlo solution and the efficiency of parallel computing systems allows us to demonstrate the possibility of massively parallel thermal transport. This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48

  14. Tough2{_}MP: A parallel version of TOUGH2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Keni; Wu, Yu-Shu; Ding, Chris

    2003-04-09

    TOUGH2{_}MP is a massively parallel version of TOUGH2. It was developed for running on distributed-memory parallel computers to simulate large simulation problems that may not be solved by the standard, single-CPU TOUGH2 code. The new code implements an efficient massively parallel scheme, while preserving the full capacity and flexibility of the original TOUGH2 code. The new software uses the METIS software package for grid partitioning and AZTEC software package for linear-equation solving. The standard message-passing interface is adopted for communication among processors. Numerical performance of the current version code has been tested on CRAY-T3E and IBM RS/6000 SP platforms. Inmore » addition, the parallel code has been successfully applied to real field problems of multi-million-cell simulations for three-dimensional multiphase and multicomponent fluid and heat flow, as well as solute transport. In this paper, we will review the development of the TOUGH2{_}MP, and discuss the basic features, modules, and their applications.« less

  15. Three-Dimensional Nanobiocomputing Architectures With Neuronal Hypercells

    DTIC Science & Technology

    2007-06-01

    Neumann architectures, and CMOS fabrication. Novel solutions of massive parallel distributed computing and processing (pipelined due to systolic... and processing platforms utilizing molecular hardware within an enabling organization and architecture. The design technology is based on utilizing a...Microsystems and Nanotechnologies investigated a novel 3D3 (Hardware Software Nanotechnology) technology to design super-high performance computing

  16. Parallel computing of a climate model on the dawn 1000 by domain decomposition method

    NASA Astrophysics Data System (ADS)

    Bi, Xunqiang

    1997-12-01

    In this paper the parallel computing of a grid-point nine-level atmospheric general circulation model on the Dawn 1000 is introduced. The model was developed by the Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS). The Dawn 1000 is a MIMD massive parallel computer made by National Research Center for Intelligent Computer (NCIC), CAS. A two-dimensional domain decomposition method is adopted to perform the parallel computing. The potential ways to increase the speed-up ratio and exploit more resources of future massively parallel supercomputation are also discussed.

  17. Tiger LDRD final report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steich, D J; Brugger, S T; Kallman, J S

    2000-02-01

    This final report describes our efforts on the Three-Dimensional Massively Parallel CEM Technologies LDRD project (97-ERD-009). Significant need exists for more advanced time domain computational electromagnetics modeling. Bookkeeping details and modifying inflexible software constitute a vast majority of the effort required to address such needs. The required effort escalates rapidly as problem complexity increases. For example, hybrid meshes requiring hybrid numerics on massively parallel platforms (MPPs). This project attempts to alleviate the above limitations by investigating flexible abstractions for these numerical algorithms on MPPs using object-oriented methods, providing a programming environment insulating physics from bookkeeping. The three major design iterationsmore » during the project, known as TIGER-I to TIGER-III, are discussed. Each version of TIGER is briefly discussed along with lessons learned during the development and implementation. An Application Programming Interface (API) of the object-oriented interface for Tiger-III is included in three appendices. The three appendices contain the Utilities, Entity-Attribute, and Mesh libraries developed during the project. The API libraries represent a snapshot of our latest attempt at insulated the physics from the bookkeeping.« less

  18. Simulation of dilute polymeric fluids in a three-dimensional contraction using a multiscale FENE model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Griebel, M., E-mail: griebel@ins.uni-bonn.de, E-mail: ruettgers@ins.uni-bonn.de; Rüttgers, A., E-mail: griebel@ins.uni-bonn.de, E-mail: ruettgers@ins.uni-bonn.de

    The multiscale FENE model is applied to a 3D square-square contraction flow problem. For this purpose, the stochastic Brownian configuration field method (BCF) has been coupled with our fully parallelized three-dimensional Navier-Stokes solver NaSt3DGPF. The robustness of the BCF method enables the numerical simulation of high Deborah number flows for which most macroscopic methods suffer from stability issues. The results of our simulations are compared with that of experimental measurements from literature and show a very good agreement. In particular, flow phenomena such as a strong vortex enhancement, streamline divergence and a flow inversion for highly elastic flows are reproduced.more » Due to their computational complexity, our simulations require massively parallel computations. Using a domain decomposition approach with MPI, the implementation achieves excellent scale-up results for up to 128 processors.« less

  19. Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

    NASA Technical Reports Server (NTRS)

    Morgan, Philip E.

    2004-01-01

    This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.

  20. dfnWorks: A discrete fracture network framework for modeling subsurface flow and transport

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hyman, Jeffrey D.; Karra, Satish; Makedonska, Nataliia

    DFNWORKS is a parallelized computational suite to generate three-dimensional discrete fracture networks (DFN) and simulate flow and transport. Developed at Los Alamos National Laboratory over the past five years, it has been used to study flow and transport in fractured media at scales ranging from millimeters to kilometers. The networks are created and meshed using DFNGEN, which combines FRAM (the feature rejection algorithm for meshing) methodology to stochastically generate three-dimensional DFNs with the LaGriT meshing toolbox to create a high-quality computational mesh representation. The representation produces a conforming Delaunay triangulation suitable for high performance computing finite volume solvers in anmore » intrinsically parallel fashion. Flow through the network is simulated in dfnFlow, which utilizes the massively parallel subsurface flow and reactive transport finite volume code PFLOTRAN. A Lagrangian approach to simulating transport through the DFN is adopted within DFNTRANS to determine pathlines and solute transport through the DFN. Example applications of this suite in the areas of nuclear waste repository science, hydraulic fracturing and CO 2 sequestration are also included.« less

  1. dfnWorks: A discrete fracture network framework for modeling subsurface flow and transport

    DOE PAGES

    Hyman, Jeffrey D.; Karra, Satish; Makedonska, Nataliia; ...

    2015-11-01

    DFNWORKS is a parallelized computational suite to generate three-dimensional discrete fracture networks (DFN) and simulate flow and transport. Developed at Los Alamos National Laboratory over the past five years, it has been used to study flow and transport in fractured media at scales ranging from millimeters to kilometers. The networks are created and meshed using DFNGEN, which combines FRAM (the feature rejection algorithm for meshing) methodology to stochastically generate three-dimensional DFNs with the LaGriT meshing toolbox to create a high-quality computational mesh representation. The representation produces a conforming Delaunay triangulation suitable for high performance computing finite volume solvers in anmore » intrinsically parallel fashion. Flow through the network is simulated in dfnFlow, which utilizes the massively parallel subsurface flow and reactive transport finite volume code PFLOTRAN. A Lagrangian approach to simulating transport through the DFN is adopted within DFNTRANS to determine pathlines and solute transport through the DFN. Example applications of this suite in the areas of nuclear waste repository science, hydraulic fracturing and CO 2 sequestration are also included.« less

  2. High performance computing applications in neurobiological research

    NASA Technical Reports Server (NTRS)

    Ross, Muriel D.; Cheng, Rei; Doshay, David G.; Linton, Samuel W.; Montgomery, Kevin; Parnas, Bruce R.

    1994-01-01

    The human nervous system is a massively parallel processor of information. The vast numbers of neurons, synapses and circuits is daunting to those seeking to understand the neural basis of consciousness and intellect. Pervading obstacles are lack of knowledge of the detailed, three-dimensional (3-D) organization of even a simple neural system and the paucity of large scale, biologically relevant computer simulations. We use high performance graphics workstations and supercomputers to study the 3-D organization of gravity sensors as a prototype architecture foreshadowing more complex systems. Scaled-down simulations run on a Silicon Graphics workstation and scale-up, three-dimensional versions run on the Cray Y-MP and CM5 supercomputers.

  3. Proxy-equation paradigm: A strategy for massively parallel asynchronous computations

    NASA Astrophysics Data System (ADS)

    Mittal, Ankita; Girimaji, Sharath

    2017-09-01

    Massively parallel simulations of transport equation systems call for a paradigm change in algorithm development to achieve efficient scalability. Traditional approaches require time synchronization of processing elements (PEs), which severely restricts scalability. Relaxing synchronization requirement introduces error and slows down convergence. In this paper, we propose and develop a novel "proxy equation" concept for a general transport equation that (i) tolerates asynchrony with minimal added error, (ii) preserves convergence order and thus, (iii) expected to scale efficiently on massively parallel machines. The central idea is to modify a priori the transport equation at the PE boundaries to offset asynchrony errors. Proof-of-concept computations are performed using a one-dimensional advection (convection) diffusion equation. The results demonstrate the promise and advantages of the present strategy.

  4. Steady and unsteady three-dimensional transonic flow computations by integral equation method

    NASA Technical Reports Server (NTRS)

    Hu, Hong

    1994-01-01

    This is the final technical report of the research performed under the grant: NAG1-1170, from the National Aeronautics and Space Administration. The report consists of three parts. The first part presents the work on unsteady flows around a zero-thickness wing. The second part presents the work on steady flows around non-zero thickness wings. The third part presents the massively parallel processing implementation and performance analysis of integral equation computations. At the end of the report, publications resulting from this grant are listed and attached.

  5. SIAM Conference on Parallel Processing for Scientific Computing, 4th, Chicago, IL, Dec. 11-13, 1989, Proceedings

    NASA Technical Reports Server (NTRS)

    Dongarra, Jack (Editor); Messina, Paul (Editor); Sorensen, Danny C. (Editor); Voigt, Robert G. (Editor)

    1990-01-01

    Attention is given to such topics as an evaluation of block algorithm variants in LAPACK and presents a large-grain parallel sparse system solver, a multiprocessor method for the solution of the generalized Eigenvalue problem on an interval, and a parallel QR algorithm for iterative subspace methods on the CM2. A discussion of numerical methods includes the topics of asynchronous numerical solutions of PDEs on parallel computers, parallel homotopy curve tracking on a hypercube, and solving Navier-Stokes equations on the Cedar Multi-Cluster system. A section on differential equations includes a discussion of a six-color procedure for the parallel solution of elliptic systems using the finite quadtree structure, data parallel algorithms for the finite element method, and domain decomposition methods in aerodynamics. Topics dealing with massively parallel computing include hypercube vs. 2-dimensional meshes and massively parallel computation of conservation laws. Performance and tools are also discussed.

  6. An efficient 3-dim FFT for plane wave electronic structure calculations on massively parallel machines composed of multiprocessor nodes

    NASA Astrophysics Data System (ADS)

    Goedecker, Stefan; Boulet, Mireille; Deutsch, Thierry

    2003-08-01

    Three-dimensional Fast Fourier Transforms (FFTs) are the main computational task in plane wave electronic structure calculations. Obtaining a high performance on a large numbers of processors is non-trivial on the latest generation of parallel computers that consist of nodes made up of a shared memory multiprocessors. A non-dogmatic method for obtaining high performance for such 3-dim FFTs in a combined MPI/OpenMP programming paradigm will be presented. Exploiting the peculiarities of plane wave electronic structure calculations, speedups of up to 160 and speeds of up to 130 Gflops were obtained on 256 processors.

  7. Massively parallel electrical conductivity imaging of the subsurface: Applications to hydrocarbon exploration

    NASA Astrophysics Data System (ADS)

    Newman, Gregory A.; Commer, Michael

    2009-07-01

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/L supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.

  8. Numerical Prediction of Non-Reacting and Reacting Flow in a Model Gas Turbine Combustor

    NASA Technical Reports Server (NTRS)

    Davoudzadeh, Farhad; Liu, Nan-Suey

    2005-01-01

    The three-dimensional, viscous, turbulent, reacting and non-reacting flow characteristics of a model gas turbine combustor operating on air/methane are simulated via an unstructured and massively parallel Reynolds-Averaged Navier-Stokes (RANS) code. This serves to demonstrate the capabilities of the code for design and analysis of real combustor engines. The effects of some design features of combustors are examined. In addition, the computed results are validated against experimental data.

  9. The design and implementation of a parallel unstructured Euler solver using software primitives

    NASA Technical Reports Server (NTRS)

    Das, R.; Mavriplis, D. J.; Saltz, J.; Gupta, S.; Ponnusamy, R.

    1992-01-01

    This paper is concerned with the implementation of a three-dimensional unstructured grid Euler-solver on massively parallel distributed-memory computer architectures. The goal is to minimize solution time by achieving high computational rates with a numerically efficient algorithm. An unstructured multigrid algorithm with an edge-based data structure has been adopted, and a number of optimizations have been devised and implemented in order to accelerate the parallel communication rates. The implementation is carried out by creating a set of software tools, which provide an interface between the parallelization issues and the sequential code, while providing a basis for future automatic run-time compilation support. Large practical unstructured grid problems are solved on the Intel iPSC/860 hypercube and Intel Touchstone Delta machine. The quantitative effect of the various optimizations are demonstrated, and we show that the combined effect of these optimizations leads to roughly a factor of three performance improvement. The overall solution efficiency is compared with that obtained on the CRAY-YMP vector supercomputer.

  10. Multiphase three-dimensional direct numerical simulation of a rotating impeller with code Blue

    NASA Astrophysics Data System (ADS)

    Kahouadji, Lyes; Shin, Seungwon; Chergui, Jalel; Juric, Damir; Craster, Richard V.; Matar, Omar K.

    2017-11-01

    The flow driven by a rotating impeller inside an open fixed cylindrical cavity is simulated using code Blue, a solver for massively-parallel simulations of fully three-dimensional multiphase flows. The impeller is composed of four blades at a 45° inclination all attached to a central hub and tube stem. In Blue, solid forms are constructed through the definition of immersed objects via a distance function that accounts for the object's interaction with the flow for both single and two-phase flows. We use a moving frame technique for imposing translation and/or rotation. The variation of the Reynolds number, the clearance, and the tank aspect ratio are considered, and we highlight the importance of the confinement ratio (blade radius versus the tank radius) in the mixing process. Blue uses a domain decomposition strategy for parallelization with MPI. The fluid interface solver is based on a parallel implementation of a hybrid front-tracking/level-set method designed complex interfacial topological changes. Parallel GMRES and multigrid iterative solvers are applied to the linear systems arising from the implicit solution for the fluid velocities and pressure in the presence of strong density and viscosity discontinuities across fluid phases. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM).

  11. The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

    NASA Technical Reports Server (NTRS)

    Woo, Alex C.; Hill, Kueichien C.

    1996-01-01

    The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts.

  12. Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver

    NASA Technical Reports Server (NTRS)

    Mavriplis, Dimitri J.

    1998-01-01

    A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.

  13. Toward three-dimensional microelectronic systems: directed self-assembly of silicon microcubes via DNA surface functionalization.

    PubMed

    Lämmerhardt, Nico; Merzsch, Stephan; Ledig, Johannes; Bora, Achyut; Waag, Andreas; Tornow, Marc; Mischnick, Petra

    2013-07-02

    The huge and intelligent processing power of three-dimensional (3D) biological "processors" like the human brain with clock speeds of only 0.1 kHz is an extremely fascinating property, which is based on a massively parallel interconnect strategy. Artificial silicon microprocessors are 7 orders of magnitude faster. Nevertheless, they do not show any indication of intelligent processing power, mostly due to their very limited interconnectivity. Massively parallel interconnectivity can only be realized in three dimensions. Three-dimensional artificial processors would therefore be at the root of fabricating artificially intelligent systems. A first step in this direction would be the self-assembly of silicon based building blocks into 3D structures. We report on the self-assembly of such building blocks by molecular recognition, and on the electrical characterization of the formed assemblies. First, planar silicon substrates were functionalized with self-assembling monolayers of 3-aminopropyltrimethoxysilane for coupling of oligonucleotides (single stranded DNA) with glutaric aldehyde. The oligonucleotide immobilization was confirmed and quantified by hybridization with fluorescence-labeled complementary oligonucleotides. After the individual processing steps, the samples were analyzed by contact angle measurements, ellipsometry, atomic force microscopy, and fluorescence microscopy. Patterned DNA-functionalized layers were fabricated by microcontact printing (μCP) and photolithography. Silicon microcubes of 3 μm edge length as model objects for first 3D self-assembly experiments were fabricated out of silicon-on-insulator (SOI) wafers by a combination of reactive ion etching (RIE) and selective wet etching. The microcubes were then surface-functionalized using the same protocol as on planar substrates, and their self-assembly was demonstrated both on patterned silicon surfaces (88% correctly placed cubes), and to cube aggregates by complementary DNA functionalization and hybridization. The yield of formed aggregates was found to be about 44%, with a relative fraction of dimers of some 30%. Finally, the electrical properties of the formed dimers were characterized using probe tips inside a scanning electron microscope.

  14. Implementation of a Message Passing Interface into a Cloud-Resolving Model for Massively Parallel Computing

    NASA Technical Reports Server (NTRS)

    Juang, Hann-Ming Henry; Tao, Wei-Kuo; Zeng, Xi-Ping; Shie, Chung-Lin; Simpson, Joanne; Lang, Steve

    2004-01-01

    The capability for massively parallel programming (MPP) using a message passing interface (MPI) has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model. The design for the MPP with MPI uses the concept of maintaining similar code structure between the whole domain as well as the portions after decomposition. Hence the model follows the same integration for single and multiple tasks (CPUs). Also, it provides for minimal changes to the original code, so it is easily modified and/or managed by the model developers and users who have little knowledge of MPP. The entire model domain could be sliced into one- or two-dimensional decomposition with a halo regime, which is overlaid on partial domains. The halo regime requires that no data be fetched across tasks during the computational stage, but it must be updated before the next computational stage through data exchange via MPI. For reproducible purposes, transposing data among tasks is required for spectral transform (Fast Fourier Transform, FFT), which is used in the anelastic version of the model for solving the pressure equation. The performance of the MPI-implemented codes (i.e., the compressible and anelastic versions) was tested on three different computing platforms. The major results are: 1) both versions have speedups of about 99% up to 256 tasks but not for 512 tasks; 2) the anelastic version has better speedup and efficiency because it requires more computations than that of the compressible version; 3) equal or approximately-equal numbers of slices between the x- and y- directions provide the fastest integration due to fewer data exchanges; and 4) one-dimensional slices in the x-direction result in the slowest integration due to the need for more memory relocation for computation.

  15. Visualization of unsteady computational fluid dynamics

    NASA Astrophysics Data System (ADS)

    Haimes, Robert

    1994-11-01

    A brief summary of the computer environment used for calculating three dimensional unsteady Computational Fluid Dynamic (CFD) results is presented. This environment requires a super computer as well as massively parallel processors (MPP's) and clusters of workstations acting as a single MPP (by concurrently working on the same task) provide the required computational bandwidth for CFD calculations of transient problems. The cluster of reduced instruction set computers (RISC) is a recent advent based on the low cost and high performance that workstation vendors provide. The cluster, with the proper software can act as a multiple instruction/multiple data (MIMD) machine. A new set of software tools is being designed specifically to address visualizing 3D unsteady CFD results in these environments. Three user's manuals for the parallel version of Visual3, pV3, revision 1.00 make up the bulk of this report.

  16. Visualization of unsteady computational fluid dynamics

    NASA Technical Reports Server (NTRS)

    Haimes, Robert

    1994-01-01

    A brief summary of the computer environment used for calculating three dimensional unsteady Computational Fluid Dynamic (CFD) results is presented. This environment requires a super computer as well as massively parallel processors (MPP's) and clusters of workstations acting as a single MPP (by concurrently working on the same task) provide the required computational bandwidth for CFD calculations of transient problems. The cluster of reduced instruction set computers (RISC) is a recent advent based on the low cost and high performance that workstation vendors provide. The cluster, with the proper software can act as a multiple instruction/multiple data (MIMD) machine. A new set of software tools is being designed specifically to address visualizing 3D unsteady CFD results in these environments. Three user's manuals for the parallel version of Visual3, pV3, revision 1.00 make up the bulk of this report.

  17. A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS

    PubMed Central

    Fu, Zhisong; Kirby, Robert M.; Whitaker, Ross T.

    2014-01-01

    Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers. PMID:25221418

  18. 50 GFlops molecular dynamics on the Connection Machine 5

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lomdahl, P.S.; Tamayo, P.; Groenbech-Jensen, N.

    1993-12-31

    The authors present timings and performance numbers for a new short range three dimensional (3D) molecular dynamics (MD) code, SPaSM, on the Connection Machine-5 (CM-5). They demonstrate that runs with more than 10{sup 8} particles are now possible on massively parallel MIMD computers. To the best of their knowledge this is at least an order of magnitude more particles than what has previously been reported. Typical production runs show sustained performance (including communication) in the range of 47--50 GFlops on a 1024 node CM-5 with vector units (VUs). The speed of the code scales linearly with the number of processorsmore » and with the number of particles and shows 95% parallel efficiency in the speedup.« less

  19. Planned development of a 3D computer based on free-space optical interconnects

    NASA Astrophysics Data System (ADS)

    Neff, John A.; Guarino, David R.

    1994-05-01

    Free-space optical interconnection has the potential to provide upwards of a million data channels between planes of electronic circuits. This may result in the planar board and backplane structures of today giving away to 3-D stacks of wafers or multi-chip modules interconnected via channels running perpendicular to the processor planes, thereby eliminating much of the packaging overhead. Three-dimensional packaging is very appealing for tightly coupled fine-grained parallel computing where the need for massive numbers of interconnections is severely taxing the capabilities of the planar structures. This paper describes a coordinated effort by four research organizations to demonstrate an operational fine-grained parallel computer that achieves global connectivity through the use of free space optical interconnects.

  20. Efficient implementation of parallel three-dimensional FFT on clusters of PCs

    NASA Astrophysics Data System (ADS)

    Takahashi, Daisuke

    2003-05-01

    In this paper, we propose a high-performance parallel three-dimensional fast Fourier transform (FFT) algorithm on clusters of PCs. The three-dimensional FFT algorithm can be altered into a block three-dimensional FFT algorithm to reduce the number of cache misses. We show that the block three-dimensional FFT algorithm improves performance by utilizing the cache memory effectively. We use the block three-dimensional FFT algorithm to implement the parallel three-dimensional FFT algorithm. We succeeded in obtaining performance of over 1.3 GFLOPS on an 8-node dual Pentium III 1 GHz PC SMP cluster.

  1. Three-dimensional wideband electromagnetic modeling on massively parallel computers

    NASA Astrophysics Data System (ADS)

    Alumbaugh, David L.; Newman, Gregory A.; Prevost, Lydie; Shadid, John N.

    1996-01-01

    A method is presented for modeling the wideband, frequency domain electromagnetic (EM) response of a three-dimensional (3-D) earth to dipole sources operating at frequencies where EM diffusion dominates the response (less than 100 kHz) up into the range where propagation dominates (greater than 10 MHz). The scheme employs the modified form of the vector Helmholtz equation for the scattered electric fields to model variations in electrical conductivity, dielectric permitivity and magnetic permeability. The use of the modified form of the Helmholtz equation allows for perfectly matched layer ( PML) absorbing boundary conditions to be employed through the use of complex grid stretching. Applying the finite difference operator to the modified Helmholtz equation produces a linear system of equations for which the matrix is sparse and complex symmetrical. The solution is obtained using either the biconjugate gradient (BICG) or quasi-minimum residual (QMR) methods with preconditioning; in general we employ the QMR method with Jacobi scaling preconditioning due to stability. In order to simulate larger, more realistic models than has been previously possible, the scheme has been modified to run on massively parallel (MP) computer architectures. Execution on the 1840-processor Intel Paragon has indicated a maximum model size of 280 × 260 × 200 cells with a maximum flop rate of 14.7 Gflops. Three different geologic models are simulated to demonstrate the use of the code for frequencies ranging from 100 Hz to 30 MHz and for different source types and polarizations. The simulations show that the scheme is correctly able to model the air-earth interface and the jump in the electric and magnetic fields normal to discontinuities. For frequencies greater than 10 MHz, complex grid stretching must be employed to incorporate absorbing boundaries while below this normal (real) grid stretching can be employed.

  2. Three-dimensional massive gravity and the bigravity black hole

    NASA Astrophysics Data System (ADS)

    Bañados, Máximo; Theisen, Stefan

    2009-11-01

    We study three-dimensional massive gravity formulated as a theory with two dynamical metrics, like the f-g theories of Isham-Salam and Strathdee. The action is parity preserving and has no higher derivative terms. The spectrum contains a single massive graviton. This theory has several features discussed recently in TMG and NMG. We find warped black holes, a critical point, and generalized Brown-Henneaux boundary conditions.

  3. DSMC Studies of the Richtmyer-Meshkov Instability

    NASA Astrophysics Data System (ADS)

    Gallis, M. A.; Koehler, T. P.; Torczynski, J. R.

    2014-11-01

    A new exascale-capable Direct Simulation Monte Carlo (DSMC) code, SPARTA, developed to be highly efficient on massively parallel computers, has extended the applicability of DSMC to challenging, transient three-dimensional problems in the continuum regime. Because DSMC inherently accounts for compressibility, viscosity, and diffusivity, it has the potential to improve the understanding of the mechanisms responsible for hydrodynamic instabilities. Here, the Richtmyer-Meshkov instability at the interface between two gases was studied parametrically using SPARTA. Simulations performed on Sequoia, an IBM Blue Gene/Q supercomputer at Lawrence Livermore National Laboratory, are used to investigate various Atwood numbers (0.33-0.94) and Mach numbers (1.2-12.0) for two-dimensional and three-dimensional perturbations. Comparisons with theoretical predictions demonstrate that DSMC accurately predicts the early-time growth of the instability. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

  4. Massively parallel multicanonical simulations

    NASA Astrophysics Data System (ADS)

    Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard

    2018-03-01

    Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.

  5. Massively parallel simulations of relativistic fluid dynamics on graphics processing units with CUDA

    NASA Astrophysics Data System (ADS)

    Bazow, Dennis; Heinz, Ulrich; Strickland, Michael

    2018-04-01

    Relativistic fluid dynamics is a major component in dynamical simulations of the quark-gluon plasma created in relativistic heavy-ion collisions. Simulations of the full three-dimensional dissipative dynamics of the quark-gluon plasma with fluctuating initial conditions are computationally expensive and typically require some degree of parallelization. In this paper, we present a GPU implementation of the Kurganov-Tadmor algorithm which solves the 3 + 1d relativistic viscous hydrodynamics equations including the effects of both bulk and shear viscosities. We demonstrate that the resulting CUDA-based GPU code is approximately two orders of magnitude faster than the corresponding serial implementation of the Kurganov-Tadmor algorithm. We validate the code using (semi-)analytic tests such as the relativistic shock-tube and Gubser flow.

  6. The AIS-5000 parallel processor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schmitt, L.A.; Wilson, S.S.

    1988-05-01

    The AIS-5000 is a commercially available massively parallel processor which has been designed to operate in an industrial environment. It has fine-grained parallelism with up to 1024 processing elements arranged in a single-instruction multiple-data (SIMD) architecture. The processing elements are arranged in a one-dimensional chain that, for computer vision applications, can be as wide as the image itself. This architecture has superior cost/performance characteristics than two-dimensional mesh-connected systems. The design of the processing elements and their interconnections as well as the software used to program the system allow a wide variety of algorithms and applications to be implemented. In thismore » paper, the overall architecture of the system is described. Various components of the system are discussed, including details of the processing elements, data I/O pathways and parallel memory organization. A virtual two-dimensional model for programming image-based algorithms for the system is presented. This model is supported by the AIS-5000 hardware and software and allows the system to be treated as a full-image-size, two-dimensional, mesh-connected parallel processor. Performance bench marks are given for certain simple and complex functions.« less

  7. Estimating water flow through a hillslope using the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Devaney, Judy E.; Camillo, P. J.; Gurney, R. J.

    1988-01-01

    A new two-dimensional model of water flow in a hillslope has been implemented on the Massively Parallel Processor at the Goddard Space Flight Center. Flow in the soil both in the saturated and unsaturated zones, evaporation and overland flow are all modelled, and the rainfall rates are allowed to vary spatially. Previous models of this type had always been very limited computationally. This model takes less than a minute to model all the components of the hillslope water flow for a day. The model can now be used in sensitivity studies to specify which measurements should be taken and how accurate they should be to describe such flows for environmental studies.

  8. Animated computer graphics models of space and earth sciences data generated via the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Treinish, Lloyd A.; Gough, Michael L.; Wildenhain, W. David

    1987-01-01

    The capability was developed of rapidly producing visual representations of large, complex, multi-dimensional space and earth sciences data sets via the implementation of computer graphics modeling techniques on the Massively Parallel Processor (MPP) by employing techniques recently developed for typically non-scientific applications. Such capabilities can provide a new and valuable tool for the understanding of complex scientific data, and a new application of parallel computing via the MPP. A prototype system with such capabilities was developed and integrated into the National Space Science Data Center's (NSSDC) Pilot Climate Data System (PCDS) data-independent environment for computer graphics data display to provide easy access to users. While developing these capabilities, several problems had to be solved independently of the actual use of the MPP, all of which are outlined.

  9. Massively Parallel Simulations of Diffusion in Dense Polymeric Structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faulon, Jean-Loup, Wilcox, R.T.

    1997-11-01

    An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon(1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of magnitude.The main advantage of the technique is that one can circumvent the bottlenecks in configuration space that inhibit relaxation in molecular dynamics simulations. The technique is based on the fact that tetravalent atoms (such as carbon and silicon) fit in themore » center of a regular tetrahedron and that regular tetrahedrons can be used to mesh the three-dimensional space. Thus, the problem of polymer equilibration described by continuous equations in molecular dynamics is reduced to a discrete problem where solutions are approximated by simple algorithms. Practical modeling applications include the constructing of butyl rubber and ethylene-propylene-dimer-monomer (EPDM) models for oxygen and water diffusion calculations. Butyl and EPDM are used in O-ring systems and serve as sealing joints in many manufactured objects. Diffusion coefficients of small gases have been measured experimentally on both polymeric systems, and in general the diffusion coefficients in EPDM are an order of magnitude larger than in butyl. In order to better understand the diffusion phenomena, 10, 000 atoms models were generated and equilibrated for butyl and EPDM. The models were submitted to a massively parallel molecular dynamics simulation to monitor the trajectories of the diffusing species.« less

  10. Programming a hillslope water movement model on the MPP

    NASA Technical Reports Server (NTRS)

    Devaney, J. E.; Irving, A. R.; Camillo, P. J.; Gurney, R. J.

    1987-01-01

    A physically based numerical model was developed of heat and moisture flow within a hillslope on a parallel architecture computer, as a precursor to a model of a complete catchment. Moisture flow within a catchment includes evaporation, overland flow, flow in unsaturated soil, and flow in saturated soil. Because of the empirical evidence that moisture flow in unsaturated soil is mainly in the vertical direction, flow in the unsaturated zone can be modeled as a series of one dimensional columns. This initial version of the hillslope model includes evaporation and a single column of one dimensional unsaturated zone flow. This case has already been solved on an IBM 3081 computer and is now being applied to the massively parallel processor architecture so as to make the extension to the one dimensional case easier and to check the problems and benefits of using a parallel architecture machine.

  11. A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

    DOE PAGES

    Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...

    2016-09-18

    This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.

  12. A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.

    This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.

  13. Massive Exploration of Perturbed Conditions of the Blood Coagulation Cascade through GPU Parallelization

    PubMed Central

    Cazzaniga, Paolo; Nobile, Marco S.; Besozzi, Daniela; Bellini, Matteo; Mauri, Giancarlo

    2014-01-01

    The introduction of general-purpose Graphics Processing Units (GPUs) is boosting scientific applications in Bioinformatics, Systems Biology, and Computational Biology. In these fields, the use of high-performance computing solutions is motivated by the need of performing large numbers of in silico analysis to study the behavior of biological systems in different conditions, which necessitate a computing power that usually overtakes the capability of standard desktop computers. In this work we present coagSODA, a CUDA-powered computational tool that was purposely developed for the analysis of a large mechanistic model of the blood coagulation cascade (BCC), defined according to both mass-action kinetics and Hill functions. coagSODA allows the execution of parallel simulations of the dynamics of the BCC by automatically deriving the system of ordinary differential equations and then exploiting the numerical integration algorithm LSODA. We present the biological results achieved with a massive exploration of perturbed conditions of the BCC, carried out with one-dimensional and bi-dimensional parameter sweep analysis, and show that GPU-accelerated parallel simulations of this model can increase the computational performances up to a 181× speedup compared to the corresponding sequential simulations. PMID:25025072

  14. Massively parallel processor networks with optical express channels

    DOEpatents

    Deri, R.J.; Brooks, E.D. III; Haigh, R.E.; DeGroot, A.J.

    1999-08-24

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination. 3 figs.

  15. Massively parallel processor networks with optical express channels

    DOEpatents

    Deri, Robert J.; Brooks, III, Eugene D.; Haigh, Ronald E.; DeGroot, Anthony J.

    1999-01-01

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination.

  16. Massive parallelization of serial inference algorithms for a complex generalized linear model

    PubMed Central

    Suchard, Marc A.; Simpson, Shawn E.; Zorych, Ivan; Ryan, Patrick; Madigan, David

    2014-01-01

    Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety. PMID:25328363

  17. Hyper-Systolic Processing on APE100/QUADRICS:. n2-LOOP Computations

    NASA Astrophysics Data System (ADS)

    Lippert, Thomas; Ritzenhöfer, Gero; Glaessner, Uwe; Hoeber, Henning; Seyfried, Armin; Schilling, Klaus

    We investigate the performance gains from hyper-systolic implementations of n2-loop problems on the massively parallel computer Quadrics, exploiting its three-dimensional interprocessor connectivity. For illustration we study the communication aspects of an exact molecular dynamics simulation of n particles with Coulomb (or gravitational) interactions. We compare the interprocessor communication costs of the standard-systolic and the hyper-systolic approaches for various granularities. We predict gain factors as large as three on the Q4 and eight on the QH4 and measure actual performances on these machine configurations. We conclude that it appears feasible to investigate the thermodynamics of a full gravitating n-body problem with O(16.000) particles using the new method on a QH4 system.

  18. Decoupling Principle Analysis and Development of a Parallel Three-Dimensional Force Sensor

    PubMed Central

    Zhao, Yanzhi; Jiao, Leihao; Weng, Dacheng; Zhang, Dan; Zheng, Rencheng

    2016-01-01

    In the development of the multi-dimensional force sensor, dimension coupling is the ubiquitous factor restricting the improvement of the measurement accuracy. To effectively reduce the influence of dimension coupling on the parallel multi-dimensional force sensor, a novel parallel three-dimensional force sensor is proposed using a mechanical decoupling principle, and the influence of the friction on dimension coupling is effectively reduced by making the friction rolling instead of sliding friction. In this paper, the mathematical model is established by combining with the structure model of the parallel three-dimensional force sensor, and the modeling and analysis of mechanical decoupling are carried out. The coupling degree (ε) of the designed sensor is defined and calculated, and the calculation results show that the mechanical decoupling parallel structure of the sensor possesses good decoupling performance. A prototype of the parallel three-dimensional force sensor was developed, and FEM analysis was carried out. The load calibration and data acquisition experiment system are built, and then calibration experiments were done. According to the calibration experiments, the measurement accuracy is less than 2.86% and the coupling accuracy is less than 3.02%. The experimental results show that the sensor system possesses high measuring accuracy, which provides a basis for the applied research of the parallel multi-dimensional force sensor. PMID:27649194

  19. Increasing the reach of forensic genetics with massively parallel sequencing.

    PubMed

    Budowle, Bruce; Schmedes, Sarah E; Wendt, Frank R

    2017-09-01

    The field of forensic genetics has made great strides in the analysis of biological evidence related to criminal and civil matters. More so, the discipline has set a standard of performance and quality in the forensic sciences. The advent of massively parallel sequencing will allow the field to expand its capabilities substantially. This review describes the salient features of massively parallel sequencing and how it can impact forensic genetics. The features of this technology offer increased number and types of genetic markers that can be analyzed, higher throughput of samples, and the capability of targeting different organisms, all by one unifying methodology. While there are many applications, three are described where massively parallel sequencing will have immediate impact: molecular autopsy, microbial forensics and differentiation of monozygotic twins. The intent of this review is to expose the forensic science community to the potential enhancements that have or are soon to arrive and demonstrate the continued expansion the field of forensic genetics and its service in the investigation of legal matters.

  20. Balancing Newtonian gravity and spin to create localized structures

    NASA Astrophysics Data System (ADS)

    Bush, Michael; Lindner, John

    2015-03-01

    Using geometry and Newtonian physics, we design localized structures that do not require electromagnetic or other forces to resist implosion or explosion. In two-dimensional Euclidean space, we find an equilibrium configuration of a rotating ring of massive dust whose inward gravity is the centripetal force that spins it. We find similar solutions in three-dimensional Euclidean and hyperbolic spaces, but only in the limit of vanishing mass. Finally, in three-dimensional Euclidean space, we generalize the two-dimensional result by finding an equilibrium configuration of a spherical shell of massive dust that supports itself against gravitational collapse by spinning isoclinically in four dimensions so its three-dimensional acceleration is everywhere inward. These Newtonian ``atoms'' illuminate classical physics and geometry.

  1. Three-dimensional nanoscale imaging by plasmonic Brownian microscopy

    NASA Astrophysics Data System (ADS)

    Labno, Anna; Gladden, Christopher; Kim, Jeongmin; Lu, Dylan; Yin, Xiaobo; Wang, Yuan; Liu, Zhaowei; Zhang, Xiang

    2017-12-01

    Three-dimensional (3D) imaging at the nanoscale is a key to understanding of nanomaterials and complex systems. While scanning probe microscopy (SPM) has been the workhorse of nanoscale metrology, its slow scanning speed by a single probe tip can limit the application of SPM to wide-field imaging of 3D complex nanostructures. Both electron microscopy and optical tomography allow 3D imaging, but are limited to the use in vacuum environment due to electron scattering and to optical resolution in micron scales, respectively. Here we demonstrate plasmonic Brownian microscopy (PBM) as a way to improve the imaging speed of SPM. Unlike photonic force microscopy where a single trapped particle is used for a serial scanning, PBM utilizes a massive number of plasmonic nanoparticles (NPs) under Brownian diffusion in solution to scan in parallel around the unlabeled sample object. The motion of NPs under an evanescent field is three-dimensionally localized to reconstruct the super-resolution topology of 3D dielectric objects. Our method allows high throughput imaging of complex 3D structures over a large field of view, even with internal structures such as cavities that cannot be accessed by conventional mechanical tips in SPM.

  2. Digital and optical shape representation and pattern recognition; Proceedings of the Meeting, Orlando, FL, Apr. 4-6, 1988

    NASA Technical Reports Server (NTRS)

    Juday, Richard D. (Editor)

    1988-01-01

    The present conference discusses topics in pattern-recognition correlator architectures, digital stereo systems, geometric image transformations and their applications, topics in pattern recognition, filter algorithms, object detection and classification, shape representation techniques, and model-based object recognition methods. Attention is given to edge-enhancement preprocessing using liquid crystal TVs, massively-parallel optical data base management, three-dimensional sensing with polar exponential sensor arrays, the optical processing of imaging spectrometer data, hybrid associative memories and metric data models, the representation of shape primitives in neural networks, and the Monte Carlo estimation of moment invariants for pattern recognition.

  3. Parallel Computation and Visualization of Three-dimensional, Time-dependent, Thermal Convective Flows

    NASA Technical Reports Server (NTRS)

    Wang, P.; Li, P.

    1998-01-01

    A high-resolution numerical study on parallel systems is reported on three-dimensional, time-dependent, thermal convective flows. A parallel implentation on the finite volume method with a multigrid scheme is discussed, and a parallel visualization systemm is developed on distributed systems for visualizing the flow.

  4. Killing vector fields in three dimensions: a method to solve massive gravity field equations

    NASA Astrophysics Data System (ADS)

    Gürses, Metin

    2010-10-01

    Killing vector fields in three dimensions play an important role in the construction of the related spacetime geometry. In this work we show that when a three-dimensional geometry admits a Killing vector field then the Ricci tensor of the geometry is determined in terms of the Killing vector field and its scalars. In this way we can generate all products and covariant derivatives at any order of the Ricci tensor. Using this property we give ways to solve the field equations of topologically massive gravity (TMG) and new massive gravity (NMG) introduced recently. In particular when the scalars of the Killing vector field (timelike, spacelike and null cases) are constants then all three-dimensional symmetric tensors of the geometry, the Ricci and Einstein tensors, their covariant derivatives at all orders, and their products of all orders are completely determined by the Killing vector field and the metric. Hence, the corresponding three-dimensional metrics are strong candidates for solving all higher derivative gravitational field equations in three dimensions.

  5. Photochemical numerics for global-scale modeling: Fidelity and GCM testing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Elliott, S.; Jim Kao, Chih-Yue; Zhao, X.

    1995-03-01

    Atmospheric photochemistry lies at the heart of global-scale pollution problems, but it is a nonlinear system embedded in nonlinear transport and so must be modeled in three dimensions. Total earth grids are massive and kinetics require dozens of interacting tracers, taxing supercomputers to their limits in global calculations. A matrix-free and noniterative family scheme is described that permits chemical step sizes an order of magnitude or more larger than time constants for molecular groupings, in the 1-h range used for transport. Families are partitioned through linearized implicit integrations that produce stabilizing species concentrations for a mass-conserving forward solver. The kineticsmore » are also parallelized by moving geographic loops innermost and changes in the continuity equations are automated through list reading. The combination of speed, parallelization and automation renders the programs naturally modular. Accuracy lies within 1% for all species in week-long fidelity tests. A 50-species, 150-reaction stratospheric module tested in a spectral GCM benchmarks at 10 min CPU time per day and agrees with lower-dimensionality simulations. Tropospheric nonmethane hydrocarbon chemistry will soon be added, and inherently three-dimensional phenomena will be investigated both decoupled from dynamics and in a complete chemical GCM. 225 refs., 11 figs., 2 tabs.« less

  6. Parameters analysis of a porous medium model for treatment with hyperthermia using OpenMP

    NASA Astrophysics Data System (ADS)

    Freitas Reis, Ruy; dos Santos Loureiro, Felipe; Lobosco, Marcelo

    2015-09-01

    Cancer is the second cause of death in the world so treatments have been developed trying to work around this world health problem. Hyperthermia is not a new technique, but its use in cancer treatment is still at early stage of development. This treatment is based on overheat the target area to a threshold temperature that causes cancerous cell necrosis and apoptosis. To simulate this phenomenon using magnetic nanoparticles in an under skin cancer treatment, a three-dimensional porous medium model was adopted. This study presents a sensibility analysis of the model parameters such as the porosity and blood velocity. To ensure a second-order solution approach, a 7-points centered finite difference method was used for space discretization while a predictor-corrector method was used to time evolution. Due to the massive computations required to find the solution of a three-dimensional model, this paper also presents a first attempt to improve performance using OpenMP, a parallel programming API.

  7. Chiral symmetry breaking in quenched massive strong-coupling four-dimensional QED

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hawes, F.T.; Williams, A.G.

    1995-03-15

    We present results from a study of subtractive renormalization of the fermion propagator Dyson-Schwinger equation (DSE) in massive strong-coupling quenched four-dimensional QED. The results are compared for three different fermion-photon proper vertex [ital Ansa]$[ital uml---tze]: bare [gamma][sup [mu

  8. TDat: An Efficient Platform for Processing Petabyte-Scale Whole-Brain Volumetric Images.

    PubMed

    Li, Yuxin; Gong, Hui; Yang, Xiaoquan; Yuan, Jing; Jiang, Tao; Li, Xiangning; Sun, Qingtao; Zhu, Dan; Wang, Zhenyu; Luo, Qingming; Li, Anan

    2017-01-01

    Three-dimensional imaging of whole mammalian brains at single-neuron resolution has generated terabyte (TB)- and even petabyte (PB)-sized datasets. Due to their size, processing these massive image datasets can be hindered by the computer hardware and software typically found in biological laboratories. To fill this gap, we have developed an efficient platform named TDat, which adopts a novel data reformatting strategy by reading cuboid data and employing parallel computing. In data reformatting, TDat is more efficient than any other software. In data accessing, we adopted parallelization to fully explore the capability for data transmission in computers. We applied TDat in large-volume data rigid registration and neuron tracing in whole-brain data with single-neuron resolution, which has never been demonstrated in other studies. We also showed its compatibility with various computing platforms, image processing software and imaging systems.

  9. Topical perspective on massive threading and parallelism.

    PubMed

    Farber, Robert M

    2011-09-01

    Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future. Published by Elsevier Inc.

  10. Phase space simulation of collisionless stellar systems on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    White, Richard L.

    1987-01-01

    A numerical technique for solving the collisionless Boltzmann equation describing the time evolution of a self gravitating fluid in phase space was implemented on the Massively Parallel Processor (MPP). The code performs calculations for a two dimensional phase space grid (with one space and one velocity dimension). Some results from calculations are presented. The execution speed of the code is comparable to the speed of a single processor of a Cray-XMP. Advantages and disadvantages of the MPP architecture for this type of problem are discussed. The nearest neighbor connectivity of the MPP array does not pose a significant obstacle. Future MPP-like machines should have much more local memory and easier access to staging memory and disks in order to be effective for this type of problem.

  11. Development and characterization of hollow microprobe array as a potential tool for versatile and massively parallel manipulation of single cells.

    PubMed

    Nagai, Moeto; Oohara, Kiyotaka; Kato, Keita; Kawashima, Takahiro; Shibata, Takayuki

    2015-04-01

    Parallel manipulation of single cells is important for reconstructing in vivo cellular microenvironments and studying cell functions. To manipulate single cells and reconstruct their environments, development of a versatile manipulation tool is necessary. In this study, we developed an array of hollow probes using microelectromechanical systems fabrication technology and demonstrated the manipulation of single cells. We conducted a cell aspiration experiment with a glass pipette and modeled a cell using a standard linear solid model, which provided information for designing hollow stepped probes for minimally invasive single-cell manipulation. We etched a silicon wafer on both sides and formed through holes with stepped structures. The inner diameters of the holes were reduced by SiO2 deposition of plasma-enhanced chemical vapor deposition to trap cells on the tips. This fabrication process makes it possible to control the wall thickness, inner diameter, and outer diameter of the probes. With the fabricated probes, single cells were manipulated and placed in microwells at a single-cell level in a parallel manner. We studied the capture, release, and survival rates of cells at different suction and release pressures and found that the cell trapping rate was directly proportional to the suction pressure, whereas the release rate and viability decreased with increasing the suction pressure. The proposed manipulation system makes it possible to place cells in a well array and observe the adherence, spreading, culture, and death of the cells. This system has potential as a tool for massively parallel manipulation and for three-dimensional hetero cellular assays.

  12. On the dimensionally correct kinetic theory of turbulence for parallel propagation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gaelzer, R., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br; Ziebell, L. F., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br; Yoon, P. H., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br

    2015-03-15

    Yoon and Fang [Phys. Plasmas 15, 122312 (2008)] formulated a second-order nonlinear kinetic theory that describes the turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. Their theory also includes discrete-particle effects, or the effects due to spontaneously emitted thermal fluctuations. However, terms associated with the spontaneous fluctuations in particle and wave kinetic equations in their theory contain proper dimensionality only for an artificial one-dimensional situation. The present paper extends the analysis and re-derives the dimensionally correct kinetic equations for three-dimensional case. The new formalism properly describes the effects of spontaneous fluctuations emitted in three-dimensional space, while the collectivelymore » emitted turbulence propagates predominantly in directions parallel/anti-parallel to the ambient magnetic field. As a first step, the present investigation focuses on linear wave-particle interaction terms only. A subsequent paper will include the dimensionally correct nonlinear wave-particle interaction terms.« less

  13. Hawking radiation and propagation of massive charged scalar field on a three-dimensional Gödel black hole

    NASA Astrophysics Data System (ADS)

    González, P. A.; Övgün, Ali; Saavedra, Joel; Vásquez, Yerko

    2018-06-01

    In this paper we consider the three-dimensional Gödel black hole as a background and we study the vector particle tunneling from this background in order to obtain the Hawking temperature. Then, we study the propagation of a massive charged scalar field and we find the quasinormal modes analytically, which turns out be unstable as a consequence of the existence of closed time-like curves. Also, we consider the flux at the horizon and at infinity, and we compute the reflection and transmission coefficients as well as the absorption cross section. Mainly, we show that massive charged scalar waves can be superradiantly amplified by the three-dimensional Gödel black hole and that the coefficients have an oscillatory behavior. Moreover, the absorption cross section is null at the high frequency limit and for certain values of the frequency.

  14. The architecture of tomorrow's massively parallel computer

    NASA Technical Reports Server (NTRS)

    Batcher, Ken

    1987-01-01

    Goodyear Aerospace delivered the Massively Parallel Processor (MPP) to NASA/Goddard in May 1983, over three years ago. Ever since then, Goodyear has tried to look in a forward direction. There is always some debate as to which way is forward when it comes to supercomputer architecture. Improvements to the MPP's massively parallel architecture are discussed in the areas of data I/O, memory capacity, connectivity, and indirect (or local) addressing. In I/O, transfer rates up to 640 megabytes per second can be achieved. There are devices that can supply the data and accept it at this rate. The memory capacity can be increased up to 128 megabytes in the ARU and over a gigabyte in the staging memory. For connectivity, there are several different kinds of multistage networks that should be considered.

  15. High-performance parallel analysis of coupled problems for aircraft propulsion

    NASA Technical Reports Server (NTRS)

    Felippa, C. A.; Farhat, C.; Chen, P.-S.; Gumaste, U.; Leoinne, M.; Stern, P.

    1995-01-01

    This research program deals with the application of high-performance computing methods to the numerical simulation of complete jet engines. The program was initiated in 1993 by applying two-dimensional parallel aeroelastic codes to the interior gas flow problem of a by-pass jet engine. The fluid mesh generation, domain decomposition and solution capabilities were successfully tested. Attention was then focused on methodology for the partitioned analysis of the interaction of the gas flow with a flexible structure and with the fluid mesh motion driven by these structural displacements. The latter is treated by an ALE technique that models the fluid mesh motion as that of a fictitious mechanical network laid along the edges of near-field fluid elements. New partitioned analysis procedures to treat this coupled 3-component problem were developed in 1994. These procedures involved delayed corrections and subcycling, and have been successfully tested on several massively parallel computers. For the global steady-state axisymmetric analysis of a complete engine we have decided to use the NASA-sponsored ENG10 program, which uses a regular FV-multiblock-grid discretization in conjunction with circumferential averaging to include effects of blade forces, loss, combustor heat addition, blockage, bleeds and convective mixing. A load-balancing preprocessor for parallel versions of ENG10 has been developed. It is planned to use the steady-state global solution provided by ENG10 as input to a localized three-dimensional FSI analysis for engine regions where aeroelastic effects may be important.

  16. CHOLLA: A New Massively Parallel Hydrodynamics Code for Astrophysical Simulation

    NASA Astrophysics Data System (ADS)

    Schneider, Evan E.; Robertson, Brant E.

    2015-04-01

    We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳2563) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.

  17. ALEGRA -- A massively parallel h-adaptive code for solid dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Summers, R.M.; Wong, M.K.; Boucheron, E.A.

    1997-12-31

    ALEGRA is a multi-material, arbitrary-Lagrangian-Eulerian (ALE) code for solid dynamics designed to run on massively parallel (MP) computers. It combines the features of modern Eulerian shock codes, such as CTH, with modern Lagrangian structural analysis codes using an unstructured grid. ALEGRA is being developed for use on the teraflop supercomputers to conduct advanced three-dimensional (3D) simulations of shock phenomena important to a variety of systems. ALEGRA was designed with the Single Program Multiple Data (SPMD) paradigm, in which the mesh is decomposed into sub-meshes so that each processor gets a single sub-mesh with approximately the same number of elements. Usingmore » this approach the authors have been able to produce a single code that can scale from one processor to thousands of processors. A current major effort is to develop efficient, high precision simulation capabilities for ALEGRA, without the computational cost of using a global highly resolved mesh, through flexible, robust h-adaptivity of finite elements. H-adaptivity is the dynamic refinement of the mesh by subdividing elements, thus changing the characteristic element size and reducing numerical error. The authors are working on several major technical challenges that must be met to make effective use of HAMMER on MP computers.« less

  18. Particle simulation of plasmas on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Gledhill, I. M. A.; Storey, L. R. O.

    1987-01-01

    Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.

  19. Seafloor Topographic Analysis in Staged Ocean Resource Exploration

    NASA Astrophysics Data System (ADS)

    Ikeda, M.; Okawa, M.; Osawa, K.; Kadoshima, K.; Asakawa, E.; Sumi, T.

    2017-12-01

    J-MARES (Research and Development Partnership for Next Generation Technology of Marine Resources Survey, JAPAN) has been designing a low-expense and high-efficiency exploration system for seafloor hydrothermal massive sulfide deposits in "Cross-ministerial Strategic Innovation Promotion Program (SIP)" granted by the Cabinet Office, Government of Japan since 2014. We designed a method to focus mineral deposit prospective area in multi-stages (the regional survey, semi-detail survey and detail survey) by extracted topographic features of some well-known seafloor massive sulfide deposits from seafloor topographic analysis using seafloor topographic data acquired by the bathymetric survey. We applied this procedure to an area of interest more than 100km x 100km over Okinawa Trough, including some known seafloor massive sulfide deposits. In Addition, we tried to create a three-dimensional model of seafloor topography by SfM (Structure from Motion) technique using multiple image data of Chimney distributed around well-known seafloor massive sulfide deposit taken with Hi-Vision camera mounted on ROV in detail survey such as geophysical exploration. Topographic features of Chimney was extracted by measuring created three-dimensional model. As the result, it was possible to estimate shape of seafloor sulfide such as Chimney to be mined by three-dimensional model created from image data taken with camera mounted on ROV. In this presentation, we will discuss about focusing mineral deposit prospective area in multi-stages by seafloor topographic analysis using seafloor topographic data in exploration system for seafloor massive sulfide deposit and also discuss about three-dimensional model of seafloor topography created from seafloor image data taken with ROV.

  20. Multi-Scale Human Respiratory System Simulations to Study Health Effects of Aging, Disease, and Inhaled Substances

    NASA Astrophysics Data System (ADS)

    Kunz, Robert; Haworth, Daniel; Dogan, Gulkiz; Kriete, Andres

    2006-11-01

    Three-dimensional, unsteady simulations of multiphase flow, gas exchange, and particle/aerosol deposition in the human lung are reported. Surface data for human tracheo-bronchial trees are derived from CT scans, and are used to generate three- dimensional CFD meshes for the first several generations of branching. One-dimensional meshes for the remaining generations down to the respiratory units are generated using branching algorithms based on those that have been proposed in the literature, and a zero-dimensional respiratory unit (pulmonary acinus) model is attached at the end of each terminal bronchiole. The process is automated to facilitate rapid model generation. The model is exercised through multiple breathing cycles to compute the spatial and temporal variations in flow, gas exchange, and particle/aerosol deposition. The depth of the 3D/1D transition (at branching generation n) is a key parameter, and can be varied. High-fidelity models (large n) are run on massively parallel distributed-memory clusters, and are used to generate physical insight and to calibrate/validate the 1D and 0D models. Suitably validated lower-order models (small n) can be run on single-processor PC’s with run times that allow model-based clinical intervention for individual patients.

  1. Conformal anomaly of some 2-d Z (n) models

    NASA Astrophysics Data System (ADS)

    William, Peter

    1991-01-01

    We describe a numerical calculation of the conformal anomaly in the case of some two-dimensional statistical models undergoing a second-order phase transition, utilizing a recently developed method to compute the partition function exactly. This computation is carried out on a massively parallel CM2 machine, using the finite size scaling behaviour of the free energy.

  2. Nonlinear three-dimensional verification of the SPECYL and PIXIE3D magnetohydrodynamics codes for fusion plasmas

    NASA Astrophysics Data System (ADS)

    Bonfiglio, D.; Chacón, L.; Cappello, S.

    2010-08-01

    With the increasing impact of scientific discovery via advanced computation, there is presently a strong emphasis on ensuring the mathematical correctness of computational simulation tools. Such endeavor, termed verification, is now at the center of most serious code development efforts. In this study, we address a cross-benchmark nonlinear verification study between two three-dimensional magnetohydrodynamics (3D MHD) codes for fluid modeling of fusion plasmas, SPECYL [S. Cappello and D. Biskamp, Nucl. Fusion 36, 571 (1996)] and PIXIE3D [L. Chacón, Phys. Plasmas 15, 056103 (2008)], in their common limit of application: the simple viscoresistive cylindrical approximation. SPECYL is a serial code in cylindrical geometry that features a spectral formulation in space and a semi-implicit temporal advance, and has been used extensively to date for reversed-field pinch studies. PIXIE3D is a massively parallel code in arbitrary curvilinear geometry that features a conservative, solenoidal finite-volume discretization in space, and a fully implicit temporal advance. The present study is, in our view, a first mandatory step in assessing the potential of any numerical 3D MHD code for fluid modeling of fusion plasmas. Excellent agreement is demonstrated over a wide range of parameters for several fusion-relevant cases in both two- and three-dimensional geometries.

  3. Nonlinear three-dimensional verification of the SPECYL and PIXIE3D magnetohydrodynamics codes for fusion plasmas

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bonfiglio, Daniele; Chacon, Luis; Cappello, Susanna

    2010-01-01

    With the increasing impact of scientific discovery via advanced computation, there is presently a strong emphasis on ensuring the mathematical correctness of computational simulation tools. Such endeavor, termed verification, is now at the center of most serious code development efforts. In this study, we address a cross-benchmark nonlinear verification study between two three-dimensional magnetohydrodynamics (3D MHD) codes for fluid modeling of fusion plasmas, SPECYL [S. Cappello and D. Biskamp, Nucl. Fusion 36, 571 (1996)] and PIXIE3D [L. Chacon, Phys. Plasmas 15, 056103 (2008)], in their common limit of application: the simple viscoresistive cylindrical approximation. SPECYL is a serial code inmore » cylindrical geometry that features a spectral formulation in space and a semi-implicit temporal advance, and has been used extensively to date for reversed-field pinch studies. PIXIE3D is a massively parallel code in arbitrary curvilinear geometry that features a conservative, solenoidal finite-volume discretization in space, and a fully implicit temporal advance. The present study is, in our view, a first mandatory step in assessing the potential of any numerical 3D MHD code for fluid modeling of fusion plasmas. Excellent agreement is demonstrated over a wide range of parameters for several fusion-relevant cases in both two- and three-dimensional geometries.« less

  4. Design and Implementation of a Parallel Multivariate Ensemble Kalman Filter for the Poseidon Ocean General Circulation Model

    NASA Technical Reports Server (NTRS)

    Keppenne, Christian L.; Rienecker, Michele M.; Koblinsky, Chester (Technical Monitor)

    2001-01-01

    A multivariate ensemble Kalman filter (MvEnKF) implemented on a massively parallel computer architecture has been implemented for the Poseidon ocean circulation model and tested with a Pacific Basin model configuration. There are about two million prognostic state-vector variables. Parallelism for the data assimilation step is achieved by regionalization of the background-error covariances that are calculated from the phase-space distribution of the ensemble. Each processing element (PE) collects elements of a matrix measurement functional from nearby PEs. To avoid the introduction of spurious long-range covariances associated with finite ensemble sizes, the background-error covariances are given compact support by means of a Hadamard (element by element) product with a three-dimensional canonical correlation function. The methodology and the MvEnKF configuration are discussed. It is shown that the regionalization of the background covariances; has a negligible impact on the quality of the analyses. The parallel algorithm is very efficient for large numbers of observations but does not scale well beyond 100 PEs at the current model resolution. On a platform with distributed memory, memory rather than speed is the limiting factor.

  5. Large-scale three-dimensional phase-field simulations for phase coarsening at ultrahigh volume fraction on high-performance architectures

    NASA Astrophysics Data System (ADS)

    Yan, Hui; Wang, K. G.; Jones, Jim E.

    2016-06-01

    A parallel algorithm for large-scale three-dimensional phase-field simulations of phase coarsening is developed and implemented on high-performance architectures. From the large-scale simulations, a new kinetics in phase coarsening in the region of ultrahigh volume fraction is found. The parallel implementation is capable of harnessing the greater computer power available from high-performance architectures. The parallelized code enables increase in three-dimensional simulation system size up to a 5123 grid cube. Through the parallelized code, practical runtime can be achieved for three-dimensional large-scale simulations, and the statistical significance of the results from these high resolution parallel simulations are greatly improved over those obtainable from serial simulations. A detailed performance analysis on speed-up and scalability is presented, showing good scalability which improves with increasing problem size. In addition, a model for prediction of runtime is developed, which shows a good agreement with actual run time from numerical tests.

  6. 3D brain tumor localization and parameter estimation using thermographic approach on GPU.

    PubMed

    Bousselham, Abdelmajid; Bouattane, Omar; Youssfi, Mohamed; Raihani, Abdelhadi

    2018-01-01

    The aim of this paper is to present a GPU parallel algorithm for brain tumor detection to estimate its size and location from surface temperature distribution obtained by thermography. The normal brain tissue is modeled as a rectangular cube including spherical tumor. The temperature distribution is calculated using forward three dimensional Pennes bioheat transfer equation, it's solved using massively parallel Finite Difference Method (FDM) and implemented on Graphics Processing Unit (GPU). Genetic Algorithm (GA) was used to solve the inverse problem and estimate the tumor size and location by minimizing an objective function involving measured temperature on the surface to those obtained by numerical simulation. The parallel implementation of Finite Difference Method reduces significantly the time of bioheat transfer and greatly accelerates the inverse identification of brain tumor thermophysical and geometrical properties. Experimental results show significant gains in the computational speed on GPU and achieve a speedup of around 41 compared to the CPU. The analysis performance of the estimation based on tumor size inside brain tissue also presented. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Erratum: The Effects of Thermal Energetics on Three-dimensional Hydrodynamic Instabilities in Massive Protostellar Disks. II. High-Resolution and Adiabatic Evolutions

    NASA Astrophysics Data System (ADS)

    Pickett, Brian K.; Cassen, Patrick; Durisen, Richard H.; Link, Robert

    2000-02-01

    In the paper ``The Effects of Thermal Energetics on Three-dimensional Hydrodynamic Instabilities in Massive Protostellar Disks. II. High-Resolution and Adiabatic Evolutions'' by Brian K. Pickett, Patrick Cassen, Richard H. Durisen, and Robert Link (ApJ, 529, 1034 [2000]), the wrong version of Figure 10 was published as a result of an error at the Press. The correct version of Figure 10 appears below. The Press sincerely regrets this error.

  8. Implementation of ADI: Schemes on MIMD parallel computers

    NASA Technical Reports Server (NTRS)

    Vanderwijngaart, Rob F.

    1993-01-01

    In order to simulate the effects of the impingement of hot exhaust jets of High Performance Aircraft on landing surfaces a multi-disciplinary computation coupling flow dynamics to heat conduction in the runway needs to be carried out. Such simulations, which are essentially unsteady, require very large computational power in order to be completed within a reasonable time frame of the order of an hour. Such power can be furnished by the latest generation of massively parallel computers. These remove the bottleneck of ever more congested data paths to one or a few highly specialized central processing units (CPU's) by having many off-the-shelf CPU's work independently on their own data, and exchange information only when needed. During the past year the first phase of this project was completed, in which the optimal strategy for mapping an ADI-algorithm for the three dimensional unsteady heat equation to a MIMD parallel computer was identified. This was done by implementing and comparing three different domain decomposition techniques that define the tasks for the CPU's in the parallel machine. These implementations were done for a Cartesian grid and Dirichlet boundary conditions. The most promising technique was then used to implement the heat equation solver on a general curvilinear grid with a suite of nontrivial boundary conditions. Finally, this technique was also used to implement the Scalar Penta-diagonal (SP) benchmark, which was taken from the NAS Parallel Benchmarks report. All implementations were done in the programming language C on the Intel iPSC/860 computer.

  9. User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Earth Sciences Division; Zhang, Keni; Zhang, Keni

    TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used. To familiarize users with the parallel code, illustrative sample problems are presented.« less

  10. Analysis of Massively Separated Flows of Aircraft Using Detached Eddy Simulation

    NASA Astrophysics Data System (ADS)

    Morton, Scott

    2002-08-01

    An important class of turbulent flows of aerodynamic interest are those characterized by massive separation, e.g., the flow around an aircraft at high angle of attack. Numerical simulation is an important tool for analysis, though traditional models used in the solution of the Reynolds-averaged Navier-Stokes (RANS) equations appear unable to accurately account for the time-dependent and three-dimensional motions governing flows with massive separation. Large-eddy simulation (LES) is able to resolve these unsteady three-dimensional motions, yet is cost prohibitive for high Reynolds number wall-bounded flows due to the need to resolve the small scale motions in the boundary layer. Spalart et. al. proposed a hybrid technique, Detached-Eddy Simulation (DES), which takes advantage of the often adequate performance of RANS turbulence models in the "thin," typically attached regions of the flow. In the separated regions of the flow the technique becomes a Large Eddy Simulation, directly resolving the time-dependent and unsteady features that dominate regions of massive separation. The current work applies DES to a 70 degree sweep delta wing at 27 degrees angle of attack, a geometrically simple yet challenging flowfield that exhibits the unsteady three-dimensional massively separated phenomena of vortex breakdown. After detailed examination of this basic flowfield, the method is demonstrated on three full aircraft of interest characterized by massive separation, the F-16 at 45 degrees angle of attack, the F-15 at 65 degree angle of attack (with comparison to flight test), and the C-130 in a parachute drop condition at near stall speed with cargo doors open.

  11. Suppressing correlations in massively parallel simulations of lattice models

    NASA Astrophysics Data System (ADS)

    Kelling, Jeffrey; Ódor, Géza; Gemming, Sibylle

    2017-11-01

    For lattice Monte Carlo simulations parallelization is crucial to make studies of large systems and long simulation time feasible, while sequential simulations remain the gold-standard for correlation-free dynamics. Here, various domain decomposition schemes are compared, concluding with one which delivers virtually correlation-free simulations on GPUs. Extensive simulations of the octahedron model for 2 + 1 dimensional Kardar-Parisi-Zhang surface growth, which is very sensitive to correlation in the site-selection dynamics, were performed to show self-consistency of the parallel runs and agreement with the sequential algorithm. We present a GPU implementation providing a speedup of about 30 × over a parallel CPU implementation on a single socket and at least 180 × with respect to the sequential reference.

  12. RISC-type microprocessors may revolutionize aerospace simulation

    NASA Astrophysics Data System (ADS)

    Jackson, Albert S.

    The author explores the application of RISC (reduced instruction set computer) processors in massively parallel computer (MPC) designs for aerospace simulation. The MPC approach is shown to be well adapted to the needs of aerospace simulation. It is shown that any of the three common types of interconnection schemes used with MPCs are effective for general-purpose simulation, although the bus-or switch-oriented machines are somewhat easier to use. For partial differential equation models, the hypercube approach at first glance appears more efficient because the nearest-neighbor connections required for three-dimensional models are hardwired in a hypercube machine. However, the data broadcast ability of a bus system, combined with the fact that data can be transmitted over a bus as soon as it has been updated, makes the bus approach very competitive with the hypercube approach even for these types of models.

  13. Execution of a parallel edge-based Navier-Stokes solver on commodity graphics processor units

    NASA Astrophysics Data System (ADS)

    Corral, Roque; Gisbert, Fernando; Pueblas, Jesus

    2017-02-01

    The implementation of an edge-based three-dimensional Reynolds Average Navier-Stokes solver for unstructured grids able to run on multiple graphics processing units (GPUs) is presented. Loops over edges, which are the most time-consuming part of the solver, have been written to exploit the massively parallel capabilities of GPUs. Non-blocking communications between parallel processes and between the GPU and the central processor unit (CPU) have been used to enhance code scalability. The code is written using a mixture of C++ and OpenCL, to allow the execution of the source code on GPUs. The Message Passage Interface (MPI) library is used to allow the parallel execution of the solver on multiple GPUs. A comparative study of the solver parallel performance is carried out using a cluster of CPUs and another of GPUs. It is shown that a single GPU is up to 64 times faster than a single CPU core. The parallel scalability of the solver is mainly degraded due to the loss of computing efficiency of the GPU when the size of the case decreases. However, for large enough grid sizes, the scalability is strongly improved. A cluster featuring commodity GPUs and a high bandwidth network is ten times less costly and consumes 33% less energy than a CPU-based cluster with an equivalent computational power.

  14. Implementation of Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) Software

    DTIC Science & Technology

    2015-08-01

    Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten and James P Larentzos Approved for...Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten Weapons and Materials Research Directorate, ARL James P Larentzos Engility...Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software 5a. CONTRACT NUMBER 5b

  15. Colloidal assembly directed by virtual magnetic moulds

    NASA Astrophysics Data System (ADS)

    Demirörs, Ahmet F.; Pillai, Pramod P.; Kowalczyk, Bartlomiej; Grzybowski, Bartosz A.

    2013-11-01

    Interest in assemblies of colloidal particles has long been motivated by their applications in photonics, electronics, sensors and microlenses. Existing assembly schemes can position colloids of one type relatively flexibly into a range of desired structures, but it remains challenging to produce multicomponent lattices, clusters with precisely controlled symmetries and three-dimensional assemblies. A few schemes can efficiently produce complex colloidal structures, but they require system-specific procedures. Here we show that magnetic field microgradients established in a paramagnetic fluid can serve as `virtual moulds' to act as templates for the assembly of large numbers (~108) of both non-magnetic and magnetic colloidal particles with micrometre precision and typical yields of 80 to 90 per cent. We illustrate the versatility of this approach by producing single-component and multicomponent colloidal arrays, complex three-dimensional structures and a variety of colloidal molecules from polymeric particles, silica particles and live bacteria and by showing that all of these structures can be made permanent. In addition, although our magnetic moulds currently resemble optical traps in that they are limited to the manipulation of micrometre-sized objects, they are massively parallel and can manipulate non-magnetic and magnetic objects simultaneously in two and three dimensions.

  16. Real time animation of space plasma phenomena

    NASA Technical Reports Server (NTRS)

    Jordan, K. F.; Greenstadt, E. W.

    1987-01-01

    In pursuit of real time animation of computer simulated space plasma phenomena, the code was rewritten for the Massively Parallel Processor (MPP). The program creates a dynamic representation of the global bowshock which is based on actual spacecraft data and designed for three dimensional graphic output. This output consists of time slice sequences which make up the frames of the animation. With the MPP, 16384, 512 or 4 frames can be calculated simultaneously depending upon which characteristic is being computed. The run time was greatly reduced which promotes the rapid sequence of images and makes real time animation a foreseeable goal. The addition of more complex phenomenology in the constructed computer images is now possible and work proceeds to generate these images.

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Newman, G.A.; Commer, M.

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/Lmore » supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.« less

  18. Three-dimensional Bayesian geostatistical aquifer characterization at the Hanford 300 Area using tracer test data

    NASA Astrophysics Data System (ADS)

    Chen, Xingyuan; Murakami, Haruko; Hahn, Melanie S.; Hammond, Glenn E.; Rockhold, Mark L.; Zachara, John M.; Rubin, Yoram

    2012-06-01

    Tracer tests performed under natural or forced gradient flow conditions can provide useful information for characterizing subsurface properties, through monitoring, modeling, and interpretation of the tracer plume migration in an aquifer. Nonreactive tracer experiments were conducted at the Hanford 300 Area, along with constant-rate injection tests and electromagnetic borehole flowmeter tests. A Bayesian data assimilation technique, the method of anchored distributions (MAD) (Rubin et al., 2010), was applied to assimilate the experimental tracer test data with the other types of data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of the Hanford formation.In this study, the Bayesian prior information on the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using constant-rate injection and borehole flowmeter test data. The posterior distribution of the conductivity field was obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. MAD was implemented with the massively parallel three-dimensional flow and transport code PFLOTRAN to cope with the highly transient flow boundary conditions at the site and to meet the computational demands of MAD. A synthetic study proved that the proposed method could effectively invert tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. Application of MAD to actual field tracer data at the Hanford 300 Area demonstrates that inverting for spatial heterogeneity of hydraulic conductivity under transient flow conditions is challenging and more work is needed.

  19. Computational strategies for three-dimensional flow simulations on distributed computer systems. Ph.D. Thesis Semiannual Status Report, 15 Aug. 1993 - 15 Feb. 1994

    NASA Technical Reports Server (NTRS)

    Weed, Richard Allen; Sankar, L. N.

    1994-01-01

    An increasing amount of research activity in computational fluid dynamics has been devoted to the development of efficient algorithms for parallel computing systems. The increasing performance to price ratio of engineering workstations has led to research to development procedures for implementing a parallel computing system composed of distributed workstations. This thesis proposal outlines an ongoing research program to develop efficient strategies for performing three-dimensional flow analysis on distributed computing systems. The PVM parallel programming interface was used to modify an existing three-dimensional flow solver, the TEAM code developed by Lockheed for the Air Force, to function as a parallel flow solver on clusters of workstations. Steady flow solutions were generated for three different wing and body geometries to validate the code and evaluate code performance. The proposed research will extend the parallel code development to determine the most efficient strategies for unsteady flow simulations.

  20. The 2nd Symposium on the Frontiers of Massively Parallel Computations

    NASA Technical Reports Server (NTRS)

    Mills, Ronnie (Editor)

    1988-01-01

    Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.

  1. Parallel computation of three-dimensional aeroelastic fluid-structure interaction

    NASA Astrophysics Data System (ADS)

    Sadeghi, Mani

    This dissertation presents a numerical method for the parallel computation of aeroelasticity (ParCAE). A flow solver is coupled to a structural solver by use of a fluid-structure interface method. The integration of the three-dimensional unsteady Navier-Stokes equations is performed in the time domain, simultaneously to the integration of a modal three-dimensional structural model. The flow solution is accelerated by using a multigrid method and a parallel multiblock approach. Fluid-structure coupling is achieved by subiteration. A grid-deformation algorithm is developed to interpolate the deformation of the structural boundaries onto the flow grid. The code is formulated to allow application to general, three-dimensional, complex configurations with multiple independent structures. Computational results are presented for various configurations, such as turbomachinery blade rows and aircraft wings. Investigations are performed on vortex-induced vibrations, effects of cascade mistuning on flutter, and cases of nonlinear cascade and wing flutter.

  2. A three-dimensional spectral algorithm for simulations of transition and turbulence

    NASA Technical Reports Server (NTRS)

    Zang, T. A.; Hussaini, M. Y.

    1985-01-01

    A spectral algorithm for simulating three dimensional, incompressible, parallel shear flows is described. It applies to the channel, to the parallel boundary layer, and to other shear flows with one wall bounded and two periodic directions. Representative applications to the channel and to the heated boundary layer are presented.

  3. Critical gravity in four dimensions.

    PubMed

    Lü, H; Pope, C N

    2011-05-06

    We study four-dimensional gravity theories that are rendered renormalizable by the inclusion of curvature-squared terms to the usual Einstein action with a cosmological constant. By choosing the parameters appropriately, the massive scalar mode can be eliminated and the massive spin-2 mode can become massless. This "critical" theory may be viewed as a four-dimensional analogue of chiral topologically massive gravity, or of critical "new massive gravity" with a cosmological constant, in three dimensions. We find that the on-shell energy for the remaining massless gravitons vanishes. There are also logarithmic spin-2 modes, which have positive energy. The mass and entropy of standard Schwarzschild-type black holes vanish. The critical theory might provide a consistent toy model for quantum gravity in four dimensions.

  4. Speeding up parallel processing

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    In 1967 Amdahl expressed doubts about the ultimate utility of multiprocessors. The formulation, now called Amdahl's law, became part of the computing folklore and has inspired much skepticism about the ability of the current generation of massively parallel processors to efficiently deliver all their computing power to programs. The widely publicized recent results of a group at Sandia National Laboratory, which showed speedup on a 1024 node hypercube of over 500 for three fixed size problems and over 1000 for three scalable problems, have convincingly challenged this bit of folklore and have given new impetus to parallel scientific computing.

  5. LETTER TO THE EDITOR: A theorem on topologically massive gravity

    NASA Astrophysics Data System (ADS)

    Aliev, A. N.; Nutku, Y.

    1996-03-01

    We show that for three dimensional spacetimes admitting a hypersurface orthogonal Killing vector field, Deser, Jackiw and Templeton's vacuum field equations of topologically massive gravity allow only the trivial flat spacetime solution. Thus spin is necessary to support topological mass.

  6. DGDFT: A massively parallel method for large scale density functional theory calculations.

    PubMed

    Hu, Wei; Lin, Lin; Yang, Chao

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  7. gWEGA: GPU-accelerated WEGA for molecular superposition and shape comparison.

    PubMed

    Yan, Xin; Li, Jiabo; Gu, Qiong; Xu, Jun

    2014-06-05

    Virtual screening of a large chemical library for drug lead identification requires searching/superimposing a large number of three-dimensional (3D) chemical structures. This article reports a graphic processing unit (GPU)-accelerated weighted Gaussian algorithm (gWEGA) that expedites shape or shape-feature similarity score-based virtual screening. With 86 GPU nodes (each node has one GPU card), gWEGA can screen 110 million conformations derived from an entire ZINC drug-like database with diverse antidiabetic agents as query structures within 2 s (i.e., screening more than 55 million conformations per second). The rapid screening speed was accomplished through the massive parallelization on multiple GPU nodes and rapid prescreening of 3D structures (based on their shape descriptors and pharmacophore feature compositions). Copyright © 2014 Wiley Periodicals, Inc.

  8. High-performance image reconstruction in fluorescence tomography on desktop computers and graphics hardware.

    PubMed

    Freiberger, Manuel; Egger, Herbert; Liebmann, Manfred; Scharfetter, Hermann

    2011-11-01

    Image reconstruction in fluorescence optical tomography is a three-dimensional nonlinear ill-posed problem governed by a system of partial differential equations. In this paper we demonstrate that a combination of state of the art numerical algorithms and a careful hardware optimized implementation allows to solve this large-scale inverse problem in a few seconds on standard desktop PCs with modern graphics hardware. In particular, we present methods to solve not only the forward but also the non-linear inverse problem by massively parallel programming on graphics processors. A comparison of optimized CPU and GPU implementations shows that the reconstruction can be accelerated by factors of about 15 through the use of the graphics hardware without compromising the accuracy in the reconstructed images.

  9. The BlueGene/L supercomputer

    NASA Astrophysics Data System (ADS)

    Bhanota, Gyan; Chen, Dong; Gara, Alan; Vranas, Pavlos

    2003-05-01

    The architecture of the BlueGene/L massively parallel supercomputer is described. Each computing node consists of a single compute ASIC plus 256 MB of external memory. The compute ASIC integrates two 700 MHz PowerPC 440 integer CPU cores, two 2.8 Gflops floating point units, 4 MB of embedded DRAM as cache, a memory controller for external memory, six 1.4 Gbit/s bi-directional ports for a 3-dimensional torus network connection, three 2.8 Gbit/s bi-directional ports for connecting to a global tree network and a Gigabit Ethernet for I/O. 65,536 of such nodes are connected into a 3-d torus with a geometry of 32×32×64. The total peak performance of the system is 360 Teraflops and the total amount of memory is 16 TeraBytes.

  10. Bistatic scattering from a three-dimensional object above a two-dimensional randomly rough surface modeled with the parallel FDTD approach.

    PubMed

    Guo, L-X; Li, J; Zeng, H

    2009-11-01

    We present an investigation of the electromagnetic scattering from a three-dimensional (3-D) object above a two-dimensional (2-D) randomly rough surface. A Message Passing Interface-based parallel finite-difference time-domain (FDTD) approach is used, and the uniaxial perfectly matched layer (UPML) medium is adopted for truncation of the FDTD lattices, in which the finite-difference equations can be used for the total computation domain by properly choosing the uniaxial parameters. This makes the parallel FDTD algorithm easier to implement. The parallel performance with different number of processors is illustrated for one rough surface realization and shows that the computation time of our parallel FDTD algorithm is dramatically reduced relative to a single-processor implementation. Finally, the composite scattering coefficients versus scattered and azimuthal angle are presented and analyzed for different conditions, including the surface roughness, the dielectric constants, the polarization, and the size of the 3-D object.

  11. plasmaFoam: An OpenFOAM framework for computational plasma physics and chemistry

    NASA Astrophysics Data System (ADS)

    Venkattraman, Ayyaswamy; Verma, Abhishek Kumar

    2016-09-01

    As emphasized in the 2012 Roadmap for low temperature plasmas (LTP), scientific computing has emerged as an essential tool for the investigation and prediction of the fundamental physical and chemical processes associated with these systems. While several in-house and commercial codes exist, with each having its own advantages and disadvantages, a common framework that can be developed by researchers from all over the world will likely accelerate the impact of computational studies on advances in low-temperature plasma physics and chemistry. In this regard, we present a finite volume computational toolbox to perform high-fidelity simulations of LTP systems. This framework, primarily based on the OpenFOAM solver suite, allows us to enhance our understanding of multiscale plasma phenomenon by performing massively parallel, three-dimensional simulations on unstructured meshes using well-established high performance computing tools that are widely used in the computational fluid dynamics community. In this talk, we will present preliminary results obtained using the OpenFOAM-based solver suite with benchmark three-dimensional simulations of microplasma devices including both dielectric and plasma regions. We will also discuss the future outlook for the solver suite.

  12. Two-dimensional Schrödinger symmetry and three-dimensional breathers and Kelvin-ripple complexes as quasi-massive-Nambu-Goldstone modes

    NASA Astrophysics Data System (ADS)

    Takahashi, Daisuke A.; Ohashi, Keisuke; Fujimori, Toshiaki; Nitta, Muneto

    2017-08-01

    Bose-Einstein condensates (BECs) confined in a two-dimensional (2D) harmonic trap are known to possess a hidden 2D Schrödinger symmetry, that is, the Schrödinger symmetry modified by a trapping potential. Spontaneous breaking of this symmetry gives rise to a breathing motion of the BEC, whose oscillation frequency is robustly determined by the strength of the harmonic trap. In this paper, we demonstrate that the concept of the 2D Schrödinger symmetry can be applied to predict the nature of three-dimensional (3D) collective modes propagating along a condensate confined in an elongated trap. We find three kinds of collective modes whose existence is robustly ensured by the Schrödinger symmetry, which are physically interpreted as one breather mode and two Kelvin-ripple complex modes, i.e., composite modes in which the vortex core and the condensate surface oscillate interactively. We provide analytical expressions for the dispersion relations (energy-momentum relation) of these modes using the Bogoliubov theory [D. A. Takahashi and M. Nitta, Ann. Phys. 354, 101 (2015), 10.1016/j.aop.2014.12.009]. Furthermore, we point out that these modes can be interpreted as "quasi-massive-Nambu-Goldstone (NG) modes", that is, they have the properties of both quasi-NG and massive NG modes: quasi-NG modes appear when a symmetry of a part of a Lagrangian, which is not a symmetry of a full Lagrangian, is spontaneously broken, while massive NG modes appear when a modified symmetry is spontaneously broken.

  13. Three dimensional magnetic solutions in massive gravity with (non)linear field

    NASA Astrophysics Data System (ADS)

    Hendi, S. H.; Eslam Panah, B.; Panahiyan, S.; Momennia, M.

    2017-12-01

    The Noble Prize in physics 2016 motivates one to study different aspects of topological properties and topological defects as their related objects. Considering the significant role of the topological defects (especially magnetic strings) in cosmology, here, we will investigate three dimensional horizonless magnetic solutions in the presence of two generalizations: massive gravity and nonlinear electromagnetic field. The effects of these two generalizations on properties of the solutions and their geometrical structure are investigated. The differences between de Sitter and anti de Sitter solutions are highlighted and conditions regarding the existence of phase transition in geometrical structure of the solutions are studied.

  14. Three-Dimensional Bayesian Geostatistical Aquifer Characterization at the Hanford 300 Area using Tracer Test Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Xingyuan; Murakami, Haruko; Hahn, Melanie S.

    2012-06-01

    Tracer testing under natural or forced gradient flow holds the potential to provide useful information for characterizing subsurface properties, through monitoring, modeling and interpretation of the tracer plume migration in an aquifer. Non-reactive tracer experiments were conducted at the Hanford 300 Area, along with constant-rate injection tests and electromagnetic borehole flowmeter (EBF) profiling. A Bayesian data assimilation technique, the method of anchored distributions (MAD) [Rubin et al., 2010], was applied to assimilate the experimental tracer test data with the other types of data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of themore » Hanford formation. In this study, the Bayesian prior information on the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using the constant-rate injection tests and the EBF data. The posterior distribution of the conductivity field was obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. MAD was implemented with the massively-parallel three-dimensional flow and transport code PFLOTRAN to cope with the highly transient flow boundary conditions at the site and to meet the computational demands of MAD. A synthetic study proved that the proposed method could effectively invert tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. Application of MAD to actual field data shows that the hydrogeological model, when conditioned on the tracer test data, can reproduce the tracer transport behavior better than the field characterized without the tracer test data. This study successfully demonstrates that MAD can sequentially assimilate multi-scale multi-type field data through a consistent Bayesian framework.« less

  15. New cellular automaton model for magnetohydrodynamics

    NASA Technical Reports Server (NTRS)

    Chen, Hudong; Matthaeus, William H.

    1987-01-01

    A new type of two-dimensional cellular automation method is introduced for computation of magnetohydrodynamic fluid systems. Particle population is described by a 36-component tensor referred to a hexagonal lattice. By appropriate choice of the coefficients that control the modified streaming algorithm and the definition of the macroscopic fields, it is possible to compute both Lorentz-force and magnetic-induction effects. The method is local in the microscopic space and therefore suited to massively parallel computations.

  16. Using algebra for massively parallel processor design and utilization

    NASA Technical Reports Server (NTRS)

    Campbell, Lowell; Fellows, Michael R.

    1990-01-01

    This paper summarizes the author's advances in the design of dense processor networks. Within is reported a collection of recent constructions of dense symmetric networks that provide the largest know values for the number of nodes that can be placed in a network of a given degree and diameter. The constructions are in the range of current potential engineering significance and are based on groups of automorphisms of finite-dimensional vector spaces.

  17. Massively parallel information processing systems for space applications

    NASA Technical Reports Server (NTRS)

    Schaefer, D. H.

    1979-01-01

    NASA is developing massively parallel systems for ultra high speed processing of digital image data collected by satellite borne instrumentation. Such systems contain thousands of processing elements. Work is underway on the design and fabrication of the 'Massively Parallel Processor', a ground computer containing 16,384 processing elements arranged in a 128 x 128 array. This computer uses existing technology. Advanced work includes the development of semiconductor chips containing thousands of feedthrough paths. Massively parallel image analog to digital conversion technology is also being developed. The goal is to provide compact computers suitable for real-time onboard processing of images.

  18. A 3-D Finite-Volume Non-hydrostatic Icosahedral Model (NIM)

    NASA Astrophysics Data System (ADS)

    Lee, Jin

    2014-05-01

    The Nonhydrostatic Icosahedral Model (NIM) formulates the latest numerical innovation of the three-dimensional finite-volume control volume on the quasi-uniform icosahedral grid suitable for ultra-high resolution simulations. NIM's modeling goal is to improve numerical accuracy for weather and climate simulations as well as to utilize the state-of-art computing architecture such as massive parallel CPUs and GPUs to deliver routine high-resolution forecasts in timely manner. NIM dynamic corel innovations include: * A local coordinate system remapped spherical surface to plane for numerical accuracy (Lee and MacDonald, 2009), * Grid points in a table-driven horizontal loop that allow any horizontal point sequence (A.E. MacDonald, et al., 2010), * Flux-Corrected Transport formulated on finite-volume operators to maintain conservative positive definite transport (J.-L, Lee, ET. Al., 2010), *Icosahedral grid optimization (Wang and Lee, 2011), * All differentials evaluated as three-dimensional finite-volume integrals around the control volume. The three-dimensional finite-volume solver in NIM is designed to improve pressure gradient calculation and orographic precipitation over complex terrain. NIM dynamical core has been successfully verified with various non-hydrostatic benchmark test cases such as internal gravity wave, and mountain waves in Dynamical Cores Model Inter-comparisons Projects (DCMIP). Physical parameterizations suitable for NWP are incorporated into NIM dynamical core and successfully tested with multimonth aqua-planet simulations. Recently, NIM has started real data simulations using GFS initial conditions. Results from the idealized tests as well as real-data simulations will be shown in the conference.

  19. Comparison of sampling techniques for Bayesian parameter estimation

    NASA Astrophysics Data System (ADS)

    Allison, Rupert; Dunkley, Joanna

    2014-02-01

    The posterior probability distribution for a set of model parameters encodes all that the data have to tell us in the context of a given model; it is the fundamental quantity for Bayesian parameter estimation. In order to infer the posterior probability distribution we have to decide how to explore parameter space. Here we compare three prescriptions for how parameter space is navigated, discussing their relative merits. We consider Metropolis-Hasting sampling, nested sampling and affine-invariant ensemble Markov chain Monte Carlo (MCMC) sampling. We focus on their performance on toy-model Gaussian likelihoods and on a real-world cosmological data set. We outline the sampling algorithms themselves and elaborate on performance diagnostics such as convergence time, scope for parallelization, dimensional scaling, requisite tunings and suitability for non-Gaussian distributions. We find that nested sampling delivers high-fidelity estimates for posterior statistics at low computational cost, and should be adopted in favour of Metropolis-Hastings in many cases. Affine-invariant MCMC is competitive when computing clusters can be utilized for massive parallelization. Affine-invariant MCMC and existing extensions to nested sampling naturally probe multimodal and curving distributions.

  20. Parallel computing techniques for rotorcraft aerodynamics

    NASA Astrophysics Data System (ADS)

    Ekici, Kivanc

    The modification of unsteady three-dimensional Navier-Stokes codes for application on massively parallel and distributed computing environments is investigated. The Euler/Navier-Stokes code TURNS (Transonic Unsteady Rotor Navier-Stokes) was chosen as a test bed because of its wide use by universities and industry. For the efficient implementation of TURNS on parallel computing systems, two algorithmic changes are developed. First, main modifications to the implicit operator, Lower-Upper Symmetric Gauss Seidel (LU-SGS) originally used in TURNS, is performed. Second, application of an inexact Newton method, coupled with a Krylov subspace iterative method (Newton-Krylov method) is carried out. Both techniques have been tried previously for the Euler equations mode of the code. In this work, we have extended the methods to the Navier-Stokes mode. Several new implicit operators were tried because of convergence problems of traditional operators with the high cell aspect ratio (CAR) grids needed for viscous calculations on structured grids. Promising results for both Euler and Navier-Stokes cases are presented for these operators. For the efficient implementation of Newton-Krylov methods to the Navier-Stokes mode of TURNS, efficient preconditioners must be used. The parallel implicit operators used in the previous step are employed as preconditioners and the results are compared. The Message Passing Interface (MPI) protocol has been used because of its portability to various parallel architectures. It should be noted that the proposed methodology is general and can be applied to several other CFD codes (e.g. OVERFLOW).

  1. Three-dimensional magnetic bubble memory system

    NASA Technical Reports Server (NTRS)

    Stadler, Henry L. (Inventor); Katti, Romney R. (Inventor); Wu, Jiin-Chuan (Inventor)

    1994-01-01

    A compact memory uses magnetic bubble technology for providing data storage. A three-dimensional arrangement, in the form of stacks of magnetic bubble layers, is used to achieve high volumetric storage density. Output tracks are used within each layer to allow data to be accessed uniquely and unambiguously. Storage can be achieved using either current access or field access magnetic bubble technology. Optical sensing via the Faraday effect is used to detect data. Optical sensing facilitates the accessing of data from within the three-dimensional package and lends itself to parallel operation for supporting high data rates and vector and parallel processing.

  2. A parallel finite-difference method for computational aerodynamics

    NASA Technical Reports Server (NTRS)

    Swisshelm, Julie M.

    1989-01-01

    A finite-difference scheme for solving complex three-dimensional aerodynamic flow on parallel-processing supercomputers is presented. The method consists of a basic flow solver with multigrid convergence acceleration, embedded grid refinements, and a zonal equation scheme. Multitasking and vectorization have been incorporated into the algorithm. Results obtained include multiprocessed flow simulations from the Cray X-MP and Cray-2. Speedups as high as 3.3 for the two-dimensional case and 3.5 for segments of the three-dimensional case have been achieved on the Cray-2. The entire solver attained a factor of 2.7 improvement over its unitasked version on the Cray-2. The performance of the parallel algorithm on each machine is analyzed.

  3. Implementation of a 3D mixing layer code on parallel computers

    NASA Technical Reports Server (NTRS)

    Roe, K.; Thakur, R.; Dang, T.; Bogucz, E.

    1995-01-01

    This paper summarizes our progress and experience in the development of a Computational-Fluid-Dynamics code on parallel computers to simulate three-dimensional spatially-developing mixing layers. In this initial study, the three-dimensional time-dependent Euler equations are solved using a finite-volume explicit time-marching algorithm. The code was first programmed in Fortran 77 for sequential computers. The code was then converted for use on parallel computers using the conventional message-passing technique, while we have not been able to compile the code with the present version of HPF compilers.

  4. Analysis of multigrid methods on massively parallel computers: Architectural implications

    NASA Technical Reports Server (NTRS)

    Matheson, Lesley R.; Tarjan, Robert E.

    1993-01-01

    We study the potential performance of multigrid algorithms running on massively parallel computers with the intent of discovering whether presently envisioned machines will provide an efficient platform for such algorithms. We consider the domain parallel version of the standard V cycle algorithm on model problems, discretized using finite difference techniques in two and three dimensions on block structured grids of size 10(exp 6) and 10(exp 9), respectively. Our models of parallel computation were developed to reflect the computing characteristics of the current generation of massively parallel multicomputers. These models are based on an interconnection network of 256 to 16,384 message passing, 'workstation size' processors executing in an SPMD mode. The first model accomplishes interprocessor communications through a multistage permutation network. The communication cost is a logarithmic function which is similar to the costs in a variety of different topologies. The second model allows single stage communication costs only. Both models were designed with information provided by machine developers and utilize implementation derived parameters. With the medium grain parallelism of the current generation and the high fixed cost of an interprocessor communication, our analysis suggests an efficient implementation requires the machine to support the efficient transmission of long messages, (up to 1000 words) or the high initiation cost of a communication must be significantly reduced through an alternative optimization technique. Furthermore, with variable length message capability, our analysis suggests the low diameter multistage networks provide little or no advantage over a simple single stage communications network.

  5. Quasilocal energy for three-dimensional massive gravity solutions with chiral deformations of AdS{sub 3} boundary conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garbarz, Alan, E-mail: alan-at@df.uba.ar; Giribet, Gaston, E-mail: gaston-at@df.uba.ar, E-mail: af.goya-at@df.uba.ar; Goya, Andrés, E-mail: gaston-at@df.uba.ar, E-mail: af.goya-at@df.uba.ar

    2015-03-26

    We consider critical gravity in three dimensions; that is, the New Massive Gravity theory formulated about Anti-de Sitter (AdS) space with the specific value of the graviton mass for which it results dual to a two-dimensional conformai field theory with vanishing central charge. As it happens with Kerr black holes in four-dimensional critical gravity, in three-dimensional critical gravity the Bañados-Teitelboim-Zanelli black holes have vanishing mass and vanishing angular momentum. However, provided suitable asymptotic conditions are chosen, the theory may also admit solutions carrying non-vanishing charges. Here, we give simple examples of exact solutions that exhibit falling-off conditions that are evenmore » weaker than those of the so-called Log-gravity. For such solutions, we define the quasilocal stress-tensor and use it to compute conserved charges. Despite the drastic deformation of AdS{sub 3} asymptotic, these solutions have finite mass and angular momentum, which are shown to be non-zero.« less

  6. Mn-silicide nanostructures aligned on massively parallel silicon nano-ribbons

    NASA Astrophysics Data System (ADS)

    De Padova, Paola; Ottaviani, Carlo; Ronci, Fabio; Colonna, Stefano; Olivieri, Bruno; Quaresima, Claudio; Cricenti, Antonio; Dávila, Maria E.; Hennies, Franz; Pietzsch, Annette; Shariati, Nina; Le Lay, Guy

    2013-01-01

    The growth of Mn nanostructures on a 1D grating of silicon nano-ribbons is investigated at atomic scale by means of scanning tunneling microscopy, low energy electron diffraction and core level photoelectron spectroscopy. The grating of silicon nano-ribbons represents an atomic scale template that can be used in a surface-driven route to control the combination of Si with Mn in the development of novel materials for spintronics devices. The Mn atoms show a preferential adsorption site on silicon atoms, forming one-dimensional nanostructures. They are parallel oriented with respect to the surface Si array, which probably predetermines the diffusion pathways of the Mn atoms during the process of nanostructure formation.

  7. Mn-silicide nanostructures aligned on massively parallel silicon nano-ribbons.

    PubMed

    De Padova, Paola; Ottaviani, Carlo; Ronci, Fabio; Colonna, Stefano; Olivieri, Bruno; Quaresima, Claudio; Cricenti, Antonio; Dávila, Maria E; Hennies, Franz; Pietzsch, Annette; Shariati, Nina; Le Lay, Guy

    2013-01-09

    The growth of Mn nanostructures on a 1D grating of silicon nano-ribbons is investigated at atomic scale by means of scanning tunneling microscopy, low energy electron diffraction and core level photoelectron spectroscopy. The grating of silicon nano-ribbons represents an atomic scale template that can be used in a surface-driven route to control the combination of Si with Mn in the development of novel materials for spintronics devices. The Mn atoms show a preferential adsorption site on silicon atoms, forming one-dimensional nanostructures. They are parallel oriented with respect to the surface Si array, which probably predetermines the diffusion pathways of the Mn atoms during the process of nanostructure formation.

  8. Instantons and Massless Fermions in Two Dimensions

    DOE R&D Accomplishments Database

    Callan, C. G. Jr.; Dashen, R.; Gross, D. J.

    1977-05-01

    The role of instantons in the breakdown of chiral U(N) symmetry is studied in a two dimensional model. Chiral U(1) is always destroyed by the axial vector anomaly. For N = 2 chiral SU(N) is also spontaneously broken yielding massive fermions and three (decoupled) Goldstone bosons. For N greater than or equal to 3 the fermions remain massless. Realistic four dimensional theories are believed to behave in a similar way but the critical N above which the fermions cease to be massive is not known in four dimensions.

  9. Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)

    2000-01-01

    HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).

  10. Rapid Prediction of Unsteady Three-Dimensional Viscous Flows in Turbopump Geometries

    NASA Technical Reports Server (NTRS)

    Dorney, Daniel J.

    1998-01-01

    A program is underway to improve the efficiency of a three-dimensional Navier-Stokes code and generalize it for nozzle and turbopump geometries. Code modifications have included the implementation of parallel processing software, incorporation of new physical models and generalization of the multiblock capability. The final report contains details of code modifications, numerical results for several nozzle and turbopump geometries, and the implementation of the parallelization software.

  11. String Theory Origin of Dyonic N=8 Supergravity and Its Chern-Simons Duals.

    PubMed

    Guarino, Adolfo; Jafferis, Daniel L; Varela, Oscar

    2015-08-28

    We clarify the higher-dimensional origin of a class of dyonic gaugings of D=4  N=8 supergravity recently discovered, when the gauge group is chosen to be ISO(7). This dyonically gauged maximal supergravity arises from consistent truncation of massive IIA supergravity on S^6, and its magnetic coupling constant descends directly from the Romans mass. The critical points of the supergravity uplift to new four-dimensional anti-de Sitter space (AdS4) massive type IIA vacua. We identify the corresponding three-dimensional conformal field theory (CFT3) duals as super-Chern-Simons-matter theories with simple gauge group SU(N) and level k given by the Romans mass. In particular, we find a critical point that uplifts to the first explicit N=2 AdS4 massive IIA background. We compute its free energy and that of the candidate dual Chern-Simons theory by localization to a solvable matrix model, and find perfect agreement. This provides the first AdS4/CFT3 precision match in massive type IIA string theory.

  12. Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fijany, A.; Milman, M.; Redding, D.

    1994-12-31

    In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm,more » designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.« less

  13. High-performance parallel approaches for three-dimensional light detection and ranging point clouds gridding

    NASA Astrophysics Data System (ADS)

    Rizki, Permata Nur Miftahur; Lee, Heezin; Lee, Minsu; Oh, Sangyoon

    2017-01-01

    With the rapid advance of remote sensing technology, the amount of three-dimensional point-cloud data has increased extraordinarily, requiring faster processing in the construction of digital elevation models. There have been several attempts to accelerate the computation using parallel methods; however, little attention has been given to investigating different approaches for selecting the most suited parallel programming model for a given computing environment. We present our findings and insights identified by implementing three popular high-performance parallel approaches (message passing interface, MapReduce, and GPGPU) on time demanding but accurate kriging interpolation. The performances of the approaches are compared by varying the size of the grid and input data. In our empirical experiment, we demonstrate the significant acceleration by all three approaches compared to a C-implemented sequential-processing method. In addition, we also discuss the pros and cons of each method in terms of usability, complexity infrastructure, and platform limitation to give readers a better understanding of utilizing those parallel approaches for gridding purposes.

  14. Three-dimensional aspects of radiative transfer in remote sensing of precipitation: Application to the 1986 COHMEX storm

    NASA Technical Reports Server (NTRS)

    Haferman, J. L.; Krajewski, W. F.; Smith, T. F.

    1994-01-01

    Several multifrequency techniques for passive microwave estimation of precipitation based on the absorption and scattering properties of hydrometers have been proposed in the literature. In the present study, plane-parallel limitations are overcome by using a model based on the discrete-ordinates method to solve the radiative transfer equation in three-dimensional rectangular domains. This effectively accounts for the complexity and variety of radiation problems encountered in the atmosphere. This investigation presents result for plane-parallel and three-dimensional radiative transfer for a precipitating system, discusses differences between these results, and suggests possible explanations for these differences. Microphysical properties were obtained from the Colorado State University Regional Atmospehric Modeling System and represent a hailstorm observed during the 1986 Cooperative Huntsville Meteorological Experiment. These properties are used as input to a three-dimensional radiative transfer model in order to simulate satellite observation of the storm. The model output consists of upwelling brightness temperatures at several of the frequencies on the Special Sensor Microwave/Imager. The radiative transfer model accounts for scattering and emission of atmospheric gases and hydrometers in liquid and ice phases. Brightness temperatures obtained from the three-dimensional model of this investigation indicate that horizontal inhomogeneities give rise to brightness temperature fields that can be quite different from fields obtained using plane-parallel radiative transfer theory. These differences are examined for various resolutions of the satellite sensor field of view. In adddition, the issue of boundary conditions for three-dimensional atmospheric radiative transfer is addressed.

  15. RAMA: A file system for massively parallel computers

    NASA Technical Reports Server (NTRS)

    Miller, Ethan L.; Katz, Randy H.

    1993-01-01

    This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another common bottleneck in parallel processor file systems. Support for a large tertiary storage system can easily be integrated in lo the file system; in fact, RAMA runs most efficiently when tertiary storage is used.

  16. Three-dimensional computational aerodynamics in the 1980's

    NASA Technical Reports Server (NTRS)

    Lomax, H.

    1978-01-01

    The future requirements for constructing codes that can be used to compute three-dimensional flows about aerodynamic shapes should be assessed in light of the constraints imposed by future computer architectures and the reality of usable algorithms that can provide practical three-dimensional simulations. On the hardware side, vector processing is inevitable in order to meet the CPU speeds required. To cope with three-dimensional geometries, massive data bases with fetch/store conflicts and transposition problems are inevitable. On the software side, codes must be prepared that: (1) can be adapted to complex geometries, (2) can (at the very least) predict the location of laminar and turbulent boundary layer separation, and (3) will converge rapidly to sufficiently accurate solutions.

  17. Construction and comparison of parallel implicit kinetic solvers in three spatial dimensions

    NASA Astrophysics Data System (ADS)

    Titarev, Vladimir; Dumbser, Michael; Utyuzhnikov, Sergey

    2014-01-01

    The paper is devoted to the further development and systematic performance evaluation of a recent deterministic framework Nesvetay-3D for modelling three-dimensional rarefied gas flows. Firstly, a review of the existing discretization and parallelization strategies for solving numerically the Boltzmann kinetic equation with various model collision integrals is carried out. Secondly, a new parallelization strategy for the implicit time evolution method is implemented which improves scaling on large CPU clusters. Accuracy and scalability of the methods are demonstrated on a pressure-driven rarefied gas flow through a finite-length circular pipe as well as an external supersonic flow over a three-dimensional re-entry geometry of complicated aerodynamic shape.

  18. Development of Modern Performance Assessment Tools and Capabilities for Underground Disposal of Transuranic Waste at WIPP

    NASA Astrophysics Data System (ADS)

    Zeitler, T.; Kirchner, T. B.; Hammond, G. E.; Park, H.

    2014-12-01

    The Waste Isolation Pilot Plant (WIPP) has been developed by the U.S. Department of Energy (DOE) for the geologic (deep underground) disposal of transuranic (TRU) waste. Containment of TRU waste at the WIPP is regulated by the U.S. Environmental Protection Agency (EPA). The DOE demonstrates compliance with the containment requirements by means of performance assessment (PA) calculations. WIPP PA calculations estimate the probability and consequence of potential radionuclide releases from the repository to the accessible environment for a regulatory period of 10,000 years after facility closure. The long-term performance of the repository is assessed using a suite of sophisticated computational codes. In a broad modernization effort, the DOE has overseen the transfer of these codes to modern hardware and software platforms. Additionally, there is a current effort to establish new performance assessment capabilities through the further development of the PFLOTRAN software, a state-of-the-art massively parallel subsurface flow and reactive transport code. Improvements to the current computational environment will result in greater detail in the final models due to the parallelization afforded by the modern code. Parallelization will allow for relatively faster calculations, as well as a move from a two-dimensional calculation grid to a three-dimensional grid. The result of the modernization effort will be a state-of-the-art subsurface flow and transport capability that will serve WIPP PA into the future. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. This research is funded by WIPP programs administered by the Office of Environmental Management (EM) of the U.S Department of Energy.

  19. FORCE2: A state-of-the-art two-phase code for hydrodynamic calculations

    NASA Astrophysics Data System (ADS)

    Ding, Jianmin; Lyczkowski, R. W.; Burge, S. W.

    1993-02-01

    A three-dimensional computer code for two-phase flow named FORCE2 has been developed by Babcock and Wilcox (B & W) in close collaboration with Argonne National Laboratory (ANL). FORCE2 is capable of both transient as well as steady-state simulations. This Cartesian coordinates computer program is a finite control volume, industrial grade and quality embodiment of the pilot-scale FLUFIX/MOD2 code and contains features such as three-dimensional blockages, volume and surface porosities to account for various obstructions in the flow field, and distributed resistance modeling to account for pressure drops caused by baffles, distributor plates and large tube banks. Recently computed results demonstrated the significance of and necessity for three-dimensional models of hydrodynamics and erosion. This paper describes the process whereby ANL's pilot-scale FLUFIX/MOD2 models and numerics were implemented into FORCE2. A description of the quality control to assess the accuracy of the new code and the validation using some of the measured data from Illinois Institute of Technology (UT) and the University of Illinois at Urbana-Champaign (UIUC) are given. It is envisioned that one day, FORCE2 with additional modules such as radiation heat transfer, combustion kinetics and multi-solids together with user-friendly pre- and post-processor software and tailored for massively parallel multiprocessor shared memory computational platforms will be used by industry and researchers to assist in reducing and/or eliminating the environmental and economic barriers which limit full consideration of coal, shale and biomass as energy sources, to retain energy security, and to remediate waste and ecological problems.

  20. Discrete Adjoint-Based Design for Unsteady Turbulent Flows On Dynamic Overset Unstructured Grids

    NASA Technical Reports Server (NTRS)

    Nielsen, Eric J.; Diskin, Boris

    2012-01-01

    A discrete adjoint-based design methodology for unsteady turbulent flows on three-dimensional dynamic overset unstructured grids is formulated, implemented, and verified. The methodology supports both compressible and incompressible flows and is amenable to massively parallel computing environments. The approach provides a general framework for performing highly efficient and discretely consistent sensitivity analysis for problems involving arbitrary combinations of overset unstructured grids which may be static, undergoing rigid or deforming motions, or any combination thereof. General parent-child motions are also accommodated, and the accuracy of the implementation is established using an independent verification based on a complex-variable approach. The methodology is used to demonstrate aerodynamic optimizations of a wind turbine geometry, a biologically-inspired flapping wing, and a complex helicopter configuration subject to trimming constraints. The objective function for each problem is successfully reduced and all specified constraints are satisfied.

  1. 3D simulation of floral oil storage in the scopa of South American insects

    NASA Astrophysics Data System (ADS)

    Ruettgers, Alexander; Griebel, Michael; Pastrik, Lars; Schmied, Heiko; Wittmann, Dieter; Scherrieble, Andreas; Dinkelmann, Albrecht; Stegmaier, Thomas; InstituteNumerical Simulation Team; Institute of Crop Science; Resource Conservation Team; Institute of Textile Technology; Process Engineering Team

    2014-11-01

    Several species of bees in South America possess structures to store and transport floral oils. By using closely spaced hairs at their back legs, the so called scopa, these bees can absorb and release oil droplets without loss. The high efficiency of this process is a matter of ongoing research. Basing on recent x-ray microtomography scans from the scopa of these bees at the Institute of Textile Technology and Process Engineering Denkendorf, we build a three-dimensional computer model. Using NaSt3DGPF, a two-phase flow solver developed at the Institute for Numerical Simulation of the University of Bonn, we perform massively parallel flow simulations with the complex micro-CT data. In this talk, we discuss the results of our simulations and the transfer of the x-ray measurement into a computer model. This research was funded under GR 1144/18-1 by the Deutsche Forschungsgemeinschaft (DFG).

  2. Development of Fast Algorithms Using Recursion, Nesting and Iterations for Computational Electromagnetics

    NASA Technical Reports Server (NTRS)

    Chew, W. C.; Song, J. M.; Lu, C. C.; Weedon, W. H.

    1995-01-01

    In the first phase of our work, we have concentrated on laying the foundation to develop fast algorithms, including the use of recursive structure like the recursive aggregate interaction matrix algorithm (RAIMA), the nested equivalence principle algorithm (NEPAL), the ray-propagation fast multipole algorithm (RPFMA), and the multi-level fast multipole algorithm (MLFMA). We have also investigated the use of curvilinear patches to build a basic method of moments code where these acceleration techniques can be used later. In the second phase, which is mainly reported on here, we have concentrated on implementing three-dimensional NEPAL on a massively parallel machine, the Connection Machine CM-5, and have been able to obtain some 3D scattering results. In order to understand the parallelization of codes on the Connection Machine, we have also studied the parallelization of 3D finite-difference time-domain (FDTD) code with PML material absorbing boundary condition (ABC). We found that simple algorithms like the FDTD with material ABC can be parallelized very well allowing us to solve within a minute a problem of over a million nodes. In addition, we have studied the use of the fast multipole method and the ray-propagation fast multipole algorithm to expedite matrix-vector multiplication in a conjugate-gradient solution to integral equations of scattering. We find that these methods are faster than LU decomposition for one incident angle, but are slower than LU decomposition when many incident angles are needed as in the monostatic RCS calculations.

  3. Nonlinear viscoplasticity in ASPECT: benchmarking and applications to subduction

    NASA Astrophysics Data System (ADS)

    Glerum, Anne; Thieulot, Cedric; Fraters, Menno; Blom, Constantijn; Spakman, Wim

    2018-03-01

    ASPECT (Advanced Solver for Problems in Earth's ConvecTion) is a massively parallel finite element code originally designed for modeling thermal convection in the mantle with a Newtonian rheology. The code is characterized by modern numerical methods, high-performance parallelism and extensibility. This last characteristic is illustrated in this work: we have extended the use of ASPECT from global thermal convection modeling to upper-mantle-scale applications of subduction.

    Subduction modeling generally requires the tracking of multiple materials with different properties and with nonlinear viscous and viscoplastic rheologies. To this end, we implemented a frictional plasticity criterion that is combined with a viscous diffusion and dislocation creep rheology. Because ASPECT uses compositional fields to represent different materials, all material parameters are made dependent on a user-specified number of fields.

    The goal of this paper is primarily to describe and verify our implementations of complex, multi-material rheology by reproducing the results of four well-known two-dimensional benchmarks: the indentor benchmark, the brick experiment, the sandbox experiment and the slab detachment benchmark. Furthermore, we aim to provide hands-on examples for prospective users by demonstrating the use of multi-material viscoplasticity with three-dimensional, thermomechanical models of oceanic subduction, putting ASPECT on the map as a community code for high-resolution, nonlinear rheology subduction modeling.

  4. Rapid Prediction of Unsteady Three-Dimensional Viscous Flows in Turbopump Geometries

    NASA Technical Reports Server (NTRS)

    Dorney, Daniel J.

    1998-01-01

    A program is underway to improve the efficiency of a three-dimensional Navier-Stokes code and generalize it for nozzle and turbopump geometries. Code modifications will include the implementation of parallel processing software, incorporating new physical models and generalizing the multi-block capability to allow the simultaneous simulation of nozzle and turbopump configurations. The current report contains details of code modifications, numerical results of several flow simulations and the status of the parallelization effort.

  5. Parallel implementation of geometrical shock dynamics for two dimensional converging shock waves

    NASA Astrophysics Data System (ADS)

    Qiu, Shi; Liu, Kuang; Eliasson, Veronica

    2016-10-01

    Geometrical shock dynamics (GSD) theory is an appealing method to predict the shock motion in the sense that it is more computationally efficient than solving the traditional Euler equations, especially for converging shock waves. However, to solve and optimize large scale configurations, the main bottleneck is the computational cost. Among the existing numerical GSD schemes, there is only one that has been implemented on parallel computers, with the purpose to analyze detonation waves. To extend the computational advantage of the GSD theory to more general applications such as converging shock waves, a numerical implementation using a spatial decomposition method has been coupled with a front tracking approach on parallel computers. In addition, an efficient tridiagonal system solver for massively parallel computers has been applied to resolve the most expensive function in this implementation, resulting in an efficiency of 0.93 while using 32 HPCC cores. Moreover, symmetric boundary conditions have been developed to further reduce the computational cost, achieving a speedup of 19.26 for a 12-sided polygonal converging shock.

  6. How to cluster in parallel with neural networks

    NASA Technical Reports Server (NTRS)

    Kamgar-Parsi, Behzad; Gualtieri, J. A.; Devaney, Judy E.; Kamgar-Parsi, Behrooz

    1988-01-01

    Partitioning a set of N patterns in a d-dimensional metric space into K clusters - in a way that those in a given cluster are more similar to each other than the rest - is a problem of interest in astrophysics, image analysis and other fields. As there are approximately K(N)/K (factorial) possible ways of partitioning the patterns among K clusters, finding the best solution is beyond exhaustive search when N is large. Researchers show that this problem can be formulated as an optimization problem for which very good, but not necessarily optimal solutions can be found by using a neural network. To do this the network must start from many randomly selected initial states. The network is simulated on the MPP (a 128 x 128 SIMD array machine), where researchers use the massive parallelism not only in solving the differential equations that govern the evolution of the network, but also by starting the network from many initial states at once, thus obtaining many solutions in one run. Researchers obtain speedups of two to three orders of magnitude over serial implementations and the promise through Analog VLSI implementations of speedups comensurate with human perceptual abilities.

  7. Utilizing High-Performance Computing to Investigate Parameter Sensitivity of an Inversion Model for Vadose Zone Flow and Transport

    NASA Astrophysics Data System (ADS)

    Fang, Z.; Ward, A. L.; Fang, Y.; Yabusaki, S.

    2011-12-01

    High-resolution geologic models have proven effective in improving the accuracy of subsurface flow and transport predictions. However, many of the parameters in subsurface flow and transport models cannot be determined directly at the scale of interest and must be estimated through inverse modeling. A major challenge, particularly in vadose zone flow and transport, is the inversion of the highly-nonlinear, high-dimensional problem as current methods are not readily scalable for large-scale, multi-process models. In this paper we describe the implementation of a fully automated approach for addressing complex parameter optimization and sensitivity issues on massively parallel multi- and many-core systems. The approach is based on the integration of PNNL's extreme scale Subsurface Transport Over Multiple Phases (eSTOMP) simulator, which uses the Global Array toolkit, with the Beowulf-Cluster inspired parallel nonlinear parameter estimation software, BeoPEST in the MPI mode. In the eSTOMP/BeoPEST implementation, a pre-processor generates all of the PEST input files based on the eSTOMP input file. Simulation results for comparison with observations are extracted automatically at each time step eliminating the need for post-process data extractions. The inversion framework was tested with three different experimental data sets: one-dimensional water flow at Hanford Grass Site; irrigation and infiltration experiment at the Andelfingen Site; and a three-dimensional injection experiment at Hanford's Sisson and Lu Site. Good agreements are achieved in all three applications between observations and simulations in both parameter estimates and water dynamics reproduction. Results show that eSTOMP/BeoPEST approach is highly scalable and can be run efficiently with hundreds or thousands of processors. BeoPEST is fault tolerant and new nodes can be dynamically added and removed. A major advantage of this approach is the ability to use high-resolution geologic models to preserve the spatial structure in the inverse model, which leads to better parameter estimates and improved predictions when using the inverse-conditioned realizations of parameter fields.

  8. On the Transition from Two-Dimensional to Three-Dimensional MHD Turbulence

    NASA Technical Reports Server (NTRS)

    Thess, A.; Zikanov, Oleg

    2004-01-01

    We report a theoretical investigation of the robustness of two-dimensional inviscid MHD flows at low magnetic Reynolds numbers with respect to three-dimensional perturbations. We analyze three model problems, namely flow in the interior of a triaxial ellipsoid, an unbounded vortex with elliptical streamlines, and a vortex sheet parallel to the magnetic field. We demonstrate that motion perpendicular to the magnetic field with elliptical streamlines becomes unstable with respect to the elliptical instability once the velocity has reached a critical magnitude whose value tends to zero as the eccentricity of the streamlines becomes large. Furthermore, vortex sheets parallel to the magnetic field, which are unstable for any velocity and any magnetic field, are found to emit eddies with vorticity perpendicular to the magnetic field and with an aspect ratio proportional to N(sup 1/2). The results suggest that purely two-dimensional motion without Joule energy dissipation is a singular type of flow which does not represent the asymptotic behaviour of three-dimensional MHD turbulence in the limit of infinitely strong magnetic fields.

  9. Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM

    NASA Astrophysics Data System (ADS)

    de La Fuente Marcos, Carlos; Barge, Pierre; de La Fuente Marcos, Raúl

    2002-03-01

    We describe a parallel version of our high-order-accuracy particle-mesh code for the simulation of collisionless protoplanetary disks. We use this code to carry out a massively parallel, two-dimensional, time-dependent, numerical simulation, which includes dust particles, to study the potential role of large-scale, gaseous vortices in protoplanetary disks. This noncollisional problem is easy to parallelize on message-passing multicomputer architectures. We performed the simulations on a cache-coherent nonuniform memory access Origin 2000 machine, using both the parallel virtual machine (PVM) and message-passing interface (MPI) message-passing libraries. Our performance analysis suggests that, for our problem, PVM is about 25% faster than MPI. Using PVM and MPI made it possible to reduce CPU time and increase code performance. This allows for simulations with a large number of particles (N ~ 105-106) in reasonable CPU times. The performances of our implementation of the pa! rallel code on an Origin 2000 supercomputer are presented and discussed. They exhibit very good speedup behavior and low load unbalancing. Our results confirm that giant gaseous vortices can play a dominant role in giant planet formation.

  10. [Three-dimensional parallel collagen scaffold promotes tendon extracellular matrix formation].

    PubMed

    Zheng, Zefeng; Shen, Weiliang; Le, Huihui; Dai, Xuesong; Ouyang, Hongwei; Chen, Weishan

    2016-03-01

    To investigate the effects of three-dimensional parallel collagen scaffold on the cell shape, arrangement and extracellular matrix formation of tendon stem cells. Parallel collagen scaffold was fabricated by unidirectional freezing technique, while random collagen scaffold was fabricated by freeze-drying technique. The effects of two scaffolds on cell shape and extracellular matrix formation were investigated in vitro by seeding tendon stem/progenitor cells and in vivo by ectopic implantation. Parallel and random collagen scaffolds were produced successfully. Parallel collagen scaffold was more akin to tendon than random collagen scaffold. Tendon stem/progenitor cells were spindle-shaped and unified orientated in parallel collagen scaffold, while cells on random collagen scaffold had disorder orientation. Two weeks after ectopic implantation, cells had nearly the same orientation with the collagen substance. In parallel collagen scaffold, cells had parallel arrangement, and more spindly cells were observed. By contrast, cells in random collagen scaffold were disorder. Parallel collagen scaffold can induce cells to be in spindly and parallel arrangement, and promote parallel extracellular matrix formation; while random collagen scaffold can induce cells in random arrangement. The results indicate that parallel collagen scaffold is an ideal structure to promote tendon repairing.

  11. The three-dimensional flow past a rapidly rotating circular cylinder

    NASA Technical Reports Server (NTRS)

    Denier, James P.; Duck, Peter W.

    1993-01-01

    The high Reynolds number (Re) flow past a rapidly rotating circular cylinder is investigated. The rotation rate of the cylinder is allowed to vary (slightly) along the axis of the cylinder, thereby provoking three-dimensional flow disturbances, which are shown to involve relatively massive (O(Re)) velocity perturbations to the flow away from the cylinder surface. Additionally, three integral conditions, analogous to the single condition determined in two dimensions by Batchelor, are derived, based on the condition of periodicity in the azimuthal direction.

  12. Parallel Logic Programming and Parallel Systems Software and Hardware

    DTIC Science & Technology

    1989-07-29

    Conference, Dallas TX. January 1985. (55) [Rous75] Roussel, P., "PROLOG: Manuel de Reference et d’Uilisation", Group d’ Intelligence Artificielle , Universite d...completed. Tools were provided for software development using artificial intelligence techniques. Al software for massively parallel architectures was...using artificial intelligence tech- niques. Al software for massively parallel architectures was started. 1. Introduction We describe research conducted

  13. FAST TRACK COMMUNICATION: Born-Infeld extension of new massive gravity

    NASA Astrophysics Data System (ADS)

    Güllü, İbrahim; Çaǧri Şişman, Tahsin; Tekin, Bayram

    2010-08-01

    We present a three-dimensional gravitational Born-Infeld theory which reduces to the recently found new massive gravity (NMG) at the quadratic level in the small curvature expansion and at the cubic order reproduces the deformation of NMG obtained from AdS/CFT. Our action provides a remarkable extension of NMG to all orders in the curvature and might define a consistent quantum gravity.

  14. Design of a massively parallel computer using bit serial processing elements

    NASA Technical Reports Server (NTRS)

    Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing

    1995-01-01

    A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.

  15. Massively Parallel DNA Sequencing Facilitates Diagnosis of Patients with Usher Syndrome Type 1

    PubMed Central

    Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-ichi

    2014-01-01

    Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance. PMID:24618850

  16. Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

    PubMed

    Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi

    2014-01-01

    Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.

  17. Accelerating population balance-Monte Carlo simulation for coagulation dynamics from the Markov jump model, stochastic algorithm and GPU parallel computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu, Zuwei; Zhao, Haibo, E-mail: klinsmannzhb@163.com; Zheng, Chuguang

    2015-01-15

    This paper proposes a comprehensive framework for accelerating population balance-Monte Carlo (PBMC) simulation of particle coagulation dynamics. By combining Markov jump model, weighted majorant kernel and GPU (graphics processing unit) parallel computing, a significant gain in computational efficiency is achieved. The Markov jump model constructs a coagulation-rule matrix of differentially-weighted simulation particles, so as to capture the time evolution of particle size distribution with low statistical noise over the full size range and as far as possible to reduce the number of time loopings. Here three coagulation rules are highlighted and it is found that constructing appropriate coagulation rule providesmore » a route to attain the compromise between accuracy and cost of PBMC methods. Further, in order to avoid double looping over all simulation particles when considering the two-particle events (typically, particle coagulation), the weighted majorant kernel is introduced to estimate the maximum coagulation rates being used for acceptance–rejection processes by single-looping over all particles, and meanwhile the mean time-step of coagulation event is estimated by summing the coagulation kernels of rejected and accepted particle pairs. The computational load of these fast differentially-weighted PBMC simulations (based on the Markov jump model) is reduced greatly to be proportional to the number of simulation particles in a zero-dimensional system (single cell). Finally, for a spatially inhomogeneous multi-dimensional (multi-cell) simulation, the proposed fast PBMC is performed in each cell, and multiple cells are parallel processed by multi-cores on a GPU that can implement the massively threaded data-parallel tasks to obtain remarkable speedup ratio (comparing with CPU computation, the speedup ratio of GPU parallel computing is as high as 200 in a case of 100 cells with 10 000 simulation particles per cell). These accelerating approaches of PBMC are demonstrated in a physically realistic Brownian coagulation case. The computational accuracy is validated with benchmark solution of discrete-sectional method. The simulation results show that the comprehensive approach can attain very favorable improvement in cost without sacrificing computational accuracy.« less

  18. Butterfly effect in 3D gravity

    NASA Astrophysics Data System (ADS)

    Qaemmaqami, Mohammad M.

    2017-11-01

    We study the butterfly effect by considering shock wave solutions near the horizon of the anti-de Sitter black hole in some three-dimensional gravity models including 3D Einstein gravity, minimal massive 3D gravity, new massive gravity, generalized massive gravity, Born-Infeld 3D gravity, and new bigravity. We calculate the butterfly velocities of these models and also we consider the critical points and different limits in some of these models. By studying the butterfly effect in the generalized massive gravity, we observe a correspondence between the butterfly velocities and right-left moving degrees of freedom or the central charges of the dual 2D conformal field theories.

  19. B-MIC: An Ultrafast Three-Level Parallel Sequence Aligner Using MIC.

    PubMed

    Cui, Yingbo; Liao, Xiangke; Zhu, Xiaoqian; Wang, Bingqiang; Peng, Shaoliang

    2016-03-01

    Sequence alignment is the central process for sequence analysis, where mapping raw sequencing data to reference genome. The large amount of data generated by NGS is far beyond the process capabilities of existing alignment tools. Consequently, sequence alignment becomes the bottleneck of sequence analysis. Intensive computing power is required to address this challenge. Intel recently announced the MIC coprocessor, which can provide massive computing power. The Tianhe-2 is the world's fastest supercomputer now equipped with three MIC coprocessors each compute node. A key feature of sequence alignment is that different reads are independent. Considering this property, we proposed a MIC-oriented three-level parallelization strategy to speed up BWA, a widely used sequence alignment tool, and developed our ultrafast parallel sequence aligner: B-MIC. B-MIC contains three levels of parallelization: firstly, parallelization of data IO and reads alignment by a three-stage parallel pipeline; secondly, parallelization enabled by MIC coprocessor technology; thirdly, inter-node parallelization implemented by MPI. In this paper, we demonstrate that B-MIC outperforms BWA by a combination of those techniques using Inspur NF5280M server and the Tianhe-2 supercomputer. To the best of our knowledge, B-MIC is the first sequence alignment tool to run on Intel MIC and it can achieve more than fivefold speedup over the original BWA while maintaining the alignment precision.

  20. Massive quiver matrix models for massive charged particles in AdS

    DOE PAGES

    Asplund, Curtis T.; Denef, Frederik; Dzienkowski, Eric

    2016-01-11

    Here, we present a new class of N = 4 supersymmetric quiver matrix models and argue that it describes the stringy low-energy dynamics of internally wrapped D-branes in four-dimensional anti-de Sitter (AdS) flux compactifications. The Lagrangians of these models differ from previously studied quiver matrix models by the presence of mass terms, associated with the AdS gravitational potential, as well as additional terms dictated by supersymmetry. These give rise to dynamical phenomena typically associated with the presence of fluxes, such as fuzzy membranes, internal cyclotron motion and the appearance of confining strings. We also show how these models can bemore » obtained by dimensional reduction of four-dimensional supersymmetric quiver gauge theories on a three-sphere.« less

  1. Three pillars for achieving quantum mechanical molecular dynamics simulations of huge systems: Divide-and-conquer, density-functional tight-binding, and massively parallel computation.

    PubMed

    Nishizawa, Hiroaki; Nishimura, Yoshifumi; Kobayashi, Masato; Irle, Stephan; Nakai, Hiromi

    2016-08-05

    The linear-scaling divide-and-conquer (DC) quantum chemical methodology is applied to the density-functional tight-binding (DFTB) theory to develop a massively parallel program that achieves on-the-fly molecular reaction dynamics simulations of huge systems from scratch. The functions to perform large scale geometry optimization and molecular dynamics with DC-DFTB potential energy surface are implemented to the program called DC-DFTB-K. A novel interpolation-based algorithm is developed for parallelizing the determination of the Fermi level in the DC method. The performance of the DC-DFTB-K program is assessed using a laboratory computer and the K computer. Numerical tests show the high efficiency of the DC-DFTB-K program, a single-point energy gradient calculation of a one-million-atom system is completed within 60 s using 7290 nodes of the K computer. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  2. Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    BAER,THOMAS A.; SACKINGER,PHILIP A.; SUBIA,SAMUEL R.

    1999-10-14

    Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-staticmore » solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance.« less

  3. Exact solutions of massive gravity in three dimensions

    NASA Astrophysics Data System (ADS)

    Chakhad, Mohamed

    In recent years, there has been an upsurge in interest in three-dimensional theories of gravity. In particular, two theories of massive gravity in three dimensions hold strong promise in the search for fully consistent theories of quantum gravity, an understanding of which will shed light on the problems of quantum gravity in four dimensions. One of these theories is the "old" third-order theory of topologically massive gravity (TMG) and the other one is a "new" fourth-order theory of massive gravity (NMG). Despite this increase in research activity, the problem of finding and classifying solutions of TMG and NMG remains a wide open area of research. In this thesis, we provide explicit new solutions of massive gravity in three dimensions and suggest future directions of research. These solutions belong to the Kundt class of spacetimes. A systematic analysis of the Kundt solutions with constant scalar polynomial curvature invariants provides a glimpse of the structure of the spaces of solutions of the two theories of massive gravity. We also find explicit solutions of topologically massive gravity whose scalar polynomial curvature invariants are not all constant, and these are the first such solutions. A number of properties of Kundt solutions of TMG and NMG, such as an identification of solutions which lie at the intersection of the full nonlinear and linearized theories, are also derived.

  4. GPU computing in medical physics: a review.

    PubMed

    Pratx, Guillem; Xing, Lei

    2011-05-01

    The graphics processing unit (GPU) has emerged as a competitive platform for computing massively parallel problems. Many computing applications in medical physics can be formulated as data-parallel tasks that exploit the capabilities of the GPU for reducing processing times. The authors review the basic principles of GPU computing as well as the main performance optimization techniques, and survey existing applications in three areas of medical physics, namely image reconstruction, dose calculation and treatment plan optimization, and image processing.

  5. Logarithmic Superdiffusion in Two Dimensional Driven Lattice Gases

    NASA Astrophysics Data System (ADS)

    Krug, J.; Neiss, R. A.; Schadschneider, A.; Schmidt, J.

    2018-03-01

    The spreading of density fluctuations in two-dimensional driven diffusive systems is marginally anomalous. Mode coupling theory predicts that the diffusivity in the direction of the drive diverges with time as (ln t)^{2/3} with a prefactor depending on the macroscopic current-density relation and the diffusion tensor of the fluctuating hydrodynamic field equation. Here we present the first numerical verification of this behavior for a particular version of the two-dimensional asymmetric exclusion process. Particles jump strictly asymmetrically along one of the lattice directions and symmetrically along the other, and an anisotropy parameter p governs the ratio between the two rates. Using a novel massively parallel coupling algorithm that strongly reduces the fluctuations in the numerical estimate of the two-point correlation function, we are able to accurately determine the exponent of the logarithmic correction. In addition, the variation of the prefactor with p provides a stringent test of mode coupling theory.

  6. Functional Parallel Factor Analysis for Functions of One- and Two-dimensional Arguments.

    PubMed

    Choi, Ji Yeh; Hwang, Heungsun; Timmerman, Marieke E

    2018-03-01

    Parallel factor analysis (PARAFAC) is a useful multivariate method for decomposing three-way data that consist of three different types of entities simultaneously. This method estimates trilinear components, each of which is a low-dimensional representation of a set of entities, often called a mode, to explain the maximum variance of the data. Functional PARAFAC permits the entities in different modes to be smooth functions or curves, varying over a continuum, rather than a collection of unconnected responses. The existing functional PARAFAC methods handle functions of a one-dimensional argument (e.g., time) only. In this paper, we propose a new extension of functional PARAFAC for handling three-way data whose responses are sequenced along both a two-dimensional domain (e.g., a plane with x- and y-axis coordinates) and a one-dimensional argument. Technically, the proposed method combines PARAFAC with basis function expansion approximations, using a set of piecewise quadratic finite element basis functions for estimating two-dimensional smooth functions and a set of one-dimensional basis functions for estimating one-dimensional smooth functions. In a simulation study, the proposed method appeared to outperform the conventional PARAFAC. We apply the method to EEG data to demonstrate its empirical usefulness.

  7. High-Performance Parallel Analysis of Coupled Problems for Aircraft Propulsion

    NASA Technical Reports Server (NTRS)

    Felippa, C. A.; Farhat, C.; Park, K. C.; Gumaste, U.; Chen, P.-S.; Lesoinne, M.; Stern, P.

    1996-01-01

    This research program dealt with the application of high-performance computing methods to the numerical simulation of complete jet engines. The program was initiated in January 1993 by applying two-dimensional parallel aeroelastic codes to the interior gas flow problem of a bypass jet engine. The fluid mesh generation, domain decomposition and solution capabilities were successfully tested. Attention was then focused on methodology for the partitioned analysis of the interaction of the gas flow with a flexible structure and with the fluid mesh motion driven by these structural displacements. The latter is treated by a ALE technique that models the fluid mesh motion as that of a fictitious mechanical network laid along the edges of near-field fluid elements. New partitioned analysis procedures to treat this coupled three-component problem were developed during 1994 and 1995. These procedures involved delayed corrections and subcycling, and have been successfully tested on several massively parallel computers, including the iPSC-860, Paragon XP/S and the IBM SP2. For the global steady-state axisymmetric analysis of a complete engine we have decided to use the NASA-sponsored ENG10 program, which uses a regular FV-multiblock-grid discretization in conjunction with circumferential averaging to include effects of blade forces, loss, combustor heat addition, blockage, bleeds and convective mixing. A load-balancing preprocessor tor parallel versions of ENG10 was developed. During 1995 and 1996 we developed the capability tor the first full 3D aeroelastic simulation of a multirow engine stage. This capability was tested on the IBM SP2 parallel supercomputer at NASA Ames. Benchmark results were presented at the 1196 Computational Aeroscience meeting.

  8. cellGPU: Massively parallel simulations of dynamic vertex models

    NASA Astrophysics Data System (ADS)

    Sussman, Daniel M.

    2017-10-01

    Vertex models represent confluent tissue by polygonal or polyhedral tilings of space, with the individual cells interacting via force laws that depend on both the geometry of the cells and the topology of the tessellation. This dependence on the connectivity of the cellular network introduces several complications to performing molecular-dynamics-like simulations of vertex models, and in particular makes parallelizing the simulations difficult. cellGPU addresses this difficulty and lays the foundation for massively parallelized, GPU-based simulations of these models. This article discusses its implementation for a pair of two-dimensional models, and compares the typical performance that can be expected between running cellGPU entirely on the CPU versus its performance when running on a range of commercial and server-grade graphics cards. By implementing the calculation of topological changes and forces on cells in a highly parallelizable fashion, cellGPU enables researchers to simulate time- and length-scales previously inaccessible via existing single-threaded CPU implementations. Program Files doi:http://dx.doi.org/10.17632/6j2cj29t3r.1 Licensing provisions: MIT Programming language: CUDA/C++ Nature of problem: Simulations of off-lattice "vertex models" of cells, in which the interaction forces depend on both the geometry and the topology of the cellular aggregate. Solution method: Highly parallelized GPU-accelerated dynamical simulations in which the force calculations and the topological features can be handled on either the CPU or GPU. Additional comments: The code is hosted at https://gitlab.com/dmsussman/cellGPU, with documentation additionally maintained at http://dmsussman.gitlab.io/cellGPUdocumentation

  9. Box schemes and their implementation on the iPSC/860

    NASA Technical Reports Server (NTRS)

    Chattot, J. J.; Merriam, M. L.

    1991-01-01

    Research on algoriths for efficiently solving fluid flow problems on massively parallel computers is continued in the present paper. Attention is given to the implementation of a box scheme on the iPSC/860, a massively parallel computer with a peak speed of 10 Gflops and a memory of 128 Mwords. A domain decomposition approach to parallelism is used.

  10. Aspects of warped AdS3/CFT2 correspondence

    NASA Astrophysics Data System (ADS)

    Chen, Bin; Zhang, Jia-Ju; Zhang, Jian-Dong; Zhong, De-Liang

    2013-04-01

    In this paper we apply the thermodynamics method to investigate the holographic pictures for the BTZ black hole, the spacelike and the null warped black holes in three-dimensional topologically massive gravity (TMG) and new massive gravity (NMG). Even though there are higher derivative terms in these theories, the thermodynamics method is still effective. It gives consistent results with the ones obtained by using asymptotical symmetry group (ASG) analysis. In doing the ASG analysis we develop a brute-force realization of the Barnich-Brandt-Compere formalism with Mathematica code, which also allows us to calculate the masses and the angular momenta of the black holes. In particular, we propose the warped AdS3/CFT2 correspondence in the new massive gravity, which states that quantum gravity in the warped spacetime could holographically dual to a two-dimensional CFT with {c_R}={c_L}=24 /{Gm{β^2√{{2( {21-4{β^2}} )}}}}.

  11. Scan line graphics generation on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Dorband, John E.

    1988-01-01

    Described here is how researchers implemented a scan line graphics generation algorithm on the Massively Parallel Processor (MPP). Pixels are computed in parallel and their results are applied to the Z buffer in large groups. To perform pixel value calculations, facilitate load balancing across the processors and apply the results to the Z buffer efficiently in parallel requires special virtual routing (sort computation) techniques developed by the author especially for use on single-instruction multiple-data (SIMD) architectures.

  12. A massively parallel computational approach to coupled thermoelastic/porous gas flow problems

    NASA Technical Reports Server (NTRS)

    Shia, David; Mcmanus, Hugh L.

    1995-01-01

    A new computational scheme for coupled thermoelastic/porous gas flow problems is presented. Heat transfer, gas flow, and dynamic thermoelastic governing equations are expressed in fully explicit form, and solved on a massively parallel computer. The transpiration cooling problem is used as an example problem. The numerical solutions have been verified by comparison to available analytical solutions. Transient temperature, pressure, and stress distributions have been obtained. Small spatial oscillations in pressure and stress have been observed, which would be impractical to predict with previously available schemes. Comparisons between serial and massively parallel versions of the scheme have also been made. The results indicate that for small scale problems the serial and parallel versions use practically the same amount of CPU time. However, as the problem size increases the parallel version becomes more efficient than the serial version.

  13. Coupling a three-dimensional subsurface flow model with a land surface model to simulate stream-aquifer-land interactions

    NASA Astrophysics Data System (ADS)

    Huang, M.; Bisht, G.; Zhou, T.; Chen, X.; Dai, H.; Hammond, G. E.; Riley, W. J.; Downs, J.; Liu, Y.; Zachara, J. M.

    2016-12-01

    A fully coupled three-dimensional surface and subsurface land model is developed and applied to a site along the Columbia River to simulate three-way interactions among river water, groundwater, and land surface processes. The model features the coupling of the Community Land Model version 4.5 (CLM4.5) and a massively-parallel multi-physics reactive tranport model (PFLOTRAN). The coupled model (CLM-PFLOTRAN) is applied to a 400m×400m study domain instrumented with groundwater monitoring wells in the Hanford 300 Area along the Columbia River. CLM-PFLOTRAN simulations are performed at three different spatial resolutions over the period 2011-2015 to evaluate the impact of spatial resolution on simulated variables. To demonstrate the difference in model simulations with and without lateral subsurface flow, a vertical-only CLM-PFLOTRAN simulation is also conducted for comparison. Results show that the coupled model is skillful in simulating stream-aquifer interactions, and the land-surface energy partitioning can be strongly modulated by groundwater-river water interactions in high water years due to increased soil moisture availability caused by elevated groundwater table. In addition, spatial resolution does not seem to impact the land surface energy flux simulations, although it is a key factor for accurately estimating the mass exchange rates at the boundaries and associated biogeochemical reactions in the aquifer. The coupled model developed in this study establishes a solid foundation for understanding co-evolution of hydrology and biogeochemistry along the river corridors under historical and future hydro-climate changes.

  14. Bit-parallel arithmetic in a massively-parallel associative processor

    NASA Technical Reports Server (NTRS)

    Scherson, Isaac D.; Kramer, David A.; Alleyne, Brian D.

    1992-01-01

    A simple but powerful new architecture based on a classical associative processor model is presented. Algorithms for performing the four basic arithmetic operations both for integer and floating point operands are described. For m-bit operands, the proposed architecture makes it possible to execute complex operations in O(m) cycles as opposed to O(m exp 2) for bit-serial machines. A word-parallel, bit-parallel, massively-parallel computing system can be constructed using this architecture with VLSI technology. The operation of this system is demonstrated for the fast Fourier transform and matrix multiplication.

  15. Hamiltonian structure of three-dimensional gravity in Vielbein formalism

    NASA Astrophysics Data System (ADS)

    Hajihashemi, Mahdi; Shirzad, Ahmad

    2018-01-01

    Considering Chern-Simons like gravity theories in three dimensions as first order systems, we analyze the Hamiltonian structure of three theories Topological massive gravity, New massive gravity, and Zwei-Dreibein Gravity. We show that these systems demonstrate a new feature of the constrained systems in which a new kind of constraints emerge due to factorization of determinant of the matrix of Poisson brackets of constraints. We find the desired number of degrees of freedom as well as the generating functional of local Lorentz transformations and diffeomorphism through canonical structure of the system. We also compare the Hamiltonian structure of linearized version of the considered models with the original ones.

  16. Parallel Visualization of Large-Scale Aerodynamics Calculations: A Case Study on the Cray T3E

    NASA Technical Reports Server (NTRS)

    Ma, Kwan-Liu; Crockett, Thomas W.

    1999-01-01

    This paper reports the performance of a parallel volume rendering algorithm for visualizing a large-scale, unstructured-grid dataset produced by a three-dimensional aerodynamics simulation. This dataset, containing over 18 million tetrahedra, allows us to extend our performance results to a problem which is more than 30 times larger than the one we examined previously. This high resolution dataset also allows us to see fine, three-dimensional features in the flow field. All our tests were performed on the Silicon Graphics Inc. (SGI)/Cray T3E operated by NASA's Goddard Space Flight Center. Using 511 processors, a rendering rate of almost 9 million tetrahedra/second was achieved with a parallel overhead of 26%.

  17. MUSEing about the SHAPE of eta Car's outer ejecta

    NASA Astrophysics Data System (ADS)

    Mehner, A.; Steffen, W.; Groh, J.; Vogt, F. P. A.; Baade, D.; Boffin, H. M. J.; de Wit, W. J.; Oudmaijer, R. D.; Rivinius, T.; Selman, F.

    2017-11-01

    The role of episodic mass loss in evolved massive stars is one of the outstanding questions in stellar evolution theory. Integral field spectroscopy of nebulae around massive stars provide information on their recent mass-loss history. η Car is one of the most massive evolved stars and is surrounded by a complex circumstellar environment. We have conducted a three-dimensional morpho-kinematic analysis of η Car's ejecta outside its famous Homunculus nebula. SHAPE modelling of VLT MUSE data establish unequivocally the spatial cohesion of the outer ejecta and the correlation of ejecta with the soft X-ray emission.

  18. Precision Parameter Estimation and Machine Learning

    NASA Astrophysics Data System (ADS)

    Wandelt, Benjamin D.

    2008-12-01

    I discuss the strategy of ``Acceleration by Parallel Precomputation and Learning'' (AP-PLe) that can vastly accelerate parameter estimation in high-dimensional parameter spaces and costly likelihood functions, using trivially parallel computing to speed up sequential exploration of parameter space. This strategy combines the power of distributed computing with machine learning and Markov-Chain Monte Carlo techniques efficiently to explore a likelihood function, posterior distribution or χ2-surface. This strategy is particularly successful in cases where computing the likelihood is costly and the number of parameters is moderate or large. We apply this technique to two central problems in cosmology: the solution of the cosmological parameter estimation problem with sufficient accuracy for the Planck data using PICo; and the detailed calculation of cosmological helium and hydrogen recombination with RICO. Since the APPLe approach is designed to be able to use massively parallel resources to speed up problems that are inherently serial, we can bring the power of distributed computing to bear on parameter estimation problems. We have demonstrated this with the CosmologyatHome project.

  19. Novel Scalable 3-D MT Inverse Solver

    NASA Astrophysics Data System (ADS)

    Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.

    2016-12-01

    We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.

  20. Scalable, High-performance 3D Imaging Software Platform: System Architecture and Application to Virtual Colonoscopy

    PubMed Central

    Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli; Brett, Bevin

    2013-01-01

    One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. In this work, we have developed a software platform that is designed to support high-performance 3D medical image processing for a wide range of applications using increasingly available and affordable commodity computing systems: multi-core, clusters, and cloud computing systems. To achieve scalable, high-performance computing, our platform (1) employs size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D image processing algorithms; (2) supports task scheduling for efficient load distribution and balancing; and (3) consists of a layered parallel software libraries that allow a wide range of medical applications to share the same functionalities. We evaluated the performance of our platform by applying it to an electronic cleansing system in virtual colonoscopy, with initial experimental results showing a 10 times performance improvement on an 8-core workstation over the original sequential implementation of the system. PMID:23366803

  1. Magnetic Reconfiguration in Explosive Solar Activity

    NASA Technical Reports Server (NTRS)

    Antiochos, Spiro K.

    2008-01-01

    A fundamental property of the Sun's corona i s that it is violently dynamic. The most spectacular and most energetic manifestations of this activity are the giant disruptions that give rise to coronal mass ejections (CME) and eruptive flares. These major events are of critical importance, because they drive the most destructive forms of space weather at Earth and in the solar system, and they provide a unique opportunity to study, in revealing detail, the interaction of magnetic field and matter, in particular, magnetohydrodynamic instability and nonequilibrium -- processes that are at the heart of laboratory and astrophysical plasma physics. Recent observations by a number of NASA space missions have given us new insights into the physical mechanisms that underlie coronal explosions. Furthermore, massively-parallel computation have now allowed us to calculate fully three-dimensional models for solar activity. In this talk I will present some of the latest observations of the Sun, including those from the just-launched Hinode and STEREO mission, and discuss recent advances in the theory and modeling of explosive solar activity.

  2. Global Ocean Prediction with the HYbrid Coordinate Ocean Model, HYCOM

    NASA Astrophysics Data System (ADS)

    Chassignet, E.

    A broad partnership of institutions is collaborating in developing and demonstrating the performance and application of eddy-resolving, real-time global and Atlantic ocean prediction systems using the the HYbrid Coordinate Ocean Model (HYCOM). These systems will be transitioned for operational use by both the U.S. Navy at the Naval Oceanographic Office (NAVOCEANO), Stennis Space Center, MS, and the Fleet Numerical Meteorology and Oceanography Centre (FNMOC), Monterey, CA, and by NOAA at the National Centers for Environmental Prediction (NCEP), Washington, D.C. These systems will run efficiently on a variety of massively parallel computers and will include sophisticated data assimilation techniques for assimilation of satellite altimeter sea surface height and sea surface temperature as well as in situ temperature, salinity, and float displacement. The Partnership addresses the Global Ocean Data Assimilation Experiment (GODAE) goals of three-dimensional (3D) depiction of the ocean state at fine resolution in real-time and provision of boundary conditions for coastal and regional models. An overview of the effort will be presented.

  3. 2010 Award for Outstanding Doctoral Thesis Research in Biological Physics Talk: How the Genome Folds

    NASA Astrophysics Data System (ADS)

    Lieberman-Aiden, Erez

    2011-03-01

    I describe Hi-C, a novel technology for probing the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. Working with collaborators at the Broad Institute and UMass Medical School, we used Hi-C to construct spatial proximity maps of the human genome at a resolution of 1Mb. These maps confirm the presence of chromosome territories and the spatial proximity of small, gene-rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. At the megabase scale, the chromatin conformation is consistent with a fractal globule, a knot-free conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus. The fractal globule is distinct from the more commonly used globular equilibrium model. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.

  4. The Dynamic Architectural and Epigenetic Nuclear Landscape: Developing the Genomic Almanac of Biology and Disease

    PubMed Central

    Tai, Phillip W. L.; Zaidi, Sayyed K.; Wu, Hai; Grandy, Rodrigo A.; Montecino, Martin M.; van Wijnen, André J.; Lian, Jane B.; Stein, Gary S.; Stein, Janet L.

    2014-01-01

    Compaction of the eukaryotic genome into the confined space of the cell nucleus must occur faithfully throughout each cell cycle to retain gene expression fidelity. For decades, experimental limitations to study the structural organization of the interphase nucleus restricted our understanding of its contributions towards gene regulation and disease. However, within the past few years, our capability to visualize chromosomes in vivo with sophisticated fluorescence microscopy, and to characterize chromosomal regulatory environments via massively-parallel sequencing methodologies have drastically changed how we currently understand epigenetic gene control within the context of three-dimensional nuclear structure. The rapid rate at which information on nuclear structure is unfolding brings challenges to compare and contrast recent observations with historic findings. In this review, we discuss experimental breakthroughs that have influenced how we understand and explore the dynamic structure and function of the nucleus, and how we can incorporate historical perspectives with insights acquired from the ever-evolving advances in molecular biology and pathology. PMID:24242872

  5. Driven-dissipative quantum Monte Carlo method for open quantum systems

    NASA Astrophysics Data System (ADS)

    Nagy, Alexandra; Savona, Vincenzo

    2018-05-01

    We develop a real-time full configuration-interaction quantum Monte Carlo approach to model driven-dissipative open quantum systems with Markovian system-bath coupling. The method enables stochastic sampling of the Liouville-von Neumann time evolution of the density matrix thanks to a massively parallel algorithm, thus providing estimates of observables on the nonequilibrium steady state. We present the underlying theory and introduce an initiator technique and importance sampling to reduce the statistical error. Finally, we demonstrate the efficiency of our approach by applying it to the driven-dissipative two-dimensional X Y Z spin-1/2 model on a lattice.

  6. Parallel 3-D numerical simulation of dielectric barrier discharge plasma actuators

    NASA Astrophysics Data System (ADS)

    Houba, Tomas

    Dielectric barrier discharge plasma actuators have shown promise in a range of applications including flow control, sterilization and ozone generation. Developing numerical models of plasma actuators is of great importance, because a high-fidelity parallel numerical model allows new design configurations to be tested rapidly. Additionally, it provides a better understanding of the plasma actuator physics which is useful for further innovation. The physics of plasma actuators is studied numerically. A loosely coupled approach is utilized for the coupling of the plasma to the neutral fluid. The state of the art in numerical plasma modeling is advanced by the development of a parallel, three-dimensional, first-principles model with detailed air chemistry. The model incorporates 7 charged species and 18 reactions, along with a solution of the electron energy equation. To the author's knowledge, a parallel three-dimensional model of a gas discharge with a detailed air chemistry model and the solution of electron energy is unique. Three representative geometries are studied using the gas discharge model. The discharge of gas between two parallel electrodes is used to validate the air chemistry model developed for the gas discharge code. The gas discharge model is then applied to the discharge produced by placing a dc powered wire and grounded plate electrodes in a channel. Finally, a three-dimensional simulation of gas discharge produced by electrodes placed inside a riblet is carried out. The body force calculated with the gas discharge model is loosely coupled with a fluid model to predict the induced flow inside the riblet.

  7. Novel 16-channel receive coil array for accelerated upper airway MRI at 3 Tesla.

    PubMed

    Kim, Yoon-Chul; Hayes, Cecil E; Narayanan, Shrikanth S; Nayak, Krishna S

    2011-06-01

    Upper airway MRI can provide a noninvasive assessment of speech and swallowing disorders and sleep apnea. Recent work has demonstrated the value of high-resolution three-dimensional imaging and dynamic two-dimensional imaging and the importance of further improvements in spatio-temporal resolution. The purpose of the study was to describe a novel 16-channel 3 Tesla receive coil that is highly sensitive to the human upper airway and investigate the performance of accelerated upper airway MRI with the coil. In three-dimensional imaging of the upper airway during static posture, 6-fold acceleration is demonstrated using parallel imaging, potentially leading to capturing a whole three-dimensional vocal tract with 1.25 mm isotropic resolution within 9 sec of sustained sound production. Midsagittal spiral parallel imaging of vocal tract dynamics during natural speech production is demonstrated with 2 × 2 mm(2) in-plane spatial and 84 ms temporal resolution. Copyright © 2010 Wiley-Liss, Inc.

  8. Parallel processing a three-dimensional free-lagrange code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mandell, D.A.; Trease, H.E.

    1989-01-01

    A three-dimensional, time-dependent free-Lagrange hydrodynamics code has been multitasked and autotasked on a CRAY X-MP/416. The multitasking was done by using the Los Alamos Multitasking Control Library, which is a superset of the CRAY multitasking library. Autotasking is done by using constructs which are only comment cards if the source code is not run through a preprocessor. The three-dimensional algorithm has presented a number of problems that simpler algorithms, such as those for one-dimensional hydrodynamics, did not exhibit. Problems in converting the serial code, originally written for a CRAY-1, to a multitasking code are discussed. Autotasking of a rewritten versionmore » of the code is discussed. Timing results for subroutines and hot spots in the serial code are presented and suggestions for additional tools and debugging aids are given. Theoretical speedup results obtained from Amdahl's law and actual speedup results obtained on a dedicated machine are presented. Suggestions for designing large parallel codes are given.« less

  9. Vectorized and multitasked solution of the few-group neutron diffusion equations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zee, S.K.; Turinsky, P.J.; Shayer, Z.

    1989-03-01

    A numerical algorithm with parallelism was used to solve the two-group, multidimensional neutron diffusion equations on computers characterized by shared memory, vector pipeline, and multi-CPU architecture features. Specifically, solutions were obtained on the Cray X/MP-48, the IBM-3090 with vector facilities, and the FPS-164. The material-centered mesh finite difference method approximation and outer-inner iteration method were employed. Parallelism was introduced in the inner iterations using the cyclic line successive overrelaxation iterative method and solving in parallel across lines. The outer iterations were completed using the Chebyshev semi-iterative method that allows parallelism to be introduced in both space and energy groups. Formore » the three-dimensional model, power, soluble boron, and transient fission product feedbacks were included. Concentrating on the pressurized water reactor (PWR), the thermal-hydraulic calculation of moderator density assumed single-phase flow and a closed flow channel, allowing parallelism to be introduced in the solution across the radial plane. Using a pinwise detail, quarter-core model of a typical PWR in cycle 1, for the two-dimensional model without feedback the measured million floating point operations per second (MFLOPS)/vector speedups were 83/11.7. 18/2.2, and 2.4/5.6 on the Cray, IBM, and FPS without multitasking, respectively. Lower performance was observed with a coarser mesh, i.e., shorter vector length, due to vector pipeline start-up. For an 18 x 18 x 30 (x-y-z) three-dimensional model with feedback of the same core, MFLOPS/vector speedups of --61/6.7 and an execution time of 0.8 CPU seconds on the Cray without multitasking were measured. Finally, using two CPUs and the vector pipelines of the Cray, a multitasking efficiency of 81% was noted for the three-dimensional model.« less

  10. A sweep algorithm for massively parallel simulation of circuit-switched networks

    NASA Technical Reports Server (NTRS)

    Gaujal, Bruno; Greenberg, Albert G.; Nicol, David M.

    1992-01-01

    A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude.

  11. Multi-threading: A new dimension to massively parallel scientific computation

    NASA Astrophysics Data System (ADS)

    Nielsen, Ida M. B.; Janssen, Curtis L.

    2000-06-01

    Multi-threading is becoming widely available for Unix-like operating systems, and the application of multi-threading opens new ways for performing parallel computations with greater efficiency. We here briefly discuss the principles of multi-threading and illustrate the application of multi-threading for a massively parallel direct four-index transformation of electron repulsion integrals. Finally, other potential applications of multi-threading in scientific computing are outlined.

  12. Neural Parallel Engine: A toolbox for massively parallel neural signal processing.

    PubMed

    Tam, Wing-Kin; Yang, Zhi

    2018-05-01

    Large-scale neural recordings provide detailed information on neuronal activities and can help elicit the underlying neural mechanisms of the brain. However, the computational burden is also formidable when we try to process the huge data stream generated by such recordings. In this study, we report the development of Neural Parallel Engine (NPE), a toolbox for massively parallel neural signal processing on graphical processing units (GPUs). It offers a selection of the most commonly used routines in neural signal processing such as spike detection and spike sorting, including advanced algorithms such as exponential-component-power-component (EC-PC) spike detection and binary pursuit spike sorting. We also propose a new method for detecting peaks in parallel through a parallel compact operation. Our toolbox is able to offer a 5× to 110× speedup compared with its CPU counterparts depending on the algorithms. A user-friendly MATLAB interface is provided to allow easy integration of the toolbox into existing workflows. Previous efforts on GPU neural signal processing only focus on a few rudimentary algorithms, are not well-optimized and often do not provide a user-friendly programming interface to fit into existing workflows. There is a strong need for a comprehensive toolbox for massively parallel neural signal processing. A new toolbox for massively parallel neural signal processing has been created. It can offer significant speedup in processing signals from large-scale recordings up to thousands of channels. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. Massive-Star Magnetospheres: Now in 3-D!

    NASA Astrophysics Data System (ADS)

    Townsend, Richard

    Magnetic fields are unexpected in massive stars, due to the absence of a dynamo convection zone beneath their surface layers. Nevertheless, kilogauss-strength, ordered fields were detected in a small subset of these stars over three decades ago, and the intervening years have witnessed the steady expansion of this subset. A distinctive feature of magnetic massive stars is that they harbor magnetospheres --- circumstellar environments where the magnetic field interacts strongly with the star's radiation-driven wind, confining it and channelling it into energetic shocks. A wide range of observational signatures are associated with these magnetospheres, in diagnostics ranging from X-rays all the way through to radio emission. Moreover, these magnetospheres can play an important role in massive-star evolution, by amplifying angular momentum loss in the wind. Recent progress in understanding massive-star magnetospheres has largely been driven by magnetohydrodynamical (MHD) simulations. However, these have been restricted to two- dimensional axisymmetric configurations, with three-dimensional configurations possible only in certain special cases. These restrictions are limiting further progress; we therefore propose to develop completely general three-dimensional models for the magnetospheres of massive stars, on the one hand to understand their observational properties and exploit them as plasma-physics laboratories, and on the other to gain a comprehensive understanding of how they influence the evolution of their host star. For weak- and intermediate-field stars, the models will be based on 3-D MHD simulations using a modified version of the ZEUS-MP code. For strong-field stars, we will extend our existing Rigid Field Hydrodynamics (RFHD) code to handle completely arbitrary field topologies. To explore a putative 'photoionization-moderated mass loss' mechanism for massive-star magnetospheres, we will also further develop a photoionization code we have recently prototyped. Simulation data from these codes will be used to synthesize observables, suitable for comparison with datasets from ground- and space-based facilities. Project results will be disseminated in the form of journal papers, presentations, data and visualizations, to facilitate the broad communication of our results. In addition, we will release the project codes under an open- source license, to encourage other groups' involvement in modeling massive-star magnetospheres. Through furthering our insights into these magnetospheres, the project is congruous with NASA's Strategic Goal 2, 'Expand scientific understanding of the Earth and the universe in which we live'. By making testable predictions of X-ray emission and UV line profiles, it is naturally synergistic with observational studies of magnetic massive stars using NASA's ROSAT, Chandra, IUE and FUSE missions. By exploring magnetic braking, it will have a direct impact on theoretical predictions of collapsar yields, and thereby help drive forward the analysis and interpretation of gamma-ray burst observations by NASA's Swift and Fermi missions. And, through its general contribution toward understanding the lifecycle of massive stars, the project will complement the past, present and future investments in studying these stars using NASA's other space-based observatories.

  14. On 3D minimal massive gravity

    NASA Astrophysics Data System (ADS)

    Alishahiha, Mohsen; Qaemmaqami, Mohammad M.; Naseh, Ali; Shirzad, Ahmad

    2014-12-01

    We study linearized equations of motion of the newly proposed three dimensional gravity, known as minimal massive gravity, using its metric formulation. By making use of a redefinition of the parameters of the model, we observe that the resulting linearized equations are exactly the same as that of TMG. In particular the model admits logarithmic modes at critical points. We also study several vacuum solutions of the model, specially at a certain limit where the contribution of Chern-Simons term vanishes.

  15. GPU-accelerated Tersoff potentials for massively parallel Molecular Dynamics simulations

    NASA Astrophysics Data System (ADS)

    Nguyen, Trung Dac

    2017-03-01

    The Tersoff potential is one of the empirical many-body potentials that has been widely used in simulation studies at atomic scales. Unlike pair-wise potentials, the Tersoff potential involves three-body terms, which require much more arithmetic operations and data dependency. In this contribution, we have implemented the GPU-accelerated version of several variants of the Tersoff potential for LAMMPS, an open-source massively parallel Molecular Dynamics code. Compared to the existing MPI implementation in LAMMPS, the GPU implementation exhibits a better scalability and offers a speedup of 2.2X when run on 1000 compute nodes on the Titan supercomputer. On a single node, the speedup ranges from 2.0 to 8.0 times, depending on the number of atoms per GPU and hardware configurations. The most notable features of our GPU-accelerated version include its design for MPI/accelerator heterogeneous parallelism, its compatibility with other functionalities in LAMMPS, its ability to give deterministic results and to support both NVIDIA CUDA- and OpenCL-enabled accelerators. Our implementation is now part of the GPU package in LAMMPS and accessible for public use.

  16. Simulation of Hypervelocity Impact on Aluminum-Nextel-Kevlar Orbital Debris Shields

    NASA Technical Reports Server (NTRS)

    Fahrenthold, Eric P.

    2000-01-01

    An improved hybrid particle-finite element method has been developed for hypervelocity impact simulation. The method combines the general contact-impact capabilities of particle codes with the true Lagrangian kinematics of large strain finite element formulations. Unlike some alternative schemes which couple Lagrangian finite element models with smooth particle hydrodynamics, the present formulation makes no use of slidelines or penalty forces. The method has been implemented in a parallel, three dimensional computer code. Simulations of three dimensional orbital debris impact problems using this parallel hybrid particle-finite element code, show good agreement with experiment and good speedup in parallel computation. The simulations included single and multi-plate shields as well as aluminum and composite shielding materials. at an impact velocity of eleven kilometers per second.

  17. Multitasking the three-dimensional transport code TORT on CRAY platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Azmy, Y.Y.; Barnett, D.A.; Burre, C.A.

    1996-04-01

    The multitasking options in the three-dimensional neutral particle transport code TORT originally implemented for Cray`s CTSS operating system are revived and extended to run on Cray Y/MP and C90 computers using the UNICOS operating system. These include two coarse-grained domain decompositions; across octants, and across directions within an octant, termed Octant Parallel (OP), and Direction Parallel (DP), respectively. Parallel performance of the DP is significantly enhanced by increasing the task grain size and reducing load imbalance via dynamic scheduling of the discrete angles among the participating tasks. Substantial Wall Clock speedup factors, approaching 4.5 using 8 tasks, have been measuredmore » in a time-sharing environment, and generally depend on the test problem specifications, number of tasks, and machine loading during execution.« less

  18. Three-dimensional vectorial multifocal arrays created by pseudo-period encoding

    NASA Astrophysics Data System (ADS)

    Zeng, Tingting; Chang, Chenliang; Chen, Zhaozhong; Wang, Hui-Tian; Ding, Jianping

    2018-06-01

    Multifocal arrays have been attracting considerable attention recently owing to their potential applications in parallel optical tweezers, parallel single-molecule orientation determination, parallel recording and multifocal multiphoton microscopy. However, the generation of vectorial multifocal arrays with a tailorable structure and polarization state remains a great challenge, and reports on multifocal arrays have hitherto been restricted either to scalar focal spots without polarization versatility or to regular arrays with fixed spacing. In this work, we propose a specific pseudo-period encoding technique to create three-dimensional (3D) vectorial multifocal arrays with the ability to manipulate the position, polarization state and intensity of each focal spot. We experimentally validated the flexibility of our approach in the generation of 3D vectorial multiple spots with polarization multiplicity and position tunability.

  19. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  20. Eruptive Massive Vector Particles of 5-Dimensional Kerr-Gödel Spacetime

    NASA Astrophysics Data System (ADS)

    Övgün, A.; Sakalli, I.

    2018-02-01

    In this paper, we investigate Hawking radiation of massive spin-1 particles from 5-dimensional Kerr-Gödel spacetime. By applying the WKB approximation and the Hamilton-Jacobi ansatz to the relativistic Proca equation, we obtain the quantum tunneling rate of the massive vector particles. Using the obtained tunneling rate, we show how one impeccably computes the Hawking temperature of the 5-dimensional Kerr-Gödel spacetime.

  1. Computer-generated 3D ultrasound images of the carotid artery

    NASA Technical Reports Server (NTRS)

    Selzer, Robert H.; Lee, Paul L.; Lai, June Y.; Frieden, Howard J.; Blankenhorn, David H.

    1989-01-01

    A method is under development to measure carotid artery lesions from a computer-generated three-dimensional ultrasound image. For each image, the position of the transducer in six coordinates (x, y, z, azimuth, elevation, and roll) is recorded and used to position each B-mode picture element in its proper spatial position in a three-dimensional memory array. After all B-mode images have been assembled in the memory, the three-dimensional image is filtered and resampled to produce a new series of parallel-plane two-dimensional images from which arterial boundaries are determined using edge tracking methods.

  2. Computer-generated 3D ultrasound images of the carotid artery

    NASA Astrophysics Data System (ADS)

    Selzer, Robert H.; Lee, Paul L.; Lai, June Y.; Frieden, Howard J.; Blankenhorn, David H.

    A method is under development to measure carotid artery lesions from a computer-generated three-dimensional ultrasound image. For each image, the position of the transducer in six coordinates (x, y, z, azimuth, elevation, and roll) is recorded and used to position each B-mode picture element in its proper spatial position in a three-dimensional memory array. After all B-mode images have been assembled in the memory, the three-dimensional image is filtered and resampled to produce a new series of parallel-plane two-dimensional images from which arterial boundaries are determined using edge tracking methods.

  3. SERODS optical data storage with parallel signal transfer

    DOEpatents

    Vo-Dinh, Tuan

    2003-09-02

    Surface-enhanced Raman optical data storage (SERODS) systems having increased reading and writing speeds, that is, increased data transfer rates, are disclosed. In the various SERODS read and write systems, the surface-enhanced Raman scattering (SERS) data is written and read using a two-dimensional process called parallel signal transfer (PST). The various embodiments utilize laser light beam excitation of the SERODS medium, optical filtering, beam imaging, and two-dimensional light detection. Two- and three-dimensional SERODS media are utilized. The SERODS write systems employ either a different laser or a different level of laser power.

  4. SERODS optical data storage with parallel signal transfer

    DOEpatents

    Vo-Dinh, Tuan

    2003-06-24

    Surface-enhanced Raman optical data storage (SERODS) systems having increased reading and writing speeds, that is, increased data transfer rates, are disclosed. In the various SERODS read and write systems, the surface-enhanced Raman scattering (SERS) data is written and read using a two-dimensional process called parallel signal transfer (PST). The various embodiments utilize laser light beam excitation of the SERODS medium, optical filtering, beam imaging, and two-dimensional light detection. Two- and three-dimensional SERODS media are utilized. The SERODS write systems employ either a different laser or a different level of laser power.

  5. The factorization of large composite numbers on the MPP

    NASA Technical Reports Server (NTRS)

    Mckurdy, Kathy J.; Wunderlich, Marvin C.

    1987-01-01

    The continued fraction method for factoring large integers (CFRAC) was an ideal algorithm to be implemented on a massively parallel computer such as the Massively Parallel Processor (MPP). After much effort, the first 60 digit number was factored on the MPP using about 6 1/2 hours of array time. Although this result added about 10 digits to the size number that could be factored using CFRAC on a serial machine, it was already badly beaten by the implementation of Davis and Holdridge on the CRAY-1 using the quadratic sieve, an algorithm which is clearly superior to CFRAC for large numbers. An algorithm is illustrated which is ideally suited to the single instruction multiple data (SIMD) massively parallel architecture and some of the modifications which were needed in order to make the parallel implementation effective and efficient are described.

  6. High Performance Programming Using Explicit Shared Memory Model on the Cray T3D

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Simon, Horst D.; Lasinski, T. A. (Technical Monitor)

    1994-01-01

    The Cray T3D is the first-phase system in Cray Research Inc.'s (CRI) three-phase massively parallel processing program. In this report we describe the architecture of the T3D, as well as the CRAFT (Cray Research Adaptive Fortran) programming model, and contrast it with PVM, which is also supported on the T3D We present some performance data based on the NAS Parallel Benchmarks to illustrate both architectural and software features of the T3D.

  7. The high performance parallel algorithm for Unified Gas-Kinetic Scheme

    NASA Astrophysics Data System (ADS)

    Li, Shiyi; Li, Qibing; Fu, Song; Xu, Jinxiu

    2016-11-01

    A high performance parallel algorithm for UGKS is developed to simulate three-dimensional flows internal and external on arbitrary grid system. The physical domain and velocity domain are divided into different blocks and distributed according to the two-dimensional Cartesian topology with intra-communicators in physical domain for data exchange and other intra-communicators in velocity domain for sum reduction to moment integrals. Numerical results of three-dimensional cavity flow and flow past a sphere agree well with the results from the existing studies and validate the applicability of the algorithm. The scalability of the algorithm is tested both on small (1-16) and large (729-5832) scale processors. The tested speed-up ratio is near linear ashind thus the efficiency is around 1, which reveals the good scalability of the present algorithm.

  8. Parallel Simulation of Three-Dimensional Free-Surface Fluid Flow Problems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    BAER,THOMAS A.; SUBIA,SAMUEL R.; SACKINGER,PHILIP A.

    2000-01-18

    We describe parallel simulations of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact lines. The Galerlin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-static solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of problem unknowns. Issues concerning the proper constraints along the solid-fluid dynamic contact line inmore » three dimensions are discussed. Parallel computations are carried out for an example taken from the coating flow industry, flow in the vicinity of a slot coater edge. This is a three-dimensional free-surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another part of the flow domain. Discussion focuses on parallel speedups for fixed problem size, a class of problems of immediate practical importance.« less

  9. Searching for a C-function on the three-dimensional sphere

    NASA Astrophysics Data System (ADS)

    Beneventano, C. G.; Cavero-Peláez, I.; D'Ascanio, D.; Santangelo, E. M.

    2017-11-01

    We present a detailed analytic study on the three-dimensional sphere of the most popular candidates for C-functions, both for Dirac and scalar free massive fields. We discuss to which extent the effective action, the Rényi entanglement entropy and the renormalized entanglement entropy fulfill the conditions expected from C-functions. In view of the absence of a good candidate in the case of the scalar field, we introduce a new candidate, which we call the modified effective action, and analyze its pros and cons.

  10. Generalized three-dimensional lattice Boltzmann color-gradient method for immiscible two-phase pore-scale imbibition and drainage in porous media

    NASA Astrophysics Data System (ADS)

    Leclaire, Sébastien; Parmigiani, Andrea; Malaspinas, Orestis; Chopard, Bastien; Latt, Jonas

    2017-03-01

    This article presents a three-dimensional numerical framework for the simulation of fluid-fluid immiscible compounds in complex geometries, based on the multiple-relaxation-time lattice Boltzmann method to model the fluid dynamics and the color-gradient approach to model multicomponent flow interaction. New lattice weights for the lattices D3Q15, D3Q19, and D3Q27 that improve the Galilean invariance of the color-gradient model as well as for modeling the interfacial tension are derived and provided in the Appendix. The presented method proposes in particular an approach to model the interaction between the fluid compound and the solid, and to maintain a precise contact angle between the two-component interface and the wall. Contrarily to previous approaches proposed in the literature, this method yields accurate solutions even in complex geometries and does not suffer from numerical artifacts like nonphysical mass transfer along the solid wall, which is crucial for modeling imbibition-type problems. The article also proposes an approach to model inflow and outflow boundaries with the color-gradient method by generalizing the regularized boundary conditions. The numerical framework is first validated for three-dimensional (3D) stationary state (Jurin's law) and time-dependent (Washburn's law and capillary waves) problems. Then, the usefulness of the method for practical problems of pore-scale flow imbibition and drainage in porous media is demonstrated. Through the simulation of nonwetting displacement in two-dimensional random porous media networks, we show that the model properly reproduces three main invasion regimes (stable displacement, capillary fingering, and viscous fingering) as well as the saturating zone transition between these regimes. Finally, the ability to simulate immiscible two-component flow imbibition and drainage is validated, with excellent results, by numerical simulations in a Berea sandstone, a frequently used benchmark case used in this field, using a complex geometry that originates from a 3D scan of a porous sandstone. The methods presented in this article were implemented in the open-source PALABOS library, a general C++ matrix-based library well adapted for massive fluid flow parallel computation.

  11. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    Treesearch

    Matthew Parks; Richard Cronn; Aaron Liston

    2009-01-01

    We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome) generated using multiplexed massively parallel sequencing. We found that 30/33 ingroup nodes resolved wlth > 95-percent bootstrap support; this is a substantial improvement relative...

  12. Fast, Massively Parallel Data Processors

    NASA Technical Reports Server (NTRS)

    Heaton, Robert A.; Blevins, Donald W.; Davis, ED

    1994-01-01

    Proposed fast, massively parallel data processor contains 8x16 array of processing elements with efficient interconnection scheme and options for flexible local control. Processing elements communicate with each other on "X" interconnection grid with external memory via high-capacity input/output bus. This approach to conditional operation nearly doubles speed of various arithmetic operations.

  13. Massively Parallel Solution of Poisson Equation on Coarse Grain MIMD Architectures

    NASA Technical Reports Server (NTRS)

    Fijany, A.; Weinberger, D.; Roosta, R.; Gulati, S.

    1998-01-01

    In this paper a new algorithm, designated as Fast Invariant Imbedding algorithm, for solution of Poisson equation on vector and massively parallel MIMD architectures is presented. This algorithm achieves the same optimal computational efficiency as other Fast Poisson solvers while offering a much better structure for vector and parallel implementation. Our implementation on the Intel Delta and Paragon shows that a speedup of over two orders of magnitude can be achieved even for moderate size problems.

  14. Development of high-resolution x-ray CT system using parallel beam geometry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yoneyama, Akio, E-mail: akio.yoneyama.bu@hitachi.com; Baba, Rika; Hyodo, Kazuyuki

    2016-01-28

    For fine three-dimensional observations of large biomedical and organic material samples, we developed a high-resolution X-ray CT system. The system consists of a sample positioner, a 5-μm scintillator, microscopy lenses, and a water-cooled sCMOS detector. Parallel beam geometry was adopted to attain a field of view of a few mm square. A fine three-dimensional image of birch branch was obtained using a 9-keV X-ray at BL16XU of SPring-8 in Japan. The spatial resolution estimated from the line profile of a sectional image was about 3 μm.

  15. A programmable computational image sensor for high-speed vision

    NASA Astrophysics Data System (ADS)

    Yang, Jie; Shi, Cong; Long, Xitian; Wu, Nanjian

    2013-08-01

    In this paper we present a programmable computational image sensor for high-speed vision. This computational image sensor contains four main blocks: an image pixel array, a massively parallel processing element (PE) array, a row processor (RP) array and a RISC core. The pixel-parallel PE is responsible for transferring, storing and processing image raw data in a SIMD fashion with its own programming language. The RPs are one dimensional array of simplified RISC cores, it can carry out complex arithmetic and logic operations. The PE array and RP array can finish great amount of computation with few instruction cycles and therefore satisfy the low- and middle-level high-speed image processing requirement. The RISC core controls the whole system operation and finishes some high-level image processing algorithms. We utilize a simplified AHB bus as the system bus to connect our major components. Programming language and corresponding tool chain for this computational image sensor are also developed.

  16. Parallel k-means++ for Multiple Shared-Memory Architectures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mackey, Patrick S.; Lewis, Robert R.

    2016-09-22

    In recent years k-means++ has become a popular initialization technique for improved k-means clustering. To date, most of the work done to improve its performance has involved parallelizing algorithms that are only approximations of k-means++. In this paper we present a parallelization of the exact k-means++ algorithm, with a proof of its correctness. We develop implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform. We demonstrate the scalability of the algorithm on each platform. In addition we present a visual approach for showing which platform performed k-means++ the fastest for varyingmore » data sizes.« less

  17. Three-dimensional photoacoustic tomography based on graphics-processing-unit-accelerated finite element method.

    PubMed

    Peng, Kuan; He, Ling; Zhu, Ziqiang; Tang, Jingtian; Xiao, Jiaying

    2013-12-01

    Compared with commonly used analytical reconstruction methods, the frequency-domain finite element method (FEM) based approach has proven to be an accurate and flexible algorithm for photoacoustic tomography. However, the FEM-based algorithm is computationally demanding, especially for three-dimensional cases. To enhance the algorithm's efficiency, in this work a parallel computational strategy is implemented in the framework of the FEM-based reconstruction algorithm using a graphic-processing-unit parallel frame named the "compute unified device architecture." A series of simulation experiments is carried out to test the accuracy and accelerating effect of the improved method. The results obtained indicate that the parallel calculation does not change the accuracy of the reconstruction algorithm, while its computational cost is significantly reduced by a factor of 38.9 with a GTX 580 graphics card using the improved method.

  18. On a model of three-dimensional bursting and its parallel implementation

    NASA Astrophysics Data System (ADS)

    Tabik, S.; Romero, L. F.; Garzón, E. M.; Ramos, J. I.

    2008-04-01

    A mathematical model for the simulation of three-dimensional bursting phenomena and its parallel implementation are presented. The model consists of four nonlinearly coupled partial differential equations that include fast and slow variables, and exhibits bursting in the absence of diffusion. The differential equations have been discretized by means of a second-order accurate in both space and time, linearly-implicit finite difference method in equally-spaced grids. The resulting system of linear algebraic equations at each time level has been solved by means of the Preconditioned Conjugate Gradient (PCG) method. Three different parallel implementations of the proposed mathematical model have been developed; two of these implementations, i.e., the MPI and the PETSc codes, are based on a message passing paradigm, while the third one, i.e., the OpenMP code, is based on a shared space address paradigm. These three implementations are evaluated on two current high performance parallel architectures, i.e., a dual-processor cluster and a Shared Distributed Memory (SDM) system. A novel representation of the results that emphasizes the most relevant factors that affect the performance of the paralled implementations, is proposed. The comparative analysis of the computational results shows that the MPI and the OpenMP implementations are about twice more efficient than the PETSc code on the SDM system. It is also shown that, for the conditions reported here, the nonlinear dynamics of the three-dimensional bursting phenomena exhibits three stages characterized by asynchronous, synchronous and then asynchronous oscillations, before a quiescent state is reached. It is also shown that the fast system reaches steady state in much less time than the slow variables.

  19. Topologically massive gravity and Ricci-Cotton flow

    NASA Astrophysics Data System (ADS)

    Lashkari, Nima; Maloney, Alexander

    2011-05-01

    We consider topologically massive gravity (TMG), which is three-dimensional general relativity with a cosmological constant and a gravitational Chern-Simons term. When the cosmological constant is negative the theory has two potential vacuum solutions: anti-de Sitter space and warped anti-de Sitter space. The theory also contains a massive graviton state which renders these solutions unstable for certain values of the parameters and boundary conditions. We study the decay of these solutions due to the condensation of the massive graviton mode using Ricci-Cotton flow, which is the appropriate generalization of Ricci flow to TMG. When the Chern-Simons coupling is small the AdS solution flows to warped AdS by the condensation of the massive graviton mode. When the coupling is large the situation is reversed, and warped AdS flows to AdS. Minisuperspace models are constructed where these flows are studied explicitly.

  20. Holographically viable extensions of topologically massive and minimal massive gravity?

    NASA Astrophysics Data System (ADS)

    Altas, Emel; Tekin, Bayram

    2016-01-01

    Recently [E. Bergshoeff et al., Classical Quantum Gravity 31, 145008 (2014)], an extension of the topologically massive gravity (TMG) in 2 +1 dimensions, dubbed as minimal massive gravity (MMG), which is free of the bulk-boundary unitarity clash that inflicts the former theory and all the other known three-dimensional theories, was found. Field equations of MMG differ from those of TMG at quadratic terms in the curvature that do not come from the variation of an action depending on the metric alone. Here we show that MMG is a unique theory and there does not exist a deformation of TMG or MMG at the cubic and quartic order (and beyond) in the curvature that is consistent at the level of the field equations. The only extension of TMG with the desired bulk and boundary properties having a single massive degree of freedom is MMG.

  1. 3DScapeCS: application of three dimensional, parallel, dynamic network visualization in Cytoscape

    PubMed Central

    2013-01-01

    Background The exponential growth of gigantic biological data from various sources, such as protein-protein interaction (PPI), genome sequences scaffolding, Mass spectrometry (MS) molecular networking and metabolic flux, demands an efficient way for better visualization and interpretation beyond the conventional, two-dimensional visualization tools. Results We developed a 3D Cytoscape Client/Server (3DScapeCS) plugin, which adopted Cytoscape in interpreting different types of data, and UbiGraph for three-dimensional visualization. The extra dimension is useful in accommodating, visualizing, and distinguishing large-scale networks with multiple crossed connections in five case studies. Conclusions Evaluation on several experimental data using 3DScapeCS and its special features, including multilevel graph layout, time-course data animation, and parallel visualization has proven its usefulness in visualizing complex data and help to make insightful conclusions. PMID:24225050

  2. 3D-PDR: Three-dimensional photodissociation region code

    NASA Astrophysics Data System (ADS)

    Bisbas, T. G.; Bell, T. A.; Viti, S.; Yates, J.; Barlow, M. J.

    2018-03-01

    3D-PDR is a three-dimensional photodissociation region code written in Fortran. It uses the Sundials package (written in C) to solve the set of ordinary differential equations and it is the successor of the one-dimensional PDR code UCL_PDR (ascl:1303.004). Using the HEALpix ray-tracing scheme (ascl:1107.018), 3D-PDR solves a three-dimensional escape probability routine and evaluates the attenuation of the far-ultraviolet radiation in the PDR and the propagation of FIR/submm emission lines out of the PDR. The code is parallelized (OpenMP) and can be applied to 1D and 3D problems.

  3. Inter-laboratory evaluation of the EUROFORGEN Global ancestry-informative SNP panel by massively parallel sequencing using the Ion PGM™.

    PubMed

    Eduardoff, M; Gross, T E; Santos, C; de la Puente, M; Ballard, D; Strobl, C; Børsting, C; Morling, N; Fusco, L; Hussing, C; Egyed, B; Souto, L; Uacyisrael, J; Syndercombe Court, D; Carracedo, Á; Lareu, M V; Schneider, P M; Parson, W; Phillips, C; Parson, W; Phillips, C

    2016-07-01

    The EUROFORGEN Global ancestry-informative SNP (AIM-SNPs) panel is a forensic multiplex of 128 markers designed to differentiate an individual's ancestry from amongst the five continental population groups of Africa, Europe, East Asia, Native America, and Oceania. A custom multiplex of AmpliSeq™ PCR primers was designed for the Global AIM-SNPs to perform massively parallel sequencing using the Ion PGM™ system. This study assessed individual SNP genotyping precision using the Ion PGM™, the forensic sensitivity of the multiplex using dilution series, degraded DNA plus simple mixtures, and the ancestry differentiation power of the final panel design, which required substitution of three original ancestry-informative SNPs with alternatives. Fourteen populations that had not been previously analyzed were genotyped using the custom multiplex and these studies allowed assessment of genotyping performance by comparison of data across five laboratories. Results indicate a low level of genotyping error can still occur from sequence misalignment caused by homopolymeric tracts close to the target SNP, despite careful scrutiny of candidate SNPs at the design stage. Such sequence misalignment required the exclusion of component SNP rs2080161 from the Global AIM-SNPs panel. However, the overall genotyping precision and sensitivity of this custom multiplex indicates the Ion PGM™ assay for the Global AIM-SNPs is highly suitable for forensic ancestry analysis with massively parallel sequencing. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  4. Supersymmetric solutions of N =(1 ,1 ) general massive supergravity

    NASA Astrophysics Data System (ADS)

    Deger, N. S.; Nazari, Z.; Sarıoǧlu, Ö.

    2018-05-01

    We construct supersymmetric solutions of three-dimensional N =(1 ,1 ) general massive supergravity (GMG). Solutions with a null Killing vector are, in general, pp-waves. We identify those that appear at critical points of the model, some of which do not exist in N =(1 ,1 ) new massive supergravity (NMG). In the timelike case, we find that many solutions are common with NMG, but there is a new class that is genuine to GMG, two members of which are stationary Lifshitz and timelike squashed AdS spacetimes. We also show that in addition to the fully supersymmetric AdS vacuum, there is a second AdS background with a nonzero vector field that preserves 1 /4 supersymmetry.

  5. ParaBTM: A Parallel Processing Framework for Biomedical Text Mining on Supercomputers.

    PubMed

    Xing, Yuting; Wu, Chengkun; Yang, Xi; Wang, Wei; Zhu, En; Yin, Jianping

    2018-04-27

    A prevailing way of extracting valuable information from biomedical literature is to apply text mining methods on unstructured texts. However, the massive amount of literature that needs to be analyzed poses a big data challenge to the processing efficiency of text mining. In this paper, we address this challenge by introducing parallel processing on a supercomputer. We developed paraBTM, a runnable framework that enables parallel text mining on the Tianhe-2 supercomputer. It employs a low-cost yet effective load balancing strategy to maximize the efficiency of parallel processing. We evaluated the performance of paraBTM on several datasets, utilizing three types of named entity recognition tasks as demonstration. Results show that, in most cases, the processing efficiency can be greatly improved with parallel processing, and the proposed load balancing strategy is simple and effective. In addition, our framework can be readily applied to other tasks of biomedical text mining besides NER.

  6. Research in computer science

    NASA Technical Reports Server (NTRS)

    Ortega, J. M.

    1986-01-01

    Various graduate research activities in the field of computer science are reported. Among the topics discussed are: (1) failure probabilities in multi-version software; (2) Gaussian Elimination on parallel computers; (3) three dimensional Poisson solvers on parallel/vector computers; (4) automated task decomposition for multiple robot arms; (5) multi-color incomplete cholesky conjugate gradient methods on the Cyber 205; and (6) parallel implementation of iterative methods for solving linear equations.

  7. Parallel DSMC Solution of Three-Dimensional Flow Over a Finite Flat Plate

    NASA Technical Reports Server (NTRS)

    Nance, Robert P.; Wilmoth, Richard G.; Moon, Bongki; Hassan, H. A.; Saltz, Joel

    1994-01-01

    This paper describes a parallel implementation of the direct simulation Monte Carlo (DSMC) method. Runtime library support is used for scheduling and execution of communication between nodes, and domain decomposition is performed dynamically to maintain a good load balance. Performance tests are conducted using the code to evaluate various remapping and remapping-interval policies, and it is shown that a one-dimensional chain-partitioning method works best for the problems considered. The parallel code is then used to simulate the Mach 20 nitrogen flow over a finite-thickness flat plate. It is shown that the parallel algorithm produces results which compare well with experimental data. Moreover, it yields significantly faster execution times than the scalar code, as well as very good load-balance characteristics.

  8. Three-dimensional mapping of the local interstellar medium with composite data

    NASA Astrophysics Data System (ADS)

    Capitanio, L.; Lallement, R.; Vergely, J. L.; Elyajouri, M.; Monreal-Ibero, A.

    2017-10-01

    Context. Three-dimensional maps of the Galactic interstellar medium are general astrophysical tools. Reddening maps may be based on the inversion of color excess measurements for individual target stars or on statistical methods using stellar surveys. Three-dimensional maps based on diffuse interstellar bands (DIBs) have also been produced. All methods benefit from the advent of massive surveys and may benefit from Gaia data. Aims: All of the various methods and databases have their own advantages and limitations. Here we present a first attempt to combine different datasets and methods to improve the local maps. Methods: We first updated our previous local dust maps based on a regularized Bayesian inversion of individual color excess data by replacing Hipparcos or photometric distances with Gaia Data Release 1 values when available. Secondly, we complemented this database with a series of ≃5000 color excess values estimated from the strength of the λ15273 DIB toward stars possessing a Gaia parallax. The DIB strengths were extracted from SDSS/APOGEE spectra. Third, we computed a low-resolution map based on a grid of Pan-STARRS reddening measurements by means of a new hierarchical technique and used this map as the prior distribution during the inversion of the two other datasets. Results: The use of Gaia parallaxes introduces significant changes in some areas and globally increases the compactness of the structures. Additional DIB-based data make it possible to assign distances to clouds located behind closer opaque structures and do not introduce contradictory information for the close structures. A more realistic prior distribution instead of a plane-parallel homogeneous distribution helps better define the structures. We validated the results through comparisons with other maps and with soft X-ray data. Conclusions: Our study demonstrates that the combination of various tracers is a potential tool for more accurate maps. An online tool makes it possible to retrieve maps and reddening estimations. Our online tool is available at http://stilism.obspm.fr

  9. Computational Performance of a Parallelized Three-Dimensional High-Order Spectral Element Toolbox

    NASA Astrophysics Data System (ADS)

    Bosshard, Christoph; Bouffanais, Roland; Clémençon, Christian; Deville, Michel O.; Fiétier, Nicolas; Gruber, Ralf; Kehtari, Sohrab; Keller, Vincent; Latt, Jonas

    In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed with help of a time prediction model based on a parameterization of the application and the hardware resources. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for clusters with up to 8192 cores. Some problems in the parallel implementation have been detected and corrected. The theoretical complexities with respect to the number of elements, to the polynomial degree, and to communication needs are correctly reproduced. It is concluded that this type of code has a nearly perfect speed up on machines with thousands of cores, and is ready to make the step to next-generation petaflop machines.

  10. Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) Simulations of the Molecular Crystal alphaRDX

    DTIC Science & Technology

    2013-08-01

    potential for HMX / RDX (3, 9). ...................................................................................8 1 1. Purpose This work...6 dispersion and electrostatic interactions. Constants for the SB potential are given in table 1. 8 Table 1. SB potential for HMX / RDX (3, 9...modeling dislocations in the energetic molecular crystal RDX using the Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) molecular

  11. Massively parallel sequencing-enabled mixture analysis of mitochondrial DNA samples.

    PubMed

    Churchill, Jennifer D; Stoljarova, Monika; King, Jonathan L; Budowle, Bruce

    2018-02-22

    The mitochondrial genome has a number of characteristics that provide useful information to forensic investigations. Massively parallel sequencing (MPS) technologies offer improvements to the quantitative analysis of the mitochondrial genome, specifically the interpretation of mixed mitochondrial samples. Two-person mixtures with nuclear DNA ratios of 1:1, 5:1, 10:1, and 20:1 of individuals from different and similar phylogenetic backgrounds and three-person mixtures with nuclear DNA ratios of 1:1:1 and 5:1:1 were prepared using the Precision ID mtDNA Whole Genome Panel and Ion Chef, and sequenced on the Ion PGM or Ion S5 sequencer (Thermo Fisher Scientific, Waltham, MA, USA). These data were used to evaluate whether and to what degree MPS mixtures could be deconvolved. Analysis was effective in identifying the major contributor in each instance, while SNPs from the minor contributor's haplotype only were identified in the 1:1, 5:1, and 10:1 two-person mixtures. While the major contributor was identified from the 5:1:1 mixture, analysis of the three-person mixtures was more complex, and the mixed haplotypes could not be completely parsed. These results indicate that mixed mitochondrial DNA samples may be interpreted with the use of MPS technologies.

  12. In-memory integration of existing software components for parallel adaptive unstructured mesh workflows

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, Cameron W.; Granzow, Brian; Diamond, Gerrett

    Unstructured mesh methods, like finite elements and finite volumes, support the effective analysis of complex physical behaviors modeled by partial differential equations over general threedimensional domains. The most reliable and efficient methods apply adaptive procedures with a-posteriori error estimators that indicate where and how the mesh is to be modified. Although adaptive meshes can have two to three orders of magnitude fewer elements than a more uniform mesh for the same level of accuracy, there are many complex simulations where the meshes required are so large that they can only be solved on massively parallel systems.

  13. In-memory integration of existing software components for parallel adaptive unstructured mesh workflows

    DOE PAGES

    Smith, Cameron W.; Granzow, Brian; Diamond, Gerrett; ...

    2017-01-01

    Unstructured mesh methods, like finite elements and finite volumes, support the effective analysis of complex physical behaviors modeled by partial differential equations over general threedimensional domains. The most reliable and efficient methods apply adaptive procedures with a-posteriori error estimators that indicate where and how the mesh is to be modified. Although adaptive meshes can have two to three orders of magnitude fewer elements than a more uniform mesh for the same level of accuracy, there are many complex simulations where the meshes required are so large that they can only be solved on massively parallel systems.

  14. Parallelized seeded region growing using CUDA.

    PubMed

    Park, Seongjin; Lee, Jeongjin; Lee, Hyunna; Shin, Juneseuk; Seo, Jinwook; Lee, Kyoung Ho; Shin, Yeong-Gil; Kim, Bohyoung

    2014-01-01

    This paper presents a novel method for parallelizing the seeded region growing (SRG) algorithm using Compute Unified Device Architecture (CUDA) technology, with intention to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared with SRG implementations on single-core CPUs, quad-core CPUs, and shader language programming, using synthetic datasets and 20 body CT scans. Based on the experimental results, the CUDA-based SRG outperforms the other three implementations, advocating that it can substantially assist the segmentation during massive CT screening tests.

  15. Algorithms and programming tools for image processing on the MPP

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.

    1985-01-01

    Topics addressed include: data mapping and rotational algorithms for the Massively Parallel Processor (MPP); Parallel Pascal language; documentation for the Parallel Pascal Development system; and a description of the Parallel Pascal language used on the MPP.

  16. Parallelized direct execution simulation of message-passing parallel programs

    NASA Technical Reports Server (NTRS)

    Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

    1994-01-01

    As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.

  17. The singular behavior of massive QCD amplitudes

    NASA Astrophysics Data System (ADS)

    Mitov, Alexander; Moch, Sven-Olaf

    2007-05-01

    We discuss the structure of infrared singularities in on-shell QCD amplitudes with massive partons and present a general factorization formula in the limit of small parton masses. The factorization formula gives rise to an all-order exponentiation of both, the soft poles in dimensional regularization and the large collinear logarithms of the parton masses. Moreover, it provides a universal relation between any on-shell amplitude with massive external partons and its corresponding massless amplitude. For the form factor of a heavy quark we present explicit results including the fixed-order expansion up to three loops in the small mass limit. For general scattering processes we show how our constructive method applies to the computation of all singularities as well as the constant (mass-independent) terms of a generic massive n-parton QCD amplitude up to the next-to-next-to-leading order corrections.

  18. HPCC Methodologies for Structural Design and Analysis on Parallel and Distributed Computing Platforms

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel

    1998-01-01

    In this grant, we have proposed a three-year research effort focused on developing High Performance Computation and Communication (HPCC) methodologies for structural analysis on parallel processors and clusters of workstations, with emphasis on reducing the structural design cycle time. Besides consolidating and further improving the FETI solver technology to address plate and shell structures, we have proposed to tackle the following design related issues: (a) parallel coupling and assembly of independently designed and analyzed three-dimensional substructures with non-matching interfaces, (b) fast and smart parallel re-analysis of a given structure after it has undergone design modifications, (c) parallel evaluation of sensitivity operators (derivatives) for design optimization, and (d) fast parallel analysis of mildly nonlinear structures. While our proposal was accepted, support was provided only for one year.

  19. ORCA Project: Research on high-performance parallel computer programming environments. Final report, 1 Apr-31 Mar 90

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Snyder, L.; Notkin, D.; Adams, L.

    1990-03-31

    This task relates to research on programming massively parallel computers. Previous work on the Ensamble concept of programming was extended and investigation into nonshared memory models of parallel computation was undertaken. Previous work on the Ensamble concept defined a set of programming abstractions and was used to organize the programming task into three distinct levels; Composition of machine instruction, composition of processes, and composition of phases. It was applied to shared memory models of computations. During the present research period, these concepts were extended to nonshared memory models. During the present research period, one Ph D. thesis was completed, onemore » book chapter, and six conference proceedings were published.« less

  20. Holographic heat engine within the framework of massive gravity

    NASA Astrophysics Data System (ADS)

    Mo, Jie-Xiong; Li, Gu-Qiang

    2018-05-01

    Heat engine models are constructed within the framework of massive gravity in this paper. For the four-dimensional charged black holes in massive gravity, it is shown that the existence of graviton mass improves the heat engine efficiency significantly. The situation is more complicated for the five-dimensional neutral black holes since the constant which corresponds to the third massive potential also contributes to the efficiency. It is also shown that the existence of graviton mass can improve the heat engine efficiency. Moreover, we probe how the massive gravity influences the behavior of the heat engine efficiency approaching the Carnot efficiency.

  1. Accuracy of impressions with different impression materials in angulated implants.

    PubMed

    Reddy, S; Prasad, K; Vakil, H; Jain, A; Chowdhary, R

    2013-01-01

    To evaluate the dimensional accuracy of the resultant (duplicative) casts made from two different impression materials (polyvinyl siloxane and polyether) in parallel and angulated implants. Three definitive master casts (control groups) were fabricated in dental stone with three implants, placed at equi-distance. In first group (control), all three implants were placed parallel to each other and perpendicular to the plane of the cast. In the second and third group (control), all three implants were placed at 10° and 15 o angulation respectively to the long axis of the cast, tilting towards the centre. Impressions were made with polyvinyl siloxane and polyether impression materials in a special tray, using a open tray impression technique from the master casts. These impressions were poured to obtain test casts. Three reference distances were evaluated on each test cast by using a profile projector and compared with control groups to determine the effect of combined interaction of implant angulation and impression materials on the accuracy of implant resultant cast. Statistical analysis revealed no significant difference in dimensional accuracy of the resultant casts made from two different impression materials (polyvinyl siloxane and polyether) by closed tray impression technique in parallel and angulated implants. On the basis of the results of this study, the use of both the impression materials i.e., polyether and polyvinyl siloxane impression is recommended for impression making in parallel as well as angulated implants.

  2. Parallel computational fluid dynamics '91; Conference Proceedings, Stuttgart, Germany, Jun. 10-12, 1991

    NASA Technical Reports Server (NTRS)

    Reinsch, K. G. (Editor); Schmidt, W. (Editor); Ecer, A. (Editor); Haeuser, Jochem (Editor); Periaux, J. (Editor)

    1992-01-01

    A conference was held on parallel computational fluid dynamics and produced related papers. Topics discussed in these papers include: parallel implicit and explicit solvers for compressible flow, parallel computational techniques for Euler and Navier-Stokes equations, grid generation techniques for parallel computers, and aerodynamic simulation om massively parallel systems.

  3. Massively parallel and linear-scaling algorithm for second-order Møller-Plesset perturbation theory applied to the study of supramolecular wires

    NASA Astrophysics Data System (ADS)

    Kjærgaard, Thomas; Baudin, Pablo; Bykov, Dmytro; Eriksen, Janus Juul; Ettenhuber, Patrick; Kristensen, Kasper; Larkin, Jeff; Liakh, Dmitry; Pawłowski, Filip; Vose, Aaron; Wang, Yang Min; Jørgensen, Poul

    2017-03-01

    We present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide-Expand-Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide-Expand-Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalability of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the "resolution of the identity second-order Møller-Plesset perturbation theory" (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.

  4. Overcoming rule-based rigidity and connectionist limitations through massively-parallel case-based reasoning

    NASA Technical Reports Server (NTRS)

    Barnden, John; Srinivas, Kankanahalli

    1990-01-01

    Symbol manipulation as used in traditional Artificial Intelligence has been criticized by neural net researchers for being excessively inflexible and sequential. On the other hand, the application of neural net techniques to the types of high-level cognitive processing studied in traditional artificial intelligence presents major problems as well. A promising way out of this impasse is to build neural net models that accomplish massively parallel case-based reasoning. Case-based reasoning, which has received much attention recently, is essentially the same as analogy-based reasoning, and avoids many of the problems leveled at traditional artificial intelligence. Further problems are avoided by doing many strands of case-based reasoning in parallel, and by implementing the whole system as a neural net. In addition, such a system provides an approach to some aspects of the problems of noise, uncertainty and novelty in reasoning systems. The current neural net system (Conposit), which performs standard rule-based reasoning, is being modified into a massively parallel case-based reasoning version.

  5. Recurrent hotspot mutations in HRAS Q61 and PI3K-AKT pathway genes as drivers of breast adenomyoepitheliomas.

    PubMed

    Geyer, Felipe C; Li, Anqi; Papanastasiou, Anastasios D; Smith, Alison; Selenica, Pier; Burke, Kathleen A; Edelweiss, Marcia; Wen, Huei-Chi; Piscuoglio, Salvatore; Schultheis, Anne M; Martelotto, Luciano G; Pareja, Fresia; Kumar, Rahul; Brandes, Alissa; Fan, Dan; Basili, Thais; Da Cruz Paula, Arnaud; Lozada, John R; Blecua, Pedro; Muenst, Simone; Jungbluth, Achim A; Foschini, Maria P; Wen, Hannah Y; Brogi, Edi; Palazzo, Juan; Rubin, Brian P; Ng, Charlotte K Y; Norton, Larry; Varga, Zsuzsanna; Ellis, Ian O; Rakha, Emad A; Chandarlapaty, Sarat; Weigelt, Britta; Reis-Filho, Jorge S

    2018-05-08

    Adenomyoepithelioma of the breast is a rare tumor characterized by epithelial-myoepithelial differentiation, whose genetic underpinning is largely unknown. Here we show through whole-exome and targeted massively parallel sequencing analysis that whilst estrogen receptor (ER)-positive adenomyoepitheliomas display PIK3CA or AKT1 activating mutations, ER-negative adenomyoepitheliomas harbor highly recurrent codon Q61 HRAS hotspot mutations, which co-occur with PIK3CA or PIK3R1 mutations. In two- and three-dimensional cell culture models, forced expression of HRAS Q61R in non-malignant ER-negative breast epithelial cells with or without a PIK3CA H1047R somatic knock-in results in transformation and the acquisition of the cardinal features of adenomyoepitheliomas, including the expression of myoepithelial markers, a reduction in E-cadherin expression, and an increase in AKT signaling. Our results demonstrate that adenomyoepitheliomas are genetically heterogeneous, and qualify mutations in HRAS, a gene whose mutations are vanishingly rare in common-type breast cancers, as likely drivers of ER-negative adenomyoepitheliomas.

  6. A non-volatile organic electrochemical device as a low-voltage artificial synapse for neuromorphic computing

    NASA Astrophysics Data System (ADS)

    van de Burgt, Yoeri; Lubberman, Ewout; Fuller, Elliot J.; Keene, Scott T.; Faria, Grégorio C.; Agarwal, Sapan; Marinella, Matthew J.; Alec Talin, A.; Salleo, Alberto

    2017-04-01

    The brain is capable of massively parallel information processing while consuming only ~1-100 fJ per synaptic event. Inspired by the efficiency of the brain, CMOS-based neural architectures and memristors are being developed for pattern recognition and machine learning. However, the volatility, design complexity and high supply voltages for CMOS architectures, and the stochastic and energy-costly switching of memristors complicate the path to achieve the interconnectivity, information density, and energy efficiency of the brain using either approach. Here we describe an electrochemical neuromorphic organic device (ENODe) operating with a fundamentally different mechanism from existing memristors. ENODe switches at low voltage and energy (<10 pJ for 103 μm2 devices), displays >500 distinct, non-volatile conductance states within a ~1 V range, and achieves high classification accuracy when implemented in neural network simulations. Plastic ENODes are also fabricated on flexible substrates enabling the integration of neuromorphic functionality in stretchable electronic systems. Mechanical flexibility makes ENODes compatible with three-dimensional architectures, opening a path towards extreme interconnectivity comparable to the human brain.

  7. A non-volatile organic electrochemical device as a low-voltage artificial synapse for neuromorphic computing.

    PubMed

    van de Burgt, Yoeri; Lubberman, Ewout; Fuller, Elliot J; Keene, Scott T; Faria, Grégorio C; Agarwal, Sapan; Marinella, Matthew J; Alec Talin, A; Salleo, Alberto

    2017-04-01

    The brain is capable of massively parallel information processing while consuming only ∼1-100 fJ per synaptic event. Inspired by the efficiency of the brain, CMOS-based neural architectures and memristors are being developed for pattern recognition and machine learning. However, the volatility, design complexity and high supply voltages for CMOS architectures, and the stochastic and energy-costly switching of memristors complicate the path to achieve the interconnectivity, information density, and energy efficiency of the brain using either approach. Here we describe an electrochemical neuromorphic organic device (ENODe) operating with a fundamentally different mechanism from existing memristors. ENODe switches at low voltage and energy (<10 pJ for 10 3  μm 2 devices), displays >500 distinct, non-volatile conductance states within a ∼1 V range, and achieves high classification accuracy when implemented in neural network simulations. Plastic ENODes are also fabricated on flexible substrates enabling the integration of neuromorphic functionality in stretchable electronic systems. Mechanical flexibility makes ENODes compatible with three-dimensional architectures, opening a path towards extreme interconnectivity comparable to the human brain.

  8. Homogeneous, anisotropic three-manifolds of topologically massive gravity

    NASA Astrophysics Data System (ADS)

    Nutku, Y.; Baekler, P.

    1989-10-01

    We present a new class of exact solutions of Deser, Jackiw, and Templeton's theory (DJT) of topologically massive gravity which consists of homogeneous, anisotropic manifolds. In these solutions the coframe is given by the left-invariant 1-forms of 3-dimensional Lie algebras up to constant scale factors. These factors are fixed in terms of the DJT coupling constant μ which is the constant of proportionality between the Einstein and Cotton tensors in 3-dimensions. Differences between the scale factors result in anisotropy which is a common feature of topologically massive 3-manifolds. We have found that only Bianchi Types VI, VIII, and IX lead to nontrivial solutions. Among these, a Bianchi Type IX, squashed 3-sphere solution of the Euclideanized DJT theory has finite action. Bianchi Type VIII, IX solutions can variously be embedded in the de Sitter/anti-de Sitter space. That is, some DJT 3-manifolds that we shall present here can be regarded as the basic constituent of anti-de Sitter space which is the ground state solution in higher dimensional generalization of Einstein's general relativity.

  9. Homogeneous, anisotropic three-manifolds of topologically massive gravity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nutku, Y.; Baekler, P.

    1989-10-01

    We present a new class of exact solutions of Deser, Jackiw, and Templeton's theory (DJT) of topologically massive gravity which consists of homogeneous, anisotropic manifolds. In these solutions the coframe is given by the left-invariant 1-forms of 3-dimensional Lie algebras up to constant scale factors. These factors are fixed in terms of the DJT coupling constant {mu}m which is the constant of proportionality between the Einstein and Cotton tensors in 3-dimensions. Differences between the scale factors result in anisotropy which is a common feature of topologically massive 3-manifolds. We have found that only Bianchi Types VI, VIII, and IX leadmore » to nontrivial solutions. Among these, a Bianchi Type IX, squashed 3-sphere solution of the Euclideanized DJT theory has finite action, Bianchi Type VIII, IX solutions can variously be embedded in the de Sitter/anti-de Sitter space. That is, some DJT 3-manifolds that we shall present here can be regarded as the basic constitent of anti-de Sitter space which is the ground state solution in higher dimensional generalizations of Einstein's general relativity. {copyright} 1989 Academic Press, Inc.« less

  10. Externally Calibrated Parallel Imaging for 3D Multispectral Imaging Near Metallic Implants Using Broadband Ultrashort Echo Time Imaging

    PubMed Central

    Wiens, Curtis N.; Artz, Nathan S.; Jang, Hyungseok; McMillan, Alan B.; Reeder, Scott B.

    2017-01-01

    Purpose To develop an externally calibrated parallel imaging technique for three-dimensional multispectral imaging (3D-MSI) in the presence of metallic implants. Theory and Methods A fast, ultrashort echo time (UTE) calibration acquisition is proposed to enable externally calibrated parallel imaging techniques near metallic implants. The proposed calibration acquisition uses a broadband radiofrequency (RF) pulse to excite the off-resonance induced by the metallic implant, fully phase-encoded imaging to prevent in-plane distortions, and UTE to capture rapidly decaying signal. The performance of the externally calibrated parallel imaging reconstructions was assessed using phantoms and in vivo examples. Results Phantom and in vivo comparisons to self-calibrated parallel imaging acquisitions show that significant reductions in acquisition times can be achieved using externally calibrated parallel imaging with comparable image quality. Acquisition time reductions are particularly large for fully phase-encoded methods such as spectrally resolved fully phase-encoded three-dimensional (3D) fast spin-echo (SR-FPE), in which scan time reductions of up to 8 min were obtained. Conclusion A fully phase-encoded acquisition with broadband excitation and UTE enabled externally calibrated parallel imaging for 3D-MSI, eliminating the need for repeated calibration regions at each frequency offset. Significant reductions in acquisition time can be achieved, particularly for fully phase-encoded methods like SR-FPE. PMID:27403613

  11. Research in Parallel Algorithms and Software for Computational Aerosciences

    NASA Technical Reports Server (NTRS)

    Domel, Neal D.

    1996-01-01

    Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.

  12. Research in Parallel Algorithms and Software for Computational Aerosciences

    NASA Technical Reports Server (NTRS)

    Domel, Neal D.

    1996-01-01

    Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.

  13. Performance of the Heavy Flavor Tracker (HFT) detector in star experiment at RHIC

    NASA Astrophysics Data System (ADS)

    Alruwaili, Manal

    With the growing technology, the number of the processors is becoming massive. Current supercomputer processing will be available on desktops in the next decade. For mass scale application software development on massive parallel computing available on desktops, existing popular languages with large libraries have to be augmented with new constructs and paradigms that exploit massive parallel computing and distributed memory models while retaining the user-friendliness. Currently, available object oriented languages for massive parallel computing such as Chapel, X10 and UPC++ exploit distributed computing, data parallel computing and thread-parallelism at the process level in the PGAS (Partitioned Global Address Space) memory model. However, they do not incorporate: 1) any extension at for object distribution to exploit PGAS model; 2) the programs lack the flexibility of migrating or cloning an object between places to exploit load balancing; and 3) lack the programming paradigms that will result from the integration of data and thread-level parallelism and object distribution. In the proposed thesis, I compare different languages in PGAS model; propose new constructs that extend C++ with object distribution and object migration; and integrate PGAS based process constructs with these extensions on distributed objects. Object cloning and object migration. Also a new paradigm MIDD (Multiple Invocation Distributed Data) is presented when different copies of the same class can be invoked, and work on different elements of a distributed data concurrently using remote method invocations. I present new constructs, their grammar and their behavior. The new constructs have been explained using simple programs utilizing these constructs.

  14. Diffraction mode terahertz tomography

    DOEpatents

    Ferguson, Bradley; Wang, Shaohong; Zhang, Xi-Cheng

    2006-10-31

    A method of obtaining a series of images of a three-dimensional object. The method includes the steps of transmitting pulsed terahertz (THz) radiation through the entire object from a plurality of angles, optically detecting changes in the transmitted THz radiation using pulsed laser radiation, and constructing a plurality of imaged slices of the three-dimensional object using the detected changes in the transmitted THz radiation. The THz radiation is transmitted through the object as a two-dimensional array of parallel rays. The optical detection is an array of detectors such as a CCD sensor.

  15. Macroscopic Lagrangian description of warm plasmas. II Nonlinear wave interactions

    NASA Technical Reports Server (NTRS)

    Kim, H.; Crawford, F. W.

    1983-01-01

    A macroscopic Lagrangian is simplified to the adiabatic limit and expanded about equilibrium, to third order in perturbation, for three illustrative cases: one-dimensional compression parallel to the static magnetic field, two-dimensional compression perpendicular to the static magnetic field, and three-dimensional compression. As examples of the averaged-Lagrangian method applied to nonlinear wave interactions, coupling coefficients are derived for interactions between two electron plasma waves and an ion acoustic wave, and between an ordinary wave, an electron plasma wave, and an ion acoustic wave.

  16. Two- to three-dimensional crossover in a dense electron liquid in silicon

    NASA Astrophysics Data System (ADS)

    Matmon, Guy; Ginossar, Eran; Villis, Byron J.; Kölker, Alex; Lim, Tingbin; Solanki, Hari; Schofield, Steven R.; Curson, Neil J.; Li, Juerong; Murdin, Ben N.; Fisher, Andrew J.; Aeppli, Gabriel

    2018-04-01

    Doping of silicon via phosphine exposures alternating with molecular beam epitaxy overgrowth is a path to Si:P substrates for conventional microelectronics and quantum information technologies. The technique also provides a well-controlled material for systematic studies of two-dimensional lattices with a half-filled band. We show here that for a dense (ns=2.8 ×1014 cm-2) disordered two-dimensional array of P atoms, the full field magnitude and angle-dependent magnetotransport is remarkably well described by classic weak localization theory with no corrections due to interaction. The two- to three-dimensional crossover seen upon warming can also be interpreted using scaling concepts developed for anistropic three-dimensional materials, which work remarkably except when the applied fields are nearly parallel to the conducting planes.

  17. Segmentation of Unstructured Datasets

    NASA Technical Reports Server (NTRS)

    Bhat, Smitha

    1996-01-01

    Datasets generated by computer simulations and experiments in Computational Fluid Dynamics tend to be extremely large and complex. It is difficult to visualize these datasets using standard techniques like Volume Rendering and Ray Casting. Object Segmentation provides a technique to extract and quantify regions of interest within these massive datasets. This thesis explores basic algorithms to extract coherent amorphous regions from two-dimensional and three-dimensional scalar unstructured grids. The techniques are applied to datasets from Computational Fluid Dynamics and from Finite Element Analysis.

  18. Role of APOE Isoforms in the Pathogenesis of TBI induced Alzheimer’s Disease

    DTIC Science & Technology

    2016-10-01

    deletion, APOE targeted replacement, complex breeding, CCI model optimization, mRNA library generation, high throughput massive parallel sequencing...demonstrate that the lack of Abca1 increases amyloid plaques and decreased APOE protein levels in AD-model mice. In this proposal we will test the hypothesis...injury, inflammatory reaction, transcriptome, high throughput massive parallel sequencing, mRNA-seq., behavioral testing, memory impairment, recovery 3

  19. High-Resolution Functional Mapping of the Venezuelan Equine Encephalitis Virus Genome by Insertional Mutagenesis and Massively Parallel Sequencing

    DTIC Science & Technology

    2010-10-14

    High-Resolution Functional Mapping of the Venezuelan Equine Encephalitis Virus Genome by Insertional Mutagenesis and Massively Parallel Sequencing...Venezuelan equine encephalitis virus (VEEV) genome. We initially used a capillary electrophoresis method to gain insight into the role of the VEEV...Smith JM, Schmaljohn CS (2010) High-Resolution Functional Mapping of the Venezuelan Equine Encephalitis Virus Genome by Insertional Mutagenesis and

  20. Efficient, massively parallel eigenvalue computation

    NASA Technical Reports Server (NTRS)

    Huo, Yan; Schreiber, Robert

    1993-01-01

    In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.

  1. Exploring the Ability of a Coarse-grained Potential to Describe the Stress-strain Response of Glassy Polystyrene

    DTIC Science & Technology

    2012-10-01

    using the open-source code Large-scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) (http://lammps.sandia.gov) (23). The commercial...parameters are proprietary and cannot be ported to the LAMMPS 4 simulation code. In our molecular dynamics simulations at the atomistic resolution, we...IBI iterative Boltzmann inversion LAMMPS Large-scale Atomic/Molecular Massively Parallel Simulator MAPS Materials Processes and Simulations MS

  2. GPU-powered Shotgun Stochastic Search for Dirichlet process mixtures of Gaussian Graphical Models

    PubMed Central

    Mukherjee, Chiranjit; Rodriguez, Abel

    2016-01-01

    Gaussian graphical models are popular for modeling high-dimensional multivariate data with sparse conditional dependencies. A mixture of Gaussian graphical models extends this model to the more realistic scenario where observations come from a heterogenous population composed of a small number of homogeneous sub-groups. In this paper we present a novel stochastic search algorithm for finding the posterior mode of high-dimensional Dirichlet process mixtures of decomposable Gaussian graphical models. Further, we investigate how to harness the massive thread-parallelization capabilities of graphical processing units to accelerate computation. The computational advantages of our algorithms are demonstrated with various simulated data examples in which we compare our stochastic search with a Markov chain Monte Carlo algorithm in moderate dimensional data examples. These experiments show that our stochastic search largely outperforms the Markov chain Monte Carlo algorithm in terms of computing-times and in terms of the quality of the posterior mode discovered. Finally, we analyze a gene expression dataset in which Markov chain Monte Carlo algorithms are too slow to be practically useful. PMID:28626348

  3. GPU-powered Shotgun Stochastic Search for Dirichlet process mixtures of Gaussian Graphical Models.

    PubMed

    Mukherjee, Chiranjit; Rodriguez, Abel

    2016-01-01

    Gaussian graphical models are popular for modeling high-dimensional multivariate data with sparse conditional dependencies. A mixture of Gaussian graphical models extends this model to the more realistic scenario where observations come from a heterogenous population composed of a small number of homogeneous sub-groups. In this paper we present a novel stochastic search algorithm for finding the posterior mode of high-dimensional Dirichlet process mixtures of decomposable Gaussian graphical models. Further, we investigate how to harness the massive thread-parallelization capabilities of graphical processing units to accelerate computation. The computational advantages of our algorithms are demonstrated with various simulated data examples in which we compare our stochastic search with a Markov chain Monte Carlo algorithm in moderate dimensional data examples. These experiments show that our stochastic search largely outperforms the Markov chain Monte Carlo algorithm in terms of computing-times and in terms of the quality of the posterior mode discovered. Finally, we analyze a gene expression dataset in which Markov chain Monte Carlo algorithms are too slow to be practically useful.

  4. A mixed finite difference/Galerkin method for three-dimensional Rayleigh-Benard convection

    NASA Technical Reports Server (NTRS)

    Buell, Jeffrey C.

    1988-01-01

    A fast and accurate numerical method, for nonlinear conservation equation systems whose solutions are periodic in two of the three spatial dimensions, is presently implemented for the case of Rayleigh-Benard convection between two rigid parallel plates in the parameter region where steady, three-dimensional convection is known to be stable. High-order streamfunctions secure the reduction of the system of five partial differential equations to a system of only three. Numerical experiments are presented which verify both the expected convergence rates and the absolute accuracy of the method.

  5. Three-dimensional instabilities of mantle convection with multiple phase transitions

    NASA Technical Reports Server (NTRS)

    Honda, S.; Yuen, D. A.; Balachandar, S.; Reuteler, D.

    1993-01-01

    The effects of multiple phase transitions on mantle convection are investigated by numerical simulations that are based on three-dimensional models. These simulations show that cold sheets of mantle material collide at junctions, merge, and form a strong downflow that is stopped temporarily by the transition zone. The accumulated cold material gives rise to a strong gravitational instability that causes the cold mass to sink rapidly into the lower mantle. This process promotes a massive exchange between the lower and upper mantles and triggers a global instability in the adjacent plume system. This mechanism may be cyclic in nature and may be linked to the generation of superplumes.

  6. A new parallel-vector finite element analysis software on distributed-memory computers

    NASA Technical Reports Server (NTRS)

    Qin, Jiangning; Nguyen, Duc T.

    1993-01-01

    A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.

  7. Template based parallel checkpointing in a massively parallel computer system

    DOEpatents

    Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  8. Parallel computing works

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less

  9. Dynamic file-access characteristics of a production parallel scientific workload

    NASA Technical Reports Server (NTRS)

    Kotz, David; Nieuwejaar, Nils

    1994-01-01

    Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to tremendous amounts of data in parallel to hundreds or thousands of processors. Most successful systems are based on a solid understanding of the expected workload, but thus far there have been no comprehensive workload characterizations of multiprocessor file systems. This paper presents the results of a three week tracing study in which all file-related activity on a massively parallel computer was recorded. Our instrumentation differs from previous efforts in that it collects information about every I/O request and about the mix of jobs running in a production environment. We also present the results of a trace-driven caching simulation and recommendations for designers of multiprocessor file systems.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reed, D.A.; Grunwald, D.C.

    The spectrum of parallel processor designs can be divided into three sections according to the number and complexity of the processors. At one end there are simple, bit-serial processors. Any one of thee processors is of little value, but when it is coupled with many others, the aggregate computing power can be large. This approach to parallel processing can be likened to a colony of termites devouring a log. The most notable examples of this approach are the NASA/Goodyear Massively Parallel Processor, which has 16K one-bit processors, and the Thinking Machines Connection Machine, which has 64K one-bit processors. At themore » other end of the spectrum, a small number of processors, each built using the fastest available technology and the most sophisticated architecture, are combined. An example of this approach is the Cray X-MP. This type of parallel processing is akin to four woodmen attacking the log with chainsaws.« less

  11. Analysis of bacterial and archaeal diversity in coastal microbial mats using massive parallel 16S rRNA gene tag sequencing.

    PubMed

    Bolhuis, Henk; Stal, Lucas J

    2011-11-01

    Coastal microbial mats are small-scale and largely closed ecosystems in which a plethora of different functional groups of microorganisms are responsible for the biogeochemical cycling of the elements. Coastal microbial mats play an important role in coastal protection and morphodynamics through stabilization of the sediments and by initiating the development of salt-marshes. Little is known about the bacterial and especially archaeal diversity and how it contributes to the ecological functioning of coastal microbial mats. Here, we analyzed three different types of coastal microbial mats that are located along a tidal gradient and can be characterized as marine (ST2), brackish (ST3) and freshwater (ST3) systems. The mats were sampled during three different seasons and subjected to massive parallel tag sequencing of the V6 region of the 16S rRNA genes of Bacteria and Archaea. Sequence analysis revealed that the mats are among the most diverse marine ecosystems studied so far and consist of several novel taxonomic levels ranging from classes to species. The diversity between the different mat types was far more pronounced than the changes between the different seasons at one location. The archaeal community for these mats have not been studied before and revealed a strong reaction on a short period of draught during summer resulting in a massive increase in halobacterial sequences, whereas the bacterial community was barely affected. We concluded that the community composition and the microbial diversity were intrinsic of the mat type and depend on the location along the tidal gradient indicating a relation with salinity.

  12. Reconstructing evolutionary trees in parallel for massive sequences.

    PubMed

    Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

    2017-12-14

    Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .

  13. Efficient implementation of a 3-dimensional ADI method on the iPSC/860

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Van der Wijngaart, R.F.

    1993-12-31

    A comparison is made between several domain decomposition strategies for the solution of three-dimensional partial differential equations on a MIMD distributed memory parallel computer. The grids used are structured, and the numerical algorithm is ADI. Important implementation issues regarding load balancing, storage requirements, network latency, and overlap of computations and communications are discussed. Results of the solution of the three-dimensional heat equation on the Intel iPSC/860 are presented for the three most viable methods. It is found that the Bruno-Cappello decomposition delivers optimal computational speed through an almost complete elimination of processor idle time, while providing good memory efficiency.

  14. Linear stability of three-dimensional boundary layers - Effects of curvature and non-parallelism

    NASA Technical Reports Server (NTRS)

    Malik, M. R.; Balakumar, P.

    1993-01-01

    In this paper we study the effect of in-plane (wavefront) curvature on the stability of three-dimensional boundary layers. It is found that this effect is stabilizing or destabilizing depending upon the sign of the crossflow velocity profile. We also investigate the effects of surface curvature and nonparallelism on crossflow instability. Computations performed for an infinite-swept cylinder show that while convex curvature stabilizes the three-dimensional boundary layer, nonparallelism is, in general, destabilizing and the net effect of the two depends upon meanflow and disturbance parameters. It is also found that concave surface curvature further destabilizes the crossflow instability.

  15. THE THREE-DIMENSIONAL EVOLUTION TO CORE COLLAPSE OF A MASSIVE STAR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Couch, Sean M.; Chatzopoulos, Emmanouil; Arnett, W. David

    2015-07-20

    We present the first three-dimensional (3D) simulation of the final minutes of iron core growth in a massive star, up to and including the point of core gravitational instability and collapse. We capture the development of strong convection driven by violent Si burning in the shell surrounding the iron core. This convective burning builds the iron core to its critical mass and collapse ensues, driven by electron capture and photodisintegration. The non-spherical structure and motion generated by 3D convection is substantial at the point of collapse, with convective speeds of several hundreds of km s{sup −1}. We examine the impactmore » of such physically realistic 3D initial conditions on the core-collapse supernova mechanism using 3D simulations including multispecies neutrino leakage and find that the enhanced post-shock turbulence resulting from 3D progenitor structure aids successful explosions. We conclude that non-spherical progenitor structure should not be ignored, and should have a significant and favorable impact on the likelihood for neutrino-driven explosions. In order to make simulating the 3D collapse of an iron core feasible, we were forced to make approximations to the nuclear network making this effort only a first step toward accurate, self-consistent 3D stellar evolution models of the end states of massive stars.« less

  16. A large-scale dynamo and magnetoturbulence in rapidly rotating core-collapse supernovae.

    PubMed

    Mösta, Philipp; Ott, Christian D; Radice, David; Roberts, Luke F; Schnetter, Erik; Haas, Roland

    2015-12-17

    Magnetohydrodynamic turbulence is important in many high-energy astrophysical systems, where instabilities can amplify the local magnetic field over very short timescales. Specifically, the magnetorotational instability and dynamo action have been suggested as a mechanism for the growth of magnetar-strength magnetic fields (of 10(15) gauss and above) and for powering the explosion of a rotating massive star. Such stars are candidate progenitors of type Ic-bl hypernovae, which make up all supernovae that are connected to long γ-ray bursts. The magnetorotational instability has been studied with local high-resolution shearing-box simulations in three dimensions, and with global two-dimensional simulations, but it is not known whether turbulence driven by this instability can result in the creation of a large-scale, ordered and dynamically relevant field. Here we report results from global, three-dimensional, general-relativistic magnetohydrodynamic turbulence simulations. We show that hydromagnetic turbulence in rapidly rotating protoneutron stars produces an inverse cascade of energy. We find a large-scale, ordered toroidal field that is consistent with the formation of bipolar magnetorotationally driven outflows. Our results demonstrate that rapidly rotating massive stars are plausible progenitors for both type Ic-bl supernovae and long γ-ray bursts, and provide a viable mechanism for the formation of magnetars. Moreover, our findings suggest that rapidly rotating massive stars might lie behind potentially magnetar-powered superluminous supernovae.

  17. Parallelized Seeded Region Growing Using CUDA

    PubMed Central

    Park, Seongjin; Lee, Hyunna; Seo, Jinwook; Lee, Kyoung Ho; Shin, Yeong-Gil; Kim, Bohyoung

    2014-01-01

    This paper presents a novel method for parallelizing the seeded region growing (SRG) algorithm using Compute Unified Device Architecture (CUDA) technology, with intention to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared with SRG implementations on single-core CPUs, quad-core CPUs, and shader language programming, using synthetic datasets and 20 body CT scans. Based on the experimental results, the CUDA-based SRG outperforms the other three implementations, advocating that it can substantially assist the segmentation during massive CT screening tests. PMID:25309619

  18. Rasdaman for Big Spatial Raster Data

    NASA Astrophysics Data System (ADS)

    Hu, F.; Huang, Q.; Scheele, C. J.; Yang, C. P.; Yu, M.; Liu, K.

    2015-12-01

    Spatial raster data have grown exponentially over the past decade. Recent advancements on data acquisition technology, such as remote sensing, have allowed us to collect massive observation data of various spatial resolution and domain coverage. The volume, velocity, and variety of such spatial data, along with the computational intensive nature of spatial queries, pose grand challenge to the storage technologies for effective big data management. While high performance computing platforms (e.g., cloud computing) can be used to solve the computing-intensive issues in big data analysis, data has to be managed in a way that is suitable for distributed parallel processing. Recently, rasdaman (raster data manager) has emerged as a scalable and cost-effective database solution to store and retrieve massive multi-dimensional arrays, such as sensor, image, and statistics data. Within this paper, the pros and cons of using rasdaman to manage and query spatial raster data will be examined and compared with other common approaches, including file-based systems, relational databases (e.g., PostgreSQL/PostGIS), and NoSQL databases (e.g., MongoDB and Hive). Earth Observing System (EOS) data collected from NASA's Atmospheric Scientific Data Center (ASDC) will be used and stored in these selected database systems, and a set of spatial and non-spatial queries will be designed to benchmark their performance on retrieving large-scale, multi-dimensional arrays of EOS data. Lessons learnt from using rasdaman will be discussed as well.

  19. Ultrascalable petaflop parallel supercomputer

    DOEpatents

    Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  20. An adaptive front tracking technique for three-dimensional transient flows

    NASA Astrophysics Data System (ADS)

    Galaktionov, O. S.; Anderson, P. D.; Peters, G. W. M.; van de Vosse, F. N.

    2000-01-01

    An adaptive technique, based on both surface stretching and surface curvature analysis for tracking strongly deforming fluid volumes in three-dimensional flows is presented. The efficiency and accuracy of the technique are demonstrated for two- and three-dimensional flow simulations. For the two-dimensional test example, the results are compared with results obtained using a different tracking approach based on the advection of a passive scalar. Although for both techniques roughly the same structures are found, the resolution for the front tracking technique is much higher. In the three-dimensional test example, a spherical blob is tracked in a chaotic mixing flow. For this problem, the accuracy of the adaptive tracking is demonstrated by the volume conservation for the advected blob. Adaptive front tracking is suitable for simulation of the initial stages of fluid mixing, where the interfacial area can grow exponentially with time. The efficiency of the algorithm significantly benefits from parallelization of the code. Copyright

  1. The Multi-dimensional Character of Core-collapse Supernovae

    DOE PAGES

    Hix, W. R.; Lentz, E. J.; Bruenn, S. W.; ...

    2016-03-01

    Core-collapse supernovae, the culmination of massive stellar evolution, are spectacular astronomical events and the principle actors in the story of our elemental origins. Our understanding of these events, while still incomplete, centers around a neutrino-driven central engine that is highly hydrodynamically unstable. Increasingly sophisticated simulations reveal a shock that stalls for hundreds of milliseconds before reviving. Though brought back to life by neutrino heating, the development of the supernova explosion is inextricably linked to multi-dimensional fluid flows. In this paper, the outcomes of three-dimensional simulations that include sophisticated nuclear physics and spectral neutrino transport are juxtaposed to learn about themore » nature of the three-dimensional fluid flow that shapes the explosion. Comparison is also made between the results of simulations in spherical symmetry from several groups, to give ourselves confidence in the understanding derived from this juxtaposition.« less

  2. Implementation of a parallel unstructured Euler solver on shared and distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.

    1992-01-01

    An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.

  3. Terminal Area Simulation System User's Guide - Version 10.0

    NASA Technical Reports Server (NTRS)

    Switzer, George F.; Proctor, Fred H.

    2014-01-01

    The Terminal Area Simulation System (TASS) is a three-dimensional, time-dependent, large eddy simulation model that has been developed for studies of wake vortex and weather hazards to aviation, along with other atmospheric turbulence, and cloud-scale weather phenomenology. This document describes the source code for TASS version 10.0 and provides users with needed documentation to run the model. The source code is programed in Fortran language and is formulated to take advantage of vector and efficient multi-processor scaling for execution on massively-parallel supercomputer clusters. The code contains different initialization modules allowing the study of aircraft wake vortex interaction with the atmosphere and ground, atmospheric turbulence, atmospheric boundary layers, precipitating convective clouds, hail storms, gust fronts, microburst windshear, supercell and mesoscale convective systems, tornadic storms, and ring vortices. The model is able to operate in either two- or three-dimensions with equations numerically formulated on a Cartesian grid. The primary output from the TASS is time-dependent domain fields generated by the prognostic equations and diagnosed variables. This document will enable a user to understand the general logic of TASS, and will show how to configure and initialize the model domain. Also described are the formats of the input and output files, as well as the parameters that control the input and output.

  4. Datacube Services in Action, Using Open Source and Open Standards

    NASA Astrophysics Data System (ADS)

    Baumann, P.; Misev, D.

    2016-12-01

    Array Databases comprise novel, promising technology for massive spatio-temporal datacubes, extending the SQL paradigm of "any query, anytime" to n-D arrays. On server side, such queries can be optimized, parallelized, and distributed based on partitioned array storage. The rasdaman ("raster data manager") system, which has pioneered Array Databases, is available in open source on www.rasdaman.org. Its declarative query language extends SQL with array operators which are optimized and parallelized on server side. The rasdaman engine, which is part of OSGeo Live, is mature and in operational use databases individually holding dozens of Terabytes. Further, the rasdaman concepts have strongly impacted international Big Data standards in the field, including the forthcoming MDA ("Multi-Dimensional Array") extension to ISO SQL, the OGC Web Coverage Service (WCS) and Web Coverage Processing Service (WCPS) standards, and the forthcoming INSPIRE WCS/WCPS; in both OGC and INSPIRE, OGC is WCS Core Reference Implementation. In our talk we present concepts, architecture, operational services, and standardization impact of open-source rasdaman, as well as experiences made.

  5. Real-Time Compressive Sensing MRI Reconstruction Using GPU Computing and Split Bregman Methods

    PubMed Central

    Smith, David S.; Gore, John C.; Yankeelov, Thomas E.; Welch, E. Brian

    2012-01-01

    Compressive sensing (CS) has been shown to enable dramatic acceleration of MRI acquisition in some applications. Being an iterative reconstruction technique, CS MRI reconstructions can be more time-consuming than traditional inverse Fourier reconstruction. We have accelerated our CS MRI reconstruction by factors of up to 27 by using a split Bregman solver combined with a graphics processing unit (GPU) computing platform. The increases in speed we find are similar to those we measure for matrix multiplication on this platform, suggesting that the split Bregman methods parallelize efficiently. We demonstrate that the combination of the rapid convergence of the split Bregman algorithm and the massively parallel strategy of GPU computing can enable real-time CS reconstruction of even acquisition data matrices of dimension 40962 or more, depending on available GPU VRAM. Reconstruction of two-dimensional data matrices of dimension 10242 and smaller took ~0.3 s or less, showing that this platform also provides very fast iterative reconstruction for small-to-moderate size images. PMID:22481908

  6. Acceleration of Particles Near Earth's Bow Shock

    NASA Astrophysics Data System (ADS)

    Sandroos, A.

    2012-12-01

    Collisionless shock waves, for example, near planetary bodies or driven by coronal mass ejections, are a key source of energetic particles in the heliosphere. When the solar wind hits Earth's bow shock, some of the incident particles get reflected back towards the Sun and are accelerated in the process. Reflected ions are responsible for the creation of a turbulent foreshock in quasi-parallel regions of Earth's bow shock. We present first results of foreshock macroscopic structure and of particle distributions upstream of Earth's bow shock, obtained with a new 2.5-dimensional self-consistent diffusive shock acceleration model. In the model particles' pitch angle scattering rates are calculated from Alfvén wave power spectra using quasilinear theory. Wave power spectra in turn are modified by particles' energy changes due to the scatterings. The new model has been implemented on massively parallel simulation platform Corsair. We have used an earlier version of the model to study ion acceleration in a shock-shock interaction event (Hietala, Sandroos, and Vainio, 2012).

  7. Real-Time Compressive Sensing MRI Reconstruction Using GPU Computing and Split Bregman Methods.

    PubMed

    Smith, David S; Gore, John C; Yankeelov, Thomas E; Welch, E Brian

    2012-01-01

    Compressive sensing (CS) has been shown to enable dramatic acceleration of MRI acquisition in some applications. Being an iterative reconstruction technique, CS MRI reconstructions can be more time-consuming than traditional inverse Fourier reconstruction. We have accelerated our CS MRI reconstruction by factors of up to 27 by using a split Bregman solver combined with a graphics processing unit (GPU) computing platform. The increases in speed we find are similar to those we measure for matrix multiplication on this platform, suggesting that the split Bregman methods parallelize efficiently. We demonstrate that the combination of the rapid convergence of the split Bregman algorithm and the massively parallel strategy of GPU computing can enable real-time CS reconstruction of even acquisition data matrices of dimension 4096(2) or more, depending on available GPU VRAM. Reconstruction of two-dimensional data matrices of dimension 1024(2) and smaller took ~0.3 s or less, showing that this platform also provides very fast iterative reconstruction for small-to-moderate size images.

  8. Micro-Macro Simulation of Viscoelastic Fluids in Three Dimensions

    NASA Astrophysics Data System (ADS)

    Rüttgers, Alexander; Griebel, Michael

    2012-11-01

    The development of the chemical industry resulted in various complex fluids that cannot be correctly described by classical fluid mechanics. For instance, this includes paint, engine oils with polymeric additives and toothpaste. We currently perform multiscale viscoelastic flow simulations for which we have coupled our three-dimensional Navier-Stokes solver NaSt3dGPF with the stochastic Brownian configuration field method on the micro-scale. In this method, we represent a viscoelastic fluid as a dumbbell system immersed in a three-dimensional Newtonian liquid which leads to a six-dimensional problem in space. The approach requires large computational resources and therefore depends on an efficient parallelisation strategy. Our flow solver is parallelised with a domain decomposition approach using MPI. It shows excellent scale-up results for up to 128 processors. In this talk, we present simulation results for viscoelastic fluids in square-square contractions due to their relevance for many engineering applications such as extrusion. Another aspect of the talk is the parallel implementation in NaSt3dGPF and the parallel scale-up and speed-up behaviour.

  9. Load balancing for massively-parallel soft-real-time systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hailperin, M.

    1988-09-01

    Global load balancing, if practical, would allow the effective use of massively-parallel ensemble architectures for large soft-real-problems. The challenge is to replace quick global communications, which is impractical in a massively-parallel system, with statistical techniques. In this vein, the author proposes a novel approach to decentralized load balancing based on statistical time-series analysis. Each site estimates the system-wide average load using information about past loads of individual sites and attempts to equal that average. This estimation process is practical because the soft-real-time systems of interest naturally exhibit loads that are periodic, in a statistical sense akin to seasonality in econometrics.more » It is shown how this load-characterization technique can be the foundation for a load-balancing system in an architecture employing cut-through routing and an efficient multicast protocol.« less

  10. Evaluation of massively parallel sequencing for forensic DNA methylation profiling.

    PubMed

    Richards, Rebecca; Patel, Jayshree; Stevenson, Kate; Harbison, SallyAnn

    2018-05-11

    Epigenetics is an emerging area of interest in forensic science. DNA methylation, a type of epigenetic modification, can be applied to chronological age estimation, identical twin differentiation and body fluid identification. However, there is not yet an agreed, established methodology for targeted detection and analysis of DNA methylation markers in forensic research. Recently a massively parallel sequencing-based approach has been suggested. The use of massively parallel sequencing is well established in clinical epigenetics and is emerging as a new technology in the forensic field. This review investigates the potential benefits, limitations and considerations of this technique for the analysis of DNA methylation in a forensic context. The importance of a robust protocol, regardless of the methodology used, that minimises potential sources of bias is highlighted. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  11. Supercomputing on massively parallel bit-serial architectures

    NASA Technical Reports Server (NTRS)

    Iobst, Ken

    1985-01-01

    Research on the Goodyear Massively Parallel Processor (MPP) suggests that high-level parallel languages are practical and can be designed with powerful new semantics that allow algorithms to be efficiently mapped to the real machines. For the MPP these semantics include parallel/associative array selection for both dense and sparse matrices, variable precision arithmetic to trade accuracy for speed, micro-pipelined train broadcast, and conditional branching at the processing element (PE) control unit level. The preliminary design of a FORTRAN-like parallel language for the MPP has been completed and is being used to write programs to perform sparse matrix array selection, min/max search, matrix multiplication, Gaussian elimination on single bit arrays and other generic algorithms. A description is given of the MPP design. Features of the system and its operation are illustrated in the form of charts and diagrams.

  12. Classification by Using Multispectral Point Cloud Data

    NASA Astrophysics Data System (ADS)

    Liao, C. T.; Huang, H. H.

    2012-07-01

    Remote sensing images are generally recorded in two-dimensional format containing multispectral information. Also, the semantic information is clearly visualized, which ground features can be better recognized and classified via supervised or unsupervised classification methods easily. Nevertheless, the shortcomings of multispectral images are highly depending on light conditions, and classification results lack of three-dimensional semantic information. On the other hand, LiDAR has become a main technology for acquiring high accuracy point cloud data. The advantages of LiDAR are high data acquisition rate, independent of light conditions and can directly produce three-dimensional coordinates. However, comparing with multispectral images, the disadvantage is multispectral information shortage, which remains a challenge in ground feature classification through massive point cloud data. Consequently, by combining the advantages of both LiDAR and multispectral images, point cloud data with three-dimensional coordinates and multispectral information can produce a integrate solution for point cloud classification. Therefore, this research acquires visible light and near infrared images, via close range photogrammetry, by matching images automatically through free online service for multispectral point cloud generation. Then, one can use three-dimensional affine coordinate transformation to compare the data increment. At last, the given threshold of height and color information is set as threshold in classification.

  13. Shift-and-invert parallel spectral transformation eigensolver: Massively parallel performance for density-functional based tight-binding

    DOE PAGES

    Zhang, Hong; Zapol, Peter; Dixon, David A.; ...

    2015-11-17

    The Shift-and-invert parallel spectral transformations (SIPs), a computational approach to solve sparse eigenvalue problems, is developed for massively parallel architectures with exceptional parallel scalability and robustness. The capabilities of SIPs are demonstrated by diagonalization of density-functional based tight-binding (DFTB) Hamiltonian and overlap matrices for single-wall metallic carbon nanotubes, diamond nanowires, and bulk diamond crystals. The largest (smallest) example studied is a 128,000 (2000) atom nanotube for which ~330,000 (~5600) eigenvalues and eigenfunctions are obtained in ~190 (~5) seconds when parallelized over 266,144 (16,384) Blue Gene/Q cores. Weak scaling and strong scaling of SIPs are analyzed and the performance of SIPsmore » is compared with other novel methods. Different matrix ordering methods are investigated to reduce the cost of the factorization step, which dominates the time-to-solution at the strong scaling limit. As a result, a parallel implementation of assembling the density matrix from the distributed eigenvectors is demonstrated.« less

  14. Shift-and-invert parallel spectral transformation eigensolver: Massively parallel performance for density-functional based tight-binding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Hong; Zapol, Peter; Dixon, David A.

    The Shift-and-invert parallel spectral transformations (SIPs), a computational approach to solve sparse eigenvalue problems, is developed for massively parallel architectures with exceptional parallel scalability and robustness. The capabilities of SIPs are demonstrated by diagonalization of density-functional based tight-binding (DFTB) Hamiltonian and overlap matrices for single-wall metallic carbon nanotubes, diamond nanowires, and bulk diamond crystals. The largest (smallest) example studied is a 128,000 (2000) atom nanotube for which ~330,000 (~5600) eigenvalues and eigenfunctions are obtained in ~190 (~5) seconds when parallelized over 266,144 (16,384) Blue Gene/Q cores. Weak scaling and strong scaling of SIPs are analyzed and the performance of SIPsmore » is compared with other novel methods. Different matrix ordering methods are investigated to reduce the cost of the factorization step, which dominates the time-to-solution at the strong scaling limit. As a result, a parallel implementation of assembling the density matrix from the distributed eigenvectors is demonstrated.« less

  15. Holography for Schrödinger backgrounds

    NASA Astrophysics Data System (ADS)

    Guica, Monica; Skenderis, Kostas; Taylor, Marika; van Rees, Balt C.

    2011-02-01

    We discuss holography for Schrödinger solutions of both topologically massive gravity in three dimensions and massive vector theories in ( d + 1) dimensions. In both cases the dual field theory can be viewed as a d-dimensional conformal field theory (two dimensional in the case of TMG) deformed by certain operators that respect the Schrödinger symmetry. These operators are irrelevant from the viewpoint of the relativistic conformal group but they are exactly marginal with respect to the non-relativistic conformal group. The spectrum of linear fluctuations around the background solutions corresponds to operators that are labeled by their scaling dimension and the lightcone momentum k v . We set up the holographic dictionary and compute 2-point functions of these operators both holographically and in field theory using conformal perturbation theory and find agreement. The counterterms needed for holographic renormalization are non-local in the v lightcone direction.

  16. Charged BTZ black holes in the context of massive gravity's rainbow

    NASA Astrophysics Data System (ADS)

    Hendi, S. H.; Panahiyan, S.; Upadhyay, S.; Eslam Panah, B.

    2017-04-01

    Banados, Teitelboim, and Zanelli (BTZ) black holes are excellent laboratories for studying black hole thermodynamics, which is a bridge between classical general relativity and the quantum nature of gravitation. In addition, three-dimensional gravity could have equipped us for exploring some of the ideas behind the two-dimensional conformal field theory based on the AdS3/CFT2 . Considering the significant interest in these regards, we examine charged BTZ black holes. We consider the system contains massive gravity with energy dependent spacetime to enrich the results. In order to make high curvature (energy) BTZ black holes more realistic, we modify the theory by energy dependent constants. We investigate thermodynamic properties of the solutions by calculating heat capacity and free energy. We also analyze thermal stability and study the possibility of the Hawking-Page phase transition. At last, we study the geometrical thermodynamics of these black holes and compare the results of various approaches.

  17. Dynamic grid refinement for partial differential equations on parallel computers

    NASA Technical Reports Server (NTRS)

    Mccormick, S.; Quinlan, D.

    1989-01-01

    The fast adaptive composite grid method (FAC) is an algorithm that uses various levels of uniform grids to provide adaptive resolution and fast solution of PDEs. An asynchronous version of FAC, called AFAC, that completely eliminates the bottleneck to parallelism is presented. This paper describes the advantage that this algorithm has in adaptive refinement for moving singularities on multiprocessor computers. This work is applicable to the parallel solution of two- and three-dimensional shock tracking problems.

  18. Massively Parallel Nanostructure Assembly Strategies for Sensing and Information Technology. Phase 2

    DTIC Science & Technology

    2013-05-25

    field. This work has focused on the synthesis of new functional materials and the development of high-throughput, facile methods to assemble...Hong (Seoul National University, Korea). Specifically, gapped nanowires (GNW) were identified as candidate materials for synthesis and assembly as...Throughout the course of this grant, we reported major accomplishments both in the synthesis and assembly of such structures. Synthetically, we report three

  19. Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing

    PubMed Central

    Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-ya; Usami, Shin-ichi

    2016-01-01

    Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype–phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3–2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis. PMID:26791358

  20. Noninvasive prenatal screening for fetal common sex chromosome aneuploidies from maternal blood.

    PubMed

    Zhang, Bin; Lu, Bei-Yi; Yu, Bin; Zheng, Fang-Xiu; Zhou, Qin; Chen, Ying-Ping; Zhang, Xiao-Qing

    2017-04-01

    Objective To explore the feasibility of high-throughput massively parallel genomic DNA sequencing technology for the noninvasive prenatal detection of fetal sex chromosome aneuploidies (SCAs). Methods The study enrolled pregnant women who were prepared to undergo noninvasive prenatal testing (NIPT) in the second trimester. Cell-free fetal DNA (cffDNA) was extracted from the mother's peripheral venous blood and a high-throughput sequencing procedure was undertaken. Patients identified as having pregnancies associated with SCAs were offered prenatal fetal chromosomal karyotyping. Results The study enrolled 10 275 pregnant women who were prepared to undergo NIPT. Of these, 57 pregnant women (0.55%) showed fetal SCA, including 27 with Turner syndrome (45,X), eight with Triple X syndrome (47,XXX), 12 with Klinefelter syndrome (47,XXY) and three with 47,XYY. Thirty-three pregnant women agreed to undergo fetal karyotyping and 18 had results consistent with NIPT, while 15 patients received a normal karyotype result. The overall positive predictive value of NIPT for detecting SCAs was 54.54% (18/33) and for detecting Turner syndrome (45,X) was 29.41% (5/17). Conclusion NIPT can be used to identify fetal SCAs by analysing cffDNA using massively parallel genomic sequencing, although the accuracy needs to be improved particularly for Turner syndrome (45,X).

  1. Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing.

    PubMed

    Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-Ya; Usami, Shin-Ichi

    2016-05-01

    Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype-phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3-2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meru, Farzana; Juhász, Attila; Ilee, John D.

    The young star Elias 2–27 has recently been observed to posses a massive circumstellar disk with two prominent large-scale spiral arms. In this Letter, we perform three-dimensional Smoothed Particle Hydrodynamics simulations, radiative transfer modeling, synthetic ALMA imaging, and an unsharped masking technique to explore three possibilities for the origin of the observed structures—an undetected companion either internal or external to the spirals, and a self-gravitating disk. We find that a gravitationally unstable disk and a disk with an external companion can produce morphology that is consistent with the observations. In addition, for the latter, we find that the companion couldmore » be a relatively massive planetary-mass companion (≲10–13 M {sub Jup}) and located at large radial distances (between ≈300–700 au). We therefore suggest that Elias 2–27 may be one of the first detections of a disk undergoing gravitational instabilities, or a disk that has recently undergone fragmentation to produce a massive companion.« less

  3. Parallel computation in a three-dimensional elastic-plastic finite-element analysis

    NASA Technical Reports Server (NTRS)

    Shivakumar, K. N.; Bigelow, C. A.; Newman, J. C., Jr.

    1992-01-01

    A CRAY parallel processing technique called autotasking was implemented in a three-dimensional elasto-plastic finite-element code. The technique was evaluated on two CRAY supercomputers, a CRAY 2 and a CRAY Y-MP. Autotasking was implemented in all major portions of the code, except the matrix equations solver. Compiler directives alone were not able to properly multitask the code; user-inserted directives were required to achieve better performance. It was noted that the connect time, rather than wall-clock time, was more appropriate to determine speedup in multiuser environments. For a typical example problem, a speedup of 2.1 (1.8 when the solution time was included) was achieved in a dedicated environment and 1.7 (1.6 with solution time) in a multiuser environment on a four-processor CRAY 2 supercomputer. The speedup on a three-processor CRAY Y-MP was about 2.4 (2.0 with solution time) in a multiuser environment.

  4. Distributed Function Mining for Gene Expression Programming Based on Fast Reduction.

    PubMed

    Deng, Song; Yue, Dong; Yang, Le-chan; Fu, Xiong; Feng, Ya-zhou

    2016-01-01

    For high-dimensional and massive data sets, traditional centralized gene expression programming (GEP) or improved algorithms lead to increased run-time and decreased prediction accuracy. To solve this problem, this paper proposes a new improved algorithm called distributed function mining for gene expression programming based on fast reduction (DFMGEP-FR). In DFMGEP-FR, fast attribution reduction in binary search algorithms (FAR-BSA) is proposed to quickly find the optimal attribution set, and the function consistency replacement algorithm is given to solve integration of the local function model. Thorough comparative experiments for DFMGEP-FR, centralized GEP and the parallel gene expression programming algorithm based on simulated annealing (parallel GEPSA) are included in this paper. For the waveform, mushroom, connect-4 and musk datasets, the comparative results show that the average time-consumption of DFMGEP-FR drops by 89.09%%, 88.85%, 85.79% and 93.06%, respectively, in contrast to centralized GEP and by 12.5%, 8.42%, 9.62% and 13.75%, respectively, compared with parallel GEPSA. Six well-studied UCI test data sets demonstrate the efficiency and capability of our proposed DFMGEP-FR algorithm for distributed function mining.

  5. Externally calibrated parallel imaging for 3D multispectral imaging near metallic implants using broadband ultrashort echo time imaging.

    PubMed

    Wiens, Curtis N; Artz, Nathan S; Jang, Hyungseok; McMillan, Alan B; Reeder, Scott B

    2017-06-01

    To develop an externally calibrated parallel imaging technique for three-dimensional multispectral imaging (3D-MSI) in the presence of metallic implants. A fast, ultrashort echo time (UTE) calibration acquisition is proposed to enable externally calibrated parallel imaging techniques near metallic implants. The proposed calibration acquisition uses a broadband radiofrequency (RF) pulse to excite the off-resonance induced by the metallic implant, fully phase-encoded imaging to prevent in-plane distortions, and UTE to capture rapidly decaying signal. The performance of the externally calibrated parallel imaging reconstructions was assessed using phantoms and in vivo examples. Phantom and in vivo comparisons to self-calibrated parallel imaging acquisitions show that significant reductions in acquisition times can be achieved using externally calibrated parallel imaging with comparable image quality. Acquisition time reductions are particularly large for fully phase-encoded methods such as spectrally resolved fully phase-encoded three-dimensional (3D) fast spin-echo (SR-FPE), in which scan time reductions of up to 8 min were obtained. A fully phase-encoded acquisition with broadband excitation and UTE enabled externally calibrated parallel imaging for 3D-MSI, eliminating the need for repeated calibration regions at each frequency offset. Significant reductions in acquisition time can be achieved, particularly for fully phase-encoded methods like SR-FPE. Magn Reson Med 77:2303-2309, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  6. [Effect of the number and inclination of implant on stress distribution for mandibular full-arch fixed prosthesis].

    PubMed

    Zheng, Xiaoying; Li, Xiaomei; Tang, Zhen; Gong, Lulu; Wang, Dalin

    2014-06-01

    To study the effect of implant number and inclination on stress distribution in implant and its surrounding bone with three-dimensional finite element analysis. A special denture was made for an edentulous mandible cast to collect three-dimensional finite element data. Three three-dimensional finite element models were established as follows. Model 1: 6 paralleled implants; model 2: 4 paralleled implants; model 3: 4 implants, the two anterior implants were parallel, the two distal implants were tilted 30° distally. Among the three models, the maximum stress values found in anterior implants, posterior implants, and peri-implant bone were modle 3

  7. Massively parallel and linear-scaling algorithm for second-order Moller–Plesset perturbation theory applied to the study of supramolecular wires

    DOE PAGES

    Kjaergaard, Thomas; Baudin, Pablo; Bykov, Dmytro; ...

    2016-11-16

    Here, we present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide–Expand–Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide–Expand–Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalabilitymore » of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the “resolution of the identity second-order Moller–Plesset perturbation theory” (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.« less

  8. Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy

    NASA Astrophysics Data System (ADS)

    Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

    2014-03-01

    One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.

  9. Parallel phase-sensitive three-dimensional imaging camera

    DOEpatents

    Smithpeter, Colin L.; Hoover, Eddie R.; Pain, Bedabrata; Hancock, Bruce R.; Nellums, Robert O.

    2007-09-25

    An apparatus is disclosed for generating a three-dimensional (3-D) image of a scene illuminated by a pulsed light source (e.g. a laser or light-emitting diode). The apparatus, referred to as a phase-sensitive 3-D imaging camera utilizes a two-dimensional (2-D) array of photodetectors to receive light that is reflected or scattered from the scene and processes an electrical output signal from each photodetector in the 2-D array in parallel using multiple modulators, each having inputs of the photodetector output signal and a reference signal, with the reference signal provided to each modulator having a different phase delay. The output from each modulator is provided to a computational unit which can be used to generate intensity and range information for use in generating a 3-D image of the scene. The 3-D camera is capable of generating a 3-D image using a single pulse of light, or alternately can be used to generate subsequent 3-D images with each additional pulse of light.

  10. Nebula: reconstruction and visualization of scattering data in reciprocal space.

    PubMed

    Reiten, Andreas; Chernyshov, Dmitry; Mathiesen, Ragnvald H

    2015-04-01

    Two-dimensional solid-state X-ray detectors can now operate at considerable data throughput rates that allow full three-dimensional sampling of scattering data from extended volumes of reciprocal space within second to minute time-scales. For such experiments, simultaneous analysis and visualization allows for remeasurements and a more dynamic measurement strategy. A new software, Nebula , is presented. It efficiently reconstructs X-ray scattering data, generates three-dimensional reciprocal space data sets that can be visualized interactively, and aims to enable real-time processing in high-throughput measurements by employing parallel computing on commodity hardware.

  11. Nebula: reconstruction and visualization of scattering data in reciprocal space

    PubMed Central

    Reiten, Andreas; Chernyshov, Dmitry; Mathiesen, Ragnvald H.

    2015-01-01

    Two-dimensional solid-state X-ray detectors can now operate at considerable data throughput rates that allow full three-dimensional sampling of scattering data from extended volumes of reciprocal space within second to minute time­scales. For such experiments, simultaneous analysis and visualization allows for remeasurements and a more dynamic measurement strategy. A new software, Nebula, is presented. It efficiently reconstructs X-ray scattering data, generates three-dimensional reciprocal space data sets that can be visualized interactively, and aims to enable real-time processing in high-throughput measurements by employing parallel computing on commodity hardware. PMID:25844083

  12. Storm Observations of Persistent Three-Dimensional Shoreline Morphology and Bathymetry Along a Geologically Influenced Shoreface Using X-Band Radar (BASIR)

    NASA Astrophysics Data System (ADS)

    Brodie, K. L.; McNinch, J. E.

    2008-12-01

    Accurate predictions of shoreline response to storms are contingent upon coastal-morphodynamic models effectively synthesizing the complex evolving relationships between beach topography, sandbar morphology, nearshore bathymetry, underlying geology, and the nearshore wave-field during storm events. Analysis of "pre" and "post" storm data sets have led to a common theory for event response of the nearshore system: pre-storm three-dimensional bar and shoreline configurations shift to two-dimensional, linear forms post- storm. A lack of data during storms has unfortunately left a gap in our knowledge of how the system explicitly changes during the storm event. This work presents daily observations of the beach and nearshore during high-energy storm events over a spatially extensive field site (order of magnitude: 10 km) using Bar and Swash Imaging Radar (BASIR), a mobile x-band radar system. The field site contains a complexity of features including shore-oblique bars and troughs, heterogeneous sediment, and an erosional hotspot. BASIR data provide observations of the evolution of shoreline and bar morphology, as well as nearshore bathymetry, throughout the storm events. Nearshore bathymetry is calculated using a bathymetry inversion from radar- derived wave celerity measurements. Preliminary results show a relatively stable but non-linear shore-parallel bar and a non-linear shoreline with megacusp and embayment features (order of magnitude: 1 km) that are enhanced during the wave events. Both the shoreline and shore-parallel bar undulate at a similar spatial frequency to the nearshore shore- oblique bar-field. Large-scale shore-oblique bars and troughs remain relatively static in position and morphology throughout the storm events. The persistence of a three-dimensional shoreline, shore-parallel bar, and large-scale shore-oblique bars and troughs, contradicts the idea of event-driven shifts to two- dimensional morphology and suggests that beach and nearshore response to storms may be location specific. We hypothesize that the influence of underlying geology, defined by (1) the introduction of heterogeneous sediment and (2) the possible creation of shore-oblique bars and troughs in the nearshore, may be responsible for the persistence of three-dimensional forms and the associated shoreline hotspots during storm events.

  13. Using CLIPS in the domain of knowledge-based massively parallel programming

    NASA Technical Reports Server (NTRS)

    Dvorak, Jiri J.

    1994-01-01

    The Program Development Environment (PDE) is a tool for massively parallel programming of distributed-memory architectures. Adopting a knowledge-based approach, the PDE eliminates the complexity introduced by parallel hardware with distributed memory and offers complete transparency in respect of parallelism exploitation. The knowledge-based part of the PDE is realized in CLIPS. Its principal task is to find an efficient parallel realization of the application specified by the user in a comfortable, abstract, domain-oriented formalism. A large collection of fine-grain parallel algorithmic skeletons, represented as COOL objects in a tree hierarchy, contains the algorithmic knowledge. A hybrid knowledge base with rule modules and procedural parts, encoding expertise about application domain, parallel programming, software engineering, and parallel hardware, enables a high degree of automation in the software development process. In this paper, important aspects of the implementation of the PDE using CLIPS and COOL are shown, including the embedding of CLIPS with C++-based parts of the PDE. The appropriateness of the chosen approach and of the CLIPS language for knowledge-based software engineering are discussed.

  14. Randomized Dynamic Mode Decomposition

    NASA Astrophysics Data System (ADS)

    Erichson, N. Benjamin; Brunton, Steven L.; Kutz, J. Nathan

    2017-11-01

    The dynamic mode decomposition (DMD) is an equation-free, data-driven matrix decomposition that is capable of providing accurate reconstructions of spatio-temporal coherent structures arising in dynamical systems. We present randomized algorithms to compute the near-optimal low-rank dynamic mode decomposition for massive datasets. Randomized algorithms are simple, accurate and able to ease the computational challenges arising with `big data'. Moreover, randomized algorithms are amenable to modern parallel and distributed computing. The idea is to derive a smaller matrix from the high-dimensional input data matrix using randomness as a computational strategy. Then, the dynamic modes and eigenvalues are accurately learned from this smaller representation of the data, whereby the approximation quality can be controlled via oversampling and power iterations. Here, we present randomized DMD algorithms that are categorized by how many passes the algorithm takes through the data. Specifically, the single-pass randomized DMD does not require data to be stored for subsequent passes. Thus, it is possible to approximately decompose massive fluid flows (stored out of core memory, or not stored at all) using single-pass algorithms, which is infeasible with traditional DMD algorithms.

  15. Identification of rare X-linked neuroligin variants by massively parallel sequencing in males with autism spectrum disorder.

    PubMed

    Steinberg, Karyn Meltz; Ramachandran, Dhanya; Patel, Viren C; Shetty, Amol C; Cutler, David J; Zwick, Michael E

    2012-09-28

    Autism spectrum disorder (ASD) is highly heritable, but the genetic risk factors for it remain largely unknown. Although structural variants with large effect sizes may explain up to 15% ASD, genome-wide association studies have failed to uncover common single nucleotide variants with large effects on phenotype. The focus within ASD genetics is now shifting to the examination of rare sequence variants of modest effect, which is most often achieved via exome selection and sequencing. This strategy has indeed identified some rare candidate variants; however, the approach does not capture the full spectrum of genetic variation that might contribute to the phenotype. We surveyed two loci with known rare variants that contribute to ASD, the X-linked neuroligin genes by performing massively parallel Illumina sequencing of the coding and noncoding regions from these genes in males from families with multiplex autism. We annotated all variant sites and functionally tested a subset to identify other rare mutations contributing to ASD susceptibility. We found seven rare variants at evolutionary conserved sites in our study population. Functional analyses of the three 3' UTR variants did not show statistically significant effects on the expression of NLGN3 and NLGN4X. In addition, we identified two NLGN3 intronic variants located within conserved transcription factor binding sites that could potentially affect gene regulation. These data demonstrate the power of massively parallel, targeted sequencing studies of affected individuals for identifying rare, potentially disease-contributing variation. However, they also point out the challenges and limitations of current methods of direct functional testing of rare variants and the difficulties of identifying alleles with modest effects.

  16. Identification of rare X-linked neuroligin variants by massively parallel sequencing in males with autism spectrum disorder

    PubMed Central

    2012-01-01

    Background Autism spectrum disorder (ASD) is highly heritable, but the genetic risk factors for it remain largely unknown. Although structural variants with large effect sizes may explain up to 15% ASD, genome-wide association studies have failed to uncover common single nucleotide variants with large effects on phenotype. The focus within ASD genetics is now shifting to the examination of rare sequence variants of modest effect, which is most often achieved via exome selection and sequencing. This strategy has indeed identified some rare candidate variants; however, the approach does not capture the full spectrum of genetic variation that might contribute to the phenotype. Methods We surveyed two loci with known rare variants that contribute to ASD, the X-linked neuroligin genes by performing massively parallel Illumina sequencing of the coding and noncoding regions from these genes in males from families with multiplex autism. We annotated all variant sites and functionally tested a subset to identify other rare mutations contributing to ASD susceptibility. Results We found seven rare variants at evolutionary conserved sites in our study population. Functional analyses of the three 3’ UTR variants did not show statistically significant effects on the expression of NLGN3 and NLGN4X. In addition, we identified two NLGN3 intronic variants located within conserved transcription factor binding sites that could potentially affect gene regulation. Conclusions These data demonstrate the power of massively parallel, targeted sequencing studies of affected individuals for identifying rare, potentially disease-contributing variation. However, they also point out the challenges and limitations of current methods of direct functional testing of rare variants and the difficulties of identifying alleles with modest effects. PMID:23020841

  17. Hypercluster Parallel Processor

    NASA Technical Reports Server (NTRS)

    Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela

    1992-01-01

    Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.

  18. Design of an open-ended plenoptic camera for three-dimensional imaging of dusty plasmas

    NASA Astrophysics Data System (ADS)

    Sanpei, Akio; Tokunaga, Kazuya; Hayashi, Yasuaki

    2017-08-01

    Herein, the design of a plenoptic imaging system for three-dimensional reconstructions of dusty plasmas using an integral photography technique has been reported. This open-ended system is constructed with a multi-convex lens array and a typical reflex CMOS camera. We validated the design of the reconstruction system using known target particles. Additionally, the system has been applied to observations of fine particles floating in a horizontal, parallel-plate radio-frequency plasma. Furthermore, the system works well in the range of our dusty plasma experiment. We can identify the three-dimensional positions of dust particles from a single-exposure image obtained from one viewing port.

  19. Transition of a Three-Dimensional Unsteady Viscous Flow Analysis from a Research Environment to the Design Environment

    NASA Technical Reports Server (NTRS)

    Dorney, Suzanne; Dorney, Daniel J.; Huber, Frank; Sheffler, David A.; Turner, James E. (Technical Monitor)

    2001-01-01

    The advent of advanced computer architectures and parallel computing have led to a revolutionary change in the design process for turbomachinery components. Two- and three-dimensional steady-state computational flow procedures are now routinely used in the early stages of design. Unsteady flow analyses, however, are just beginning to be incorporated into design systems. This paper outlines the transition of a three-dimensional unsteady viscous flow analysis from the research environment into the design environment. The test case used to demonstrate the analysis is the full turbine system (high-pressure turbine, inter-turbine duct and low-pressure turbine) from an advanced turboprop engine.

  20. Massively parallel sparse matrix function calculations with NTPoly

    NASA Astrophysics Data System (ADS)

    Dawson, William; Nakajima, Takahito

    2018-04-01

    We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.

  1. Black hole perturbation under a 2 +2 decomposition in the action

    NASA Astrophysics Data System (ADS)

    Ripley, Justin L.; Yagi, Kent

    2018-01-01

    Black hole perturbation theory is useful for studying the stability of black holes and calculating ringdown gravitational waves after the collision of two black holes. Most previous calculations were carried out at the level of the field equations instead of the action. In this work, we compute the Einstein-Hilbert action to quadratic order in linear metric perturbations about a spherically symmetric vacuum background in Regge-Wheeler gauge. Using a 2 +2 splitting of spacetime, we expand the metric perturbations into a sum over scalar, vector, and tensor spherical harmonics, and dimensionally reduce the action to two dimensions by integrating over the two sphere. We find that the axial perturbation degree of freedom is described by a two-dimensional massive vector action, and that the polar perturbation degree of freedom is described by a two-dimensional dilaton massive gravity action. Varying the dimensionally reduced actions, we rederive covariant and gauge-invariant master equations for the axial and polar degrees of freedom. Thus, the two-dimensional massive vector and massive gravity actions we derive by dimensionally reducing the perturbed Einstein-Hilbert action describe the dynamics of a well-studied physical system: the metric perturbations of a static black hole. The 2 +2 formalism we present can be generalized to m +n -dimensional spacetime splittings, which may be useful in more generic situations, such as expanding metric perturbations in higher dimensional gravity. We provide a self-contained presentation of m +n formalism for vacuum spacetime splittings.

  2. 3D Feature Extraction for Unstructured Grids

    NASA Technical Reports Server (NTRS)

    Silver, Deborah

    1996-01-01

    Visualization techniques provide tools that help scientists identify observed phenomena in scientific simulation. To be useful, these tools must allow the user to extract regions, classify and visualize them, abstract them for simplified representations, and track their evolution. Object Segmentation provides a technique to extract and quantify regions of interest within these massive datasets. This article explores basic algorithms to extract coherent amorphous regions from two-dimensional and three-dimensional scalar unstructured grids. The techniques are applied to datasets from Computational Fluid Dynamics and those from Finite Element Analysis.

  3. Massively Parallel Assimilation of TOGA/TAO and Topex/Poseidon Measurements into a Quasi Isopycnal Ocean General Circulation Model Using an Ensemble Kalman Filter

    NASA Technical Reports Server (NTRS)

    Keppenne, Christian L.; Rienecker, Michele; Borovikov, Anna Y.; Suarez, Max

    1999-01-01

    A massively parallel ensemble Kalman filter (EnKF)is used to assimilate temperature data from the TOGA/TAO array and altimetry from TOPEX/POSEIDON into a Pacific basin version of the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. The EnKF is an approximate Kalman filter in which the error-covariance propagation step is modeled by the integration of multiple instances of a numerical model. An estimate of the true error covariances is then inferred from the distribution of the ensemble of model state vectors. This inplementation of the filter takes advantage of the inherent parallelism in the EnKF algorithm by running all the model instances concurrently. The Kalman filter update step also occurs in parallel by having each processor process the observations that occur in the region of physical space for which it is responsible. The massively parallel data assimilation system is validated by withholding some of the data and then quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The distributions of the forecast and analysis error covariances predicted by the ENKF are also examined.

  4. Three is much more than two in coarsening dynamics of cyclic competitions

    NASA Astrophysics Data System (ADS)

    Mitarai, Namiko; Gunnarson, Ivar; Pedersen, Buster Niels; Rosiek, Christian Anker; Sneppen, Kim

    2016-04-01

    The classical game of rock-paper-scissors has inspired experiments and spatial model systems that address the robustness of biological diversity. In particular, the game nicely illustrates that cyclic interactions allow multiple strategies to coexist for long-time intervals. When formulated in terms of a one-dimensional cellular automata, the spatial distribution of strategies exhibits coarsening with algebraically growing domain size over time, while the two-dimensional version allows domains to break and thereby opens the possibility for long-time coexistence. We consider a quasi-one-dimensional implementation of the cyclic competition, and study the long-term dynamics as a function of rare invasions between parallel linear ecosystems. We find that increasing the complexity from two to three parallel subsystems allows a transition from complete coarsening to an active steady state where the domain size stays finite. We further find that this transition happens irrespective of whether the update is done in parallel for all sites simultaneously or done randomly in sequential order. In both cases, the active state is characterized by localized bursts of dislocations, followed by longer periods of coarsening. In the case of the parallel dynamics, we find that there is another phase transition between the active steady state and the coarsening state within the three-line system when the invasion rate between the subsystems is varied. We identify the critical parameter for this transition and show that the density of active boundaries has critical exponents that are consistent with the directed percolation universality class. On the other hand, numerical simulations with the random sequential dynamics suggest that the system may exhibit an active steady state as long as the invasion rate is finite.

  5. A universal counting of black hole microstates in AdS4

    NASA Astrophysics Data System (ADS)

    Azzurli, Francesco; Bobev, Nikolay; Crichigno, P. Marcos; Min, Vincent S.; Zaffaroni, Alberto

    2018-02-01

    Many three-dimensional N=2 SCFTs admit a universal partial topological twist when placed on hyperbolic Riemann surfaces. We exploit this fact to derive a universal formula which relates the planar limit of the topologically twisted index of these SCFTs and their three-sphere partition function. We then utilize this to account for the entropy of a large class of supersymmetric asymptotically AdS4 magnetically charged black holes in M-theory and massive type IIA string theory. In this context we also discuss novel AdS2 solutions of eleven-dimensional supergravity which describe the near horizon region of large new families of supersymmetric black holes arising from M2-branes wrapping Riemann surfaces.

  6. Inner mechanics of three-dimensional black holes.

    PubMed

    Detournay, Stéphane

    2012-07-20

    We investigate properties of the inner horizons of certain black holes in higher-derivative three-dimensional gravity theories. We focus on Bañados-Teitelboim-Zanelli and spacelike warped anti-de Sitter black holes, as well as on asymptotically warped de Sitter solutions exhibiting both a cosmological and a black hole horizon. We verify that a first law is satisfied at the inner horizon, in agreement with the proposal of Castro and Rodriguez [arXiv:1204.1284]. We then show that, in topologically massive gravity, the product of the areas of the inner and outer horizons fails to be independent on the mass, and we trace this to the diffeomorphism anomaly of the theory.

  7. The Safety of Aircraft Exposed to Electromagnetic Fields: HIRF Testing of Aircraft Using Direct Current Injection

    DTIC Science & Technology

    2007-06-01

    massive RF power to the antenna feed points without providing an inductive path to earth. Given all the above challenges, and especially the... circuit theory currents are flowing limited by the three parallel 50 ohm resistances and low inductive reactance. This plateaus at eigencurrent...relative to nett TEM cell input power has been calculated: Figure 86 Expected power output from probe, neglecting probe inductance DSTO-RR-0329

  8. Parallel Planes Information Visualization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bush, Brian

    2015-12-26

    This software presents a user-provided multivariate dataset as an interactive three dimensional visualization so that the user can explore the correlation between variables in the observations and the distribution of observations among the variables.

  9. Algorithmic commonalities in the parallel environment

    NASA Technical Reports Server (NTRS)

    Mcanulty, Michael A.; Wainer, Michael S.

    1987-01-01

    The ultimate aim of this project was to analyze procedures from substantially different application areas to discover what is either common or peculiar in the process of conversion to the Massively Parallel Processor (MPP). Three areas were identified: molecular dynamic simulation, production systems (rule systems), and various graphics and vision algorithms. To date, only selected graphics procedures have been investigated. They are the most readily available, and produce the most visible results. These include simple polygon patch rendering, raycasting against a constructive solid geometric model, and stochastic or fractal based textured surface algorithms. Only the simplest of conversion strategies, mapping a major loop to the array, has been investigated so far. It is not entirely satisfactory.

  10. Black hole entropy in massive Type IIA

    NASA Astrophysics Data System (ADS)

    Benini, Francesco; Khachatryan, Hrachya; Milan, Paolo

    2018-02-01

    We study the entropy of static dyonic BPS black holes in AdS4 in 4d N=2 gauged supergravities with vector and hyper multiplets, and how the entropy can be reproduced with a microscopic counting of states in the AdS/CFT dual field theory. We focus on the particular example of BPS black holes in AdS{\\hspace{0pt}}4 × S6 in massive Type IIA, whose dual three-dimensional boundary description is known and simple. To count the states in field theory we employ a supersymmetric topologically twisted index, which can be computed exactly with localization techniques. We find a perfect match at leading order.

  11. Massive Vector Fields in Rotating Black-Hole Spacetimes: Separability and Quasinormal Modes

    NASA Astrophysics Data System (ADS)

    Frolov, Valeri P.; Krtouš, Pavel; KubizÅák, David; Santos, Jorge E.

    2018-06-01

    We demonstrate the separability of the massive vector (Proca) field equation in general Kerr-NUT-AdS black-hole spacetimes in any number of dimensions, filling a long-standing gap in the literature. The obtained separated equations are studied in more detail for the four-dimensional Kerr geometry and the corresponding quasinormal modes are calculated. Two of the three independent polarizations of the Proca field are shown to emerge from the separation ansatz and the results are found in an excellent agreement with those of the recent numerical study where the full coupled partial differential equations were tackled without using the separability property.

  12. Massive Vector Fields in Rotating Black-Hole Spacetimes: Separability and Quasinormal Modes.

    PubMed

    Frolov, Valeri P; Krtouš, Pavel; Kubizňák, David; Santos, Jorge E

    2018-06-08

    We demonstrate the separability of the massive vector (Proca) field equation in general Kerr-NUT-AdS black-hole spacetimes in any number of dimensions, filling a long-standing gap in the literature. The obtained separated equations are studied in more detail for the four-dimensional Kerr geometry and the corresponding quasinormal modes are calculated. Two of the three independent polarizations of the Proca field are shown to emerge from the separation ansatz and the results are found in an excellent agreement with those of the recent numerical study where the full coupled partial differential equations were tackled without using the separability property.

  13. A two-dimensional Zn coordination polymer with a three-dimensional supra-molecular architecture.

    PubMed

    Liu, Fuhong; Ding, Yan; Li, Qiuyu; Zhang, Liping

    2017-10-01

    The title compound, poly[bis-{μ 2 -4,4'-bis-[(1,2,4-triazol-1-yl)meth-yl]biphenyl-κ 2 N 4 : N 4' }bis-(nitrato-κ O )zinc(II)], [Zn(NO 3 ) 2 (C 18 H 16 N 6 ) 2 ] n , is a two-dimensional zinc coordination polymer constructed from 4,4'-bis-[(1 H -1,2,4-triazol-1-yl)meth-yl]-1,1'-biphenyl units. It was synthesized and characterized by elemental analysis and single-crystal X-ray diffraction. The Zn II cation is located on an inversion centre and is coordinated by two O atoms from two symmetry-related nitrate groups and four N atoms from four symmetry-related 4,4'-bis-[(1 H -1,2,4-triazol-1-yl)meth-yl]-1,1'-biphenyl ligands, forming a distorted octa-hedral {ZnN 4 O 2 } coordination geometry. The linear 4,4'-bis-[(1 H -1,2,4-triazol-1-yl)meth-yl]-1,1'-biphenyl ligand links two Zn II cations, generating two-dimensional layers parallel to the crystallographic (132) plane. The parallel layers are connected by C-H⋯O, C-H⋯N, C-H⋯π and π-π stacking inter-actions, resulting in a three-dimensional supra-molecular architecture.

  14. Spectroscopic evidence for bulk-band inversion and three-dimensional massive Dirac fermions in ZrTe5

    PubMed Central

    Chen, Zhi-Guo; Chen, R. Y.; Zhong, R. D.; Schneeloch, John; Zhang, C.; Huang, Y.; Qu, Fanming; Yu, Rui; Gu, G. D.; Wang, N. L.

    2017-01-01

    Three-dimensional topological insulators (3D TIs) represent states of quantum matters in which surface states are protected by time-reversal symmetry and an inversion occurs between bulk conduction and valence bands. However, the bulk-band inversion, which is intimately tied to the topologically nontrivial nature of 3D Tis, has rarely been investigated by experiments. Besides, 3D massive Dirac fermions with nearly linear band dispersions were seldom observed in TIs. Recently, a van der Waals crystal, ZrTe5, was theoretically predicted to be a TI. Here, we report an infrared transmission study of a high-mobility [∼33,000 cm2/(V ⋅ s)] multilayer ZrTe5 flake at magnetic fields (B) up to 35 T. Our observation of a linear relationship between the zero-magnetic-field optical absorption and the photon energy, a bandgap of ∼10 meV and a B dependence of the Landau level (LL) transition energies at low magnetic fields demonstrates 3D massive Dirac fermions with nearly linear band dispersions in this system. More importantly, the reemergence of the intra-LL transitions at magnetic fields higher than 17 T reveals the energy cross between the two zeroth LLs, which reflects the inversion between the bulk conduction and valence bands. Our results not only provide spectroscopic evidence for the TI state in ZrTe5 but also open up a new avenue for fundamental studies of Dirac fermions in van der Waals materials. PMID:28096330

  15. Spectroscopic evidence for bulk-band inversion and three-dimensional massive Dirac fermions in ZrTe 5

    DOE PAGES

    Chen, Zhi -Guo; Chen, R. Y.; Zhong, R. D.; ...

    2017-01-17

    Three-dimensional topological insulators (3D TIs) represent states of quantum matters in which surface states are protected by time-reversal symmetry and an inversion occurs between bulk conduction and valence bands. However, the bulk-band inversion, which is intimately tied to the topologically nontrivial nature of 3D Tis, has rarely been investigated by experiments. Besides, 3D massive Dirac fermions with nearly linear band dispersions were seldom observed in TIs. Recently, a van der Waals crystal, ZrTe 5, was theoretically predicted to be a TI. Here, we report an infrared transmission study of a high-mobility [~33,000 cm 2/(V • s)] multilayer ZrTe 5 flakemore » at magnetic fields (B) up to 35 T. Our observation of a linear relationship between the zero-magnetic-field optical absorption and the photon energy, a bandgap of ~10 meV and a √B dependence of the Landau level (LL) transition energies at low magnetic fields demonstrates 3D massive Dirac fermions with nearly linear band dispersions in this system. More importantly, the reemergence of the intra-LL transitions at magnetic fields higher than 17 T reveals the energy cross between the two zeroth LLs, which reflects the inversion between the bulk conduction and valence bands. Finally, our results not only provide spectroscopic evidence for the TI state in ZrTe 5 but also open up a new avenue for fundamental studies of Dirac fermions in van der Waals materials.« less

  16. Early stage aggregation of a coarse-grained model of polyglutamine

    NASA Astrophysics Data System (ADS)

    Haaga, Jason; Gunton, J. D.; Buckles, C. Nadia; Rickman, J. M.

    2018-01-01

    In this paper, we study the early stages of aggregation of a model of polyglutamine (polyQ) for different repeat lengths (number of glutamine amino acid groups in the chain). In particular, we use the Large-scale Atomic/Molecular Massively Parallel Simulator to study a generic coarse-grained model proposed by Bereau and Deserno. We focus on the primary nucleation mechanism involved and find that our results for the initial self-assembly process are consistent with the two-dimensional classical nucleation theory of Kashchiev and Auer. More specifically, we find that with decreasing supersaturation, the oligomer fibril (protofibril) transforms from a one-dimensional β sheet to two-, three-, and higher layer β sheets as the critical nucleus size increases. We also show that the results are consistent with several predictions of their theory, including the dependence of the critical nucleus size on the supersaturation. Our results for the time dependence of the mass aggregation are in reasonable agreement with an approximate analytical solution of the filament theory by Knowles and collaborators that corresponds to an additional secondary nucleation arising from filament fragmentation. Finally, we study the dependence of the critical nucleus size on the repeat length of polyQ. We find that for the larger length polyglutamine chain that we study, the critical nucleus is a monomer, in agreement with experiment and in contrast to the case for the smaller chain, for which the smallest critical nucleus size is four.

  17. NASA high performance computing, communications, image processing, and data visualization-potential applications to medicine.

    PubMed

    Kukkonen, C A

    1995-06-01

    High-speed information processing technologies being developed and applied by the Jet Propulsion Laboratory for NASA and Department of Defense mission needs have potential dual-uses in telemedicine and other medical applications. Fiber optic ground networks connected with microwave satellite links allow NASA to communicate with its astronauts in Earth orbit or on the moon, and with its deep space probes billions of miles away. These networks monitor the health of astronauts and or robotic spacecraft. Similar communications technology will also allow patients to communicate with doctors anywhere on Earth. NASA space missions have science as a major objective. Science sensors have become so sophisticated that they can take more data than our scientists can analyze by hand. High performance computers--workstations, supercomputer and massively parallel computers are being used to transform this data into knowledge. This is done using image processing, data visualization and other techniques to present the data--one's and zero's in forms that a human analyst can readily relate to and understand. Medical sensors have also explored in the in data output--witness CT scans, MRI, and ultrasound. This data must be presented in visual form and computers will allow routine combination of many two dimensional MRI images into three dimensional reconstructions of organs that then can be fully examined by physicians. Emerging technologies such as neural networks that are being "trained" to detect craters on planets or incoming missiles amongst decoys can be used to identify microcalcification in mammograms.

  18. A fast ultrasonic simulation tool based on massively parallel implementations

    NASA Astrophysics Data System (ADS)

    Lambert, Jason; Rougeron, Gilles; Lacassagne, Lionel; Chatillon, Sylvain

    2014-02-01

    This paper presents a CIVA optimized ultrasonic inspection simulation tool, which takes benefit of the power of massively parallel architectures: graphical processing units (GPU) and multi-core general purpose processors (GPP). This tool is based on the classical approach used in CIVA: the interaction model is based on Kirchoff, and the ultrasonic field around the defect is computed by the pencil method. The model has been adapted and parallelized for both architectures. At this stage, the configurations addressed by the tool are : multi and mono-element probes, planar specimens made of simple isotropic materials, planar rectangular defects or side drilled holes of small diameter. Validations on the model accuracy and performances measurements are presented.

  19. Ordered fast Fourier transforms on a massively parallel hypercube multiprocessor

    NASA Technical Reports Server (NTRS)

    Tong, Charles; Swarztrauber, Paul N.

    1991-01-01

    The present evaluation of alternative, massively parallel hypercube processor-applicable designs for ordered radix-2 decimation-in-frequency FFT algorithms gives attention to the reduction of computation time-dominating communication. A combination of the order and computational phases of the FFT is accordingly employed, in conjunction with sequence-to-processor maps which reduce communication. Two orderings, 'standard' and 'cyclic', in which the order of the transform is the same as that of the input sequence, can be implemented with ease on the Connection Machine (where orderings are determined by geometries and priorities. A parallel method for trigonometric coefficient computation is presented which does not employ trigonometric functions or interprocessor communication.

  20. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    DOE PAGES

    O'keefe, Matthew; Parr, Terence; Edgar, B. Kevin; ...

    1995-01-01

    Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. Wemore » have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.« less

  1. On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods

    PubMed Central

    Lee, Anthony; Yau, Christopher; Giles, Michael B.; Doucet, Arnaud; Holmes, Christopher C.

    2011-01-01

    We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design. PMID:22003276

  2. Parallel Monte Carlo transport modeling in the context of a time-dependent, three-dimensional multi-physics code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Procassini, R.J.

    1997-12-31

    The fine-scale, multi-space resolution that is envisioned for accurate simulations of complex weapons systems in three spatial dimensions implies flop-rate and memory-storage requirements that will only be obtained in the near future through the use of parallel computational techniques. Since the Monte Carlo transport models in these simulations usually stress both of these computational resources, they are prime candidates for parallelization. The MONACO Monte Carlo transport package, which is currently under development at LLNL, will utilize two types of parallelism within the context of a multi-physics design code: decomposition of the spatial domain across processors (spatial parallelism) and distribution ofmore » particles in a given spatial subdomain across additional processors (particle parallelism). This implementation of the package will utilize explicit data communication between domains (message passing). Such a parallel implementation of a Monte Carlo transport model will result in non-deterministic communication patterns. The communication of particles between subdomains during a Monte Carlo time step may require a significant level of effort to achieve a high parallel efficiency.« less

  3. MPSalsa Version 1.5: A Finite Element Computer Program for Reacting Flow Problems: Part 1 - Theoretical Development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devine, K.D.; Hennigan, G.L.; Hutchinson, S.A.

    1999-01-01

    The theoretical background for the finite element computer program, MPSalsa Version 1.5, is presented in detail. MPSalsa is designed to solve laminar or turbulent low Mach number, two- or three-dimensional incompressible and variable density reacting fluid flows on massively parallel computers, using a Petrov-Galerkin finite element formulation. The code has the capability to solve coupled fluid flow (with auxiliary turbulence equations), heat transport, multicomponent species transport, and finite-rate chemical reactions, and to solve coupled multiple Poisson or advection-diffusion-reaction equations. The program employs the CHEMKIN library to provide a rigorous treatment of multicomponent ideal gas kinetics and transport. Chemical reactions occurringmore » in the gas phase and on surfaces are treated by calls to CHEMKIN and SURFACE CHEMK3N, respectively. The code employs unstructured meshes, using the EXODUS II finite element database suite of programs for its input and output files. MPSalsa solves both transient and steady flows by using fully implicit time integration, an inexact Newton method and iterative solvers based on preconditioned Krylov methods as implemented in the Aztec. solver library.« less

  4. Modeling UV Radiation Feedback from Massive Stars. I. Implementation of Adaptive Ray-tracing Method and Tests

    NASA Astrophysics Data System (ADS)

    Kim, Jeong-Gyu; Kim, Woong-Tae; Ostriker, Eve C.; Skinner, M. Aaron

    2017-12-01

    We present an implementation of an adaptive ray-tracing (ART) module in the Athena hydrodynamics code that accurately and efficiently handles the radiative transfer involving multiple point sources on a three-dimensional Cartesian grid. We adopt a recently proposed parallel algorithm that uses nonblocking, asynchronous MPI communications to accelerate transport of rays across the computational domain. We validate our implementation through several standard test problems, including the propagation of radiation in vacuum and the expansions of various types of H II regions. Additionally, scaling tests show that the cost of a full ray trace per source remains comparable to that of the hydrodynamics update on up to ∼ {10}3 processors. To demonstrate application of our ART implementation, we perform a simulation of star cluster formation in a marginally bound, turbulent cloud, finding that its star formation efficiency is 12% when both radiation pressure forces and photoionization by UV radiation are treated. We directly compare the radiation forces computed from the ART scheme with those from the M1 closure relation. Although the ART and M1 schemes yield similar results on large scales, the latter is unable to resolve the radiation field accurately near individual point sources.

  5. A forward-adjoint operator pair based on the elastic wave equation for use in transcranial photoacoustic computed tomography

    PubMed Central

    Mitsuhashi, Kenji; Poudel, Joemini; Matthews, Thomas P.; Garcia-Uribe, Alejandro; Wang, Lihong V.; Anastasio, Mark A.

    2017-01-01

    Photoacoustic computed tomography (PACT) is an emerging imaging modality that exploits optical contrast and ultrasonic detection principles to form images of the photoacoustically induced initial pressure distribution within tissue. The PACT reconstruction problem corresponds to an inverse source problem in which the initial pressure distribution is recovered from measurements of the radiated wavefield. A major challenge in transcranial PACT brain imaging is compensation for aberrations in the measured data due to the presence of the skull. Ultrasonic waves undergo absorption, scattering and longitudinal-to-shear wave mode conversion as they propagate through the skull. To properly account for these effects, a wave-equation-based inversion method should be employed that can model the heterogeneous elastic properties of the skull. In this work, a forward model based on a finite-difference time-domain discretization of the three-dimensional elastic wave equation is established and a procedure for computing the corresponding adjoint of the forward operator is presented. Massively parallel implementations of these operators employing multiple graphics processing units (GPUs) are also developed. The developed numerical framework is validated and investigated in computer19 simulation and experimental phantom studies whose designs are motivated by transcranial PACT applications. PMID:29387291

  6. Massively parallel sequencing of 124 SNPs included in the precision ID identity panel in three East Asian minority ethnicities.

    PubMed

    Liu, Jing; Wang, Zheng; He, Guanglin; Zhao, Xueying; Wang, Mengge; Luo, Tao; Li, Chengtao; Hou, Yiping

    2018-07-01

    Massively parallel sequencing (MPS) technologies can sequence many targeted regions of multiple samples simultaneously and are gaining great interest in the forensic community. The Precision ID Identity Panel contains 90 autosomal SNPs and 34 upper Y-Clade SNPs, which was designed with small amplicons and optimized for forensic degraded or challenging samples. Here, 184 unrelated individuals from three East Asian minority ethnicities (Tibetan, Uygur and Hui) were analyzed using the Precision ID Identity Panel and the Ion PGM System. The sequencing performance and corresponding forensic statistical parameters of this MPS-SNP panel were investigated. The inter-population relationships and substructures among three investigated populations and 30 worldwide populations were further investigated using PCA, MDS, cladogram and STRUCTURE. No significant deviation from Hardy-Weinberg equilibrium (HWE) and Linkage Disequilibrium (LD) tests was observed across all 90 autosomal SNPs. The combined matching probability (CMP) for Tibetan, Uygur and Hui were 2.5880 × 10 -33 , 1.7480 × 10 -35 and 4.6326 × 10 -34 respectively, and the combined power of exclusion (CPE) were 0.999999386152271, 0.999999607712827 and 0.999999696360182 respectively. For 34 Y-SNPs, only 16 haplogroups were obtained, but the haplogroup distributions differ among the three populations. Tibetans from the Sino-Tibetan population and Hui with multiple ethnicities with an admixture population have genetic affinity with East Asian populations, while Uygurs of a Eurasian admixture population have similar genetic components to the South Asian populations and are distributed between East Asian and European populations. The aforementioned results suggest that the Precision ID Identity Panel is informative and polymorphic in three investigated populations and could be used as an effective tool for human forensics. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. A parallel reaction-transport model applied to cement hydration and microstructure development

    NASA Astrophysics Data System (ADS)

    Bullard, Jeffrey W.; Enjolras, Edith; George, William L.; Satterfield, Steven G.; Terrill, Judith E.

    2010-03-01

    A recently described stochastic reaction-transport model on three-dimensional lattices is parallelized and is used to simulate the time-dependent structural and chemical evolution in multicomponent reactive systems. The model, called HydratiCA, uses probabilistic rules to simulate the kinetics of diffusion, homogeneous reactions and heterogeneous phenomena such as solid nucleation, growth and dissolution in complex three-dimensional systems. The algorithms require information only from each lattice site and its immediate neighbors, and this localization enables the parallelized model to exhibit near-linear scaling up to several hundred processors. Although applicable to a wide range of material systems, including sedimentary rock beds, reacting colloids and biochemical systems, validation is performed here on two minerals that are commonly found in Portland cement paste, calcium hydroxide and ettringite, by comparing their simulated dissolution or precipitation rates far from equilibrium to standard rate equations, and also by comparing simulated equilibrium states to thermodynamic calculations, as a function of temperature and pH. Finally, we demonstrate how HydratiCA can be used to investigate microstructure characteristics, such as spatial correlations between different condensed phases, in more complex microstructures.

  8. Spatial structure of ion beams in an expanding plasma

    NASA Astrophysics Data System (ADS)

    Aguirre, E. M.; Scime, E. E.; Thompson, D. S.; Good, T. N.

    2017-12-01

    We report spatially resolved perpendicular and parallel, to the magnetic field, ion velocity distribution function (IVDF) measurements in an expanding argon helicon plasma. The parallel IVDFs, obtained through laser induced fluorescence (LIF), show an ion beam with v ≈ 8000 m/s flowing downstream and confined to the center of the discharge. The ion beam is measurable for tens of centimeters along the expansion axis before the LIF signal fades, likely a result of metastable quenching of the beam ions. The parallel ion beam velocity slows in agreement with expectations for the measured parallel electric field. The perpendicular IVDFs show an ion population with a radially outward flow that increases with distance from the plasma axis. Structures aligned to the expanding magnetic field appear in the DC electric field, the electron temperature, and the plasma density in the plasma plume. These measurements demonstrate that at least two-dimensional and perhaps fully three-dimensional models are needed to accurately describe the spontaneous acceleration of ion beams in expanding plasmas.

  9. First experiments probing the collision of parallel magnetic fields using laser-produced plasmas

    DOE PAGES

    Rosenberg, M. J.; Li, C. K.; Fox, W.; ...

    2015-04-08

    Novel experiments to study the strongly-driven collision of parallel magnetic fields in β~10, laser-produced plasmas have been conducted using monoenergetic proton radiography. These experiments were designed to probe the process of magnetic flux pileup, which has been identified in prior laser-plasma experiments as a key physical mechanism in the reconnection of anti-parallel magnetic fields when the reconnection inflow is dominated by strong plasma flows. In the present experiments using colliding plasmas carrying parallel magnetic fields, the magnetic flux is found to be conserved and slightly compressed in the collision region. Two-dimensional (2D) particle-in-cell (PIC) simulations predict a stronger flux compressionmore » and amplification of the magnetic field strength, and this discrepancy is attributed to the three-dimensional (3D) collision geometry. Future experiments may drive a stronger collision and further explore flux pileup in the context of the strongly-driven interaction of magnetic fields.« less

  10. Sequential Feedback Scheme Outperforms the Parallel Scheme for Hamiltonian Parameter Estimation.

    PubMed

    Yuan, Haidong

    2016-10-14

    Measurement and estimation of parameters are essential for science and engineering, where the main quest is to find the highest achievable precision with the given resources and design schemes to attain it. Two schemes, the sequential feedback scheme and the parallel scheme, are usually studied in the quantum parameter estimation. While the sequential feedback scheme represents the most general scheme, it remains unknown whether it can outperform the parallel scheme for any quantum estimation tasks. In this Letter, we show that the sequential feedback scheme has a threefold improvement over the parallel scheme for Hamiltonian parameter estimations on two-dimensional systems, and an order of O(d+1) improvement for Hamiltonian parameter estimation on d-dimensional systems. We also show that, contrary to the conventional belief, it is possible to simultaneously achieve the highest precision for estimating all three components of a magnetic field, which sets a benchmark on the local precision limit for the estimation of a magnetic field.

  11. High-performance computing — an overview

    NASA Astrophysics Data System (ADS)

    Marksteiner, Peter

    1996-08-01

    An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.

  12. Performance Evaluation in Network-Based Parallel Computing

    NASA Technical Reports Server (NTRS)

    Dezhgosha, Kamyar

    1996-01-01

    Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.

  13. Fast I/O for Massively Parallel Applications

    NASA Technical Reports Server (NTRS)

    OKeefe, Matthew T.

    1996-01-01

    The two primary goals for this report were the design, contruction and modeling of parallel disk arrays for scientific visualization and animation, and a study of the IO requirements of highly parallel applications. In addition, further work in parallel display systems required to project and animate the very high-resolution frames resulting from our supercomputing simulations in ocean circulation and compressible gas dynamics.

  14. Performance Analysis and Optimization on the UCLA Parallel Atmospheric General Circulation Model Code

    NASA Technical Reports Server (NTRS)

    Lou, John; Ferraro, Robert; Farrara, John; Mechoso, Carlos

    1996-01-01

    An analysis is presented of several factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on massively parallel computer systems. Several modificaitons to the original parallel AGCM code aimed at improving its numerical efficiency, interprocessor communication cost, load-balance and issues affecting single-node code performance are discussed.

  15. LDRD final report on massively-parallel linear programming : the parPCx system.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

    2005-02-01

    This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less

  16. Comparison of the MPP with other supercomputers for LANDSAT data processing

    NASA Technical Reports Server (NTRS)

    Ozga, Martin

    1987-01-01

    The massively parallel processor is compared to the CRAY X-MP and the CYBER-205 for LANDSAT data processing. The maximum likelihood classification algorithm is the basis for comparison since this algorithm is simple to implement and vectorizes very well. The algorithm was implemented on all three machines and tested by classifying the same full scene of LANDSAT multispectral scan data. Timings are compared as well as features of the machines and available software.

  17. High Performance Parallel Analysis of Coupled Problems for Aircraft Propulsion

    NASA Technical Reports Server (NTRS)

    Felippa, C. A.; Farhat, C.; Lanteri, S.; Maman, N.; Piperno, S.; Gumaste, U.

    1994-01-01

    In order to predict the dynamic response of a flexible structure in a fluid flow, the equations of motion of the structure and the fluid must be solved simultaneously. In this paper, we present several partitioned procedures for time-integrating this focus coupled problem and discuss their merits in terms of accuracy, stability, heterogeneous computing, I/O transfers, subcycling, and parallel processing. All theoretical results are derived for a one-dimensional piston model problem with a compressible flow, because the complete three-dimensional aeroelastic problem is difficult to analyze mathematically. However, the insight gained from the analysis of the coupled piston problem and the conclusions drawn from its numerical investigation are confirmed with the numerical simulation of the two-dimensional transient aeroelastic response of a flexible panel in a transonic nonlinear Euler flow regime.

  18. Approach and separation of quantum vortices with balanced cores

    NASA Astrophysics Data System (ADS)

    Kerr, Robert M.; Rorai, C.; Skipper, J.; Sreenivasan, K. R.

    2014-11-01

    Using two innovations, smooth but different, scaling laws for the reconnection of pairs of initially orthogonal and anti-parallel quantum vortices are obtained using the three-dimensional Gross-Pitaevskii equations. For the anti-parallel case, the scaling laws just before and after reconnection obey the dimensional δ ~ | t - tr| 1 / 2 prediction with temporal symmetry about the reconnection time tr and physical space symmetry about xr, the mid-point between the vortices, with extensions forming the edges of an equilateral pyramid. For all of the orthogonal cases, before reconnection δin ~(t -tr) 1 / 3 and after reconnection δout ~(tr - t) 2 / 3 , which are respectively slower and faster than the dimensional prediction. In these cases, the reconnection takes place in a plane defined by the directions of the curvature and vorticity. Robert.Kerr@warwick.ac.uk.

  19. The UPSF code: a metaprogramming-based high-performance automatically parallelized plasma simulation framework

    NASA Astrophysics Data System (ADS)

    Gao, Xiatian; Wang, Xiaogang; Jiang, Binhao

    2017-10-01

    UPSF (Universal Plasma Simulation Framework) is a new plasma simulation code designed for maximum flexibility by using edge-cutting techniques supported by C++17 standard. Through use of metaprogramming technique, UPSF provides arbitrary dimensional data structures and methods to support various kinds of plasma simulation models, like, Vlasov, particle in cell (PIC), fluid, Fokker-Planck, and their variants and hybrid methods. Through C++ metaprogramming technique, a single code can be used to arbitrary dimensional systems with no loss of performance. UPSF can also automatically parallelize the distributed data structure and accelerate matrix and tensor operations by BLAS. A three-dimensional particle in cell code is developed based on UPSF. Two test cases, Landau damping and Weibel instability for electrostatic and electromagnetic situation respectively, are presented to show the validation and performance of the UPSF code.

  20. A new conformal absorbing boundary condition for finite element meshes and parallelization of FEMATS

    NASA Technical Reports Server (NTRS)

    Chatterjee, A.; Volakis, J. L.; Nguyen, J.; Nurnberger, M.; Ross, D.

    1993-01-01

    Some of the progress toward the development and parallelization of an improved version of the finite element code FEMATS is described. This is a finite element code for computing the scattering by arbitrarily shaped three dimensional surfaces composite scatterers. The following tasks were worked on during the report period: (1) new absorbing boundary conditions (ABC's) for truncating the finite element mesh; (2) mixed mesh termination schemes; (3) hierarchical elements and multigridding; (4) parallelization; and (5) various modeling enhancements (antenna feeds, anisotropy, and higher order GIBC).

  1. Three-dimensional Hybrid Simulation Study of Anisotropic Turbulence in the Proton Kinetic Regime

    NASA Astrophysics Data System (ADS)

    Vasquez, Bernard J.; Markovskii, Sergei A.; Chandran, Benjamin D. G.

    2014-06-01

    Three-dimensional numerical hybrid simulations with particle protons and quasi-neutralizing fluid electrons are conducted for a freely decaying turbulence that is anisotropic with respect to the background magnetic field. The turbulence evolution is determined by both the combined root-mean-square (rms) amplitude for fluctuating proton bulk velocity and magnetic field and by the ratio of perpendicular to parallel wavenumbers. This kind of relationship had been considered in the past with regard to interplanetary turbulence. The fluctuations nonlinearly evolve to a turbulent phase whose net wave vector anisotropy is usually more perpendicular than the initial one, irrespective of the initial ratio of perpendicular to parallel wavenumbers. Self-similar anisotropy evolution is found as a function of the rms amplitude and parallel wavenumber. Proton heating rates in the turbulent phase vary strongly with the rms amplitude but only weakly with the initial wave vector anisotropy. Even in the limit where wave vectors are confined to the plane perpendicular to the background magnetic field, the heating rate remains close to the corresponding case with finite parallel wave vector components. Simulation results obtained as a function of proton plasma to background magnetic pressure ratio β p in the range 0.1-0.5 show that the wave vector anisotropy also weakly depends on β p .

  2. Numerical Simulation of Dual-Mode Scramjet Combustors

    NASA Technical Reports Server (NTRS)

    Rodriguez, C. G.; Riggins, D. W.; Bittner, R. D.

    2000-01-01

    Results of a numerical investigation of a three-dimensional dual-mode scramjet isolator-combustor flow-field are presented. Specifically, the effect of wall cooling on upstream interaction and flow-structure is examined for a case assuming jet-to-jet symmetry within the combustor. Comparisons are made with available experimental wall pressures. The full half-duct for the isolator-combustor is then modeled in order to study the influence of side-walls. Large scale three-dimensionality is observed in the flow with massive separation forward on the side-walls of the duct. A brief review of convergence-acceleration techniques useful in dual-mode simulations is presented, followed by recommendations regarding the development of a reliable and unambiguous experimental data base for guiding CFD code assessments in this area.

  3. Block iterative restoration of astronomical images with the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Heap, Sara R.; Lindler, Don J.

    1987-01-01

    A method is described for algebraic image restoration capable of treating astronomical images. For a typical 500 x 500 image, direct algebraic restoration would require the solution of a 250,000 x 250,000 linear system. The block iterative approach is used to reduce the problem to solving 4900 121 x 121 linear systems. The algorithm was implemented on the Goddard Massively Parallel Processor, which can solve a 121 x 121 system in approximately 0.06 seconds. Examples are shown of the results for various astronomical images.

  4. A Massively Parallel Code for Polarization Calculations

    NASA Astrophysics Data System (ADS)

    Akiyama, Shizuka; Höflich, Peter

    2001-03-01

    We present an implementation of our Monte-Carlo radiation transport method for rapidly expanding, NLTE atmospheres for massively parallel computers which utilizes both the distributed and shared memory models. This allows us to take full advantage of the fast communication and low latency inherent to nodes with multiple CPUs, and to stretch the limits of scalability with the number of nodes compared to a version which is based on the shared memory model. Test calculations on a local 20-node Beowulf cluster with dual CPUs showed an improved scalability by about 40%.

  5. Routing performance analysis and optimization within a massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

    2013-04-16

    An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.

  6. Progressive Vector Quantization on a massively parallel SIMD machine with application to multispectral image data

    NASA Technical Reports Server (NTRS)

    Manohar, Mareboyana; Tilton, James C.

    1994-01-01

    A progressive vector quantization (VQ) compression approach is discussed which decomposes image data into a number of levels using full search VQ. The final level is losslessly compressed, enabling lossless reconstruction. The computational difficulties are addressed by implementation on a massively parallel SIMD machine. We demonstrate progressive VQ on multispectral imagery obtained from the Advanced Very High Resolution Radiometer instrument and other Earth observation image data, and investigate the trade-offs in selecting the number of decomposition levels and codebook training method.

  7. Modelling of Heat and Moisture Loss Through NBC Ensembles

    DTIC Science & Technology

    1991-11-01

    the heat and moisture transport through various NBC clothing ensembles. The analysis involves simplifying the three dimensional physical problem of... clothing on a person to that of a one dimensional problem of flow through parallel layers of clothing and air. Body temperatures are calculated based on...prescribed work rates, ambient conditions and clothing properties. Sweat response and respiration rates are estimated based on empirical data to

  8. Regional-scale calculation of the LS factor using parallel processing

    NASA Astrophysics Data System (ADS)

    Liu, Kai; Tang, Guoan; Jiang, Ling; Zhu, A.-Xing; Yang, Jianyi; Song, Xiaodong

    2015-05-01

    With the increase of data resolution and the increasing application of USLE over large areas, the existing serial implementation of algorithms for computing the LS factor is becoming a bottleneck. In this paper, a parallel processing model based on message passing interface (MPI) is presented for the calculation of the LS factor, so that massive datasets at a regional scale can be processed efficiently. The parallel model contains algorithms for calculating flow direction, flow accumulation, drainage network, slope, slope length and the LS factor. According to the existence of data dependence, the algorithms are divided into local algorithms and global algorithms. Parallel strategy are designed according to the algorithm characters including the decomposition method for maintaining the integrity of the results, optimized workflow for reducing the time taken for exporting the unnecessary intermediate data and a buffer-communication-computation strategy for improving the communication efficiency. Experiments on a multi-node system show that the proposed parallel model allows efficient calculation of the LS factor at a regional scale with a massive dataset.

  9. Massively parallel processor computer

    NASA Technical Reports Server (NTRS)

    Fung, L. W. (Inventor)

    1983-01-01

    An apparatus for processing multidimensional data with strong spatial characteristics, such as raw image data, characterized by a large number of parallel data streams in an ordered array is described. It comprises a large number (e.g., 16,384 in a 128 x 128 array) of parallel processing elements operating simultaneously and independently on single bit slices of a corresponding array of incoming data streams under control of a single set of instructions. Each of the processing elements comprises a bidirectional data bus in communication with a register for storing single bit slices together with a random access memory unit and associated circuitry, including a binary counter/shift register device, for performing logical and arithmetical computations on the bit slices, and an I/O unit for interfacing the bidirectional data bus with the data stream source. The massively parallel processor architecture enables very high speed processing of large amounts of ordered parallel data, including spatial translation by shifting or sliding of bits vertically or horizontally to neighboring processing elements.

  10. Compact holographic optical neural network system for real-time pattern recognition

    NASA Astrophysics Data System (ADS)

    Lu, Taiwei; Mintzer, David T.; Kostrzewski, Andrew A.; Lin, Freddie S.

    1996-08-01

    One of the important characteristics of artificial neural networks is their capability for massive interconnection and parallel processing. Recently, specialized electronic neural network processors and VLSI neural chips have been introduced in the commercial market. The number of parallel channels they can handle is limited because of the limited parallel interconnections that can be implemented with 1D electronic wires. High-resolution pattern recognition problems can require a large number of neurons for parallel processing of an image. This paper describes a holographic optical neural network (HONN) that is based on high- resolution volume holographic materials and is capable of performing massive 3D parallel interconnection of tens of thousands of neurons. A HONN with more than 16,000 neurons packaged in an attache case has been developed. Rotation- shift-scale-invariant pattern recognition operations have been demonstrated with this system. System parameters such as the signal-to-noise ratio, dynamic range, and processing speed are discussed.

  11. Streaming parallel GPU acceleration of large-scale filter-based spiking neural networks.

    PubMed

    Slażyński, Leszek; Bohte, Sander

    2012-01-01

    The arrival of graphics processing (GPU) cards suitable for massively parallel computing promises affordable large-scale neural network simulation previously only available at supercomputing facilities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of magnitude, the challenge is to develop fine-grained parallel algorithms to fully exploit the particulars of GPUs. Computation in a neural network is inherently parallel and thus a natural match for GPU architectures: given inputs, the internal state for each neuron can be updated in parallel. We show that for filter-based spiking neurons, like the Spike Response Model, the additive nature of membrane potential dynamics enables additional update parallelism. This also reduces the accumulation of numerical errors when using single precision computation, the native precision of GPUs. We further show that optimizing simulation algorithms and data structures to the GPU's architecture has a large pay-off: for example, matching iterative neural updating to the memory architecture of the GPU speeds up this simulation step by a factor of three to five. With such optimizations, we can simulate in better-than-realtime plausible spiking neural networks of up to 50 000 neurons, processing over 35 million spiking events per second.

  12. Chromatographic background drift correction coupled with parallel factor analysis to resolve coelution problems in three-dimensional chromatographic data: quantification of eleven antibiotics in tap water samples by high-performance liquid chromatography coupled with a diode array detector.

    PubMed

    Yu, Yong-Jie; Wu, Hai-Long; Fu, Hai-Yan; Zhao, Juan; Li, Yuan-Na; Li, Shu-Fang; Kang, Chao; Yu, Ru-Qin

    2013-08-09

    Chromatographic background drift correction has been an important field of research in chromatographic analysis. In the present work, orthogonal spectral space projection for background drift correction of three-dimensional chromatographic data was described in detail and combined with parallel factor analysis (PARAFAC) to resolve overlapped chromatographic peaks and obtain the second-order advantage. This strategy was verified by simulated chromatographic data and afforded significant improvement in quantitative results. Finally, this strategy was successfully utilized to quantify eleven antibiotics in tap water samples. Compared with the traditional methodology of introducing excessive factors for the PARAFAC model to eliminate the effect of background drift, clear improvement in the quantitative performance of PARAFAC was observed after background drift correction by orthogonal spectral space projection. Copyright © 2013 Elsevier B.V. All rights reserved.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barrett, Brian; Brightwell, Ronald B.; Grant, Ryan

    This report presents a specification for the Portals 4 networ k programming interface. Portals 4 is intended to allow scalable, high-performance network communication betwee n nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded syste ms. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platfor ms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is tarmore » geted to the next generation of machines employing advanced network interface architectures that support enh anced offload capabilities.« less

  14. The Portals 4.0 network programming interface.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barrett, Brian W.; Brightwell, Ronald Brian; Pedretti, Kevin

    2012-11-01

    This report presents a specification for the Portals 4.0 network programming interface. Portals 4.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4.0 is well suited to massively parallel processing and embedded systems. Portals 4.0 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandias Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4.0 is targeted to the next generationmore » of machines employing advanced network interface architectures that support enhanced offload capabilities.« less

  15. Coaxial microreactor for particle synthesis

    DOEpatents

    Bartsch, Michael; Kanouff, Michael P; Ferko, Scott M; Crocker, Robert W; Wally, Karl

    2013-10-22

    A coaxial fluid flow microreactor system disposed on a microfluidic chip utilizing laminar flow for synthesizing particles from solution. Flow geometries produced by the mixing system make use of hydrodynamic focusing to confine a core flow to a small axially-symmetric, centrally positioned and spatially well-defined portion of a flow channel cross-section to provide highly uniform diffusional mixing between a reactant core and sheath flow streams. The microreactor is fabricated in such a way that a substantially planar two-dimensional arrangement of microfluidic channels will produce a three-dimensional core/sheath flow geometry. The microreactor system can comprise one or more coaxial mixing stages that can be arranged singly, in series, in parallel or nested concentrically in parallel.

  16. Turbulence modeling of free shear layers for high performance aircraft

    NASA Technical Reports Server (NTRS)

    Sondak, Douglas

    1993-01-01

    In many flowfield computations, accuracy of the turbulence model employed is frequently a limiting factor in the overall accuracy of the computation. This is particularly true for complex flowfields such as those around full aircraft configurations. Free shear layers such as wakes, impinging jets (in V/STOL applications), and mixing layers over cavities are often part of these flowfields. Although flowfields have been computed for full aircraft, the memory and CPU requirements for these computations are often excessive. Additional computer power is required for multidisciplinary computations such as coupled fluid dynamics and conduction heat transfer analysis. Massively parallel computers show promise in alleviating this situation, and the purpose of this effort was to adapt and optimize CFD codes to these new machines. The objective of this research effort was to compute the flowfield and heat transfer for a two-dimensional jet impinging normally on a cool plate. The results of this research effort were summarized in an AIAA paper titled 'Parallel Implementation of the k-epsilon Turbulence Model'. Appendix A contains the full paper.

  17. Scalable isosurface visualization of massive datasets on commodity off-the-shelf clusters

    PubMed Central

    Bajaj, Chandrajit

    2009-01-01

    Tomographic imaging and computer simulations are increasingly yielding massive datasets. Interactive and exploratory visualizations have rapidly become indispensable tools to study large volumetric imaging and simulation data. Our scalable isosurface visualization framework on commodity off-the-shelf clusters is an end-to-end parallel and progressive platform, from initial data access to the final display. Interactive browsing of extracted isosurfaces is made possible by using parallel isosurface extraction, and rendering in conjunction with a new specialized piece of image compositing hardware called Metabuffer. In this paper, we focus on the back end scalability by introducing a fully parallel and out-of-core isosurface extraction algorithm. It achieves scalability by using both parallel and out-of-core processing and parallel disks. It statically partitions the volume data to parallel disks with a balanced workload spectrum, and builds I/O-optimal external interval trees to minimize the number of I/O operations of loading large data from disk. We also describe an isosurface compression scheme that is efficient for progress extraction, transmission and storage of isosurfaces. PMID:19756231

  18. 3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sofronov, I.D.; Voronin, B.L.; Butnev, O.I.

    1997-12-31

    The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle.more » The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.« less

  19. Explosive Disintegration of a Massive Young Stellar System in Orion

    NASA Astrophysics Data System (ADS)

    Zapata, Luis A.; Schmid-Burgk, Johannes; Ho, Paul T. P.; Rodríguez, Luis F.; Menten, Karl M.

    2009-10-01

    Young massive stars in the center of crowded star clusters are expected to undergo close dynamical encounters that could lead to energetic, explosive events. However, there has so far never been clear observational evidence of such a remarkable phenomenon. We here report new interferometric observations that indicate the well-known enigmatic wide-angle outflow located in the Orion BN/KL star-forming region to have been produced by such a violent explosion during the disruption of a massive young stellar system, and that this was caused by a close dynamical interaction about 500 years ago. This outflow thus belongs to a totally different family of molecular flows that is not related to the classical bipolar flows that are generated by stars during their formation process. Our molecular data allow us to create a three-dimensional view of the debris flow and to link this directly to the well-known Orion H2 "fingers" farther out.

  20. Gravitational Instabilities in the Disks of Massive Protostars as an Explanation for Linear Distributions of Methanol Masers

    NASA Astrophysics Data System (ADS)

    Durisen, Richard H.; Mejia, Annie C.; Pickett, Brian K.; Hartquist, Thomas W.

    2001-12-01

    Evidence suggests that some masers associated with massive protostars may originate in the outer regions of large disks, at radii of hundreds to thousands of AU from the central mass. This is particularly true for methanol (CH3OH), for which linear distributions of masers are found with disklike kinematics. In three-dimensional hydrodynamics simulations we have made to study the effects of gravitational instabilities in the outer parts of disks around young low-mass stars, the nonlinear development of the instabilities leads to a complex of intersecting spiral shocks, clumps, and arclets within the disk and to significant time-dependent, nonaxisymmetric distortions of the disk surface. A rescaling of our disk simulations to the case of a massive protostar shows that conditions in the disturbed outer disk seem conducive to the appearance of masers if it is viewed edge-on.

  1. Three-dimensional magnetohydrodynamic equilibrium of quiescent H-modes in tokamak systems

    NASA Astrophysics Data System (ADS)

    Cooper, W. A.; Graves, J. P.; Duval, B. P.; Sauter, O.; Faustin, J. M.; Kleiner, A.; Lanthaler, S.; Patten, H.; Raghunathan, M.; Tran, T.-M.; Chapman, I. T.; Ham, C. J.

    2016-06-01

    Three dimensional free boundary magnetohydrodynamic equilibria that recover saturated ideal kink/peeling structures are obtained numerically. Simulations that model the JET tokamak at fixed < β > =1.7% with a large edge bootstrap current that flattens the q-profile near the plasma boundary demonstrate that a radial parallel current density ribbon with a dominant m /n  =  5/1 Fourier component at {{I}\\text{t}}=2.2 MA develops into a broadband spectrum when the toroidal current I t is increased to 2.5 MA.

  2. A lattice relaxation algorithm for three-dimensional Poisson-Nernst-Planck theory with application to ion transport through the gramicidin A channel.

    PubMed Central

    Kurnikova, M G; Coalson, R D; Graf, P; Nitzan, A

    1999-01-01

    A lattice relaxation algorithm is developed to solve the Poisson-Nernst-Planck (PNP) equations for ion transport through arbitrary three-dimensional volumes. Calculations of systems characterized by simple parallel plate and cylindrical pore geometries are presented in order to calibrate the accuracy of the method. A study of ion transport through gramicidin A dimer is carried out within this PNP framework. Good agreement with experimental measurements is obtained. Strengths and weaknesses of the PNP approach are discussed. PMID:9929470

  3. Supercomputer algorithms for efficient linear octree encoding of three-dimensional brain images.

    PubMed

    Berger, S B; Reis, D J

    1995-02-01

    We designed and implemented algorithms for three-dimensional (3-D) reconstruction of brain images from serial sections using two important supercomputer architectures, vector and parallel. These architectures were represented by the Cray YMP and Connection Machine CM-2, respectively. The programs operated on linear octree representations of the brain data sets, and achieved 500-800 times acceleration when compared with a conventional laboratory workstation. As the need for higher resolution data sets increases, supercomputer algorithms may offer a means of performing 3-D reconstruction well above current experimental limits.

  4. Self-assembled three-dimensional chiral colloidal architecture

    NASA Astrophysics Data System (ADS)

    Ben Zion, Matan Yah; He, Xiaojin; Maass, Corinna C.; Sha, Ruojie; Seeman, Nadrian C.; Chaikin, Paul M.

    2017-11-01

    Although stereochemistry has been a central focus of the molecular sciences since Pasteur, its province has previously been restricted to the nanometric scale. We have programmed the self-assembly of micron-sized colloidal clusters with structural information stemming from a nanometric arrangement. This was done by combining DNA nanotechnology with colloidal science. Using the functional flexibility of DNA origami in conjunction with the structural rigidity of colloidal particles, we demonstrate the parallel self-assembly of three-dimensional microconstructs, evincing highly specific geometry that includes control over position, dihedral angles, and cluster chirality.

  5. Signatures of extra dimensions in gravitational waves from black hole quasinormal modes

    NASA Astrophysics Data System (ADS)

    Chakraborty, Sumanta; Chakravarti, Kabir; Bose, Sukanta; SenGupta, Soumitra

    2018-05-01

    In this work, we have derived the evolution equation for gravitational perturbation in four-dimensional spacetime in the presence of a spatial extra dimension. The evolution equation is derived by perturbing the effective gravitational field equations on the four-dimensional spacetime, which inherits nontrivial higher-dimensional effects. Note that this is different from the perturbation of the five-dimensional gravitational field equations that exist in the literature and possess quantitatively new features. The gravitational perturbation has further been decomposed into a purely four-dimensional part and another piece that depends on extra dimensions. The four-dimensional gravitational perturbation now admits massive propagating degrees of freedom, owing to the existence of higher dimensions. We have also studied the influence of these massive propagating modes on the quasinormal mode frequencies, signaling the higher-dimensional nature of the spacetime, and have contrasted these massive modes with the massless modes in general relativity. Surprisingly, it turns out that the massive modes experience damping much smaller than that of the massless modes in general relativity and may even dominate over and above the general relativity contribution if one observes the ringdown phase of a black hole merger event at sufficiently late times. Furthermore, the whole analytical framework has been supplemented by the fully numerical Cauchy evolution problem, as well. In this context, we have shown that, except for minute details, the overall features of the gravitational perturbations are captured both in the Cauchy evolution as well as in the analysis of quasinormal modes. The implications on observations of black holes with LIGO and proposed space missions such as LISA are also discussed.

  6. Gravity-oriented microfluidic device for uniform and massive cell spheroid formation

    PubMed Central

    Lee, Kangsun; Kim, Choong; Young Yang, Jae; Lee, Hun; Ahn, Byungwook; Xu, Linfeng; Yoon Kang, Ji; Oh, Kwang W.

    2012-01-01

    We propose a simple method for forming massive and uniform three-dimensional (3-D) cell spheroids in a multi-level structured microfluidic device by gravitational force. The concept of orienting the device vertically has allowed spheroid formation, long-term perfusion, and retrieval of the cultured spheroids by user-friendly standard pipetting. We have successfully formed, perfused, and retrieved uniform, size-controllable, well-conditioned spheroids of human embryonic kidney 293 cells (HEK 293) in the gravity-oriented microfluidic device. We expect the proposed method will be a useful tool to study in-vitro 3-D cell models for the proliferation, differentiation, and metabolism of embryoid bodies or tumours. PMID:22662098

  7. Low-resistance gateless high electron mobility transistors using three-dimensional inverted pyramidal AlGaN/GaN surfaces

    NASA Astrophysics Data System (ADS)

    So, Hongyun; Senesky, Debbie G.

    2016-01-01

    In this letter, three-dimensional gateless AlGaN/GaN high electron mobility transistors (HEMTs) were demonstrated with 54% reduction in electrical resistance and 73% increase in surface area compared with conventional gateless HEMTs on planar substrates. Inverted pyramidal AlGaN/GaN surfaces were microfabricated using potassium hydroxide etched silicon with exposed (111) surfaces and metal-organic chemical vapor deposition of coherent AlGaN/GaN thin films. In addition, electrical characterization of the devices showed that a combination of series and parallel connections of the highly conductive two-dimensional electron gas along the pyramidal geometry resulted in a significant reduction in electrical resistance at both room and high temperatures (up to 300 °C). This three-dimensional HEMT architecture can be leveraged to realize low-power and reliable power electronics, as well as harsh environment sensors with increased surface area.

  8. Reconnection in Three Dimensions

    NASA Technical Reports Server (NTRS)

    Hesse, Michael

    1999-01-01

    Analyzing the qualitative three-dimensional magnetic structure of a plasmoid, we were led to reconsider the concept of magnetic reconnection from a general point of view. The properties of relatively simple magnetic field models provide a strong preference for one of two definitions of magnetic reconnection that exist in the literature. Any concept of magnetic reconnection defined in terms of magnetic topology seems naturally restricted to cases where the magnetic field vanishes somewhere in the nonideal (diffusion) region. The main part of this paper is concerned with magnetic reconnection in nonvanishing magnetic fields (finite-B reconnection), which has attracted less attention in the past. We show that the electric field component parallel to the magnetic field plays a crucial physical role in finite-B reconnection, and we present two theorems involving the former. The first states a necessary and sufficient condition on the parallel electric field for global reconnection to occur. Here the term "global" means the generic case where the breakdown of magnetic connection occurs for plasma elements that stay outside the nonideal region. The second theorem relates the change of magnetic helicity to the parallel electric field for cases where the electric field vanishes at large distances. That these results provide new insight into three-dimensional reconnection processes is illustrated in terms of the plasmoid configuration, which was our starting point.

  9. Thought Leaders during Crises in Massive Social Networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Corley, Courtney D.; Farber, Robert M.; Reynolds, William

    The vast amount of social media data that can be gathered from the internet coupled with workflows that utilize both commodity systems and massively parallel supercomputers, such as the Cray XMT, open new vistas for research to support health, defense, and national security. Computer technology now enables the analysis of graph structures containing more than 4 billion vertices joined by 34 billion edges along with metrics and massively parallel algorithms that exhibit near-linear scalability according to number of processors. The challenge lies in making this massive data and analysis comprehensible to an analyst and end-users that require actionable knowledge tomore » carry out their duties. Simply stated, we have developed language and content agnostic techniques to reduce large graphs built from vast media corpora into forms people can understand. Specifically, our tools and metrics act as a survey tool to identify thought leaders' -- those members that lead or reflect the thoughts and opinions of an online community, independent of the source language.« less

  10. Massively parallel sequencing and genome-wide copy number analysis revealed a clonal relationship in benign metastasizing leiomyoma

    PubMed Central

    Lee, Li-Yu; Lin, Gigin; Chen, Shu-Jen; Lu, Yen-Jung; Huang, Huei-Jean; Yen, Chi-Feng; Han, Chien Min; Lee, Yun-Shien; Wang, Tzu-Hao; Chao, Angel

    2017-01-01

    Benign metastasizing leiomyoma (BML) is a rare disease entity typically presenting as multiple extrauterine leiomyomas associated with a uterine leiomyoma. It has been hypothesized that the extrauterine leiomyomata represent distant metastasis of the uterine leiomyoma. To date, the only molecular evidence supporting this hypothesis was derived from clonality analyses based on X-chromosome inactivation assays. Here, we sought to address this issue by examining paired specimens of synchronous pulmonary and uterine leiomyomata from three patients using targeted massively parallel sequencing and molecular inversion probe array analysis for detecting somatic mutations and copy number aberrations. We detected identical non-hot-spot somatic mutations and similar patterns of copy number aberrations (CNAs) in paired pulmonary and uterine leiomyomata from two patients, indicating the clonal relationship between pulmonary and uterine leiomyomata. In addition to loss of chromosome 22q found in the literature, we identified additional recurrent CNAs including losses of chromosome 3q and 11q. In conclusion, our findings of the clonal relationship between synchronous pulmonary and uterine leiomyomas support the hypothesis that BML represents a condition wherein a uterine leiomyoma disseminates to distant extrauterine locations. PMID:28533481

  11. Massively parallel sequencing and genome-wide copy number analysis revealed a clonal relationship in benign metastasizing leiomyoma.

    PubMed

    Wu, Ren-Chin; Chao, An-Shine; Lee, Li-Yu; Lin, Gigin; Chen, Shu-Jen; Lu, Yen-Jung; Huang, Huei-Jean; Yen, Chi-Feng; Han, Chien Min; Lee, Yun-Shien; Wang, Tzu-Hao; Chao, Angel

    2017-07-18

    Benign metastasizing leiomyoma (BML) is a rare disease entity typically presenting as multiple extrauterine leiomyomas associated with a uterine leiomyoma. It has been hypothesized that the extrauterine leiomyomata represent distant metastasis of the uterine leiomyoma. To date, the only molecular evidence supporting this hypothesis was derived from clonality analyses based on X-chromosome inactivation assays. Here, we sought to address this issue by examining paired specimens of synchronous pulmonary and uterine leiomyomata from three patients using targeted massively parallel sequencing and molecular inversion probe array analysis for detecting somatic mutations and copy number aberrations. We detected identical non-hot-spot somatic mutations and similar patterns of copy number aberrations (CNAs) in paired pulmonary and uterine leiomyomata from two patients, indicating the clonal relationship between pulmonary and uterine leiomyomata. In addition to loss of chromosome 22q found in the literature, we identified additional recurrent CNAs including losses of chromosome 3q and 11q. In conclusion, our findings of the clonal relationship between synchronous pulmonary and uterine leiomyomas support the hypothesis that BML represents a condition wherein a uterine leiomyoma disseminates to distant extrauterine locations.

  12. A review of bioinformatic methods for forensic DNA analyses.

    PubMed

    Liu, Yao-Yuan; Harbison, SallyAnn

    2018-03-01

    Short tandem repeats, single nucleotide polymorphisms, and whole mitochondrial analyses are three classes of markers which will play an important role in the future of forensic DNA typing. The arrival of massively parallel sequencing platforms in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. As forensic laboratories search for the optimal pipeline of tools, sequencer manufacturers have incorporated pipelines into sequencer software to make analyses convenient. This review explores the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the massively parallel sequencing (MPS) platforms currently most widely used. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Development of iterative techniques for the solution of unsteady compressible viscous flows

    NASA Technical Reports Server (NTRS)

    Hixon, Duane; Sankar, L. N.

    1993-01-01

    During the past two decades, there has been significant progress in the field of numerical simulation of unsteady compressible viscous flows. At present, a variety of solution techniques exist such as the transonic small disturbance analyses (TSD), transonic full potential equation-based methods, unsteady Euler solvers, and unsteady Navier-Stokes solvers. These advances have been made possible by developments in three areas: (1) improved numerical algorithms; (2) automation of body-fitted grid generation schemes; and (3) advanced computer architectures with vector processing and massively parallel processing features. In this work, the GMRES scheme has been considered as a candidate for acceleration of a Newton iteration time marching scheme for unsteady 2-D and 3-D compressible viscous flow calculation; from preliminary calculations, this will provide up to a 65 percent reduction in the computer time requirements over the existing class of explicit and implicit time marching schemes. The proposed method has ben tested on structured grids, but is flexible enough for extension to unstructured grids. The described scheme has been tested only on the current generation of vector processor architecture of the Cray Y/MP class, but should be suitable for adaptation to massively parallel machines.

  14. FWT2D: A massively parallel program for frequency-domain full-waveform tomography of wide-aperture seismic data—Part 1: Algorithm

    NASA Astrophysics Data System (ADS)

    Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves

    2009-03-01

    This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.

  15. DNA Brick Crystals with Prescribed Depth

    PubMed Central

    Ke, Yonggang; Ong, Luvena L.; Sun, Wei; Song, Jie; Dong, Mingdong; Shih, William M.; Yin, Peng

    2014-01-01

    We describe a general framework for constructing two-dimensional crystals with prescribed depth and sophisticated three-dimensional features. These crystals may serve as scaffolds for the precise spatial arrangements of functional materials for diverse applications. The crystals are self-assembled from single-stranded DNA components called DNA bricks. We demonstrate the experimental construction of DNA brick crystals that can grow to micron-size in the lateral dimensions with precisely controlled depth up to 80 nanometers. They can be designed to display user-specified sophisticated three-dimensional nanoscale features, such as continuous or discontinuous cavities and channels, and to pack DNA helices at parallel and perpendicular angles relative to the plane of the crystals. PMID:25343605

  16. Parallel deterministic neutronics with AMR in 3D

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clouse, C.; Ferguson, J.; Hendrickson, C.

    1997-12-31

    AMTRAN, a three dimensional Sn neutronics code with adaptive mesh refinement (AMR) has been parallelized over spatial domains and energy groups and runs on the Meiko CS-2 with MPI message passing. Block refined AMR is used with linear finite element representations for the fluxes, which allows for a straight forward interpretation of fluxes at block interfaces with zoning differences. The load balancing algorithm assumes 8 spatial domains, which minimizes idle time among processors.

  17. MPI-AMRVAC 2.0 for Solar and Astrophysical Applications

    NASA Astrophysics Data System (ADS)

    Xia, C.; Teunissen, J.; El Mellah, I.; Chané, E.; Keppens, R.

    2018-02-01

    We report on the development of MPI-AMRVAC version 2.0, which is an open-source framework for parallel, grid-adaptive simulations of hydrodynamic and magnetohydrodynamic (MHD) astrophysical applications. The framework now supports radial grid stretching in combination with adaptive mesh refinement (AMR). The advantages of this combined approach are demonstrated with one-dimensional, two-dimensional, and three-dimensional examples of spherically symmetric Bondi accretion, steady planar Bondi–Hoyle–Lyttleton flows, and wind accretion in supergiant X-ray binaries. Another improvement is support for the generic splitting of any background magnetic field. We present several tests relevant for solar physics applications to demonstrate the advantages of field splitting on accuracy and robustness in extremely low-plasma β environments: a static magnetic flux rope, a magnetic null-point, and magnetic reconnection in a current sheet with either uniform or anomalous resistivity. Our implementation for treating anisotropic thermal conduction in multi-dimensional MHD applications is also described, which generalizes the original slope-limited symmetric scheme from two to three dimensions. We perform ring diffusion tests that demonstrate its accuracy and robustness, and show that it prevents the unphysical thermal flux present in traditional schemes. The improved parallel scaling of the code is demonstrated with three-dimensional AMR simulations of solar coronal rain, which show satisfactory strong scaling up to 2000 cores. Other framework improvements are also reported: the modernization and reorganization into a library, the handling of automatic regression tests, the use of inline/online Doxygen documentation, and a new future-proof data format for input/output.

  18. Massively Parallel, Molecular Analysis Platform Developed Using a CMOS Integrated Circuit With Biological Nanopores

    PubMed Central

    Roever, Stefan

    2012-01-01

    A massively parallel, low cost molecular analysis platform will dramatically change the nature of protein, molecular and genomics research, DNA sequencing, and ultimately, molecular diagnostics. An integrated circuit (IC) with 264 sensors was fabricated using standard CMOS semiconductor processing technology. Each of these sensors is individually controlled with precision analog circuitry and is capable of single molecule measurements. Under electronic and software control, the IC was used to demonstrate the feasibility of creating and detecting lipid bilayers and biological nanopores using wild type α-hemolysin. The ability to dynamically create bilayers over each of the sensors will greatly accelerate pore development and pore mutation analysis. In addition, the noise performance of the IC was measured to be 30fA(rms). With this noise performance, single base detection of DNA was demonstrated using α-hemolysin. The data shows that a single molecule, electrical detection platform using biological nanopores can be operationalized and can ultimately scale to millions of sensors. Such a massively parallel platform will revolutionize molecular analysis and will completely change the field of molecular diagnostics in the future.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wylie, Brian Neil; Moreland, Kenneth D.

    Graphs are a vital way of organizing data with complex correlations. A good visualization of a graph can fundamentally change human understanding of the data. Consequently, there is a rich body of work on graph visualization. Although there are many techniques that are effective on small to medium sized graphs (tens of thousands of nodes), there is a void in the research for visualizing massive graphs containing millions of nodes. Sandia is one of the few entities in the world that has the means and motivation to handle data on such a massive scale. For example, homeland security generates graphsmore » from prolific media sources such as television, telephone, and the Internet. The purpose of this project is to provide the groundwork for visualizing such massive graphs. The research provides for two major feature gaps: a parallel, interactive visualization framework and scalable algorithms to make the framework usable to a practical application. Both the frameworks and algorithms are designed to run on distributed parallel computers, which are already available at Sandia. Some features are integrated into the ThreatView{trademark} application and future work will integrate further parallel algorithms.« less

  20. Forming spectroscopic massive protobinaries by disc fragmentation

    NASA Astrophysics Data System (ADS)

    Meyer, D. M.-A.; Kuiper, R.; Kley, W.; Johnston, K. G.; Vorobyov, E.

    2018-01-01

    The surroundings of massive protostars constitute an accretion disc which has numerically been shown to be subject to fragmentation and responsible for luminous accretion-driven outbursts. Moreover, it is suspected to produce close binary companions which will later strongly influence the star's future evolution in the Hertzsprung-Russel diagram. We present three-dimensional gravitation-radiation-hydrodynamic numerical simulations of 100 M⊙ pre-stellar cores. We find that accretion discs of young massive stars violently fragment without preventing the (highly variable) accretion of gaseous clumps on to the protostars. While acquiring the characteristics of a nascent low-mass companion, some disc fragments migrate on to the central massive protostar with dynamical properties showing that its final Keplerian orbit is close enough to constitute a close massive protobinary system, having a young high- and a low-mass components. We conclude on the viability of the disc fragmentation channel for the formation of such short-period binaries, and that both processes - close massive binary formation and accretion bursts - may happen at the same time. FU-Orionis-type bursts, such as observed in the young high-mass star S255IR-NIRS3, may not only indicate ongoing disc fragmentation, but also be considered as a tracer for the formation of close massive binaries - progenitors of the subsequent massive spectroscopic binaries - once the high-mass component of the system will enter the main-sequence phase of its evolution. Finally, we investigate the Atacama Large (sub-)Millimeter Array observability of the disc fragments.

  1. Bioactive Natural Products Prioritization Using Massive Multi-informational Molecular Networks.

    PubMed

    Olivon, Florent; Allard, Pierre-Marie; Koval, Alexey; Righi, Davide; Genta-Jouve, Gregory; Neyts, Johan; Apel, Cécile; Pannecouque, Christophe; Nothias, Louis-Félix; Cachet, Xavier; Marcourt, Laurence; Roussi, Fanny; Katanaev, Vladimir L; Touboul, David; Wolfender, Jean-Luc; Litaudon, Marc

    2017-10-20

    Natural products represent an inexhaustible source of novel therapeutic agents. Their complex and constrained three-dimensional structures endow these molecules with exceptional biological properties, thereby giving them a major role in drug discovery programs. However, the search for new bioactive metabolites is hampered by the chemical complexity of the biological matrices in which they are found. The purification of single constituents from such matrices requires such a significant amount of work that it should be ideally performed only on molecules of high potential value (i.e., chemical novelty and biological activity). Recent bioinformatics approaches based on mass spectrometry metabolite profiling methods are beginning to address the complex task of compound identification within complex mixtures. However, in parallel to these developments, methods providing information on the bioactivity potential of natural products prior to their isolation are still lacking and are of key interest to target the isolation of valuable natural products only. In the present investigation, we propose an integrated analysis strategy for bioactive natural products prioritization. Our approach uses massive molecular networks embedding various informational layers (bioactivity and taxonomical data) to highlight potentially bioactive scaffolds within the chemical diversity of crude extracts collections. We exemplify this workflow by targeting the isolation of predicted active and nonactive metabolites from two botanical sources (Bocquillonia nervosa and Neoguillauminia cleopatra) against two biological targets (Wnt signaling pathway and chikungunya virus replication). Eventually, the detection and isolation processes of a daphnane diterpene orthoester and four 12-deoxyphorbols inhibiting the Wnt signaling pathway and exhibiting potent antiviral activities against the CHIKV virus are detailed. Combined with efficient metabolite annotation tools, this bioactive natural products prioritization pipeline proves to be efficient. Implementation of this approach in drug discovery programs based on natural extract screening should speed up and rationalize the isolation of bioactive natural products.

  2. Direct numerical simulation of steady state, three dimensional, laminar flow around a wall mounted cube

    NASA Astrophysics Data System (ADS)

    Liakos, Anastasios; Malamataris, Nikolaos A.

    2014-05-01

    The topology and evolution of flow around a surface mounted cubical object in three dimensional channel flow is examined for low to moderate Reynolds numbers. Direct numerical simulations were performed via a home made parallel finite element code. The computational domain has been designed according to actual laboratory experiment conditions. Analysis of the results is performed using the three dimensional theory of separation. Our findings indicate that a tornado-like vortex by the side of the cube is present for all Reynolds numbers for which flow was simulated. A horseshoe vortex upstream from the cube was formed at Reynolds number approximately 1266. Pressure distributions are shown along with three dimensional images of the tornado-like vortex and the horseshoe vortex at selected Reynolds numbers. Finally, and in accordance to previous work, our results indicate that the upper limit for the Reynolds number for which steady state results are physically realizable is roughly 2000.

  3. Carbon nanotube-based three-dimensional monolithic optoelectronic integrated system

    NASA Astrophysics Data System (ADS)

    Liu, Yang; Wang, Sheng; Liu, Huaping; Peng, Lian-Mao

    2017-06-01

    Single material-based monolithic optoelectronic integration with complementary metal oxide semiconductor-compatible signal processing circuits is one of the most pursued approaches in the post-Moore era to realize rapid data communication and functional diversification in a limited three-dimensional space. Here, we report an electrically driven carbon nanotube-based on-chip three-dimensional optoelectronic integrated circuit. We demonstrate that photovoltaic receivers, electrically driven transmitters and on-chip electronic circuits can all be fabricated using carbon nanotubes via a complementary metal oxide semiconductor-compatible low-temperature process, providing a seamless integration platform for realizing monolithic three-dimensional optoelectronic integrated circuits with diversified functionality such as the heterogeneous AND gates. These circuits can be vertically scaled down to sub-30 nm and operates in photovoltaic mode at room temperature. Parallel optical communication between functional layers, for example, bottom-layer digital circuits and top-layer memory, has been demonstrated by mapping data using a 2 × 2 transmitter/receiver array, which could be extended as the next generation energy-efficient signal processing paradigm.

  4. Multitasking a three-dimensional Navier-Stokes algorithm on the Cray-2

    NASA Technical Reports Server (NTRS)

    Swisshelm, Julie M.

    1989-01-01

    A three-dimensional computational aerodynamics algorithm has been multitasked for efficient parallel execution on the Cray-2. It provides a means for examining the multitasking performance of a complete CFD application code. An embedded zonal multigrid scheme is used to solve the Reynolds-averaged Navier-Stokes equations for an internal flow model problem. The explicit nature of each component of the method allows a spatial partitioning of the computational domain to achieve a well-balanced task load for MIMD computers with vector-processing capability. Experiments have been conducted with both two- and three-dimensional multitasked cases. The best speedup attained by an individual task group was 3.54 on four processors of the Cray-2, while the entire solver yielded a speedup of 2.67 on four processors for the three-dimensional case. The multiprocessing efficiency of various types of computational tasks is examined, performance on two Cray-2s with different memory access speeds is compared, and extrapolation to larger problems is discussed.

  5. Faulting of Rocks in a Three-Dimensional Stress Field by Micro-Anticracks

    PubMed Central

    Ghaffari, H. O.; Nasseri, M. H. B.; Young, R. Paul

    2014-01-01

    Nucleation and propagation of a shear fault is known to be the result of interaction and coalescence of many microcracks. Yet the character and rate of the microcracks' interactions, and their dependence on the three-dimensional stress state are poorly understood. Here we investigate formation of microcracks during sandstone faulting under 3D-polyaxial stress fields by analyzing multi-stationary acoustic waveforms. We show that in a true three-dimensional stress state (a) faulting forms in a orthorhombic pattern, and (b) the emitted acoustic waveforms from microcracking carry a shorter rapid slip phase. The later is associated with microcracking that dominantly develops parallel to the minimum stress direction. Our results imply that due to inducing the micro-anticracks, the three-dimensional (3D) stress state can quicken dynamic weakening and rupture propagation by a factor of two relatively to simpler stress states. The results suggest a new nucleation mechanism of 3D-faulting with implications for earthquakes' instabilities, as well as the understanding of avalanches associated with dislocations. PMID:24862447

  6. Parallel Adaptive Mesh Refinement Library

    NASA Technical Reports Server (NTRS)

    Mac-Neice, Peter; Olson, Kevin

    2005-01-01

    Parallel Adaptive Mesh Refinement Library (PARAMESH) is a package of Fortran 90 subroutines designed to provide a computer programmer with an easy route to extension of (1) a previously written serial code that uses a logically Cartesian structured mesh into (2) a parallel code with adaptive mesh refinement (AMR). Alternatively, in its simplest use, and with minimal effort, PARAMESH can operate as a domain-decomposition tool for users who want to parallelize their serial codes but who do not wish to utilize adaptivity. The package builds a hierarchy of sub-grids to cover the computational domain of a given application program, with spatial resolution varying to satisfy the demands of the application. The sub-grid blocks form the nodes of a tree data structure (a quad-tree in two or an oct-tree in three dimensions). Each grid block has a logically Cartesian mesh. The package supports one-, two- and three-dimensional models.

  7. Mountain-Wave Induced Rotors in the Lee of Three-Dimensional Ridges

    NASA Astrophysics Data System (ADS)

    Doyle, J. D.; Durran, D. R.

    2003-12-01

    Mountain waves forced by elongated ridges are often accompanied by low-level vortices that have horizontal circulation axes parallel to the ridgeline. These horizontal vortices, known as rotors, can be severe aeronautical hazards and have been cited as contributing to numerous aircraft accidents. In spite of their obvious importance, mountain-induced rotors still remain poorly understood, particularly with respect to three-dimensional aspects of the flow. In this study, the dynamics of rotors forced by three-dimensional topography are investigated through a series of high-resolution idealized simulations with the non-hydrostatic COAMPS model. The focus of this investigation is on the internal structure of rotors and in particular on the dynamics of small-scale intense circulations within rotors that we refer to as "sub-rotors". These are the first known simulations of sub-rotors in three dimensions, likely because explicit simulations have only just recently become computationally feasible with the new generation of massively parallel computers. The calculations were performed on an SGI Origin 3000 at the DoD Major Shared Resource Facility High Performance Computing Facility at the U.S. Army Engineer Research and Development Center (ERDC) in Vicksburg, Mississippi as part of the DoD Challenge program. Simulations are conducted using an upstream reference state representative of the conditions under which rotors form in the real atmosphere; in particular a vertical profile approximating the conditions upstream of the Colorado Front Range on 1200 UTC 3 March 1991. This is a few hours prior to a B737 crash at the Colorado Springs, CO airport that was initially linked to rotors and near the time when rotor clouds were observed in vicinity. The topography is specified as a 1000-m high elongated ridge with a half-width of 15 km on the upstream portion and 5 km on the downstream side. In several experiments, a 500-m circular peak with a half-width of 7.5 km is used to investigate the sensitivity of the rotor dynamics to topographic variations in the cross-flow direction. As many as six nested grids are used with a minimum horizontal resolution of 22 m and 90 vertical levels in order to resolve the internal rotor structure and sub-rotors. The simulation results indicate a thin sheet of high-vorticity fluid develops adjacent to the ground along the lee slope and then ascends abruptly as it is advected into the updraft at the leading edge of the first trapped lee wave. This vortex sheet is primarily forced by mechanical shear associated with frictional processes at the surface. Instability of the horizontal vortex sheet occurs along the leading edge of the "parent" rotor and as a result coherent sub-rotor circulations subsequently develop. These sub-rotors intensify and are advected downstream or back toward the mountain into the parent rotor at low-levels leading to an enhancement of the near-surface horizontal vorticity. Horizontal vorticity within the sub-rotors are enhanced several fold. The horizontal vorticity generation appears to be enhanced near the edges of the wake emanating from the circular peak due to vortex stretching of the parent rotor and also further maximized due to stretching associated with three-dimensional turbulent eddies. The results suggest that preferred regions of intense rotors may exist near topographic features that enhance vortex stretching.

  8. de Sitter space from dilatino condensates in massive IIA supergravity

    NASA Astrophysics Data System (ADS)

    Souères, Bertrand; Tsimpis, Dimitrios

    2018-02-01

    We use the superspace formulation of (massive) IIA supergravity to obtain the explicit form of the dilatino terms, and we find that the quartic-dilatino term is positive. The theory admits a ten-dimensional de Sitter solution, obtained by assuming a nonvanishing quartic-dilatino condensate which generates a positive cosmological constant. Moreover, in the presence of dilatino condensates, the theory admits formal four-dimensional de Sitter solutions of the form d S4×M6, where M6 is a six-dimensional Kähler-Einstein manifold of positive scalar curvature.

  9. Highly Parallel Alternating Directions Algorithm for Time Dependent Problems

    NASA Astrophysics Data System (ADS)

    Ganzha, M.; Georgiev, K.; Lirkov, I.; Margenov, S.; Paprzycki, M.

    2011-11-01

    In our work, we consider the time dependent Stokes equation on a finite time interval and on a uniform rectangular mesh, written in terms of velocity and pressure. For this problem, a parallel algorithm based on a novel direction splitting approach is developed. Here, the pressure equation is derived from a perturbed form of the continuity equation, in which the incompressibility constraint is penalized in a negative norm induced by the direction splitting. The scheme used in the algorithm is composed of two parts: (i) velocity prediction, and (ii) pressure correction. This is a Crank-Nicolson-type two-stage time integration scheme for two and three dimensional parabolic problems in which the second-order derivative, with respect to each space variable, is treated implicitly while the other variable is made explicit at each time sub-step. In order to achieve a good parallel performance the solution of the Poison problem for the pressure correction is replaced by solving a sequence of one-dimensional second order elliptic boundary value problems in each spatial direction. The parallel code is implemented using the standard MPI functions and tested on two modern parallel computer systems. The performed numerical tests demonstrate good level of parallel efficiency and scalability of the studied direction-splitting-based algorithm.

  10. The portals 4.0.1 network programming interface.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barrett, Brian W.; Brightwell, Ronald Brian; Pedretti, Kevin

    2013-04-01

    This report presents a specification for the Portals 4.0 network programming interface. Portals 4.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4.0 is well suited to massively parallel processing and embedded systems. Portals 4.0 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandias Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4.0 is targeted to the next generationmore » of machines employing advanced network interface architectures that support enhanced offload capabilities. 3« less

  11. Merlin - Massively parallel heterogeneous computing

    NASA Technical Reports Server (NTRS)

    Wittie, Larry; Maples, Creve

    1989-01-01

    Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.

  12. Massively parallel quantum computer simulator

    NASA Astrophysics Data System (ADS)

    De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.

    2007-01-01

    We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.

  13. Dual-dimensional microscopy: real-time in vivo three-dimensional observation method using high-resolution light-field microscopy and light-field display.

    PubMed

    Kim, Jonghyun; Moon, Seokil; Jeong, Youngmo; Jang, Changwon; Kim, Youngmin; Lee, Byoungho

    2018-06-01

    Here, we present dual-dimensional microscopy that captures both two-dimensional (2-D) and light-field images of an in-vivo sample simultaneously, synthesizes an upsampled light-field image in real time, and visualizes it with a computational light-field display system in real time. Compared with conventional light-field microscopy, the additional 2-D image greatly enhances the lateral resolution at the native object plane up to the diffraction limit and compensates for the image degradation at the native object plane. The whole process from capturing to displaying is done in real time with the parallel computation algorithm, which enables the observation of the sample's three-dimensional (3-D) movement and direct interaction with the in-vivo sample. We demonstrate a real-time 3-D interactive experiment with Caenorhabditis elegans. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  14. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamic global mapping of contended links

    DOEpatents

    Archer, Charles Jens [Rochester, MN; Musselman, Roy Glenn [Rochester, MN; Peters, Amanda [Rochester, MN; Pinnow, Kurt Walter [Rochester, MN; Swartz, Brent Allen [Chippewa Falls, WI; Wallenfelt, Brian Paul [Eden Prairie, MN

    2011-10-04

    A massively parallel nodal computer system periodically collects and broadcasts usage data for an internal communications network. A node sending data over the network makes a global routing determination using the network usage data. Preferably, network usage data comprises an N-bit usage value for each output buffer associated with a network link. An optimum routing is determined by summing the N-bit values associated with each link through which a data packet must pass, and comparing the sums associated with different possible routes.

  15. Scalable Visual Analytics of Massive Textual Datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krishnan, Manoj Kumar; Bohn, Shawn J.; Cowley, Wendy E.

    2007-04-01

    This paper describes the first scalable implementation of text processing engine used in Visual Analytics tools. These tools aid information analysts in interacting with and understanding large textual information content through visual interfaces. By developing parallel implementation of the text processing engine, we enabled visual analytics tools to exploit cluster architectures and handle massive dataset. The paper describes key elements of our parallelization approach and demonstrates virtually linear scaling when processing multi-gigabyte data sets such as Pubmed. This approach enables interactive analysis of large datasets beyond capabilities of existing state-of-the art visual analytics tools.

  16. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamically adjusting local routing strategies

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-03-16

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Each node implements a respective routing strategy for routing data through the network, the routing strategies not necessarily being the same in every node. The routing strategies implemented in the nodes are dynamically adjusted during application execution to shift network workload as required. Preferably, adjustment of routing policies in selective nodes is performed at synchronization points. The network may be dynamically monitored, and routing strategies adjusted according to detected network conditions.

  17. Parallel phase-shifting self-interference digital holography with faithful reconstruction using compressive sensing

    NASA Astrophysics Data System (ADS)

    Wan, Yuhong; Man, Tianlong; Wu, Fan; Kim, Myung K.; Wang, Dayong

    2016-11-01

    We present a new self-interference digital holographic approach that allows single-shot capturing three-dimensional intensity distribution of the spatially incoherent objects. The Fresnel incoherent correlation holographic microscopy is combined with parallel phase-shifting technique to instantaneously obtain spatially multiplexed phase-shifting holograms. The compressive-sensing-based reconstruction algorithm is implemented to reconstruct the original object from the under sampled demultiplexed holograms. The scheme is verified with simulations. The validity of the proposed method is experimentally demonstrated in an indirectly way by simulating the use of specific parallel phase-shifting recording device.

  18. Two-dimensional shape recognition using sparse distributed memory

    NASA Technical Reports Server (NTRS)

    Kanerva, Pentti; Olshausen, Bruno

    1990-01-01

    Researchers propose a method for recognizing two-dimensional shapes (hand-drawn characters, for example) with an associative memory. The method consists of two stages: first, the image is preprocessed to extract tangents to the contour of the shape; second, the set of tangents is converted to a long bit string for recognition with sparse distributed memory (SDM). SDM provides a simple, massively parallel architecture for an associative memory. Long bit vectors (256 to 1000 bits, for example) serve as both data and addresses to the memory, and patterns are grouped or classified according to similarity in Hamming distance. At the moment, tangents are extracted in a simple manner by progressively blurring the image and then using a Canny-type edge detector (Canny, 1986) to find edges at each stage of blurring. This results in a grid of tangents. While the technique used for obtaining the tangents is at present rather ad hoc, researchers plan to adopt an existing framework for extracting edge orientation information over a variety of resolutions, such as suggested by Watson (1987, 1983), Marr and Hildreth (1980), or Canny (1986).

  19. A Lightweight Remote Parallel Visualization Platform for Interactive Massive Time-varying Climate Data Analysis

    NASA Astrophysics Data System (ADS)

    Li, J.; Zhang, T.; Huang, Q.; Liu, Q.

    2014-12-01

    Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.

  20. A Chip-Capillary Hybrid Device for Automated Transfer of Sample Pre-Separated by Capillary Isoelectric Focusing to Parallel Capillary Gel Electrophoresis for Two-Dimensional Protein Separation

    PubMed Central

    Lu, Joann J.; Wang, Shili; Li, Guanbin; Wang, Wei; Pu, Qiaosheng; Liu, Shaorong

    2012-01-01

    In this report, we introduce a chip-capillary hybrid device to integrate capillary isoelectric focusing (CIEF) with parallel capillary sodium dodecyl sulfate – polyacrylamide gel electrophoresis (SDS-PAGE) or capillary gel electrophoresis (CGE) toward automating two-dimensional (2D) protein separations. The hybrid device consists of three chips that are butted together. The middle chip can be moved between two positions to re-route the fluidic paths, which enables the performance of CIEF and injection of proteins partially resolved by CIEF to CGE capillaries for parallel CGE separations in a continuous and automated fashion. Capillaries are attached to the other two chips to facilitate CIEF and CGE separations and to extend the effective lengths of CGE columns. Specifically, we illustrate the working principle of the hybrid device, develop protocols for producing and preparing the hybrid device, and demonstrate the feasibility of using this hybrid device for automated injection of CIEF-separated sample to parallel CGE for 2D protein separations. Potentials and problems associated with the hybrid device are also discussed. PMID:22830584

  1. Parallel processing architecture for H.264 deblocking filter on multi-core platforms

    NASA Astrophysics Data System (ADS)

    Prasad, Durga P.; Sonachalam, Sekar; Kunchamwar, Mangesh K.; Gunupudi, Nageswara Rao

    2012-03-01

    Massively parallel computing (multi-core) chips offer outstanding new solutions that satisfy the increasing demand for high resolution and high quality video compression technologies such as H.264. Such solutions not only provide exceptional quality but also efficiency, low power, and low latency, previously unattainable in software based designs. While custom hardware and Application Specific Integrated Circuit (ASIC) technologies may achieve lowlatency, low power, and real-time performance in some consumer devices, many applications require a flexible and scalable software-defined solution. The deblocking filter in H.264 encoder/decoder poses difficult implementation challenges because of heavy data dependencies and the conditional nature of the computations. Deblocking filter implementations tend to be fixed and difficult to reconfigure for different needs. The ability to scale up for higher quality requirements such as 10-bit pixel depth or a 4:2:2 chroma format often reduces the throughput of a parallel architecture designed for lower feature set. A scalable architecture for deblocking filtering, created with a massively parallel processor based solution, means that the same encoder or decoder will be deployed in a variety of applications, at different video resolutions, for different power requirements, and at higher bit-depths and better color sub sampling patterns like YUV, 4:2:2, or 4:4:4 formats. Low power, software-defined encoders/decoders may be implemented using a massively parallel processor array, like that found in HyperX technology, with 100 or more cores and distributed memory. The large number of processor elements allows the silicon device to operate more efficiently than conventional DSP or CPU technology. This software programing model for massively parallel processors offers a flexible implementation and a power efficiency close to that of ASIC solutions. This work describes a scalable parallel architecture for an H.264 compliant deblocking filter for multi core platforms such as HyperX technology. Parallel techniques such as parallel processing of independent macroblocks, sub blocks, and pixel row level are examined in this work. The deblocking architecture consists of a basic cell called deblocking filter unit (DFU) and dependent data buffer manager (DFM). The DFU can be used in several instances, catering to different performance needs the DFM serves the data required for the different number of DFUs, and also manages all the neighboring data required for future data processing of DFUs. This approach achieves the scalability, flexibility, and performance excellence required in deblocking filters.

  2. Parallel-Processing Software for Correlating Stereo Images

    NASA Technical Reports Server (NTRS)

    Klimeck, Gerhard; Deen, Robert; Mcauley, Michael; DeJong, Eric

    2007-01-01

    A computer program implements parallel- processing algorithms for cor relating images of terrain acquired by stereoscopic pairs of digital stereo cameras on an exploratory robotic vehicle (e.g., a Mars rove r). Such correlations are used to create three-dimensional computatio nal models of the terrain for navigation. In this program, the scene viewed by the cameras is segmented into subimages. Each subimage is assigned to one of a number of central processing units (CPUs) opera ting simultaneously.

  3. Massively parallel data processing for quantitative total flow imaging with optical coherence microscopy and tomography

    NASA Astrophysics Data System (ADS)

    Sylwestrzak, Marcin; Szlag, Daniel; Marchand, Paul J.; Kumar, Ashwin S.; Lasser, Theo

    2017-08-01

    We present an application of massively parallel processing of quantitative flow measurements data acquired using spectral optical coherence microscopy (SOCM). The need for massive signal processing of these particular datasets has been a major hurdle for many applications based on SOCM. In view of this difficulty, we implemented and adapted quantitative total flow estimation algorithms on graphics processing units (GPU) and achieved a 150 fold reduction in processing time when compared to a former CPU implementation. As SOCM constitutes the microscopy counterpart to spectral optical coherence tomography (SOCT), the developed processing procedure can be applied to both imaging modalities. We present the developed DLL library integrated in MATLAB (with an example) and have included the source code for adaptations and future improvements. Catalogue identifier: AFBT_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AFBT_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPLv3 No. of lines in distributed program, including test data, etc.: 913552 No. of bytes in distributed program, including test data, etc.: 270876249 Distribution format: tar.gz Programming language: CUDA/C, MATLAB. Computer: Intel x64 CPU, GPU supporting CUDA technology. Operating system: 64-bit Windows 7 Professional. Has the code been vectorized or parallelized?: Yes, CPU code has been vectorized in MATLAB, CUDA code has been parallelized. RAM: Dependent on users parameters, typically between several gigabytes and several tens of gigabytes Classification: 6.5, 18. Nature of problem: Speed up of data processing in optical coherence microscopy Solution method: Utilization of GPU for massively parallel data processing Additional comments: Compiled DLL library with source code and documentation, example of utilization (MATLAB script with raw data) Running time: 1,8 s for one B-scan (150 × faster in comparison to the CPU data processing time)

  4. Impact of turbulence anisotropy near walls in room airflow.

    PubMed

    Schälin, A; Nielsen, P V

    2004-06-01

    The influence of different turbulence models used in computational fluid dynamics predictions is studied in connection with room air movement. The turbulence models used are the high Re-number kappa-epsilon model and the high Re-number Reynolds stress model (RSM). The three-dimensional wall jet is selected for the work. The growth rate parallel to the wall in a three-dimensional wall jet is large compared with the growth rate perpendicular to the wall, and it is large compared with the growth rate in a free circular jet. It is shown that it is not possible to predict the high growth rate parallel with a surface in a three-dimensional wall jet by the kappa-epsilon turbulence model. Furthermore, it is shown that the growth rate can be predicted to a certain extent by the RSM with wall reflection terms. The flow in a deep room can be strongly influenced by details as the growth rate of a three-dimensional wall jet. Predictions by a kappa-epsilon model and RSM show large deviations in the occupied zone. Measurements and observations of streamline patterns in model experiments indicate that a reasonable solution is obtained by the RSM compared with the solution obtained by the kappa-epsilon model. Computational fluid dynamics (CFD) is often used for the prediction of air distribution in rooms and for the evaluation of thermal comfort and indoor air quality. The most used turbulence model in CFD is the kappa-epsilon model. This model often produces good results; however, some cases require more sophisticated models. The prediction of a three-dimensional wall jet is improved if it is made by a Reynolds stress model (RSM). This model improves the prediction of the velocity level in the jet and in some special cases it may influence the entire flow in the occupied zone.

  5. Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

    NASA Astrophysics Data System (ADS)

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A.; Oliveira, Micael J. T.; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G.; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A. L.

    2012-06-01

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  6. Time-dependent density-functional theory in massively parallel computer architectures: the OCTOPUS project.

    PubMed

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A L

    2012-06-13

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  7. Optical Symbolic Computing

    NASA Astrophysics Data System (ADS)

    Neff, John A.

    1989-12-01

    Experiments originating from Gestalt psychology have shown that representing information in a symbolic form provides a more effective means to understanding. Computer scientists have been struggling for the last two decades to determine how best to create, manipulate, and store collections of symbolic structures. In the past, much of this struggling led to software innovations because that was the path of least resistance. For example, the development of heuristics for organizing the searching through knowledge bases was much less expensive than building massively parallel machines that could search in parallel. That is now beginning to change with the emergence of parallel architectures which are showing the potential for handling symbolic structures. This paper will review the relationships between symbolic computing and parallel computing architectures, and will identify opportunities for optics to significantly impact the performance of such computing machines. Although neural networks are an exciting subset of massively parallel computing structures, this paper will not touch on this area since it is receiving a great deal of attention in the literature. That is, the concepts presented herein do not consider the distributed representation of knowledge.

  8. Duct flow nonuniformities: Effect of struts in SSME HGM II(+)

    NASA Technical Reports Server (NTRS)

    Burke, Roger

    1988-01-01

    A numerical study, using the INS3D flow solver, of laminar and turbulent flow around a two dimensional strut, and three dimensional flow around a strut in an annulus is presented. A multi-block procedure was used to calculate two dimensional laminar flow around two struts in parallel, with each strut represented by one computational block. Single block calculations were performed for turbulent flow around a two dimensional strut, using a Baldwin-Lomax turbulence model to parameterize the turbulent shear stresses. A modified Baldwin-Lomax model was applied to the case of a three dimensional strut in an annulus. The results displayed the essential features of wing-body flows, including the presence of a horseshoe vortex system at the junction of the strut and the lower annulus surface. A similar system was observed at the upper annulus surface. The test geometries discussed were useful in developing the capability to perform multiblock calculations, and to simulate turbulent flow around obstructions located between curved walls. Both of these skills will be necessary to model the three dimensional flow in the strut assembly of the SSME. Work is now in progress on performing a three dimensional two block turbulent calculation of the flow in the turnaround duct (TAD) and strut/fuel bowl juncture region.

  9. ColDICE: A parallel Vlasov–Poisson solver using moving adaptive simplicial tessellation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sousbie, Thierry, E-mail: tsousbie@gmail.com; Department of Physics, The University of Tokyo, Tokyo 113-0033; Research Center for the Early Universe, School of Science, The University of Tokyo, Tokyo 113-0033

    2016-09-15

    Resolving numerically Vlasov–Poisson equations for initially cold systems can be reduced to following the evolution of a three-dimensional sheet evolving in six-dimensional phase-space. We describe a public parallel numerical algorithm consisting in representing the phase-space sheet with a conforming, self-adaptive simplicial tessellation of which the vertices follow the Lagrangian equations of motion. The algorithm is implemented both in six- and four-dimensional phase-space. Refinement of the tessellation mesh is performed using the bisection method and a local representation of the phase-space sheet at second order relying on additional tracers created when needed at runtime. In order to preserve in the bestmore » way the Hamiltonian nature of the system, refinement is anisotropic and constrained by measurements of local Poincaré invariants. Resolution of Poisson equation is performed using the fast Fourier method on a regular rectangular grid, similarly to particle in cells codes. To compute the density projected onto this grid, the intersection of the tessellation and the grid is calculated using the method of Franklin and Kankanhalli [65–67] generalised to linear order. As preliminary tests of the code, we study in four dimensional phase-space the evolution of an initially small patch in a chaotic potential and the cosmological collapse of a fluctuation composed of two sinusoidal waves. We also perform a “warm” dark matter simulation in six-dimensional phase-space that we use to check the parallel scaling of the code.« less

  10. A High-Order Method Using Unstructured Grids for the Aeroacoustic Analysis of Realistic Aircraft Configurations

    NASA Technical Reports Server (NTRS)

    Atkins, Harold L.; Lockard, David P.

    1999-01-01

    A method for the prediction of acoustic scatter from complex geometries is presented. The discontinuous Galerkin method provides a framework for the development of a high-order method using unstructured grids. The method's compact form contributes to its accuracy and efficiency, and makes the method well suited for distributed memory parallel computing platforms. Mesh refinement studies are presented to validate the expected convergence properties of the method, and to establish the absolute levels of a error one can expect at a given level of resolution. For a two-dimensional shear layer instability wave and for three-dimensional wave propagation, the method is demonstrated to be insensitive to mesh smoothness. Simulations of scatter from a two-dimensional slat configuration and a three-dimensional blended-wing-body demonstrate the capability of the method to efficiently treat realistic geometries.

  11. Three-dimensional real-time imaging of bi-phasic flow through porous media

    NASA Astrophysics Data System (ADS)

    Sharma, Prerna; Aswathi, P.; Sane, Anit; Ghosh, Shankar; Bhattacharya, S.

    2011-11-01

    We present a scanning laser-sheet video imaging technique to image bi-phasic flow in three-dimensional porous media in real time with pore-scale spatial resolution, i.e., 35 μm and 500 μm for directions parallel and perpendicular to the flow, respectively. The technique is illustrated for the case of viscous fingering. Using suitable image processing protocols, both the morphology and the movement of the two-fluid interface, were quantitatively estimated. Furthermore, a macroscopic parameter such as the displacement efficiency obtained from a microscopic (pore-scale) analysis demonstrates the versatility and usefulness of the method.

  12. Parallel image-acquisition in continuous-wave electron paramagnetic resonance imaging with a surface coil array: Proof-of-concept experiments

    NASA Astrophysics Data System (ADS)

    Enomoto, Ayano; Hirata, Hiroshi

    2014-02-01

    This article describes a feasibility study of parallel image-acquisition using a two-channel surface coil array in continuous-wave electron paramagnetic resonance (CW-EPR) imaging. Parallel EPR imaging was performed by multiplexing of EPR detection in the frequency domain. The parallel acquisition system consists of two surface coil resonators and radiofrequency (RF) bridges for EPR detection. To demonstrate the feasibility of this method of parallel image-acquisition with a surface coil array, three-dimensional EPR imaging was carried out using a tube phantom. Technical issues in the multiplexing method of EPR detection were also clarified. We found that degradation in the signal-to-noise ratio due to the interference of RF carriers is a key problem to be solved.

  13. The tidal disruption of a star by a massive black hole

    NASA Technical Reports Server (NTRS)

    Evans, Charles R.; Kochanek, Christopher S.

    1989-01-01

    Results are reported from a three-dimensional numerical calculation of the tidal disruption of a low-mass main-sequence star on a parabolic orbit around a massive black hole (Mh = 10 to the 6th stellar mass). The postdisruption evolution is followed until hydrodynamic forces becomes negligible and the liberated gas becomes ballistic. Also given is the rate at which bound mass returns to pericenter after orbiting the hole once. The processes that determine the time scale to circularize the debris orbits and allow an accretion torus to form are discussed. This time scale and the time scales for radiative cooling and accretion inflow determine the onset and duration of the subsequent flare in the AGN luminosity.

  14. Dependence of energy characteristics of ascending swirling air flow on velocity of vertical blowing

    NASA Astrophysics Data System (ADS)

    Volkov, R. E.; Obukhov, A. G.; Kutrunov, V. N.

    2018-05-01

    In the model of a compressible continuous medium, for the complete Navier-Stokes system of equations, an initial boundary problem is proposed that corresponds to the conducted and planned experiments and describes complex three-dimensional flows of a viscous compressible heat-conducting gas in ascending swirling flows that are initiated by a vertical cold blowing. Using parallelization methods, three-dimensional nonstationary flows of a polytropic viscous compressible heat-conducting gas are constructed numerically in different scaled ascending swirling flows under the condition when gravity and Coriolis forces act. With the help of explicit difference schemes and the proposed initial boundary conditions, approximate solutions of the complete system of Navier-Stokes equations are constructed as well as the velocity and energy characteristics of three-dimensional nonstationary gas flows in ascending swirling flows are determined.

  15. A heterogeneous computing environment for simulating astrophysical fluid flows

    NASA Technical Reports Server (NTRS)

    Cazes, J.

    1994-01-01

    In the Concurrent Computing Laboratory in the Department of Physics and Astronomy at Louisiana State University we have constructed a heterogeneous computing environment that permits us to routinely simulate complicated three-dimensional fluid flows and to readily visualize the results of each simulation via three-dimensional animation sequences. An 8192-node MasPar MP-1 computer with 0.5 GBytes of RAM provides 250 MFlops of execution speed for our fluid flow simulations. Utilizing the parallel virtual machine (PVM) language, at periodic intervals data is automatically transferred from the MP-1 to a cluster of workstations where individual three-dimensional images are rendered for inclusion in a single animation sequence. Work is underway to replace executions on the MP-1 with simulations performed on the 512-node CM-5 at NCSA and to simultaneously gain access to more potent volume rendering workstations.

  16. Numerical solution of supersonic three-dimensional free-mixing flows using the parabolic-elliptic Navier-Stokes equations

    NASA Technical Reports Server (NTRS)

    Hirsh, R. S.

    1976-01-01

    A numerical method is presented for solving the parabolic-elliptic Navier-Stokes equations. The solution procedure is applied to three-dimensional supersonic laminar jet flow issuing parallel with a supersonic free stream. A coordinate transformation is introduced which maps the boundaries at infinity into a finite computational domain in order to eliminate difficulties associated with the imposition of free-stream boundary conditions. Results are presented for an approximate circular jet, a square jet, varying aspect ratio rectangular jets, and interacting square jets. The solution behavior varies from axisymmetric to nearly two-dimensional in character. For cases where comparisons of the present results with those obtained from shear layer calculations could be made, agreement was good.

  17. Transport of cosmic-ray protons in intermittent heliospheric turbulence: Model and simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Alouani-Bibi, Fathallah; Le Roux, Jakobus A., E-mail: fb0006@uah.edu

    The transport of charged energetic particles in the presence of strong intermittent heliospheric turbulence is computationally analyzed based on known properties of the interplanetary magnetic field and solar wind plasma at 1 astronomical unit. The turbulence is assumed to be static, composite, and quasi-three-dimensional with a varying energy distribution between a one-dimensional Alfvénic (slab) and a structured two-dimensional component. The spatial fluctuations of the turbulent magnetic field are modeled either as homogeneous with a Gaussian probability distribution function (PDF), or as intermittent on large and small scales with a q-Gaussian PDF. Simulations showed that energetic particle diffusion coefficients both parallelmore » and perpendicular to the background magnetic field are significantly affected by intermittency in the turbulence. This effect is especially strong for parallel transport where for large-scale intermittency results show an extended phase of subdiffusive parallel transport during which cross-field transport diffusion dominates. The effects of intermittency are found to depend on particle rigidity and the fraction of slab energy in the turbulence, yielding a perpendicular to parallel mean free path ratio close to 1 for large-scale intermittency. Investigation of higher order transport moments (kurtosis) indicates that non-Gaussian statistical properties of the intermittent turbulent magnetic field are present in the parallel transport, especially for low rigidity particles at all times.« less

  18. The Destructive Birth of Massive Stars and Massive Star Clusters

    NASA Astrophysics Data System (ADS)

    Rosen, Anna; Krumholz, Mark; McKee, Christopher F.; Klein, Richard I.; Ramirez-Ruiz, Enrico

    2017-01-01

    Massive stars play an essential role in the Universe. They are rare, yet the energy and momentum they inject into the interstellar medium with their intense radiation fields dwarfs the contribution by their vastly more numerous low-mass cousins. Previous theoretical and observational studies have concluded that the feedback associated with massive stars' radiation fields is the dominant mechanism regulating massive star and massive star cluster (MSC) formation. Therefore detailed simulation of the formation of massive stars and MSCs, which host hundreds to thousands of massive stars, requires an accurate treatment of radiation. For this purpose, we have developed a new, highly accurate hybrid radiation algorithm that properly treats the absorption of the direct radiation field from stars and the re-emission and processing by interstellar dust. We use our new tool to perform a suite of three-dimensional radiation-hydrodynamic simulations of the formation of massive stars and MSCs. For individual massive stellar systems, we simulate the collapse of massive pre-stellar cores with laminar and turbulent initial conditions and properly resolve regions where we expect instabilities to grow. We find that mass is channeled to the massive stellar system via gravitational and Rayleigh-Taylor (RT) instabilities. For laminar initial conditions, proper treatment of the direct radiation field produces later onset of RT instability, but does not suppress it entirely provided the edges of the radiation-dominated bubbles are adequately resolved. RT instabilities arise immediately for turbulent pre-stellar cores because the initial turbulence seeds the instabilities. To model MSC formation, we simulate the collapse of a dense, turbulent, magnetized Mcl = 106 M⊙ molecular cloud. We find that the influence of the magnetic pressure and radiative feedback slows down star formation. Furthermore, we find that star formation is suppressed along dense filaments where the magnetic field is amplified. Our results suggest that the combined effect of turbulence, magnetic pressure, and radiative feedback from massive stars is responsible for the low star formation efficiencies observed in molecular clouds.

  19. A hierarchical, automated target recognition algorithm for a parallel analog processor

    NASA Technical Reports Server (NTRS)

    Woodward, Gail; Padgett, Curtis

    1997-01-01

    A hierarchical approach is described for an automated target recognition (ATR) system, VIGILANTE, that uses a massively parallel, analog processor (3DANN). The 3DANN processor is capable of performing 64 concurrent inner products of size 1x4096 every 250 nanoseconds.

  20. Real-time electron dynamics for massively parallel excited-state simulations

    NASA Astrophysics Data System (ADS)

    Andrade, Xavier

    The simulation of the real-time dynamics of electrons, based on time dependent density functional theory (TDDFT), is a powerful approach to study electronic excited states in molecular and crystalline systems. What makes the method attractive is its flexibility to simulate different kinds of phenomena beyond the linear-response regime, including strongly-perturbed electronic systems and non-adiabatic electron-ion dynamics. Electron-dynamics simulations are also attractive from a computational point of view. They can run efficiently on massively parallel architectures due to the low communication requirements. Our implementations of electron dynamics, based on the codes Octopus (real-space) and Qball (plane-waves), allow us to simulate systems composed of thousands of atoms and to obtain good parallel scaling up to 1.6 million processor cores. Due to the versatility of real-time electron dynamics and its parallel performance, we expect it to become the method of choice to apply the capabilities of exascale supercomputers for the simulation of electronic excited states.

  1. Pushing configuration-interaction to the limit: Towards massively parallel MCSCF calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vogiatzis, Konstantinos D.; Ma, Dongxia; Olsen, Jeppe

    A new large-scale parallel multiconfigurational self-consistent field (MCSCF) implementation in the open-source NWChem computational chemistry code is presented. The generalized active space approach is used to partition large configuration interaction (CI) vectors and generate a sufficient number of batches that can be distributed to the available cores. Massively parallel CI calculations with large active spaces can be performed. The new parallel MCSCF implementation is tested for the chromium trimer and for an active space of 20 electrons in 20 orbitals, which can now routinely be performed. Unprecedented CI calculations with an active space of 22 electrons in 22 orbitals formore » the pentacene systems were performed and a single CI iteration calculation with an active space of 24 electrons in 24 orbitals for the chromium tetramer was possible. In conclusion, the chromium tetramer corresponds to a CI expansion of one trillion Slater determinants (914 058 513 424) and is the largest conventional CI calculation attempted up to date.« less

  2. Pushing configuration-interaction to the limit: Towards massively parallel MCSCF calculations

    DOE PAGES

    Vogiatzis, Konstantinos D.; Ma, Dongxia; Olsen, Jeppe; ...

    2017-11-14

    A new large-scale parallel multiconfigurational self-consistent field (MCSCF) implementation in the open-source NWChem computational chemistry code is presented. The generalized active space approach is used to partition large configuration interaction (CI) vectors and generate a sufficient number of batches that can be distributed to the available cores. Massively parallel CI calculations with large active spaces can be performed. The new parallel MCSCF implementation is tested for the chromium trimer and for an active space of 20 electrons in 20 orbitals, which can now routinely be performed. Unprecedented CI calculations with an active space of 22 electrons in 22 orbitals formore » the pentacene systems were performed and a single CI iteration calculation with an active space of 24 electrons in 24 orbitals for the chromium tetramer was possible. In conclusion, the chromium tetramer corresponds to a CI expansion of one trillion Slater determinants (914 058 513 424) and is the largest conventional CI calculation attempted up to date.« less

  3. Pushing configuration-interaction to the limit: Towards massively parallel MCSCF calculations

    NASA Astrophysics Data System (ADS)

    Vogiatzis, Konstantinos D.; Ma, Dongxia; Olsen, Jeppe; Gagliardi, Laura; de Jong, Wibe A.

    2017-11-01

    A new large-scale parallel multiconfigurational self-consistent field (MCSCF) implementation in the open-source NWChem computational chemistry code is presented. The generalized active space approach is used to partition large configuration interaction (CI) vectors and generate a sufficient number of batches that can be distributed to the available cores. Massively parallel CI calculations with large active spaces can be performed. The new parallel MCSCF implementation is tested for the chromium trimer and for an active space of 20 electrons in 20 orbitals, which can now routinely be performed. Unprecedented CI calculations with an active space of 22 electrons in 22 orbitals for the pentacene systems were performed and a single CI iteration calculation with an active space of 24 electrons in 24 orbitals for the chromium tetramer was possible. The chromium tetramer corresponds to a CI expansion of one trillion Slater determinants (914 058 513 424) and is the largest conventional CI calculation attempted up to date.

  4. A comprehensive study of MPI parallelism in three-dimensional discrete element method (DEM) simulation of complex-shaped granular particles

    NASA Astrophysics Data System (ADS)

    Yan, Beichuan; Regueiro, Richard A.

    2018-02-01

    A three-dimensional (3D) DEM code for simulating complex-shaped granular particles is parallelized using message-passing interface (MPI). The concepts of link-block, ghost/border layer, and migration layer are put forward for design of the parallel algorithm, and theoretical scalability function of 3-D DEM scalability and memory usage is derived. Many performance-critical implementation details are managed optimally to achieve high performance and scalability, such as: minimizing communication overhead, maintaining dynamic load balance, handling particle migrations across block borders, transmitting C++ dynamic objects of particles between MPI processes efficiently, eliminating redundant contact information between adjacent MPI processes. The code executes on multiple US Department of Defense (DoD) supercomputers and tests up to 2048 compute nodes for simulating 10 million three-axis ellipsoidal particles. Performance analyses of the code including speedup, efficiency, scalability, and granularity across five orders of magnitude of simulation scale (number of particles) are provided, and they demonstrate high speedup and excellent scalability. It is also discovered that communication time is a decreasing function of the number of compute nodes in strong scaling measurements. The code's capability of simulating a large number of complex-shaped particles on modern supercomputers will be of value in both laboratory studies on micromechanical properties of granular materials and many realistic engineering applications involving granular materials.

  5. Development of a Three-Dimensional PSE Code for Compressible Flows: Stability of Three-Dimensional Compressible Boundary Layers

    NASA Technical Reports Server (NTRS)

    Balakumar, P.; Jeyasingham, Samarasingham

    1999-01-01

    A program is developed to investigate the linear stability of three-dimensional compressible boundary layer flows over bodies of revolutions. The problem is formulated as a two dimensional (2D) eigenvalue problem incorporating the meanflow variations in the normal and azimuthal directions. Normal mode solutions are sought in the whole plane rather than in a line normal to the wall as is done in the classical one dimensional (1D) stability theory. The stability characteristics of a supersonic boundary layer over a sharp cone with 50 half-angle at 2 degrees angle of attack is investigated. The 1D eigenvalue computations showed that the most amplified disturbances occur around x(sub 2) = 90 degrees and the azimuthal mode number for the most amplified disturbances range between m = -30 to -40. The frequencies of the most amplified waves are smaller in the middle region where the crossflow dominates the instability than the most amplified frequencies near the windward and leeward planes. The 2D eigenvalue computations showed that due to the variations in the azimuthal direction, the eigenmodes are clustered into isolated confined regions. For some eigenvalues, the eigenfunctions are clustered in two regions. Due to the nonparallel effect in the azimuthal direction, the eigenmodes are clustered into isolated confined regions. For some eigenvalues, the eigenfunctions are clustered in two regions. Due to the nonparallel effect in the azimuthal direction, the most amplified disturbances are shifted to 120 degrees compared to 90 degrees for the parallel theory. It is also observed that the nonparallel amplification rates are smaller than that is obtained from the parallel theory.

  6. A cost-effective methodology for the design of massively-parallel VLSI functional units

    NASA Technical Reports Server (NTRS)

    Venkateswaran, N.; Sriram, G.; Desouza, J.

    1993-01-01

    In this paper we propose a generalized methodology for the design of cost-effective massively-parallel VLSI Functional Units. This methodology is based on a technique of generating and reducing a massive bit-array on the mask-programmable PAcube VLSI array. This methodology unifies (maintains identical data flow and control) the execution of complex arithmetic functions on PAcube arrays. It is highly regular, expandable and uniform with respect to problem-size and wordlength, thereby reducing the communication complexity. The memory-functional unit interface is regular and expandable. Using this technique functional units of dedicated processors can be mask-programmed on the naked PAcube arrays, reducing the turn-around time. The production cost of such dedicated processors can be drastically reduced since the naked PAcube arrays can be mass-produced. Analysis of the the performance of functional units designed by our method yields promising results.

  7. Parallel computing on Unix workstation arrays

    NASA Astrophysics Data System (ADS)

    Reale, F.; Bocchino, F.; Sciortino, S.

    1994-12-01

    We have tested arrays of general-purpose Unix workstations used as MIMD systems for massive parallel computations. In particular we have solved numerically a demanding test problem with a 2D hydrodynamic code, generally developed to study astrophysical flows, by exucuting it on arrays either of DECstations 5000/200 on Ethernet LAN, or of DECstations 3000/400, equipped with powerful Alpha processors, on FDDI LAN. The code is appropriate for data-domain decomposition, and we have used a library for parallelization previously developed in our Institute, and easily extended to work on Unix workstation arrays by using the PVM software toolset. We have compared the parallel efficiencies obtained on arrays of several processors to those obtained on a dedicated MIMD parallel system, namely a Meiko Computing Surface (CS-1), equipped with Intel i860 processors. We discuss the feasibility of using non-dedicated parallel systems and conclude that the convenience depends essentially on the size of the computational domain as compared to the relative processor power and network bandwidth. We point out that for future perspectives a parallel development of processor and network technology is important, and that the software still offers great opportunities of improvement, especially in terms of latency times in the message-passing protocols. In conditions of significant gain in terms of speedup, such workstation arrays represent a cost-effective approach to massive parallel computations.

  8. Passive contribution of the rotator cuff to abduction and joint stability.

    PubMed

    Tétreault, Patrice; Levasseur, Annie; Lin, Jenny C; de Guise, Jacques; Nuño, Natalia; Hagemeister, Nicola

    2011-11-01

    The purpose of this study is to compare shoulder joint biomechanics during abduction with and without intact non-functioning rotator cuff tissue. A cadaver model was devised to simulate the clinical findings seen in patients with a massive cuff tear. Eight full upper limb shoulder specimens were studied. Initially, the rotator cuff tendons were left intact, representing a non-functional rotator cuff, as seen in suprascapular nerve paralysis or in cuff repair with a patch. Subsequently, a massive rotator cuff tear was re-created. Three-dimensional kinematics and force requirements for shoulder abduction were analyzed for each condition using ten abduction cycles in the plane of the scapula. Mediolateral displacements of the glenohumeral rotation center (GHRC) during abduction with an intact non-functioning cuff were minimal, but massive cuff tear resulted in significant lateral displacement of the GHRC (p < 0.013). Similarly, massive cuff tear caused increased superior migration of the GHRC during abduction compared with intact non-functional cuff (p < 0.01). From 5 to 30° of abduction, force requirements were significantly less with an intact non-functioning cuff than with massive cuff tear (p < 0.009). During abduction, an intact but non-functioning rotator cuff resulted in decreased GHRC displacement in two axes as well as lowered the force requirement for abduction from 5 to 30° as compared with the results following a massive rotator cuff tear. This provides insight into the potential biomechanical effect of repairing massive rotator cuff tears with a biological or synthetic "patch," which is a new treatment for massive cuff tear.

  9. Topological charges in SL(2,R) covariant massive 11-dimensional and type IIB supergravity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Callister, Andrew K.; Smith, Douglas J.

    2009-12-15

    In this paper we construct closed expressions that correspond to the topological charges of the various 1/2-BPS states of the maximal 10- and 11-dimensional supergravity theories. These expressions are related to the structure of the supersymmetry algebras in curved spacetimes. We mainly focus on IIB supergravity and 11-dimensional supergravity in a double M9-brane background, with an emphasis on the SL(2,R) multiplet structure of the charges and how these map between theories. This includes the charges corresponding to the multiplets of 7- and 9-branes in IIB. We find that examining the possible multiplet structures of the charges provides another tool formore » exploring the spectrum of BPS states that appear in these theories. As a prerequisite to constructing the charges we determine the field equations and multiplet structure of the 11-dimensional gauge potentials, extending previous results on the subject. The massive gauge transformations of the fields are also discussed. We also demonstrate how these massive gauge transformations are compatible with the construction of an SL(2,R) covariant kinetic term in the 11-dimensional Kaluza-Klein monopole worldvolume action.« less

  10. Rotator cuff tear shape characterization: a comparison of two-dimensional imaging and three-dimensional magnetic resonance reconstructions.

    PubMed

    Gyftopoulos, Soterios; Beltran, Luis S; Gibbs, Kevin; Jazrawi, Laith; Berman, Phillip; Babb, James; Meislin, Robert

    2016-01-01

    The purpose of this study was to see if 3-dimensional (3D) magnetic resonance imaging (MRI) could improve our understanding of rotator cuff tendon tear shapes. We believed that 3D MRI would be more accurate than two-dimensional (2D) MRI for classifying tear shapes. We performed a retrospective review of MRI studies of patients with arthroscopically proven full-thickness rotator cuff tears. Two orthopedic surgeons reviewed the information for each case, including scope images, and characterized the shape of the cuff tear into crescent, longitudinal, U- or L-shaped longitudinal, and massive type. Two musculoskeletal radiologists reviewed the corresponding MRI studies independently and blind to the arthroscopic findings and characterized the shape on the basis of the tear's retraction and size using 2D MRI. The 3D reconstructions of each cuff tear were reviewed by each radiologist to characterize the shape. Statistical analysis included 95% confidence intervals and intraclass correlation coefficients. The study reviewed 34 patients. The accuracy for differentiating between crescent-shaped, longitudinal, and massive tears using measurements on 2D MRI was 70.6% for reader 1 and 67.6% for reader 2. The accuracy for tear shape characterization into crescent and longitudinal U- or L-shaped using 3D MRI was 97.1% for reader 1 and 82.4% for reader 2. When further characterizing the longitudinal tears as massive or not using 3D MRI, both readers had an accuracy of 76.9% (10 of 13). The overall accuracy of 3D MRI was 82.4% (56 of 68), significantly different (P = .021) from 2D MRI accuracy (64.7%). Our study has demonstrated that 3D MR reconstructions of the rotator cuff improve the accuracy of characterizing rotator cuff tear shapes compared with current 2D MRI-based techniques. Copyright © 2016 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.

  11. The ABC of ACM: asteroids, Buffon and comets

    NASA Astrophysics Data System (ADS)

    Steel, D. I.

    1997-12-01

    Most of the participants in the ACM 96 conference would have made use of facilities in a building named for Georges-Louis Leclerc, the Compte de Buffon (1707-1788). Buffon made many major contributions to the natural sciences, and may be considered to be one of the founders of planetary science. He proposed a theory for the origin of the planets which involved a massive comet having an oblique impact upon the Sun, the ejected material condensing so as to form a regular system of planets. Amongst his mathematical contributions is what is known as Buffon's Needle, whereby experimental evaluations of π may be made by randomly dropping a needle onto a set of parallel lines of separation greater than the needle length, and accumulating the fraction of times that the needle cuts one of the lines. Near-Earth asteroid (NEA) trails imaged onto a CCD chip provide a two-dimensional analogue of this, and where the pixel size is very large (this having some advantages for NEA searching) an analysis based on Buffon's Needle provides probabilities of the NEA trail lying within one, two or three pixels, such probabilities affecting the chances of detection. It is therefore appropriate that Buffon and his contributions to studies of comets and asteroids be remembered in these conference proceedings.

  12. An Overview of Spray Modeling With OpenNCC and its Application to Emissions Predictions of a LDI Combustor at High Pressure

    NASA Technical Reports Server (NTRS)

    Raju, M. S.

    2016-01-01

    The open national combustion code (Open- NCC) is developed with the aim of advancing the current multi-dimensional computational tools used in the design of advanced technology combustors. In this paper we provide an overview of the spray module, LSPRAY-V, developed as a part of this effort. The spray solver is mainly designed to predict the flow, thermal, and transport properties of a rapidly evaporating multi-component liquid spray. The modeling approach is applicable over a wide-range of evaporating conditions (normal, superheat, and supercritical). The modeling approach is based on several well-established atomization, vaporization, and wall/droplet impingement models. It facilitates large-scale combustor computations through the use of massively parallel computers with the ability to perform the computations on either structured & unstructured grids. The spray module has a multi-liquid and multi-injector capability, and can be used in the calculation of both steady and unsteady computations. We conclude the paper by providing the results for a reacting spray generated by a single injector element with 600 axially swept swirler vanes. It is a configuration based on the next-generation lean-direct injection (LDI) combustor concept. The results include comparisons for both combustor exit temperature and EINOX at three different fuel/air ratios.

  13. A hybrid interface tracking - level set technique for multiphase flow with soluble surfactant

    NASA Astrophysics Data System (ADS)

    Shin, Seungwon; Chergui, Jalel; Juric, Damir; Kahouadji, Lyes; Matar, Omar K.; Craster, Richard V.

    2018-04-01

    A formulation for soluble surfactant transport in multiphase flows recently presented by Muradoglu and Tryggvason (JCP 274 (2014) 737-757) [17] is adapted to the context of the Level Contour Reconstruction Method, LCRM, (Shin et al. IJNMF 60 (2009) 753-778, [8]) which is a hybrid method that combines the advantages of the Front-tracking and Level Set methods. Particularly close attention is paid to the formulation and numerical implementation of the surface gradients of surfactant concentration and surface tension. Various benchmark tests are performed to demonstrate the accuracy of different elements of the algorithm. To verify surfactant mass conservation, values for surfactant diffusion along the interface are compared with the exact solution for the problem of uniform expansion of a sphere. The numerical implementation of the discontinuous boundary condition for the source term in the bulk concentration is compared with the approximate solution. Surface tension forces are tested for Marangoni drop translation. Our numerical results for drop deformation in simple shear are compared with experiments and results from previous simulations. All benchmarking tests compare well with existing data thus providing confidence that the adapted LCRM formulation for surfactant advection and diffusion is accurate and effective in three-dimensional multiphase flows with a structured mesh. We also demonstrate that this approach applies easily to massively parallel simulations.

  14. The Use of Modelling for Improving Pupils' Learning about Cells.

    ERIC Educational Resources Information Center

    Tregidgo, David; Ratcliffe, Mary

    2000-01-01

    Outlines the use of modeling in science teaching. Describes a study in which two parallel groups of year seven pupils modeled concepts of cell structure and function as they produced two- or three-dimensional representations of plant and animal cells. (Author/CCM)

  15. Finite fringe hologram

    NASA Technical Reports Server (NTRS)

    Heflinger, L. O.

    1970-01-01

    In holographic interferometry a small movement of apparatus between exposures causes the background of the reconstructed scene to be covered with interference fringes approximately parallel to each other. The three-dimensional quality of the holographic image is allowable since a mathematical model will give the location of the fringes.

  16. A Multibody Formulation for Three Dimensional Brick Finite Element Based Parallel and Scalable Rotor Dynamic Analysis

    DTIC Science & Technology

    2010-05-01

    connections near the hub end, and containing up to 0.48 million degrees of freedom. The models are analyzed for scala - bility and timing for hover and...Parallel and Scalable Rotor Dynamic Analysis 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK...will enable the modeling of critical couplings that occur in hingeless and bearingless hubs with advanced flex structures. Second , it will enable the

  17. An Investigation of the Flow Physics of Acoustic Liners by Direct Numerical Simulation

    NASA Technical Reports Server (NTRS)

    Watson, Willie R. (Technical Monitor); Tam, Christopher

    2004-01-01

    This report concentrates on reporting the effort and status of work done on three dimensional (3-D) simulation of a multi-hole resonator in an impedance tube. This work is coordinated with a parallel experimental effort to be carried out at the NASA Langley Research Center. The outline of this report is as follows : 1. Preliminary consideration. 2. Computation model. 3. Mesh design and parallel computing. 4. Visualization. 5. Status of computer code development. 1. Preliminary Consideration.

  18. Random-subset fitting of digital holograms for fast three-dimensional particle tracking [invited].

    PubMed

    Dimiduk, Thomas G; Perry, Rebecca W; Fung, Jerome; Manoharan, Vinothan N

    2014-09-20

    Fitting scattering solutions to time series of digital holograms is a precise way to measure three-dimensional dynamics of microscale objects such as colloidal particles. However, this inverse-problem approach is computationally expensive. We show that the computational time can be reduced by an order of magnitude or more by fitting to a random subset of the pixels in a hologram. We demonstrate our algorithm on experimentally measured holograms of micrometer-scale colloidal particles, and we show that 20-fold increases in speed, relative to fitting full frames, can be attained while introducing errors in the particle positions of 10 nm or less. The method is straightforward to implement and works for any scattering model. It also enables a parallelization strategy wherein random-subset fitting is used to quickly determine initial guesses that are subsequently used to fit full frames in parallel. This approach may prove particularly useful for studying rare events, such as nucleation, that can only be captured with high frame rates over long times.

  19. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by employing bandwidth shells at areas of overutilization

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-04-27

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a final destination. The default routing strategy is altered responsive to detection of overutilization of a particular path of one or more links, and at least some traffic is re-routed by distributing the traffic among multiple paths (which may include the default path). An alternative path may require a greater number of link traversals to reach the destination node.

  20. A Massively Parallel Bayesian Approach to Planetary Protection Trajectory Analysis and Design

    NASA Technical Reports Server (NTRS)

    Wallace, Mark S.

    2015-01-01

    The NASA Planetary Protection Office has levied a requirement that the upper stage of future planetary launches have a less than 10(exp -4) chance of impacting Mars within 50 years after launch. A brute-force approach requires a decade of computer time to demonstrate compliance. By using a Bayesian approach and taking advantage of the demonstrated reliability of the upper stage, the required number of fifty-year propagations can be massively reduced. By spreading the remaining embarrassingly parallel Monte Carlo simulations across multiple computers, compliance can be demonstrated in a reasonable time frame. The method used is described here.

  1. Systems and methods for rapid processing and storage of data

    DOEpatents

    Stalzer, Mark A.

    2017-01-24

    Systems and methods of building massively parallel computing systems using low power computing complexes in accordance with embodiments of the invention are disclosed. A massively parallel computing system in accordance with one embodiment of the invention includes at least one Solid State Blade configured to communicate via a high performance network fabric. In addition, each Solid State Blade includes a processor configured to communicate with a plurality of low power computing complexes interconnected by a router, and each low power computing complex includes at least one general processing core, an accelerator, an I/O interface, and cache memory and is configured to communicate with non-volatile solid state memory.

  2. Massively Parallel Sequencing Detected a Mutation in the MFN2 Gene Missed by Sanger Sequencing Due to a Primer Mismatch on an SNP Site.

    PubMed

    Neupauerová, Jana; Grečmalová, Dagmar; Seeman, Pavel; Laššuthová, Petra

    2016-05-01

    We describe a patient with early onset severe axonal Charcot-Marie-Tooth disease (CMT2) with dominant inheritance, in whom Sanger sequencing failed to detect a mutation in the mitofusin 2 (MFN2) gene because of a single nucleotide polymorphism (rs2236057) under the PCR primer sequence. The severe early onset phenotype and the family history with severely affected mother (died after delivery) was very suggestive of CMT2A and this suspicion was finally confirmed by a MFN2 mutation. The mutation p.His361Tyr was later detected in the patient by massively parallel sequencing with a gene panel for hereditary neuropathies. According to this information, new primers for amplification and sequencing were designed which bind away from the polymorphic sites of the patient's DNA. Sanger sequencing with these new primers then confirmed the heterozygous mutation in the MFN2 gene in this patient. This case report shows that massively parallel sequencing may in some rare cases be more sensitive than Sanger sequencing and highlights the importance of accurate primer design which requires special attention. © 2016 John Wiley & Sons Ltd/University College London.

  3. Massively parallel whole genome amplification for single-cell sequencing using droplet microfluidics.

    PubMed

    Hosokawa, Masahito; Nishikawa, Yohei; Kogawa, Masato; Takeyama, Haruko

    2017-07-12

    Massively parallel single-cell genome sequencing is required to further understand genetic diversities in complex biological systems. Whole genome amplification (WGA) is the first step for single-cell sequencing, but its throughput and accuracy are insufficient in conventional reaction platforms. Here, we introduce single droplet multiple displacement amplification (sd-MDA), a method that enables massively parallel amplification of single cell genomes while maintaining sequence accuracy and specificity. Tens of thousands of single cells are compartmentalized in millions of picoliter droplets and then subjected to lysis and WGA by passive droplet fusion in microfluidic channels. Because single cells are isolated in compartments, their genomes are amplified to saturation without contamination. This enables the high-throughput acquisition of contamination-free and cell specific sequence reads from single cells (21,000 single-cells/h), resulting in enhancement of the sequence data quality compared to conventional methods. This method allowed WGA of both single bacterial cells and human cancer cells. The obtained sequencing coverage rivals those of conventional techniques with superior sequence quality. In addition, we also demonstrate de novo assembly of uncultured soil bacteria and obtain draft genomes from single cell sequencing. This sd-MDA is promising for flexible and scalable use in single-cell sequencing.

  4. Efficient Iterative Methods Applied to the Solution of Transonic Flows

    NASA Astrophysics Data System (ADS)

    Wissink, Andrew M.; Lyrintzis, Anastasios S.; Chronopoulos, Anthony T.

    1996-02-01

    We investigate the use of an inexact Newton's method to solve the potential equations in the transonic regime. As a test case, we solve the two-dimensional steady transonic small disturbance equation. Approximate factorization/ADI techniques have traditionally been employed for implicit solutions of this nonlinear equation. Instead, we apply Newton's method using an exact analytical determination of the Jacobian with preconditioned conjugate gradient-like iterative solvers for solution of the linear systems in each Newton iteration. Two iterative solvers are tested; a block s-step version of the classical Orthomin(k) algorithm called orthogonal s-step Orthomin (OSOmin) and the well-known GMRES method. The preconditioner is a vectorizable and parallelizable version of incomplete LU (ILU) factorization. Efficiency of the Newton-Iterative method on vector and parallel computer architectures is the main issue addressed. In vectorized tests on a single processor of the Cray C-90, the performance of Newton-OSOmin is superior to Newton-GMRES and a more traditional monotone AF/ADI method (MAF) for a variety of transonic Mach numbers and mesh sizes. Newton-GMRES is superior to MAF for some cases. The parallel performance of the Newton method is also found to be very good on multiple processors of the Cray C-90 and on the massively parallel thinking machine CM-5, where very fast execution rates (up to 9 Gflops) are found for large problems.

  5. The art of seeing and painting.

    PubMed

    Grossberg, Stephen

    2008-01-01

    The human urge to represent the three-dimensional world using two-dimensional pictorial representations dates back at least to Paleolithic times. Artists from ancient to modern times have struggled to understand how a few contours or color patches on a flat surface can induce mental representations of a three-dimensional scene. This article summarizes some of the recent breakthroughs in scientifically understanding how the brain sees that shed light on these struggles. These breakthroughs illustrate how various artists have intuitively understood paradoxical properties about how the brain sees, and have used that understanding to create great art. These paradoxical properties arise from how the brain forms the units of conscious visual perception; namely, representations of three-dimensional boundaries and surfaces. Boundaries and surfaces are computed in parallel cortical processing streams that obey computationally complementary properties. These streams interact at multiple levels to overcome their complementary weaknesses and to transform their complementary properties into consistent percepts. The article describes how properties of complementary consistency have guided the creation of many great works of art.

  6. A plasma source driven predator-prey like mechanism as a potential cause of spiraling intermittencies in linear plasma devices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reiser, D.; Ohno, N.; Tanaka, H.

    2014-03-15

    Three-dimensional global drift fluid simulations are carried out to analyze coherent plasma structures appearing in the NAGDIS-II linear device (nagoya divertor plasma Simulator-II). The numerical simulations reproduce several features of the intermittent spiraling structures observed, for instance, statistical properties, rotation frequency, and the frequency of plasma expulsion. The detailed inspection of the three-dimensional plasma dynamics allows to identify the key mechanism behind the formation of these intermittent events. The resistive coupling between electron pressure and parallel electric field in the plasma source region gives rise to a quasilinear predator-prey like dynamics where the axisymmetric mode represents the prey and themore » spiraling structure with low azimuthal mode number represents the predator. This interpretation is confirmed by a reduced one-dimensional quasilinear model derived on the basis of the findings in the full three-dimensional simulations. The dominant dynamics reveals certain similarities to the classical Lotka-Volterra cycle.« less

  7. PCTDSE: A parallel Cartesian-grid-based TDSE solver for modeling laser-atom interactions

    NASA Astrophysics Data System (ADS)

    Fu, Yongsheng; Zeng, Jiaolong; Yuan, Jianmin

    2017-01-01

    We present a parallel Cartesian-grid-based time-dependent Schrödinger equation (TDSE) solver for modeling laser-atom interactions. It can simulate the single-electron dynamics of atoms in arbitrary time-dependent vector potentials. We use a split-operator method combined with fast Fourier transforms (FFT), on a three-dimensional (3D) Cartesian grid. Parallelization is realized using a 2D decomposition strategy based on the Message Passing Interface (MPI) library, which results in a good parallel scaling on modern supercomputers. We give simple applications for the hydrogen atom using the benchmark problems coming from the references and obtain repeatable results. The extensions to other laser-atom systems are straightforward with minimal modifications of the source code.

  8. 3D unstructured-mesh radiation transport codes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morel, J.

    1997-12-31

    Three unstructured-mesh radiation transport codes are currently being developed at Los Alamos National Laboratory. The first code is ATTILA, which uses an unstructured tetrahedral mesh in conjunction with standard Sn (discrete-ordinates) angular discretization, standard multigroup energy discretization, and linear-discontinuous spatial differencing. ATTILA solves the standard first-order form of the transport equation using source iteration in conjunction with diffusion-synthetic acceleration of the within-group source iterations. DANTE is designed to run primarily on workstations. The second code is DANTE, which uses a hybrid finite-element mesh consisting of arbitrary combinations of hexahedra, wedges, pyramids, and tetrahedra. DANTE solves several second-order self-adjoint forms of the transport equation including the even-parity equation, the odd-parity equation, and a new equation called the self-adjoint angular flux equation. DANTE also offers three angular discretization options:more » $$S{_}n$$ (discrete-ordinates), $$P{_}n$$ (spherical harmonics), and $$SP{_}n$$ (simplified spherical harmonics). DANTE is designed to run primarily on massively parallel message-passing machines, such as the ASCI-Blue machines at LANL and LLNL. The third code is PERICLES, which uses the same hybrid finite-element mesh as DANTE, but solves the standard first-order form of the transport equation rather than a second-order self-adjoint form. DANTE uses a standard $$S{_}n$$ discretization in angle in conjunction with trilinear-discontinuous spatial differencing, and diffusion-synthetic acceleration of the within-group source iterations. PERICLES was initially designed to run on workstations, but a version for massively parallel message-passing machines will be built. The three codes will be described in detail and computational results will be presented.« less

  9. Quantitative assessment of fatty infiltration and muscle volume of the rotator cuff muscles using 3-dimensional 2-point Dixon magnetic resonance imaging.

    PubMed

    Matsumura, Noboru; Oguro, Sota; Okuda, Shigeo; Jinzaki, Masahiro; Matsumoto, Morio; Nakamura, Masaya; Nagura, Takeo

    2017-10-01

    In patients with rotator cuff tears, muscle degeneration is known to be a predictor of irreparable tears and poor outcomes after surgical repair. Fatty infiltration and volume of the whole muscles constituting the rotator cuff were quantitatively assessed using 3-dimensional 2-point Dixon magnetic resonance imaging. Ten shoulders with a partial-thickness tear, 10 shoulders with an isolated supraspinatus tear, and 10 shoulders with a massive tear involving supraspinatus and infraspinatus were compared with 10 control shoulders after matching age and sex. With segmentation of muscle boundaries, the fat fraction value and the volume of the whole rotator cuff muscles were computed. After reliabilities were determined, differences in fat fraction, muscle volume, and fat-free muscle volume were evaluated. Intra-rater and inter-rater reliabilities were regarded as excellent for fat fraction and muscle volume. Tendon rupture adversely increased the fat fraction value of the respective rotator cuff muscle (P < .002). In the massive tear group, muscle volume was significantly decreased in the infraspinatus (P = .035) and increased in the teres minor (P = .039). With subtraction of fat volume, a significant decrease of fat-free volume of the supraspinatus muscle became apparent with a massive tear (P = .003). Three-dimensional measurement could evaluate fatty infiltration and muscular volume with excellent reliabilities. The present study showed that chronic rupture of the tendon adversely increases the fat fraction of the respective muscle and indicates that the residual capacity of the rotator cuff muscles might be overestimated in patients with severe fatty infiltration. Copyright © 2017 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.

  10. 7-Meth­oxy­indan-1-one

    PubMed Central

    Chang, Yuan Jay; Chen, Kew-Yu

    2012-01-01

    In the title compound, C10H10O2, the 1-indanone unit is essentially planar (r.m.s. deviation = 0.028 Å). In the crystal, molecules are linked via C—H⋯O hydrogen bonds, forming layers lying parallel to the ab plane. This two-dimensional structure is stabilized by a weak C—H⋯π inter­action. A second weak C—H⋯π inter­action links the layers, forming a three-dimensional structure. PMID:23284398

  11. Simulation of confined magnetohydrodynamic flows with Dirichlet boundary conditions using a pseudo-spectral method with volume penalization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morales, Jorge A.; Leroy, Matthieu; Bos, Wouter J.T.

    A volume penalization approach to simulate magnetohydrodynamic (MHD) flows in confined domains is presented. Here the incompressible visco-resistive MHD equations are solved using parallel pseudo-spectral solvers in Cartesian geometries. The volume penalization technique is an immersed boundary method which is characterized by a high flexibility for the geometry of the considered flow. In the present case, it allows to use other than periodic boundary conditions in a Fourier pseudo-spectral approach. The numerical method is validated and its convergence is assessed for two- and three-dimensional hydrodynamic (HD) and MHD flows, by comparing the numerical results with results from literature and analyticalmore » solutions. The test cases considered are two-dimensional Taylor–Couette flow, the z-pinch configuration, three dimensional Orszag–Tang flow, Ohmic-decay in a periodic cylinder, three-dimensional Taylor–Couette flow with and without axial magnetic field and three-dimensional Hartmann-instabilities in a cylinder with an imposed helical magnetic field. Finally, we present a magnetohydrodynamic flow simulation in toroidal geometry with non-symmetric cross section and imposing a helical magnetic field to illustrate the potential of the method.« less

  12. Explicit finite-difference simulation of optical integrated devices on massive parallel computers.

    PubMed

    Sterkenburgh, T; Michels, R M; Dress, P; Franke, H

    1997-02-20

    An explicit method for the numerical simulation of optical integrated circuits by means of the finite-difference time-domain (FDTD) method is presented. This method, based on an explicit solution of Maxwell's equations, is well established in microwave technology. Although the simulation areas are small, we verified the behavior of three interesting problems, especially nonparaxial problems, with typical aspects of integrated optical devices. Because numerical losses are within acceptable limits, we suggest the use of the FDTD method to achieve promising quantitative simulation results.

  13. Oxygen transport properties estimation by DSMC-CT simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bruno, Domenico; Frezzotti, Aldo; Ghiroldi, Gian Pietro

    Coupling DSMC simulations with classical trajectories calculations is emerging as a powerful tool to improve predictive capabilities of computational rarefied gas dynamics. The considerable increase of computational effort outlined in the early application of the method (Koura,1997) can be compensated by running simulations on massively parallel computers. In particular, GPU acceleration has been found quite effective in reducing computing time (Ferrigni,2012; Norman et al.,2013) of DSMC-CT simulations. The aim of the present work is to study rarefied Oxygen flows by modeling binary collisions through an accurate potential energy surface, obtained by molecular beams scattering (Aquilanti, et al.,1999). The accuracy ofmore » the method is assessed by calculating molecular Oxygen shear viscosity and heat conductivity following three different DSMC-CT simulation methods. In the first one, transport properties are obtained from DSMC-CT simulations of spontaneous fluctuation of an equilibrium state (Bruno et al, Phys. Fluids, 23, 093104, 2011). In the second method, the collision trajectory calculation is incorporated in a Monte Carlo integration procedure to evaluate the Taxman’s expressions for the transport properties of polyatomic gases (Taxman,1959). In the third, non-equilibrium zero and one-dimensional rarefied gas dynamic simulations are adopted and the transport properties are computed from the non-equilibrium fluxes of momentum and energy. The three methods provide close values of the transport properties, their estimated statistical error not exceeding 3%. The experimental values are slightly underestimated, the percentage deviation being, again, few percent.« less

  14. Targeted next-generation sequencing in steroid-resistant nephrotic syndrome: mutations in multiple glomerular genes may influence disease severity.

    PubMed

    Bullich, Gemma; Trujillano, Daniel; Santín, Sheila; Ossowski, Stephan; Mendizábal, Santiago; Fraga, Gloria; Madrid, Álvaro; Ariceta, Gema; Ballarín, José; Torra, Roser; Estivill, Xavier; Ars, Elisabet

    2015-09-01

    Genetic diagnosis of steroid-resistant nephrotic syndrome (SRNS) using Sanger sequencing is complicated by the high genetic heterogeneity and phenotypic variability of this disease. We aimed to improve the genetic diagnosis of SRNS by simultaneously sequencing 26 glomerular genes using massive parallel sequencing and to study whether mutations in multiple genes increase disease severity. High-throughput mutation analysis was performed in 50 SRNS and/or focal segmental glomerulosclerosis (FSGS) patients, a validation cohort of 25 patients with known pathogenic mutations, and a discovery cohort of 25 uncharacterized patients with probable genetic etiology. In the validation cohort, we identified the 42 previously known pathogenic mutations across NPHS1, NPHS2, WT1, TRPC6, and INF2 genes. In the discovery cohort, disease-causing mutations in SRNS/FSGS genes were found in nine patients. We detected three patients with mutations in an SRNS/FSGS gene and COL4A3. Two of them were familial cases and presented a more severe phenotype than family members with mutation in only one gene. In conclusion, our results show that massive parallel sequencing is feasible and robust for genetic diagnosis of SRNS/FSGS. Our results indicate that patients carrying mutations in an SRNS/FSGS gene and also in COL4A3 gene have increased disease severity.

  15. Exact solutions in 3D gravity with torsion

    NASA Astrophysics Data System (ADS)

    González, P. A.; Vásquez, Yerko

    2011-08-01

    We study the three-dimensional gravity with torsion given by the Mielke-Baekler (MB) model coupled to gravitational Chern-Simons term, and that possess electric charge described by Maxwell-Chern-Simons electrodynamics. We find and discuss this theory's charged black holes solutions and uncharged solutions. We find that for vanishing torsion our solutions by means of a coordinate transformation can be written as three-dimensional Chern-Simons black holes. We also discuss a special case of this theory, Topologically Massive Gravity (TMG) at chiral point, and we show that the logarithmic solution of TMG is also a solution of the MB model at a fixed point in the space of parameters. Furthermore, we show that our solutions generalize Gödel type solutions in a particular case. Also, we recover BTZ black hole in Riemann-Cartan spacetime for vanishing charge.

  16. Rational Protein Engineering Guided by Deep Mutational Scanning

    PubMed Central

    Shin, HyeonSeok; Cho, Byung-Kwan

    2015-01-01

    Sequence–function relationship in a protein is commonly determined by the three-dimensional protein structure followed by various biochemical experiments. However, with the explosive increase in the number of genome sequences, facilitated by recent advances in sequencing technology, the gap between protein sequences available and three-dimensional structures is rapidly widening. A recently developed method termed deep mutational scanning explores the functional phenotype of thousands of mutants via massive sequencing. Coupled with a highly efficient screening system, this approach assesses the phenotypic changes made by the substitution of each amino acid sequence that constitutes a protein. Such an informational resource provides the functional role of each amino acid sequence, thereby providing sufficient rationale for selecting target residues for protein engineering. Here, we discuss the current applications of deep mutational scanning and consider experimental design. PMID:26404267

  17. Direct numerical simulation of steady state, three dimensional, laminar flow around a wall mounted cube

    NASA Astrophysics Data System (ADS)

    Liakos, Anastasios; Malamataris, Nikolaos

    2014-11-01

    The topology and evolution of flow around a surface mounted cubical object in three dimensional channel flow is examined for low to moderate Reynolds numbers. Direct numerical simulations were performed via a home made parallel finite element code. The computational domain has been designed according to actual laboratory experimental conditions. Analysis of the results is performed using the three dimensional theory of separation. Our findings indicate that a tornado-like vortex by the side of the cube is present for all Reynolds numbers for which flow was simulated. A horse-shoe vortex upstream from the cube was formed at Reynolds number approximately 1266. Pressure distributions are shown along with three dimensional images of the tornado-like vortex and the horseshoe vortex at selected Reynolds numbers. Finally, and in accordance to previous work, our results indicate that the upper limit for the Reynolds number for which steady state results are physically realizable is roughly 2000. Financial support of author NM from the Office of Naval Research Global (ONRG-VSP, N62909-13-1-V016) is acknowledged.

  18. Differentiation of Benign and Malignant Breast Tumors by In-Vivo Three-Dimensional Parallel-Plate Diffuse Optical Tomography

    PubMed Central

    Choe, Regine; Konecky, Soren D.; Corlu, Alper; Lee, Kijoon; Durduran, Turgut; Busch, David R.; Pathak, Saurav; Czerniecki, Brian J.; Tchou, Julia; Fraker, Douglas L.; DeMichele, Angela; Chance, Britton; Arridge, Simon R.; Schweiger, Martin; Culver, Joseph P.; Schnall, Mitchell D.; Putt, Mary E.; Rosen, Mark A.; Yodh, Arjun G.

    2009-01-01

    We have developed a novel parallel-plate diffuse optical tomography (DOT) system for three-dimensional in vivo imaging of human breast tumor based on large optical data sets. Images of oxy-, deoxy-, total-hemoglobin concentration, blood oxygen saturation, and tissue scattering were reconstructed. Tumor margins were derived using the optical data with guidance from radiology reports and Magnetic Resonance Imaging. Tumor-to-normal ratios of these endogenous physiological parameters and an optical index were computed for 51 biopsy-proven lesions from 47 subjects. Malignant cancers (N=41) showed statistically significant higher total hemoglobin, oxy-hemoglobin concentration, and scattering compared to normal tissue. Furthermore, malignant lesions exhibited a two-fold average increase in optical index. The influence of core biopsy on DOT results was also explored; the difference between the malignant group measured before core biopsy and the group measured more than one week after core biopsy was not significant. Benign tumors (N=10) did not exhibit statistical significance in the tumor-to-normal ratios of any parameter. Optical index and tumor-to-normal ratios of total hemoglobin, oxy-hemoglobin concentration, and scattering exhibited high area under the receiver operating characteristic curve values from 0.90 to 0.99, suggesting good discriminatory power. The data demonstrate that benign and malignant lesions can be distinguished by quantitative three-dimensional DOT. PMID:19405750

  19. Development of a two-dimensional zonally averaged statistical-dynamical model. III - The parameterization of the eddy fluxes of heat and moisture

    NASA Technical Reports Server (NTRS)

    Stone, Peter H.; Yao, Mao-Sung

    1990-01-01

    A number of perpetual January simulations are carried out with a two-dimensional zonally averaged model employing various parameterizations of the eddy fluxes of heat (potential temperature) and moisture. The parameterizations are evaluated by comparing these results with the eddy fluxes calculated in a parallel simulation using a three-dimensional general circulation model with zonally symmetric forcing. The three-dimensional model's performance in turn is evaluated by comparing its results using realistic (nonsymmetric) boundary conditions with observations. Branscome's parameterization of the meridional eddy flux of heat and Leovy's parameterization of the meridional eddy flux of moisture simulate the seasonal and latitudinal variations of these fluxes reasonably well, while somewhat underestimating their magnitudes. New parameterizations of the vertical eddy fluxes are developed that take into account the enhancement of the eddy mixing slope in a growing baroclinic wave due to condensation, and also the effect of eddy fluctuations in relative humidity. The new parameterizations, when tested in the two-dimensional model, simulate the seasonal, latitudinal, and vertical variations of the vertical eddy fluxes quite well, when compared with the three-dimensional model, and only underestimate the magnitude of the fluxes by 10 to 20 percent.

  20. A Comparison of Numerical and Analytical Radiative-Transfer Solutions for Plane Albedo of Natural Waters

    EPA Science Inventory

    Three numerical algorithms were compared to provide a solution of a radiative transfer equation (RTE) for plane albedo (hemispherical reflectance) in semi-infinite one-dimensional plane-parallel layer. Algorithms were based on the invariant imbedding method and two different var...

  1. A Survey of Parallel Computing

    DTIC Science & Technology

    1988-07-01

    Evaluating Two Massively Parallel Machines. Communications of the ACM .9, , , 176 BIBLIOGRAPHY 29, 8 (August), pp. 752-758. Gajski , D.D., Padua, D.A., Kuck...Computer Architecture, edited by Gajski , D. D., Milutinovic, V. M. Siegel, H. J. and Furht, B. P. IEEE Computer Society Press, Washington, D.C., pp. 387-407

  2. Crystal structures of isomeric 3,5-di-chloro-N-(2,3-di-methyl-phen-yl)benzene-sulfonamide, 3,5-di-chloro-N-(2,6-di-methyl-phen-yl)benzene-sulfonamide and 3,5-di-chloro-N-(3,5-di-methyl-phen-yl)benzene-sulfonamide.

    PubMed

    Shakuntala, K; Naveen, S; Lokanath, N K; Suchetan, P A

    2017-05-01

    The crystal structures of three isomeric compounds of formula C 14 H 13 Cl 2 NO 2 S, namely 3,5-di-chloro- N -(2,3-di-methyl-phen-yl)-benzene-sulfonamide (I), 3,5-di-chloro- N -(2,6-di-methyl-phen-yl)benzene-sulfonamide (II) and 3,5-di-chloro- N -(3,5-di-methyl-phen-yl)benzene-sulfonamide (III) are described. The mol-ecules of all the three compounds are U-shaped with the two aromatic rings inclined at 41.3 (6)° in (I), 42.1 (2)° in (II) and 54.4 (3)° in (III). The mol-ecular conformation of (II) is stabilized by intra-molecular C-H⋯O hydrogen bonds and C-H⋯π inter-actions. The crystal structure of (I) features N-H⋯O hydrogen-bonded R 2 2 (8) loops inter-connected via C (7) chains of C-H⋯O inter-actions, forming a three-dimensional architecture. The structure also features π-π inter-actions [ Cg ⋯ Cg = 3.6970 (14) Å]. In (II), N-H⋯O hydrogen-bonded R 2 2 (8) loops are inter-connected via π-π inter-actions [inter-centroid distance = 3.606 (3) Å] to form a one-dimensional architecture running parallel to the a axis. In (III), adjacent C (4) chains of N-H⋯O hydrogen-bonded mol-ecules running parallel to [010] are connected via C-H⋯π inter-actions, forming sheets parallel to the ab plane. Neighbouring sheets are linked via offset π-π inter-actions [inter-centroid distance = 3.8303 (16) Å] to form a three-dimensional architecture.

  3. Multi-Dimensional Analysis of Large, Complex Slope Instability: Case study of Downie Slide, British Columbia, Canada. (Invited)

    NASA Astrophysics Data System (ADS)

    Kalenchuk, K. S.; Hutchinson, D.; Diederichs, M. S.

    2013-12-01

    Downie Slide, one of the world's largest landslides, is a massive, active, composite, extremely slow rockslide located on the west bank of the Revelstoke Reservoir in British Columbia. It is a 1.5 billion m3 rockslide measuring 2400 m along the river valley, 3300m from toe to headscarp and up to 245 m thick. Significant contributions to the field of landslide geomechanics have been made by analyses of spatially and temporally discriminated slope deformations, and how these are controlled by complex geological and geotechnical factors. Downie Slide research demonstrates the importance of delineating massive landslides into morphological regions in order to characterize global slope behaviour and identify localized events, which may or may not influence the overall slope deformation patterns. Massive slope instabilities do not behave as monolithic masses, rather, different landslide zones can display specific landslide processes occurring at variable rates of deformation. The global deformation of Downie Slide is extremely slow moving; however localized regions of the slope incur moderate to high rates of movement. Complex deformation processes and composite failure mechanism are contributed to by topography, non-uniform shear surfaces, heterogeneous rockmass and shear zone strength and stiffness characteristics. Further, from the analysis of temporal changes in landslide behaviour it has been clearly recognized that different regions of the slope respond differently to changing hydrogeological boundary conditions. State-of-the-art methodologies have been developed for numerical simulation of large landslides; these provide important tools for investigating dynamic landslide systems which account for complex three-dimensional geometries, heterogenous shear zone strength parameters, internal shear zones, the interaction of discrete landslide zones and piezometric fluctuations. Numerical models of Downie Slide have been calibrated to reproduce observed slope behaviour, and the calibration process has provided important insight to key factors controlling massive slope mechanics. Through numerical studies it has been shown that the three-dimensional interpretation of basal slip surface geometry and spatial heterogeneity in shear zone stiffness are important factors controlling large-scale slope deformation processes. The role of secondary internal shears and the interaction between landslide morphological zones has also been assessed. Further, numerical simulation of changing groundwater conditions has produced reasonable correlation with field observations. Calibrated models are valuable tools for the forward prediction of landslide dynamics. Calibrated Downie Slide models have been used to investigate how trigger scenarios may accelerate deformations at Downie Slide. The ability to reproduce observed behaviour and forward test hypothesized changes to boundary conditions has valuable application in hazard management of massive landslides. The capacity of decision makers to interpret large amounts of data, respond to rapid changes in a system and understand complex slope dynamics has been enhanced.

  4. A low-cost and portable realization on fringe projection three-dimensional measurement

    NASA Astrophysics Data System (ADS)

    Xiao, Suzhi; Tao, Wei; Zhao, Hui

    2015-12-01

    Fringe projection three-dimensional measurement is widely applied in a wide range of industrial application. The traditional fringe projection system has the disadvantages of high expense, big size, and complicated calibration requirements. In this paper we introduce a low-cost and portable realization on three-dimensional measurement with Pico projector. It has the advantages of low cost, compact physical size, and flexible configuration. For the proposed fringe projection system, there is no restriction to camera and projector's relative alignment on parallelism and perpendicularity for installation. Moreover, plane-based calibration method is adopted in this paper that avoids critical requirements on calibration system such as additional gauge block or precise linear z stage. What is more, error sources existing in the proposed system are introduced in this paper. The experimental results demonstrate the feasibility of the proposed low cost and portable fringe projection system.

  5. Mineralization of collagen may occur on fibril surfaces: evidence from conventional and high-voltage electron microscopy and three-dimensional imaging

    NASA Technical Reports Server (NTRS)

    Landis, W. J.; Hodgens, K. J.; Song, M. J.; Arena, J.; Kiyonaga, S.; Marko, M.; Owen, C.; McEwen, B. F.

    1996-01-01

    The interaction between collagen and mineral crystals in the normally calcifying leg tendons from the domestic turkey, Meleagris gallopavo, has been investigated at an ultrastructural level with conventional and high-voltage electron microscopy, computed tomography, and three-dimensional image reconstruction methods. Specimens treated by either aqueous or anhydrous techniques and resin-embedded were appropriately sectioned and regions of early tendon mineralization were photographed. On the basis of individual photomicrographs, stereoscopic pairs of images, and tomographic three-dimensional image reconstructions, platelet-shaped crystals may be demonstrated for the first time in association with the surface of collagen fibrils. Mineral is also observed in closely parallel arrays within collagen hole and overlap zones. The mineral deposition at these spatially distinct locations in the tendon provides insight into possible means by which calcification is mediated by collagen as a fundamental event in skeletal and dental formation among vertebrates.

  6. Three-dimensional fine structure of the organization of microtubules in neurite varicosities by ultra-high voltage electron microscope tomography.

    PubMed

    Nishida, Tomoki; Yoshimura, Ryoichi; Endo, Yasuhisa

    2017-09-01

    Neurite varicosities are highly specialized compartments that are involved in neurotransmitter/ neuromodulator release and provide a physiological platform for neural functions. However, it remains unclear how microtubule organization contributes to the form of varicosity. Here, we examine the three-dimensional structure of microtubules in varicosities of a differentiated PC12 neural cell line using ultra-high voltage electron microscope tomography. Three-dimensional imaging showed that a part of the varicosities contained an accumulation of organelles that were separated from parallel microtubule arrays. Further detailed analysis using serial sections and whole-mount tomography revealed microtubules running in a spindle shape of swelling in some other types of varicosities. These electron tomographic results showed that the structural diversity and heterogeneity of microtubule organization supported the form of varicosities, suggesting that a different distribution pattern of microtubules in varicosities is crucial to the regulation of varicosities development.

  7. Lax pair, conservation laws and solitons for a (2 + 1)-dimensional fourth-order nonlinear Schrödinger equation governing an α-helical protein

    NASA Astrophysics Data System (ADS)

    Chai, Jun; Tian, Bo; Zhen, Hui-Ling; Sun, Wen-Rong

    2015-11-01

    Energy transfer through a (2+1)-dimensional α-helical protein can be described by a (2+1)-dimensional fourth-order nonlinear Schrödinger equation. For such an equation, a Lax pair and the infinitely-many conservation laws are derived. Using an auxiliary function and a bilinear formulation, we get the one-, two-, three- and N-soliton solutions via the Hirota method. The soliton velocity is linearly related to the lattice parameter γ, while the soliton' direction and amplitude do not depend on γ. Interactions between the two solitons are elastic, while those among the three solitons are pairwise elastic. Oblique, head-on and overtaking interactions between the two solitons are displayed. Oblique interaction among the three solitons and interactions among the two parallel solitons and a single one are presented as well.

  8. Observation of the Chiral and Achiral Hexatic Phases of Self-assembled Micellar polymers

    PubMed Central

    Pal, Antara; Kamal, Md. Arif; Raghunathan, V. A.

    2016-01-01

    We report the discovery of a thermodynamically stable line hexatic (N + 6) phase in a three-dimensional (3D) system made up of self-assembled polymer-like micelles of amphiphilic molecules. The experimentally observed phase transition sequence nematic (N)  N + 6  two-dimensional hexagonal (2D-H) is in good agreement with the theoretical predictions. Further, the present study also brings to light the effect of chirality on the N + 6 phase. In the chiral N + 6 phase the bond-orientational order within each “polymer” bundle is found to be twisted about an axis parallel to the average polymer direction. This structure is consistent with the theoretically envisaged Moiré state, thereby providing the first experimental demonstration of the Moiré structure. In addition to confirming the predictions of fundamental theories of two-dimensional melting, these results are relevant in a variety of situations in chemistry, physics and biology, where parallel packing of polymer-like objects are encountered. PMID:27577927

  9. Parallel solution of sparse one-dimensional dynamic programming problems

    NASA Technical Reports Server (NTRS)

    Nicol, David M.

    1989-01-01

    Parallel computation offers the potential for quickly solving large computational problems. However, it is often a non-trivial task to effectively use parallel computers. Solution methods must sometimes be reformulated to exploit parallelism; the reformulations are often more complex than their slower serial counterparts. We illustrate these points by studying the parallelization of sparse one-dimensional dynamic programming problems, those which do not obviously admit substantial parallelization. We propose a new method for parallelizing such problems, develop analytic models which help us to identify problems which parallelize well, and compare the performance of our algorithm with existing algorithms on a multiprocessor.

  10. Hiding in Plain Sight: An Abundance of Compact Massive Spheroids in the Local Universe

    NASA Astrophysics Data System (ADS)

    Graham, Alister W.; Dullo, Bililign T.; Savorgnan, Giulia A. D.

    2015-05-01

    It has been widely remarked that compact, massive, elliptical-like galaxies are abundant at high redshifts but exceedingly rare in the universe today, implying significant evolution such that their sizes at z ˜ 2 ± 0.6 have increased by factors of 3 to 6 to become today’s massive elliptical galaxies. These claims have been based on studies that measured the half-light radii of galaxies as though they are all single-component systems. Here we identify 21 spheroidal stellar systems within 90 Mpc that have half-light, major-axis radii {{R}e}≲ 2 kpc, stellar masses 0.7× {{10}11}\\lt {{M}*}/ {{M}⊙ }\\lt 1.4× {{10}11}, and Sérsic indices typically around a value of n = 2-3. This abundance of compact, massive spheroids in our own backyard—with a number density of 6.9× {{10}-6} Mpc-3 (or 3.5 × 10-5 Mpc-3 per unit dex-1 in stellar mass)—and with the same physical properties as the high-redshift galaxies, had been overlooked because they are encased in stellar disks that usually result in galaxy sizes notably larger than 2 kpc. Moreover, this number density is a lower limit because it has not come from a volume-limited sample. The actual density may be closer to 10-4, although further work is required to confirm this. We therefore conclude that not all massive “spheroids” have undergone dramatic structural and size evolution since z ˜ 2 ± 0.6. Given that the bulges of local early-type disk galaxies are known to consist of predominantly old stars that existed at z ˜ 2, it seems likely that some of the observed high-redshift spheroids did not increase in size by building (three-dimensional) triaxial envelopes as commonly advocated, and that the growth of (two-dimensional) disks has also been important over the past 9-11 billion years.

  11. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.

  12. Parallel Preconditioning for CFD Problems on the CM-5

    NASA Technical Reports Server (NTRS)

    Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)

    1994-01-01

    Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.

  13. A biconjugate gradient type algorithm on massively parallel architectures

    NASA Technical Reports Server (NTRS)

    Freund, Roland W.; Hochbruck, Marlis

    1991-01-01

    The biconjugate gradient (BCG) method is the natural generalization of the classical conjugate gradient algorithm for Hermitian positive definite matrices to general non-Hermitian linear systems. Unfortunately, the original BCG algorithm is susceptible to possible breakdowns and numerical instabilities. Recently, Freund and Nachtigal have proposed a novel BCG type approach, the quasi-minimal residual method (QMR), which overcomes the problems of BCG. Here, an implementation is presented of QMR based on an s-step version of the nonsymmetric look-ahead Lanczos algorithm. The main feature of the s-step Lanczos algorithm is that, in general, all inner products, except for one, can be computed in parallel at the end of each block; this is unlike the other standard Lanczos process where inner products are generated sequentially. The resulting implementation of QMR is particularly attractive on massively parallel SIMD architectures, such as the Connection Machine.

  14. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    DOE PAGES

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; ...

    2015-12-21

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Somemore » specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000 ® problems. These benchmark and scaling studies show promising results.« less

  15. Crystal MD: The massively parallel molecular dynamics software for metal with BCC structure

    NASA Astrophysics Data System (ADS)

    Hu, Changjun; Bai, He; He, Xinfu; Zhang, Boyao; Nie, Ningming; Wang, Xianmeng; Ren, Yingwen

    2017-02-01

    Material irradiation effect is one of the most important keys to use nuclear power. However, the lack of high-throughput irradiation facility and knowledge of evolution process, lead to little understanding of the addressed issues. With the help of high-performance computing, we could make a further understanding of micro-level-material. In this paper, a new data structure is proposed for the massively parallel simulation of the evolution of metal materials under irradiation environment. Based on the proposed data structure, we developed the new molecular dynamics software named Crystal MD. The simulation with Crystal MD achieved over 90% parallel efficiency in test cases, and it takes more than 25% less memory on multi-core clusters than LAMMPS and IMD, which are two popular molecular dynamics simulation software. Using Crystal MD, a two trillion particles simulation has been performed on Tianhe-2 cluster.

  16. Three dimensional simulations of viscous folding in diverging microchannels

    NASA Astrophysics Data System (ADS)

    Xu, Bingrui; Chergui, Jalel; Shin, Seungwon; Juric, Damir

    2016-11-01

    Three dimensional simulations on the viscous folding in diverging microchannels reported by Cubaud and Mason are performed using the parallel code BLUE for multi-phase flows. The more viscous liquid L1 is injected into the channel from the center inlet, and the less viscous liquid L2 from two side inlets. Liquid L1 takes the form of a thin filament due to hydrodynamic focusing in the long channel that leads to the diverging region. The thread then becomes unstable to a folding instability, due to the longitudinal compressive stress applied to it by the diverging flow of liquid L2. We performed a parameter study in which the flow rate ratio, the viscosity ratio, the Reynolds number, and the shape of the channel were varied relative to a reference model. In our simulations, the cross section of the thread produced by focusing is elliptical rather than circular. The initial folding axis can be either parallel or perpendicular to the narrow dimension of the chamber. In the former case, the folding slowly transforms via twisting to perpendicular folding, or it may remain parallel. The direction of folding onset is determined by the velocity profile and the elliptical shape of the thread cross section in the channel that feeds the diverging part of the cell.

  17. Application of CHAD hydrodynamics to shock-wave problems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trease, H.E.; O`Rourke, P.J.; Sahota, M.S.

    1997-12-31

    CHAD is the latest in a sequence of continually evolving computer codes written to effectively utilize massively parallel computer architectures and the latest grid generators for unstructured meshes. Its applications range from automotive design issues such as in-cylinder and manifold flows of internal combustion engines, vehicle aerodynamics, underhood cooling and passenger compartment heating, ventilation, and air conditioning to shock hydrodynamics and materials modeling. CHAD solves the full unsteady Navier-Stoke equations with the k-epsilon turbulence model in three space dimensions. The code has four major features that distinguish it from the earlier KIVA code, also developed at Los Alamos. First, itmore » is based on a node-centered, finite-volume method in which, like finite element methods, all fluid variables are located at computational nodes. The computational mesh efficiently and accurately handles all element shapes ranging from tetrahedra to hexahedra. Second, it is written in standard Fortran 90 and relies on automatic domain decomposition and a universal communication library written in standard C and MPI for unstructured grids to effectively exploit distributed-memory parallel architectures. Thus the code is fully portable to a variety of computing platforms such as uniprocessor workstations, symmetric multiprocessors, clusters of workstations, and massively parallel platforms. Third, CHAD utilizes a variable explicit/implicit upwind method for convection that improves computational efficiency in flows that have large velocity Courant number variations due to velocity of mesh size variations. Fourth, CHAD is designed to also simulate shock hydrodynamics involving multimaterial anisotropic behavior under high shear. The authors will discuss CHAD capabilities and show several sample calculations showing the strengths and weaknesses of CHAD.« less

  18. Dynamic current-current susceptibility in three-dimensional Dirac and Weyl semimetals

    NASA Astrophysics Data System (ADS)

    Thakur, Anmol; Sadhukhan, Krishanu; Agarwal, Amit

    2018-01-01

    We study the linear response of doped three-dimensional Dirac and Weyl semimetals to vector potentials, by calculating the wave-vector- and frequency-dependent current-current response function analytically. The longitudinal part of the dynamic current-current response function is then used to study the plasmon dispersion and the optical conductivity. The transverse response in the static limit yields the orbital magnetic susceptibility. In a Weyl semimetal, along with the current-current response function, all these quantities are significantly impacted by the presence of parallel electric and magnetic fields (a finite E .B term) and can be used to experimentally explore the chiral anomaly.

  19. Unstructured mesh algorithms for aerodynamic calculations

    NASA Technical Reports Server (NTRS)

    Mavriplis, D. J.

    1992-01-01

    The use of unstructured mesh techniques for solving complex aerodynamic flows is discussed. The principle advantages of unstructured mesh strategies, as they relate to complex geometries, adaptive meshing capabilities, and parallel processing are emphasized. The various aspects required for the efficient and accurate solution of aerodynamic flows are addressed. These include mesh generation, mesh adaptivity, solution algorithms, convergence acceleration, and turbulence modeling. Computations of viscous turbulent two-dimensional flows and inviscid three-dimensional flows about complex configurations are demonstrated. Remaining obstacles and directions for future research are also outlined.

  20. Cultural Narratives: Developing a Three-Dimensional Learning Community through Braided Understanding

    ERIC Educational Resources Information Center

    Heck, Marsha L.

    2004-01-01

    Paula Underwood's "Learning Stories" braid together body, mind, and spirit to enable understanding that does not easily unravel. They tell of relationships among individual and community learning that parallel other ancient and contemporary ideas about learning in caring communities. Underwood's tradition considers learning sacred; everyone's…

Top