Sample records for scale parallel structured

  1. A parallel orbital-updating based plane-wave basis method for electronic structure calculations

    NASA Astrophysics Data System (ADS)

    Pan, Yan; Dai, Xiaoying; de Gironcoli, Stefano; Gong, Xin-Gao; Rignanese, Gian-Marco; Zhou, Aihui

    2017-11-01

    Motivated by the recently proposed parallel orbital-updating approach in real space method [1], we propose a parallel orbital-updating based plane-wave basis method for electronic structure calculations, for solving the corresponding eigenvalue problems. In addition, we propose two new modified parallel orbital-updating methods. Compared to the traditional plane-wave methods, our methods allow for two-level parallelization, which is particularly interesting for large scale parallelization. Numerical experiments show that these new methods are more reliable and efficient for large scale calculations on modern supercomputers.

  2. Accelerating large-scale protein structure alignments with graphics processing units

    PubMed Central

    2012-01-01

    Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132

  3. Multi-thread parallel algorithm for reconstructing 3D large-scale porous structures

    NASA Astrophysics Data System (ADS)

    Ju, Yang; Huang, Yaohui; Zheng, Jiangtao; Qian, Xu; Xie, Heping; Zhao, Xi

    2017-04-01

    Geomaterials inherently contain many discontinuous, multi-scale, geometrically irregular pores, forming a complex porous structure that governs their mechanical and transport properties. The development of an efficient reconstruction method for representing porous structures can significantly contribute toward providing a better understanding of the governing effects of porous structures on the properties of porous materials. In order to improve the efficiency of reconstructing large-scale porous structures, a multi-thread parallel scheme was incorporated into the simulated annealing reconstruction method. In the method, four correlation functions, which include the two-point probability function, the linear-path functions for the pore phase and the solid phase, and the fractal system function for the solid phase, were employed for better reproduction of the complex well-connected porous structures. In addition, a random sphere packing method and a self-developed pre-conditioning method were incorporated to cast the initial reconstructed model and select independent interchanging pairs for parallel multi-thread calculation, respectively. The accuracy of the proposed algorithm was evaluated by examining the similarity between the reconstructed structure and a prototype in terms of their geometrical, topological, and mechanical properties. Comparisons of the reconstruction efficiency of porous models with various scales indicated that the parallel multi-thread scheme significantly shortened the execution time for reconstruction of a large-scale well-connected porous model compared to a sequential single-thread procedure.

  4. Symposium on Parallel Computational Methods for Large-scale Structural Analysis and Design, 2nd, Norfolk, VA, US

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O. (Editor); Housner, Jerrold M. (Editor)

    1993-01-01

    Computing speed is leaping forward by several orders of magnitude each decade. Engineers and scientists gathered at a NASA Langley symposium to discuss these exciting trends as they apply to parallel computational methods for large-scale structural analysis and design. Among the topics discussed were: large-scale static analysis; dynamic, transient, and thermal analysis; domain decomposition (substructuring); and nonlinear and numerical methods.

  5. Linear static structural and vibration analysis on high-performance computers

    NASA Technical Reports Server (NTRS)

    Baddourah, M. A.; Storaasli, O. O.; Bostic, S. W.

    1993-01-01

    Parallel computers offer the oppurtunity to significantly reduce the computation time necessary to analyze large-scale aerospace structures. This paper presents algorithms developed for and implemented on massively-parallel computers hereafter referred to as Scalable High-Performance Computers (SHPC), for the most computationally intensive tasks involved in structural analysis, namely, generation and assembly of system matrices, solution of systems of equations and calculation of the eigenvalues and eigenvectors. Results on SHPC are presented for large-scale structural problems (i.e. models for High-Speed Civil Transport). The goal of this research is to develop a new, efficient technique which extends structural analysis to SHPC and makes large-scale structural analyses tractable.

  6. Automatic differentiation for design sensitivity analysis of structural systems using multiple processors

    NASA Technical Reports Server (NTRS)

    Nguyen, Duc T.; Storaasli, Olaf O.; Qin, Jiangning; Qamar, Ramzi

    1994-01-01

    An automatic differentiation tool (ADIFOR) is incorporated into a finite element based structural analysis program for shape and non-shape design sensitivity analysis of structural systems. The entire analysis and sensitivity procedures are parallelized and vectorized for high performance computation. Small scale examples to verify the accuracy of the proposed program and a medium scale example to demonstrate the parallel vector performance on multiple CRAY C90 processors are included.

  7. Parallel-vector solution of large-scale structural analysis problems on supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

    1989-01-01

    A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.

  8. Disappearance of Anisotropic Intermittency in Large-amplitude MHD Turbulence and Its Comparison with Small-amplitude MHD Turbulence

    NASA Astrophysics Data System (ADS)

    Yang, Liping; Zhang, Lei; He, Jiansen; Tu, Chuanyi; Li, Shengtai; Wang, Xin; Wang, Linghua

    2018-03-01

    Multi-order structure functions in the solar wind are reported to display a monofractal scaling when sampled parallel to the local magnetic field and a multifractal scaling when measured perpendicularly. Whether and to what extent will the scaling anisotropy be weakened by the enhancement of turbulence amplitude relative to the background magnetic strength? In this study, based on two runs of the magnetohydrodynamic (MHD) turbulence simulation with different relative levels of turbulence amplitude, we investigate and compare the scaling of multi-order magnetic structure functions and magnetic probability distribution functions (PDFs) as well as their dependence on the direction of the local field. The numerical results show that for the case of large-amplitude MHD turbulence, the multi-order structure functions display a multifractal scaling at all angles to the local magnetic field, with PDFs deviating significantly from the Gaussian distribution and a flatness larger than 3 at all angles. In contrast, for the case of small-amplitude MHD turbulence, the multi-order structure functions and PDFs have different features in the quasi-parallel and quasi-perpendicular directions: a monofractal scaling and Gaussian-like distribution in the former, and a conversion of a monofractal scaling and Gaussian-like distribution into a multifractal scaling and non-Gaussian tail distribution in the latter. These results hint that when intermittencies are abundant and intense, the multifractal scaling in the structure functions can appear even if it is in the quasi-parallel direction; otherwise, the monofractal scaling in the structure functions remains even if it is in the quasi-perpendicular direction.

  9. Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Kwak, Dochan (Technical Monitor)

    2002-01-01

    A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel supercomputers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.

  10. Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Byun, Chansup; Kwak, Dochan (Technical Monitor)

    2001-01-01

    A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel super computers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.

  11. A transient FETI methodology for large-scale parallel implicit computations in structural mechanics

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier

    1992-01-01

    Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.

  12. Magnetic intermittency of solar wind turbulence in the dissipation range

    NASA Astrophysics Data System (ADS)

    Pei, Zhongtian; He, Jiansen; Tu, Chuanyi; Marsch, Eckart; Wang, Linghua

    2016-04-01

    The feature, nature, and fate of intermittency in the dissipation range are an interesting topic in the solar wind turbulence. We calculate the distribution of flatness for the magnetic field fluctuations as a functionof angle and scale. The flatness distribution shows a "butterfly" pattern, with two wings located at angles parallel/anti-parallel to local mean magnetic field direction and main body located at angles perpendicular to local B0. This "butterfly" pattern illustrates that the flatness profile in (anti-) parallel direction approaches to the maximum value at larger scale and drops faster than that in perpendicular direction. The contours for probability distribution functions at different scales illustrate a "vase" pattern, more clear in parallel direction, which confirms the scale-variation of flatness and indicates the intermittency generation and dissipation. The angular distribution of structure function in the dissipation range shows an anisotropic pattern. The quasi-mono-fractal scaling of structure function in the dissipation range is also illustrated and investigated with the mathematical model for inhomogeneous cascading (extended p-model). Different from the inertial range, the extended p-model for the dissipation range results in approximate uniform fragmentation measure. However, more complete mathematicaland physical model involving both non-uniform cascading and dissipation is needed. The nature of intermittency may be strong structures or large amplitude fluctuations, which may be tested with magnetic helicity. In one case study, we find the heating effect in terms of entropy for large amplitude fluctuations seems to be more obvious than strong structures.

  13. MAGNETIC BRAIDING AND PARALLEL ELECTRIC FIELDS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wilmot-Smith, A. L.; Hornig, G.; Pontin, D. I.

    2009-05-10

    The braiding of the solar coronal magnetic field via photospheric motions-with subsequent relaxation and magnetic reconnection-is one of the most widely debated ideas of solar physics. We readdress the theory in light of developments in three-dimensional magnetic reconnection theory. It is known that the integrated parallel electric field along field lines is the key quantity determining the rate of reconnection, in contrast with the two-dimensional case where the electric field itself is the important quantity. We demonstrate that this difference becomes crucial for sufficiently complex magnetic field structures. A numerical method is used to relax a braided magnetic field towardmore » an ideal force-free equilibrium; the field is found to remain smooth throughout the relaxation, with only large-scale current structures. However, a highly filamentary integrated parallel current structure with extremely short length-scales is found in the field, with the associated gradients intensifying during the relaxation process. An analytical model is developed to show that, in a coronal situation, the length scales associated with the integrated parallel current structures will rapidly decrease with increasing complexity, or degree of braiding, of the magnetic field. Analysis shows the decrease in these length scales will, for any finite resistivity, eventually become inconsistent with the stability of the coronal field. Thus the inevitable consequence of the magnetic braiding process is a loss of equilibrium of the magnetic field, probably via magnetic reconnection events.« less

  14. A low-altitude mechanism for mesoscale dynamics, structure, and current filamentation in the discrete aurora

    NASA Technical Reports Server (NTRS)

    Keskinen, M. J.; Chaturvedi, P. K.; Ossakow, S. L.

    1992-01-01

    The 2D nonlinear evolution of the ionization-driven adiabatic auroral arc instability is studied. We find: (1) the adiabatic auroral arc instability can fully develop on time scales of tens to hundreds of seconds and on spatial scales of tens to hundreds of kilometers; (2) the evolution of this instability leads to nonlinear 'hook-shaped' conductivity structures: (3) this instability can lead to parallel current filamentation over a wide range of scale sizes; and (4) the k-spectra of the density, electric field, and parallel current develop into inverse power laws in agreement with satellite observations. Comparison with mesoscale auroral phenomenology and current filamentation structures is made.

  15. Parallel-SymD: A Parallel Approach to Detect Internal Symmetry in Protein Domains.

    PubMed

    Jha, Ashwani; Flurchick, K M; Bikdash, Marwan; Kc, Dukka B

    2016-01-01

    Internally symmetric proteins are proteins that have a symmetrical structure in their monomeric single-chain form. Around 10-15% of the protein domains can be regarded as having some sort of internal symmetry. In this regard, we previously published SymD (symmetry detection), an algorithm that determines whether a given protein structure has internal symmetry by attempting to align the protein to its own copy after the copy is circularly permuted by all possible numbers of residues. SymD has proven to be a useful algorithm to detect symmetry. In this paper, we present a new parallelized algorithm called Parallel-SymD for detecting symmetry of proteins on clusters of computers. The achieved speedup of the new Parallel-SymD algorithm scales well with the number of computing processors. Scaling is better for proteins with a larger number of residues. For a protein of 509 residues, a speedup of 63 was achieved on a parallel system with 100 processors.

  16. Parallel-SymD: A Parallel Approach to Detect Internal Symmetry in Protein Domains

    PubMed Central

    Jha, Ashwani; Flurchick, K. M.; Bikdash, Marwan

    2016-01-01

    Internally symmetric proteins are proteins that have a symmetrical structure in their monomeric single-chain form. Around 10–15% of the protein domains can be regarded as having some sort of internal symmetry. In this regard, we previously published SymD (symmetry detection), an algorithm that determines whether a given protein structure has internal symmetry by attempting to align the protein to its own copy after the copy is circularly permuted by all possible numbers of residues. SymD has proven to be a useful algorithm to detect symmetry. In this paper, we present a new parallelized algorithm called Parallel-SymD for detecting symmetry of proteins on clusters of computers. The achieved speedup of the new Parallel-SymD algorithm scales well with the number of computing processors. Scaling is better for proteins with a larger number of residues. For a protein of 509 residues, a speedup of 63 was achieved on a parallel system with 100 processors. PMID:27747230

  17. Programming Probabilistic Structural Analysis for Parallel Processing Computer

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

    1991-01-01

    The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.

  18. MMS Observations of Parallel Electric Fields During a Quasi-Perpendicular Bow Shock Crossing

    NASA Astrophysics Data System (ADS)

    Goodrich, K.; Schwartz, S. J.; Ergun, R.; Wilder, F. D.; Holmes, J.; Burch, J. L.; Gershman, D. J.; Giles, B. L.; Khotyaintsev, Y. V.; Le Contel, O.; Lindqvist, P. A.; Strangeway, R. J.; Russell, C.; Torbert, R. B.

    2016-12-01

    Previous observations of the terrestrial bow shock have frequently shown large-amplitude fluctuations in the parallel electric field. These parallel electric fields are seen as both nonlinear solitary structures, such as double layers and electron phase-space holes, and short-wavelength waves, which can reach amplitudes greater than 100 mV/m. The Magnetospheric Multi-Scale (MMS) Mission has crossed the Earth's bow shock more than 200 times. The parallel electric field signatures observed in these crossings are seen in very discrete packets and evolve over time scales of less than a second, indicating the presence of a wealth of kinetic-scale activity. The high time resolution of the Fast Particle Instrument (FPI) available on MMS offers greater detail of the kinetic-scale physics that occur at bow shocks than ever before, allowing greater insight into the overall effect of these observed electric fields. We present a characterization of these parallel electric fields found in a single bow shock event and how it reflects the kinetic-scale activity that can occur at the terrestrial bow shock.

  19. Displacement and deformation measurement for large structures by camera network

    NASA Astrophysics Data System (ADS)

    Shang, Yang; Yu, Qifeng; Yang, Zhen; Xu, Zhiqiang; Zhang, Xiaohu

    2014-03-01

    A displacement and deformation measurement method for large structures by a series-parallel connection camera network is presented. By taking the dynamic monitoring of a large-scale crane in lifting operation as an example, a series-parallel connection camera network is designed, and the displacement and deformation measurement method by using this series-parallel connection camera network is studied. The movement range of the crane body is small, and that of the crane arm is large. The displacement of the crane body, the displacement of the crane arm relative to the body and the deformation of the arm are measured. Compared with a pure series or parallel connection camera network, the designed series-parallel connection camera network can be used to measure not only the movement and displacement of a large structure but also the relative movement and deformation of some interesting parts of the large structure by a relatively simple optical measurement system.

  20. Parallel Large-Scale Molecular Dynamics Simulation Opens New Perspective to Clarify the Effect of a Porous Structure on the Sintering Process of Ni/YSZ Multiparticles.

    PubMed

    Xu, Jingxiang; Higuchi, Yuji; Ozawa, Nobuki; Sato, Kazuhisa; Hashida, Toshiyuki; Kubo, Momoji

    2017-09-20

    Ni sintering in the Ni/YSZ porous anode of a solid oxide fuel cell changes the porous structure, leading to degradation. Preventing sintering and degradation during operation is a great challenge. Usually, a sintering molecular dynamics (MD) simulation model consisting of two particles on a substrate is used; however, the model cannot reflect the porous structure effect on sintering. In our previous study, a multi-nanoparticle sintering modeling method with tens of thousands of atoms revealed the effect of the particle framework and porosity on sintering. However, the method cannot reveal the effect of the particle size on sintering and the effect of sintering on the change in the porous structure. In the present study, we report a strategy to reveal them in the porous structure by using our multi-nanoparticle modeling method and a parallel large-scale multimillion-atom MD simulator. We used this method to investigate the effect of YSZ particle size and tortuosity on sintering and degradation in the Ni/YSZ anodes. Our parallel large-scale MD simulation showed that the sintering degree decreased as the YSZ particle size decreased. The gas fuel diffusion path, which reflects the overpotential, was blocked by pore coalescence during sintering. The degradation of gas diffusion performance increased as the YSZ particle size increased. Furthermore, the gas diffusion performance was quantified by a tortuosity parameter and an optimal YSZ particle size, which is equal to that of Ni, was found for good diffusion after sintering. These findings cannot be obtained by previous MD sintering studies with tens of thousands of atoms. The present parallel large-scale multimillion-atom MD simulation makes it possible to clarify the effects of the particle size and tortuosity on sintering and degradation.

  1. A new parallel-vector finite element analysis software on distributed-memory computers

    NASA Technical Reports Server (NTRS)

    Qin, Jiangning; Nguyen, Duc T.

    1993-01-01

    A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.

  2. Thermal stability analysis of the fine structure of solar prominences

    NASA Technical Reports Server (NTRS)

    Demoulin, Pascal; Malherbe, Jean-Marie; Schmieder, Brigitte; Raadu, Mickael A.

    1986-01-01

    The linear thermal stability of a 2D periodic structure (alternatively hot and cold) in a uniform magnetic field is analyzed. The energy equation includes wave heating (assumed proportional to density), radiative cooling and both conduction parallel and orthogonal to magnetic lines. The equilibrium is perturbed at constant gas pressure. With parallel conduction only, it is found to be unstable when the length scale 1// is greater than 45 Mn. In that case, orthogonal conduction becomes important and stabilizes the structure when the length scale is smaller than 5 km. On the other hand, when the length scale is greater than 5 km, the thermal equilibrium is unstable, and the corresponding time scale is about 10,000 s: this result may be compared to observations showing that the lifetime of the fine structure of solar prominences is about one hour; consequently, our computations suggest that the size of the unresolved threads could be of the order of 10 km only.

  3. Parallel computing for probabilistic fatigue analysis

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Lua, Yuan J.; Smith, Mark D.

    1993-01-01

    This paper presents the results of Phase I research to investigate the most effective parallel processing software strategies and hardware configurations for probabilistic structural analysis. We investigate the efficiency of both shared and distributed-memory architectures via a probabilistic fatigue life analysis problem. We also present a parallel programming approach, the virtual shared-memory paradigm, that is applicable across both types of hardware. Using this approach, problems can be solved on a variety of parallel configurations, including networks of single or multiprocessor workstations. We conclude that it is possible to effectively parallelize probabilistic fatigue analysis codes; however, special strategies will be needed to achieve large-scale parallelism to keep large number of processors busy and to treat problems with the large memory requirements encountered in practice. We also conclude that distributed-memory architecture is preferable to shared-memory for achieving large scale parallelism; however, in the future, the currently emerging hybrid-memory architectures will likely be optimal.

  4. Development of parallel scales to measure HIV-related stigma

    PubMed Central

    Visser, Maretha J.; Kershaw, Trace; Makin, Jennifer D.; Forsyth, Brian W.C.

    2014-01-01

    HIV-related stigma is a multidimensional concept which has pervasive effects on the lives of HIV-infected people as well as serious consequences for the management of HIV/AIDS. In this research three parallel stigma scales were developed to assess personal views of stigma, stigma attributed to others, and internalized stigma experienced by HIV-infected individuals. The stigma scales were administered in two samples: a community sample of 1077 respondents and 317 HIV-infected pregnant women recruited at clinics from the same community in Tshwane (South Africa). A two-factor structure referring to moral judgment and interpersonal distancing was confirmed across scales and sample groups. The internal consistency of the scales was acceptable and evidence of validity is reported. Parallel scales to assess and compare different perspectives of stigma provide opportunities for research aimed at understanding of stigma, assessing the consequences or evaluating possible interventions aimed at reducing stigma. PMID:18266101

  5. Implementation of molecular dynamics and its extensions with the coarse-grained UNRES force field on massively parallel systems; towards millisecond-scale simulations of protein structure, dynamics, and thermodynamics

    PubMed Central

    Liwo, Adam; Ołdziej, Stanisław; Czaplewski, Cezary; Kleinerman, Dana S.; Blood, Philip; Scheraga, Harold A.

    2010-01-01

    We report the implementation of our united-residue UNRES force field for simulations of protein structure and dynamics with massively parallel architectures. In addition to coarse-grained parallelism already implemented in our previous work, in which each conformation was treated by a different task, we introduce a fine-grained level in which energy and gradient evaluation are split between several tasks. The Message Passing Interface (MPI) libraries have been utilized to construct the parallel code. The parallel performance of the code has been tested on a professional Beowulf cluster (Xeon Quad Core), a Cray XT3 supercomputer, and two IBM BlueGene/P supercomputers with canonical and replica-exchange molecular dynamics. With IBM BlueGene/P, about 50 % efficiency and 120-fold speed-up of the fine-grained part was achieved for a single trajectory of a 767-residue protein with use of 256 processors/trajectory. Because of averaging over the fast degrees of freedom, UNRES provides an effective 1000-fold speed-up compared to the experimental time scale and, therefore, enables us to effectively carry out millisecond-scale simulations of proteins with 500 and more amino-acid residues in days of wall-clock time. PMID:20305729

  6. Structural parallels between terrestrial microbialites and Martian sediments: are all cases of `Pareidolia'?

    NASA Astrophysics Data System (ADS)

    Rizzo, Vincenzo; Cantasano, Nicola

    2017-10-01

    The study analyses possible parallels of the microbialite-known structures with a set of similar settings selected by a systematic investigation from the wide record and data set of images shot by NASA rovers. Terrestrial cases involve structures both due to bio-mineralization processes and those induced by bacterial metabolism, that occur in a dimensional field longer than 0.1 mm, at micro, meso and macro scales. The study highlights occurrence on Martian sediments of widespread structures like microspherules, often organized into some higher-order settings. Such structures also occur on terrestrial stromatolites in a great variety of `Microscopic Induced Sedimentary Structures', such as voids, gas domes and layer deformations of microbial mats. We present a suite of analogies so compelling (i.e. different scales of morphological, structural and conceptual relevance), to make the case that similarities between Martian sediment structures and terrestrial microbialites are not all cases of `Pareidolia'.

  7. Highly efficient spatial data filtering in parallel using the opensource library CPPPO

    NASA Astrophysics Data System (ADS)

    Municchi, Federico; Goniva, Christoph; Radl, Stefan

    2016-10-01

    CPPPO is a compilation of parallel data processing routines developed with the aim to create a library for "scale bridging" (i.e. connecting different scales by mean of closure models) in a multi-scale approach. CPPPO features a number of parallel filtering algorithms designed for use with structured and unstructured Eulerian meshes, as well as Lagrangian data sets. In addition, data can be processed on the fly, allowing the collection of relevant statistics without saving individual snapshots of the simulation state. Our library is provided with an interface to the widely-used CFD solver OpenFOAM®, and can be easily connected to any other software package via interface modules. Also, we introduce a novel, extremely efficient approach to parallel data filtering, and show that our algorithms scale super-linearly on multi-core clusters. Furthermore, we provide a guideline for choosing the optimal Eulerian cell selection algorithm depending on the number of CPU cores used. Finally, we demonstrate the accuracy and the parallel scalability of CPPPO in a showcase focusing on heat and mass transfer from a dense bed of particles.

  8. Measures of three-dimensional anisotropy and intermittency in strong Alfvénic turbulence

    NASA Astrophysics Data System (ADS)

    Mallet, A.; Schekochihin, A. A.; Chandran, B. D. G.; Chen, C. H. K.; Horbury, T. S.; Wicks, R. T.; Greenan, C. C.

    2016-06-01

    We measure the local anisotropy of numerically simulated strong Alfvénic turbulence with respect to two local, physically relevant directions: along the local mean magnetic field and along the local direction of one of the fluctuating Elsasser fields. We find significant scaling anisotropy with respect to both these directions: the fluctuations are `ribbon-like' - statistically, they are elongated along both the mean magnetic field and the fluctuating field. The latter form of anisotropy is due to scale-dependent alignment of the fluctuating fields. The intermittent scalings of the nth-order conditional structure functions in the direction perpendicular to both the local mean field and the fluctuations agree well with the theory of Chandran, Schekochihin & Mallet, while the parallel scalings are consistent with those implied by the critical-balance conjecture. We quantify the relationship between the perpendicular scalings and those in the fluctuation and parallel directions, and find that the scaling exponent of the perpendicular anisotropy (I.e. of the aspect ratio of the Alfvénic structures in the plane perpendicular to the mean magnetic field) depends on the amplitude of the fluctuations. This is shown to be equivalent to the anticorrelation of fluctuation amplitude and alignment at each scale. The dependence of the anisotropy on amplitude is shown to be more significant for the anisotropy between the perpendicular and fluctuation-direction scales than it is between the perpendicular and parallel scales.

  9. Self-propulsion of Leidenfrost Drops between Non-Parallel Structures.

    PubMed

    Luo, Cheng; Mrinal, Manjarik; Wang, Xiang

    2017-09-20

    In this work, we explored self-propulsion of a Leidenfrost drop between non-parallel structures. A theoretical model was first developed to determine conditions for liquid drops to start moving away from the corner of two non-parallel plates. These conditions were then simplified for the case of a Leidenfrost drop. Furthermore, ejection speeds and travel distances of Leidenfrost drops were derived using a scaling law. Subsequently, the theoretical models were validated by experiments. Finally, three new devices have been developed to manipulate Leidenfrost drops in different ways.

  10. Wakefield Computations for the CLIC PETS using the Parallel Finite Element Time-Domain Code T3P

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Candel, A; Kabel, A.; Lee, L.

    In recent years, SLAC's Advanced Computations Department (ACD) has developed the high-performance parallel 3D electromagnetic time-domain code, T3P, for simulations of wakefields and transients in complex accelerator structures. T3P is based on advanced higher-order Finite Element methods on unstructured grids with quadratic surface approximation. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with unprecedented accuracy, aiding the design of the next generation of accelerator facilities. Applications to the Compact Linear Collider (CLIC) Power Extraction and Transfer Structure (PETS) are presented.

  11. Space Technology 5 Multi-point Observations of Field-aligned Currents: Temporal Variability of Meso-Scale Structures

    NASA Technical Reports Server (NTRS)

    Le, Guan; Wang, Yongli; Slavin, James A.; Strangeway, Robert J.

    2007-01-01

    Space Technology 5 (ST5) is a three micro-satellite constellation deployed into a 300 x 4500 km, dawn-dusk, sun-synchronous polar orbit from March 22 to June 21, 2006, for technology validations. In this paper, we present a study of the temporal variability of field-aligned currents using multi-point magnetic field measurements from ST5. The data demonstrate that meso-scale current structures are commonly embedded within large-scale field-aligned current sheets. The meso-scale current structures are very dynamic with highly variable current density and/or polarity in time scales of - 10 min. They exhibit large temporal variations during both quiet and disturbed times in such time scales. On the other hand, the data also shown that the time scales for the currents to be relatively stable are approx. 1 min for meso-scale currents and approx. 10 min for large scale current sheets. These temporal features are obviously associated with dynamic variations of their particle carriers (mainly electrons) as they respond to the variations of the parallel electric field in auroral acceleration region. The characteristic time scales for the temporal variability of meso-scale field-aligned currents are found to be consistent with those of auroral parallel electric field.

  12. DGDFT: A massively parallel method for large scale density functional theory calculations.

    PubMed

    Hu, Wei; Lin, Lin; Yang, Chao

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  13. Probabilistic structural mechanics research for parallel processing computers

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Martin, William R.

    1991-01-01

    Aerospace structures and spacecraft are a complex assemblage of structural components that are subjected to a variety of complex, cyclic, and transient loading conditions. Significant modeling uncertainties are present in these structures, in addition to the inherent randomness of material properties and loads. To properly account for these uncertainties in evaluating and assessing the reliability of these components and structures, probabilistic structural mechanics (PSM) procedures must be used. Much research has focused on basic theory development and the development of approximate analytic solution methods in random vibrations and structural reliability. Practical application of PSM methods was hampered by their computationally intense nature. Solution of PSM problems requires repeated analyses of structures that are often large, and exhibit nonlinear and/or dynamic response behavior. These methods are all inherently parallel and ideally suited to implementation on parallel processing computers. New hardware architectures and innovative control software and solution methodologies are needed to make solution of large scale PSM problems practical.

  14. A Structure-Based Distance Metric for High-Dimensional Space Exploration with Multi-Dimensional Scaling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Hyun Jung; McDonnell, Kevin T.; Zelenyuk, Alla

    2014-03-01

    Although the Euclidean distance does well in measuring data distances within high-dimensional clusters, it does poorly when it comes to gauging inter-cluster distances. This significantly impacts the quality of global, low-dimensional space embedding procedures such as the popular multi-dimensional scaling (MDS) where one can often observe non-intuitive layouts. We were inspired by the perceptual processes evoked in the method of parallel coordinates which enables users to visually aggregate the data by the patterns the polylines exhibit across the dimension axes. We call the path of such a polyline its structure and suggest a metric that captures this structure directly inmore » high-dimensional space. This allows us to better gauge the distances of spatially distant data constellations and so achieve data aggregations in MDS plots that are more cognizant of existing high-dimensional structure similarities. Our MDS plots also exhibit similar visual relationships as the method of parallel coordinates which is often used alongside to visualize the high-dimensional data in raw form. We then cast our metric into a bi-scale framework which distinguishes far-distances from near-distances. The coarser scale uses the structural similarity metric to separate data aggregates obtained by prior classification or clustering, while the finer scale employs the appropriate Euclidean distance.« less

  15. Electron Currents and Heating in the Ion Diffusion Region of Asymmetric Reconnection

    NASA Technical Reports Server (NTRS)

    Graham, D. B.; Khotyaintsev, Yu. V.; Norgren, C.; Vaivads, A.; Andre, M.; Lindqvist, P. A.; Marklund, G. T.; Ergun, R. E.; Paterson, W. R.; Gershman, D. J.; hide

    2016-01-01

    In this letter the structure of the ion diffusion region of magnetic reconnection at Earths magnetopause is investigated using the Magnetospheric Multiscale (MMS) spacecraft. The ion diffusion region is characterized by a strong DC electric field, approximately equal to the Hall electric field, intense currents, and electron heating parallel to the background magnetic field. Current structures well below ion spatial scales are resolved, and the electron motion associated with lower hybrid drift waves is shown to contribute significantly to the total current density. The electron heating is shown to be consistent with large-scale parallel electric fields trapping and accelerating electrons, rather than wave-particle interactions. These results show that sub-ion scale processes occur in the ion diffusion region and are important for understanding electron heating and acceleration.

  16. Parallel Domain Decomposition Formulation and Software for Large-Scale Sparse Symmetrical/Unsymmetrical Aeroacoustic Applications

    NASA Technical Reports Server (NTRS)

    Nguyen, D. T.; Watson, Willie R. (Technical Monitor)

    2005-01-01

    The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.

  17. Graph-based linear scaling electronic structure theory.

    PubMed

    Niklasson, Anders M N; Mniszewski, Susan M; Negre, Christian F A; Cawkwell, Marc J; Swart, Pieter J; Mohd-Yusof, Jamal; Germann, Timothy C; Wall, Michael E; Bock, Nicolas; Rubensson, Emanuel H; Djidjev, Hristo

    2016-06-21

    We show how graph theory can be combined with quantum theory to calculate the electronic structure of large complex systems. The graph formalism is general and applicable to a broad range of electronic structure methods and materials, including challenging systems such as biomolecules. The methodology combines well-controlled accuracy, low computational cost, and natural low-communication parallelism. This combination addresses substantial shortcomings of linear scaling electronic structure theory, in particular with respect to quantum-based molecular dynamics simulations.

  18. Graph-based linear scaling electronic structure theory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Niklasson, Anders M. N., E-mail: amn@lanl.gov; Negre, Christian F. A.; Cawkwell, Marc J.

    2016-06-21

    We show how graph theory can be combined with quantum theory to calculate the electronic structure of large complex systems. The graph formalism is general and applicable to a broad range of electronic structure methods and materials, including challenging systems such as biomolecules. The methodology combines well-controlled accuracy, low computational cost, and natural low-communication parallelism. This combination addresses substantial shortcomings of linear scaling electronic structure theory, in particular with respect to quantum-based molecular dynamics simulations.

  19. Parallel Finite Element Domain Decomposition for Structural/Acoustic Analysis

    NASA Technical Reports Server (NTRS)

    Nguyen, Duc T.; Tungkahotara, Siroj; Watson, Willie R.; Rajan, Subramaniam D.

    2005-01-01

    A domain decomposition (DD) formulation for solving sparse linear systems of equations resulting from finite element analysis is presented. The formulation incorporates mixed direct and iterative equation solving strategics and other novel algorithmic ideas that are optimized to take advantage of sparsity and exploit modern computer architecture, such as memory and parallel computing. The most time consuming part of the formulation is identified and the critical roles of direct sparse and iterative solvers within the framework of the formulation are discussed. Experiments on several computer platforms using several complex test matrices are conducted using software based on the formulation. Small-scale structural examples are used to validate thc steps in the formulation and large-scale (l,000,000+ unknowns) duct acoustic examples are used to evaluate the ORIGIN 2000 processors, and a duster of 6 PCs (running under the Windows environment). Statistics show that the formulation is efficient in both sequential and parallel computing environmental and that the formulation is significantly faster and consumes less memory than that based on one of the best available commercialized parallel sparse solvers.

  20. Two-dimensional displacement measurement based on two parallel gratings

    NASA Astrophysics Data System (ADS)

    Wei, Peipei; Lu, Xi; Qiao, Decheng; Zou, Limin; Huang, Xiangdong; Tan, Jiubin; Lu, Zhengang

    2018-06-01

    In this paper, a two-dimensional (2-D) planar encoder based on two parallel gratings, which includes a scanning grating and scale grating, is presented. The scanning grating is a combined transmission rectangular grating comprised of a 2-D grating located at the center and two one-dimensional (1-D) gratings located at the sides. The grating lines of the two 1-D gratings are perpendicular to each other and parallel with the 2-D grating lines. The scale grating is a 2-D reflective-type rectangular grating placed in parallel with the scanning grating, and there is an angular difference of 45° between the grating lines of the two 2-D gratings. With the special structural design of the scanning grating, the encoder can measure the 2-D displacement in the grating plane simultaneously, and the measured interference signals in the two directions are uncoupled. Moreover, by utilizing the scanning grating to modulate the phase of the interference signals instead of the prisms, the structure of the encoder is compact. Experiments were implemented, and the results demonstrate the validity of the 2-D planar grating encoder.

  1. Personality Assessment Inventory scale characteristics and factor structure in the assessment of alcohol dependency.

    PubMed

    Schinka, J A

    1995-02-01

    Individual scale characteristics and the inventory structure of the Personality Assessment Inventory (PAI; Morey, 1991) were examined by conducting internal consistency and factor analyses of item and scale score data from a large group (N = 301) of alcohol-dependent patients. Alpha coefficients, mean inter-item correlations, and corrected item-total scale correlations for the sample paralleled values reported by Morey for a large clinical sample. Minor differences in the scale factor structure of the inventory from Morey's clinical sample were found. Overall, the findings support the use of the PAI in the assessment of personality and psychopathology of alcohol-dependent patients.

  2. Correlation of generation interval and scale of large-scale submarine landslides using 3D seismic data off Shimokita Peninsula, Northeast Japan

    NASA Astrophysics Data System (ADS)

    Nakamura, Yuki; Ashi, Juichiro; Morita, Sumito

    2016-04-01

    To clarify timing and scale of past submarine landslides is important to understand formation processes of the landslides. The study area is in a part of continental slope of the Japan Trench, where a number of large-scale submarine landslide (slump) deposits have been identified in Pliocene and Quaternary formations by analysing METI's 3D seismic data "Sanrikuoki 3D" off Shimokita Peninsula (Morita et al., 2011). As structural features, swarm of parallel dikes which are likely dewatering paths formed accompanying the slumping deformation, and slip directions are basically perpendicular to the parallel dikes. Therefore, parallel dikes are good indicator for estimation of slip directions. Slip direction of each slide was determined one kilometre grid in the survey area of 40 km x 20 km. The remarkable slip direction varies from Pliocene to Quaternary in the survey area. Parallel dike structure is also available for the distinguishment of the slump deposit and normal deposit on time slice images. By tracing outline of slump deposits at each depth, we identified general morphology of the overall slump deposits, and calculated the volume of the extracted slump deposits so as to estimate the scale of each event. We investigated temporal and spatial variation of depositional pattern of the slump deposits. Calculating the generation interval of the slumps, some periodicity is likely recognized, especially large slump do not occur in succession. Additionally, examining the relationship of the cumulative volume and the generation interval, certain correlation is observed in Pliocene and Quaternary. Key words: submarine landslides, 3D seismic data, Shimokita Peninsula

  3. Electron Heating at Kinetic Scales in Magnetosheath Turbulence

    NASA Technical Reports Server (NTRS)

    Chasapis, Alexandros; Matthaeus, W. H.; Parashar, T. N.; Lecontel, O.; Retino, A.; Breuillard, H.; Khotyaintsev, Y.; Vaivads, A.; Lavraud, B.; Eriksson, E.; hide

    2017-01-01

    We present a statistical study of coherent structures at kinetic scales, using data from the Magnetospheric Multiscale mission in the Earths magnetosheath. We implemented the multi-spacecraft partial variance of increments (PVI) technique to detect these structures, which are associated with intermittency at kinetic scales. We examine the properties of the electron heating occurring within such structures. We find that, statistically, structures with a high PVI index are regions of significant electron heating. We also focus on one such structure, a current sheet, which shows some signatures consistent with magnetic reconnection. Strong parallel electron heating coincides with whistler emissions at the edges of the current sheet.

  4. Parallel-vector out-of-core equation solver for computational mechanics

    NASA Technical Reports Server (NTRS)

    Qin, J.; Agarwal, T. K.; Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.

    1993-01-01

    A parallel/vector out-of-core equation solver is developed for shared-memory computers, such as the Cray Y-MP machine. The input/ output (I/O) time is reduced by using the a synchronous BUFFER IN and BUFFER OUT, which can be executed simultaneously with the CPU instructions. The parallel and vector capability provided by the supercomputers is also exploited to enhance the performance. Numerical applications in large-scale structural analysis are given to demonstrate the efficiency of the present out-of-core solver.

  5. Space Technology 5 Multi-Point Observations of Temporal Variability of Field-Aligned Currents

    NASA Technical Reports Server (NTRS)

    Le, Guan; Wang, Yongli; Slavin, James A.; Strangeway, Robert J.

    2008-01-01

    Space Technology 5 (ST5) is a three micro-satellite constellation deployed into a 300 x 4500 km, dawn-dusk, sun-synchronous polar orbit from March 22 to June 21, 2006, for technology validations. In this paper, we present a study of the temporal variability of field-aligned currents using multi-point magnetic field measurements from ST5. The data demonstrate that meso-scale current structures are commonly embedded within large-scale field-aligned current sheets. The meso-scale current structures are very dynamic with highly variable current density and/or polarity in time scales of approximately 10 min. They exhibit large temporal variations during both quiet and disturbed times in such time scales. On the other hand, the data also shown that the time scales for the currents to be relatively stable are approximately 1 min for meso-scale currents and approximately 10 min for large scale current sheets. These temporal features are obviously associated with dynamic variations of their particle carriers (mainly electrons) as they respond to the variations of the parallel electric field in auroral acceleration region. The characteristic time scales for the temporal variability of meso-scale field-aligned currents are found to be consistent with those of auroral parallel electric field.

  6. A gyrofluid description of Alfvenic turbulence and its parallel electric field

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bian, N. H.; Kontar, E. P.

    2010-06-15

    Anisotropic Alfvenic fluctuations with k{sub ||}/k{sub perpendicular}<<1 remain at frequencies much smaller than the ion cyclotron frequency in the presence of a strong background magnetic field. Based on the simplest truncation of the electromagnetic gyrofluid equations in a homogeneous plasma, a model for the energy cascade produced by Alfvenic turbulence is constructed, which smoothly connects the large magnetohydrodynamics scales and the small 'kinetic' scales. Scaling relations are obtained for the electromagnetic fluctuations, as a function of k{sub perpendicular} and k{sub ||}. Moreover, a particular attention is paid to the spectral structure of the parallel electric field which is produced bymore » Alfvenic turbulence. The reason is the potential implication of this parallel electric field in turbulent acceleration and transport of particles. For electromagnetic turbulence, this issue was raised some time ago in Hasegawa and Mima [J. Geophys. Res. 83, 1117 (1978)].« less

  7. Constructing Neuronal Network Models in Massively Parallel Environments.

    PubMed

    Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus

    2017-01-01

    Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.

  8. Constructing Neuronal Network Models in Massively Parallel Environments

    PubMed Central

    Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus

    2017-01-01

    Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808

  9. Electron Heating at Kinetic Scales in Magnetosheath Turbulence

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chasapis, Alexandros; Matthaeus, W. H.; Parashar, T. N.

    2017-02-20

    We present a statistical study of coherent structures at kinetic scales, using data from the Magnetospheric Multiscale mission in the Earth’s magnetosheath. We implemented the multi-spacecraft partial variance of increments (PVI) technique to detect these structures, which are associated with intermittency at kinetic scales. We examine the properties of the electron heating occurring within such structures. We find that, statistically, structures with a high PVI index are regions of significant electron heating. We also focus on one such structure, a current sheet, which shows some signatures consistent with magnetic reconnection. Strong parallel electron heating coincides with whistler emissions at themore » edges of the current sheet.« less

  10. Space Technology 5 (ST-5) Observations of Field-Aligned Currents: Temporal Variability

    NASA Technical Reports Server (NTRS)

    Le, Guan

    2010-01-01

    Space Technology 5 (ST-5) is a three micro-satellite constellation deployed into a 300 x 4500 km, dawn-dusk, sun-synchronous polar orbit from March 22 to June 21, 2006, for technology validations. In this paper, we present a study of the temporal variability of field-aligned currents using multi-point magnetic field measurements from STS. The data demonstrate that masoscale current structures are commonly embedded within large-scale field-aligned current sheets. The meso-scale current structures are very dynamic with highly variable current density and/or polarity in time scales of about 10 min. They exhibit large temporal variations during both quiet and disturbed times in such time scales. On the other hand, the data also shown that the time scales for the currents to be relatively stable are about I min for meso-scale currents and about 10 min for large scale current sheets. These temporal features are obviously associated with dynamic variations of their particle carriers (mainly electrons) as they respond to the variations of the parallel electric field in auroral acceleration region. The characteristic time scales for the temporal variability of meso-scale field-aligned currents are found to be consistent with those of auroral parallel electric field.

  11. Space Technology 5 (ST-5) Multipoint Observations of Temporal and Spatial Variability of Field-Aligned Currents

    NASA Technical Reports Server (NTRS)

    Le, Guan

    2010-01-01

    Space Technology 5 (ST-5) is a three micro-satellite constellation deployed into a 300 x 4500 km, dawn-dusk, sun-synchronous polar orbit from March 22 to June 21, 2006, for technology validations. In this paper, we present a study of the temporal variability of field-aligned currents using multi-point magnetic field measurements from ST5. The data demonstrate that mesoscale current structures are commonly embedded within large-scale field-aligned current sheets. The meso-scale current structures are very dynamic with highly variable current density and/or polarity in time scales of about 10 min. They exhibit large temporal variations during both quiet and disturbed times in such time scales. On the other hand, the data also shown that the time scales for the currents to be relatively stable are about 1 min for meso-scale currents and about 10 min for large scale current sheets. These temporal features are obviously associated with dynamic variations of their particle carriers (mainly electrons) as they respond to the variations of the parallel electric field in auroral acceleration region. The characteristic time scales for the temporal variability of meso-scale field-aligned currents are found to be consistent with those of auroral parallel electric field.

  12. Parallel block schemes for large scale least squares computations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Golub, G.H.; Plemmons, R.J.; Sameh, A.

    1986-04-01

    Large scale least squares computations arise in a variety of scientific and engineering problems, including geodetic adjustments and surveys, medical image analysis, molecular structures, partial differential equations and substructuring methods in structural engineering. In each of these problems, matrices often arise which possess a block structure which reflects the local connection nature of the underlying physical problem. For example, such super-large nonlinear least squares computations arise in geodesy. Here the coordinates of positions are calculated by iteratively solving overdetermined systems of nonlinear equations by the Gauss-Newton method. The US National Geodetic Survey will complete this year (1986) the readjustment ofmore » the North American Datum, a problem which involves over 540 thousand unknowns and over 6.5 million observations (equations). The observation matrix for these least squares computations has a block angular form with 161 diagnonal blocks, each containing 3 to 4 thousand unknowns. In this paper parallel schemes are suggested for the orthogonal factorization of matrices in block angular form and for the associated backsubstitution phase of the least squares computations. In addition, a parallel scheme for the calculation of certain elements of the covariance matrix for such problems is described. It is shown that these algorithms are ideally suited for multiprocessors with three levels of parallelism such as the Cedar system at the University of Illinois. 20 refs., 7 figs.« less

  13. Modification in drag of turbulent boundary layers resulting from manipulation of large-scale structures

    NASA Technical Reports Server (NTRS)

    Corke, T. C.; Guezennec, Y.; Nagib, H. M.

    1981-01-01

    The effects of placing a parallel-plate turbulence manipulator in a boundary layer are documented through flow visualization and hot wire measurements. The boundary layer manipulator was designed to manage the large scale structures of turbulence leading to a reduction in surface drag. The differences in the turbulent structure of the boundary layer are summarized to demonstrate differences in various flow properties. The manipulator inhibited the intermittent large scale structure of the turbulent boundary layer for at least 70 boundary layer thicknesses downstream. With the removal of the large scale, the streamwise turbulence intensity levels near the wall were reduced. The downstream distribution of the skin friction was also altered by the introduction of the manipulator.

  14. Efficient multitasking of Choleski matrix factorization on CRAY supercomputers

    NASA Technical Reports Server (NTRS)

    Overman, Andrea L.; Poole, Eugene L.

    1991-01-01

    A Choleski method is described and used to solve linear systems of equations that arise in large scale structural analysis. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is used for comparison with the microtasked and autotasked implementations. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both computers. CPU and wall clock timings are given for the parallel implementations and are compared to single processor timings of the same algorithm.

  15. apGA: An adaptive parallel genetic algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liepins, G.E.; Baluja, S.

    1991-01-01

    We develop apGA, a parallel variant of the standard generational GA, that combines aggressive search with perpetual novelty, yet is able to preserve enough genetic structure to optimally solve variably scaled, non-uniform block deceptive and hierarchical deceptive problems. apGA combines elitism, adaptive mutation, adaptive exponential scaling, and temporal memory. We present empirical results for six classes of problems, including the DeJong test suite. Although we have not investigated hybrids, we note that apGA could be incorporated into other recent GA variants such as GENITOR, CHC, and the recombination stage of mGA. 12 refs., 2 figs., 2 tabs.

  16. Multidisciplinary Optimization Methods for Aircraft Preliminary Design

    NASA Technical Reports Server (NTRS)

    Kroo, Ilan; Altus, Steve; Braun, Robert; Gage, Peter; Sobieski, Ian

    1994-01-01

    This paper describes a research program aimed at improved methods for multidisciplinary design and optimization of large-scale aeronautical systems. The research involves new approaches to system decomposition, interdisciplinary communication, and methods of exploiting coarse-grained parallelism for analysis and optimization. A new architecture, that involves a tight coupling between optimization and analysis, is intended to improve efficiency while simplifying the structure of multidisciplinary, computation-intensive design problems involving many analysis disciplines and perhaps hundreds of design variables. Work in two areas is described here: system decomposition using compatibility constraints to simplify the analysis structure and take advantage of coarse-grained parallelism; and collaborative optimization, a decomposition of the optimization process to permit parallel design and to simplify interdisciplinary communication requirements.

  17. Method and apparatus for fabrication of high gradient insulators with parallel surface conductors spaced less than one millimeter apart

    DOEpatents

    Sanders, David M.; Decker, Derek E.

    1999-01-01

    Optical patterns and lithographic techniques are used as part of a process to embed parallel and evenly spaced conductors in the non-planar surfaces of an insulator to produce high gradient insulators. The approach extends the size that high gradient insulating structures can be fabricated as well as improves the performance of those insulators by reducing the scale of the alternating parallel lines of insulator and conductor along the surface. This fabrication approach also substantially decreases the cost required to produce high gradient insulators.

  18. Monitoring Data-Structure Evolution in Distributed Message-Passing Programs

    NASA Technical Reports Server (NTRS)

    Sarukkai, Sekhar R.; Beers, Andrew; Woodrow, Thomas S. (Technical Monitor)

    1996-01-01

    Monitoring the evolution of data structures in parallel and distributed programs, is critical for debugging its semantics and performance. However, the current state-of-art in tracking and presenting data-structure information on parallel and distributed environments is cumbersome and does not scale. In this paper we present a methodology that automatically tracks memory bindings (not the actual contents) of static and dynamic data-structures of message-passing C programs, using PVM. With the help of a number of examples we show that in addition to determining the impact of memory allocation overheads on program performance, graphical views can help in debugging the semantics of program execution. Scalable animations of virtual address bindings of source-level data-structures are used for debugging the semantics of parallel programs across all processors. In conjunction with light-weight core-files, this technique can be used to complement traditional debuggers on single processors. Detailed information (such as data-structure contents), on specific nodes, can be determined using traditional debuggers after the data structure evolution leading to the semantic error is observed graphically.

  19. Electron Scale Structures and Magnetic Reconnection Signatures in the Turbulent Magnetosheath

    NASA Technical Reports Server (NTRS)

    Yordanova, E.; Voros, Z.; Varsani, A.; Graham, D. B.; Norgren, C.; Khotyaintsev, Yu. V.; Vaivads, A.; Eriksson, E.; Nakamura, R.; Lindqvist, P.-A.; hide

    2016-01-01

    Collisionless space plasma turbulence can generate reconnecting thin current sheets as suggested by recent results of numerical magnetohydrodynamic simulations. The Magnetospheric Multiscale (MMS) mission provides the first serious opportunity to verify whether small ion-electron-scale reconnection, generated by turbulence, resembles the reconnection events frequently observed in the magnetotail or at the magnetopause. Here we investigate field and particle observations obtained by the MMS fleet in the turbulent terrestrial magnetosheath behind quasi-parallel bow shock geometry. We observe multiple small-scale current sheets during the event and present a detailed look of one of the detected structures. The emergence of thin current sheets can lead to electron scale structures. Within these structures, we see signatures of ion demagnetization, electron jets, electron heating, and agyrotropy suggesting that MMS spacecraft observe reconnection at these scales.

  20. The implementation of an aeronautical CFD flow code onto distributed memory parallel systems

    NASA Astrophysics Data System (ADS)

    Ierotheou, C. S.; Forsey, C. R.; Leatham, M.

    2000-04-01

    The parallelization of an industrially important in-house computational fluid dynamics (CFD) code for calculating the airflow over complex aircraft configurations using the Euler or Navier-Stokes equations is presented. The code discussed is the flow solver module of the SAUNA CFD suite. This suite uses a novel grid system that may include block-structured hexahedral or pyramidal grids, unstructured tetrahedral grids or a hybrid combination of both. To assist in the rapid convergence to a solution, a number of convergence acceleration techniques are employed including implicit residual smoothing and a multigrid full approximation storage scheme (FAS). Key features of the parallelization approach are the use of domain decomposition and encapsulated message passing to enable the execution in parallel using a single programme multiple data (SPMD) paradigm. In the case where a hybrid grid is used, a unified grid partitioning scheme is employed to define the decomposition of the mesh. The parallel code has been tested using both structured and hybrid grids on a number of different distributed memory parallel systems and is now routinely used to perform industrial scale aeronautical simulations. Copyright

  1. Massively parallel and linear-scaling algorithm for second-order Møller-Plesset perturbation theory applied to the study of supramolecular wires

    NASA Astrophysics Data System (ADS)

    Kjærgaard, Thomas; Baudin, Pablo; Bykov, Dmytro; Eriksen, Janus Juul; Ettenhuber, Patrick; Kristensen, Kasper; Larkin, Jeff; Liakh, Dmitry; Pawłowski, Filip; Vose, Aaron; Wang, Yang Min; Jørgensen, Poul

    2017-03-01

    We present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide-Expand-Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide-Expand-Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalability of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the "resolution of the identity second-order Møller-Plesset perturbation theory" (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.

  2. Extreme-Scale Bayesian Inference for Uncertainty Quantification of Complex Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Biros, George

    Uncertainty quantification (UQ)—that is, quantifying uncertainties in complex mathematical models and their large-scale computational implementations—is widely viewed as one of the outstanding challenges facing the field of CS&E over the coming decade. The EUREKA project set to address the most difficult class of UQ problems: those for which both the underlying PDE model as well as the uncertain parameters are of extreme scale. In the project we worked on these extreme-scale challenges in the following four areas: 1. Scalable parallel algorithms for sampling and characterizing the posterior distribution that exploit the structure of the underlying PDEs and parameter-to-observable map. Thesemore » include structure-exploiting versions of the randomized maximum likelihood method, which aims to overcome the intractability of employing conventional MCMC methods for solving extreme-scale Bayesian inversion problems by appealing to and adapting ideas from large-scale PDE-constrained optimization, which have been very successful at exploring high-dimensional spaces. 2. Scalable parallel algorithms for construction of prior and likelihood functions based on learning methods and non-parametric density estimation. Constructing problem-specific priors remains a critical challenge in Bayesian inference, and more so in high dimensions. Another challenge is construction of likelihood functions that capture unmodeled couplings between observations and parameters. We will create parallel algorithms for non-parametric density estimation using high dimensional N-body methods and combine them with supervised learning techniques for the construction of priors and likelihood functions. 3. Bayesian inadequacy models, which augment physics models with stochastic models that represent their imperfections. The success of the Bayesian inference framework depends on the ability to represent the uncertainty due to imperfections of the mathematical model of the phenomena of interest. This is a central challenge in UQ, especially for large-scale models. We propose to develop the mathematical tools to address these challenges in the context of extreme-scale problems. 4. Parallel scalable algorithms for Bayesian optimal experimental design (OED). Bayesian inversion yields quantified uncertainties in the model parameters, which can be propagated forward through the model to yield uncertainty in outputs of interest. This opens the way for designing new experiments to reduce the uncertainties in the model parameters and model predictions. Such experimental design problems have been intractable for large-scale problems using conventional methods; we will create OED algorithms that exploit the structure of the PDE model and the parameter-to-output map to overcome these challenges. Parallel algorithms for these four problems were created, analyzed, prototyped, implemented, tuned, and scaled up for leading-edge supercomputers, including UT-Austin’s own 10 petaflops Stampede system, ANL’s Mira system, and ORNL’s Titan system. While our focus is on fundamental mathematical/computational methods and algorithms, we will assess our methods on model problems derived from several DOE mission applications, including multiscale mechanics and ice sheet dynamics.« less

  3. Parallel Adjective High-Order CFD Simulations Characterizing SOFIA Cavity Acoustics

    NASA Technical Reports Server (NTRS)

    Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak

    2016-01-01

    This paper presents large-scale MPI-parallel computational uid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft fuselage of a Boeing 747SP. These simulations focus on how the unsteady ow eld inside and over the cavity interferes with the optical path and mounting structure of the telescope. A temporally fourth-order accurate Runge-Kutta, and spatially fth-order accurate WENO- 5Z scheme was used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh re nement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32k CPU cores and 4 billion compu- tational cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregular numerical cost associated with blocks con- taining boundaries. Limits to scaling beyond 32k cores are identi ed, and targeted code optimizations are discussed.

  4. Parallel Adaptive High-Order CFD Simulations Characterizing SOFIA Cavitiy Acoustics

    NASA Technical Reports Server (NTRS)

    Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak

    2015-01-01

    This paper presents large-scale MPI-parallel computational uid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft fuselage of a Boeing 747SP. These simulations focus on how the unsteady ow eld inside and over the cavity interferes with the optical path and mounting structure of the telescope. A tempo- rally fourth-order accurate Runge-Kutta, and a spatially fth-order accurate WENO-5Z scheme were used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh re nement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32k CPU cores and 4 billion compu- tational cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregular numerical cost associated with blocks con- taining boundaries. Limits to scaling beyond 32k cores are identi ed, and targeted code optimizations are discussed.

  5. Parallel Tensor Compression for Large-Scale Scientific Data.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kolda, Tamara G.; Ballard, Grey; Austin, Woody Nathan

    As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memorymore » parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.« less

  6. Streaming parallel GPU acceleration of large-scale filter-based spiking neural networks.

    PubMed

    Slażyński, Leszek; Bohte, Sander

    2012-01-01

    The arrival of graphics processing (GPU) cards suitable for massively parallel computing promises affordable large-scale neural network simulation previously only available at supercomputing facilities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of magnitude, the challenge is to develop fine-grained parallel algorithms to fully exploit the particulars of GPUs. Computation in a neural network is inherently parallel and thus a natural match for GPU architectures: given inputs, the internal state for each neuron can be updated in parallel. We show that for filter-based spiking neurons, like the Spike Response Model, the additive nature of membrane potential dynamics enables additional update parallelism. This also reduces the accumulation of numerical errors when using single precision computation, the native precision of GPUs. We further show that optimizing simulation algorithms and data structures to the GPU's architecture has a large pay-off: for example, matching iterative neural updating to the memory architecture of the GPU speeds up this simulation step by a factor of three to five. With such optimizations, we can simulate in better-than-realtime plausible spiking neural networks of up to 50 000 neurons, processing over 35 million spiking events per second.

  7. Building up the spin - orbit alignment of interacting galaxy pairs

    NASA Astrophysics Data System (ADS)

    Moon, Jun-Sung; Yoon, Suk-Jin

    2018-01-01

    Galaxies are not just randomly distributed throughout space. Instead, they are in alignment over a wide range of scales from the cosmic web down to a pair of galaxies. Motivated by recent findings that the spin and the orbital angular momentum vectors of galaxy pairs tend to be parallel, we here investigate the spin - orbit orientation in close pairs using the Illustris cosmological simulation. We find that since z ~ 1, the parallel alignment has become progressively stronger with time through repetitive encounters. The pair Interactions are preferentially in prograde at z = 0 (over 5 sigma significance). The prograde fraction at z = 0 is larger for the pairs influenced more heavily by each other during their evolution. We find no correlation between the spin - orbit orientation and the surrounding large-scale structure. Our results favor the scenario in which the alignment in close pairs is caused by tidal interactions later on, rather than the primordial torquing by the large-scale structures.

  8. Energy Dependence of Electron-Scale Currents and Dissipation During Magnetopause Reconnection

    NASA Astrophysics Data System (ADS)

    Shuster, J. R.; Gershman, D. J.; Giles, B. L.; Dorelli, J.; Avanov, L. A.; Chen, L. J.; Wang, S.; Bessho, N.; Torbert, R. B.; Farrugia, C. J.; Argall, M. R.; Strangeway, R. J.; Schwartz, S. J.

    2017-12-01

    We investigate the electron-scale physics of reconnecting current structures observed at the magnetopause during Phase 1B of the Magnetospheric Multiscale (MMS) mission when the spacecraft separation was less than 10 km. Using single-spacecraft measurements of the current density vector Jplasma = en(vi - ve) enabled by the accuracy of the Fast Plasma Investigation (FPI) electron moments as demonstrated by Phan et al. [2016], we consider perpendicular (J⊥1 and J⊥2) and parallel (J//) currents and their corresponding kinetic electron signatures. These currents can correspond to a variety of structures in the electron velocity distribution functions measured by FPI, including perpendicular and parallel crescents like those first reported by Burch et al. [2016], parallel electron beams, counter-streaming electron populations, or sometimes simply a bulk velocity shift. By integrating the distribution function over only its angular dimensions, we compute energy-dependent 'partial' moments and employ them to characterize the energy dependence of velocities, currents, and dissipation associated with magnetic reconnection diffusion regions caught by MMS. Our technique aids in visualizing and elucidating the plasma energization mechanisms that operate during collisionless reconnection.

  9. New Insights into the Nature of Turbulence in the Earth's Magnetosheath Using Magnetospheric MultiScale Mission Data

    NASA Astrophysics Data System (ADS)

    Breuillard, H.; Matteini, L.; Argall, M. R.; Sahraoui, F.; Andriopoulou, M.; Le Contel, O.; Retinò, A.; Mirioni, L.; Huang, S. Y.; Gershman, D. J.; Ergun, R. E.; Wilder, F. D.; Goodrich, K. A.; Ahmadi, N.; Yordanova, E.; Vaivads, A.; Turner, D. L.; Khotyaintsev, Yu. V.; Graham, D. B.; Lindqvist, P.-A.; Chasapis, A.; Burch, J. L.; Torbert, R. B.; Russell, C. T.; Magnes, W.; Strangeway, R. J.; Plaschke, F.; Moore, T. E.; Giles, B. L.; Paterson, W. R.; Pollock, C. J.; Lavraud, B.; Fuselier, S. A.; Cohen, I. J.

    2018-06-01

    The Earth’s magnetosheath, which is characterized by highly turbulent fluctuations, is usually divided into two regions of different properties as a function of the angle between the interplanetary magnetic field and the shock normal. In this study, we make use of high-time resolution instruments on board the Magnetospheric MultiScale spacecraft to determine and compare the properties of subsolar magnetosheath turbulence in both regions, i.e., downstream of the quasi-parallel and quasi-perpendicular bow shocks. In particular, we take advantage of the unprecedented temporal resolution of the Fast Plasma Investigation instrument to show the density fluctuations down to sub-ion scales for the first time. We show that the nature of turbulence is highly compressible down to electron scales, particularly in the quasi-parallel magnetosheath. In this region, the magnetic turbulence also shows an inertial (Kolmogorov-like) range, indicating that the fluctuations are not formed locally, in contrast with the quasi-perpendicular magnetosheath. We also show that the electromagnetic turbulence is dominated by electric fluctuations at sub-ion scales (f > 1 Hz) and that magnetic and electric spectra steepen at the largest-electron scale. The latter indicates a change in the nature of turbulence at electron scales. Finally, we show that the electric fluctuations around the electron gyrofrequency are mostly parallel in the quasi-perpendicular magnetosheath, where intense whistlers are observed. This result suggests that energy dissipation, plasma heating, and acceleration might be driven by intense electrostatic parallel structures/waves, which can be linked to whistler waves.

  10. Large-scale molecular dynamics simulation of DNA: implementation and validation of the AMBER98 force field in LAMMPS.

    PubMed

    Grindon, Christina; Harris, Sarah; Evans, Tom; Novik, Keir; Coveney, Peter; Laughton, Charles

    2004-07-15

    Molecular modelling played a central role in the discovery of the structure of DNA by Watson and Crick. Today, such modelling is done on computers: the more powerful these computers are, the more detailed and extensive can be the study of the dynamics of such biological macromolecules. To fully harness the power of modern massively parallel computers, however, we need to develop and deploy algorithms which can exploit the structure of such hardware. The Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is a scalable molecular dynamics code including long-range Coulomb interactions, which has been specifically designed to function efficiently on parallel platforms. Here we describe the implementation of the AMBER98 force field in LAMMPS and its validation for molecular dynamics investigations of DNA structure and flexibility against the benchmark of results obtained with the long-established code AMBER6 (Assisted Model Building with Energy Refinement, version 6). Extended molecular dynamics simulations on the hydrated DNA dodecamer d(CTTTTGCAAAAG)(2), which has previously been the subject of extensive dynamical analysis using AMBER6, show that it is possible to obtain excellent agreement in terms of static, dynamic and thermodynamic parameters between AMBER6 and LAMMPS. In comparison with AMBER6, LAMMPS shows greatly improved scalability in massively parallel environments, opening up the possibility of efficient simulations of order-of-magnitude larger systems and/or for order-of-magnitude greater simulation times.

  11. Validating the Factor Structure of the Self-Report Psychopathy Scale in a Community Sample

    ERIC Educational Resources Information Center

    Mahmut, Mehmet K.; Menictas, Con; Stevenson, Richard J.; Homewood, Judi

    2011-01-01

    Currently, there is no standard self-report measure of psychopathy in community-dwelling samples that parallels the most commonly used measure of psychopathy in forensic and clinical samples, the Psychopathy Checklist. A promising instrument is the Self-Report Psychopathy scale (SRP), which was derived from the original version the Psychopathy…

  12. Modeling Cooperative Threads to Project GPU Performance for Adaptive Parallelism

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meng, Jiayuan; Uram, Thomas; Morozov, Vitali A.

    Most accelerators, such as graphics processing units (GPUs) and vector processors, are particularly suitable for accelerating massively parallel workloads. On the other hand, conventional workloads are developed for multi-core parallelism, which often scale to only a few dozen OpenMP threads. When hardware threads significantly outnumber the degree of parallelism in the outer loop, programmers are challenged with efficient hardware utilization. A common solution is to further exploit the parallelism hidden deep in the code structure. Such parallelism is less structured: parallel and sequential loops may be imperfectly nested within each other, neigh boring inner loops may exhibit different concurrency patternsmore » (e.g. Reduction vs. Forall), yet have to be parallelized in the same parallel section. Many input-dependent transformations have to be explored. A programmer often employs a larger group of hardware threads to cooperatively walk through a smaller outer loop partition and adaptively exploit any encountered parallelism. This process is time-consuming and error-prone, yet the risk of gaining little or no performance remains high for such workloads. To reduce risk and guide implementation, we propose a technique to model workloads with limited parallelism that can automatically explore and evaluate transformations involving cooperative threads. Eventually, our framework projects the best achievable performance and the most promising transformations without implementing GPU code or using physical hardware. We envision our technique to be integrated into future compilers or optimization frameworks for autotuning.« less

  13. Two-stage bulk electron heating in the diffusion region of anti-parallel symmetric reconnection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Le, Ari Yitzchak; Egedal, Jan; Daughton, William Scott

    2016-10-13

    Electron bulk energization in the diffusion region during anti-parallel symmetric reconnection entails two stages. First, the inflowing electrons are adiabatically trapped and energized by an ambipolar parallel electric field. Next, the electrons gain energy from the reconnection electric field as they undergo meandering motion. These collisionless mechanisms have been described previously, and they lead to highly structured electron velocity distributions. Furthermore, a simplified control-volume analysis gives estimates for how the net effective heating scales with the upstream plasma conditions in agreement with fully kinetic simulations and spacecraft observations.

  14. Portable parallel portfolio optimization in the Aurora Financial Management System

    NASA Astrophysics Data System (ADS)

    Laure, Erwin; Moritsch, Hans

    2001-07-01

    Financial planning problems are formulated as large scale, stochastic, multiperiod, tree structured optimization problems. An efficient technique for solving this kind of problems is the nested Benders decomposition method. In this paper we present a parallel, portable, asynchronous implementation of this technique. To achieve our portability goals we elected the programming language Java for our implementation and used a high level Java based framework, called OpusJava, for expressing the parallelism potential as well as synchronization constraints. Our implementation is embedded within a modular decision support tool for portfolio and asset liability management, the Aurora Financial Management System.

  15. Fast Particle Methods for Multiscale Phenomena Simulations

    NASA Technical Reports Server (NTRS)

    Koumoutsakos, P.; Wray, A.; Shariff, K.; Pohorille, Andrew

    2000-01-01

    We are developing particle methods oriented at improving computational modeling capabilities of multiscale physical phenomena in : (i) high Reynolds number unsteady vortical flows, (ii) particle laden and interfacial flows, (iii)molecular dynamics studies of nanoscale droplets and studies of the structure, functions, and evolution of the earliest living cell. The unifying computational approach involves particle methods implemented in parallel computer architectures. The inherent adaptivity, robustness and efficiency of particle methods makes them a multidisciplinary computational tool capable of bridging the gap of micro-scale and continuum flow simulations. Using efficient tree data structures, multipole expansion algorithms, and improved particle-grid interpolation, particle methods allow for simulations using millions of computational elements, making possible the resolution of a wide range of length and time scales of these important physical phenomena.The current challenges in these simulations are in : [i] the proper formulation of particle methods in the molecular and continuous level for the discretization of the governing equations [ii] the resolution of the wide range of time and length scales governing the phenomena under investigation. [iii] the minimization of numerical artifacts that may interfere with the physics of the systems under consideration. [iv] the parallelization of processes such as tree traversal and grid-particle interpolations We are conducting simulations using vortex methods, molecular dynamics and smooth particle hydrodynamics, exploiting their unifying concepts such as : the solution of the N-body problem in parallel computers, highly accurate particle-particle and grid-particle interpolations, parallel FFT's and the formulation of processes such as diffusion in the context of particle methods. This approach enables us to transcend among seemingly unrelated areas of research.

  16. Hybrid MPI-OpenMP Parallelism in the ONETEP Linear-Scaling Electronic Structure Code: Application to the Delamination of Cellulose Nanofibrils.

    PubMed

    Wilkinson, Karl A; Hine, Nicholas D M; Skylaris, Chris-Kriton

    2014-11-11

    We present a hybrid MPI-OpenMP implementation of Linear-Scaling Density Functional Theory within the ONETEP code. We illustrate its performance on a range of high performance computing (HPC) platforms comprising shared-memory nodes with fast interconnect. Our work has focused on applying OpenMP parallelism to the routines which dominate the computational load, attempting where possible to parallelize different loops from those already parallelized within MPI. This includes 3D FFT box operations, sparse matrix algebra operations, calculation of integrals, and Ewald summation. While the underlying numerical methods are unchanged, these developments represent significant changes to the algorithms used within ONETEP to distribute the workload across CPU cores. The new hybrid code exhibits much-improved strong scaling relative to the MPI-only code and permits calculations with a much higher ratio of cores to atoms. These developments result in a significantly shorter time to solution than was possible using MPI alone and facilitate the application of the ONETEP code to systems larger than previously feasible. We illustrate this with benchmark calculations from an amyloid fibril trimer containing 41,907 atoms. We use the code to study the mechanism of delamination of cellulose nanofibrils when undergoing sonification, a process which is controlled by a large number of interactions that collectively determine the structural properties of the fibrils. Many energy evaluations were needed for these simulations, and as these systems comprise up to 21,276 atoms this would not have been feasible without the developments described here.

  17. Massively parallel and linear-scaling algorithm for second-order Moller–Plesset perturbation theory applied to the study of supramolecular wires

    DOE PAGES

    Kjaergaard, Thomas; Baudin, Pablo; Bykov, Dmytro; ...

    2016-11-16

    Here, we present a scalable cross-platform hybrid MPI/OpenMP/OpenACC implementation of the Divide–Expand–Consolidate (DEC) formalism with portable performance on heterogeneous HPC architectures. The Divide–Expand–Consolidate formalism is designed to reduce the steep computational scaling of conventional many-body methods employed in electronic structure theory to linear scaling, while providing a simple mechanism for controlling the error introduced by this approximation. Our massively parallel implementation of this general scheme has three levels of parallelism, being a hybrid of the loosely coupled task-based parallelization approach and the conventional MPI +X programming model, where X is either OpenMP or OpenACC. We demonstrate strong and weak scalabilitymore » of this implementation on heterogeneous HPC systems, namely on the GPU-based Cray XK7 Titan supercomputer at the Oak Ridge National Laboratory. Using the “resolution of the identity second-order Moller–Plesset perturbation theory” (RI-MP2) as the physical model for simulating correlated electron motion, the linear-scaling DEC implementation is applied to 1-aza-adamantane-trione (AAT) supramolecular wires containing up to 40 monomers (2440 atoms, 6800 correlated electrons, 24 440 basis functions and 91 280 auxiliary functions). This represents the largest molecular system treated at the MP2 level of theory, demonstrating an efficient removal of the scaling wall pertinent to conventional quantum many-body methods.« less

  18. Evaluation of parallel milliliter-scale stirred-tank bioreactors for the study of biphasic whole-cell biocatalysis with ionic liquids.

    PubMed

    Dennewald, Danielle; Hortsch, Ralf; Weuster-Botz, Dirk

    2012-01-01

    As clear structure-activity relationships are still rare for ionic liquids, preliminary experiments are necessary for the process development of biphasic whole-cell processes involving these solvents. To reduce the time investment and the material costs, the process development of such biphasic reaction systems would profit from a small-scale high-throughput platform. Exemplarily, the reduction of 2-octanone to (R)-2-octanol by a recombinant Escherichia coli in a biphasic ionic liquid/water system was studied in a miniaturized stirred-tank bioreactor system allowing the parallel operation of up to 48 reactors at the mL-scale. The results were compared to those obtained in a 20-fold larger stirred-tank reactor. The maximum local energy dissipation was evaluated at the larger scale and compared to the data available for the small-scale reactors, to verify if similar mass transfer could be obtained at both scales. Thereafter, the reaction kinetics and final conversions reached in different reactions setups were analysed. The results were in good agreement between both scales for varying ionic liquids and for ionic liquid volume fractions up to 40%. The parallel bioreactor system can thus be used for the process development of the majority of biphasic reaction systems involving ionic liquids, reducing the time and resource investment during the process development of this type of applications. Copyright © 2011. Published by Elsevier B.V.

  19. A parallel adaptive mesh refinement algorithm

    NASA Technical Reports Server (NTRS)

    Quirk, James J.; Hanebutte, Ulf R.

    1993-01-01

    Over recent years, Adaptive Mesh Refinement (AMR) algorithms which dynamically match the local resolution of the computational grid to the numerical solution being sought have emerged as powerful tools for solving problems that contain disparate length and time scales. In particular, several workers have demonstrated the effectiveness of employing an adaptive, block-structured hierarchical grid system for simulations of complex shock wave phenomena. Unfortunately, from the parallel algorithm developer's viewpoint, this class of scheme is quite involved; these schemes cannot be distilled down to a small kernel upon which various parallelizing strategies may be tested. However, because of their block-structured nature such schemes are inherently parallel, so all is not lost. In this paper we describe the method by which Quirk's AMR algorithm has been parallelized. This method is built upon just a few simple message passing routines and so it may be implemented across a broad class of MIMD machines. Moreover, the method of parallelization is such that the original serial code is left virtually intact, and so we are left with just a single product to support. The importance of this fact should not be underestimated given the size and complexity of the original algorithm.

  20. Evaluation of concurrent priority queue algorithms. Technical report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, Q.

    1991-02-01

    The priority queue is a fundamental data structure that is used in a large variety of parallel algorithms, such as multiprocessor scheduling and parallel best-first search of state-space graphs. This thesis addresses the design and experimental evaluation of two novel concurrent priority queues: a parallel Fibonacci heap and a concurrent priority pool, and compares them with the concurrent binary heap. The parallel Fibonacci heap is based on the sequential Fibonacci heap, which is theoretically the most efficient data structure for sequential priority queues. This scheme not only preserves the efficient operation time bounds of its sequential counterpart, but also hasmore » very low contention by distributing locks over the entire data structure. The experimental results show its linearly scalable throughput and speedup up to as many processors as tested (currently 18). A concurrent access scheme for a doubly linked list is described as part of the implementation of the parallel Fibonacci heap. The concurrent priority pool is based on the concurrent B-tree and the concurrent pool. The concurrent priority pool has the highest throughput among the priority queues studied. Like the parallel Fibonacci heap, the concurrent priority pool scales linearly up to as many processors as tested. The priority queues are evaluated in terms of throughput and speedup. Some applications of concurrent priority queues such as the vertex cover problem and the single source shortest path problem are tested.« less

  1. Seismic stratigraphic characteristics of upper Louisiana continental slope: an area east of Green Canyon

    USGS Publications Warehouse

    Bouma, Arnold H.; Feeley, Mary H.; Kindinger, Jack G.; Stelting, Charles E.; Hilde, Thomas W.C.

    1981-01-01

    A high-resolution seismic reflection survey was conducted in a small area of the upper Louisiana Continental Slope known as Green Canyon Area. This area includes tracts 427, 428, 471, 472, 515, and 516, that will be offered for sale in March 1982 as part of Lease Sale 67.The sea floor of this region is, slightly hummocky and is underlain by salt diapirs that are mantled by early Tertiary shale. Most of the shale is overlain by younger Tertiary and Quaternary deposits, although locally some of the shale protrudes the sea floor. Because of proximity to older Mississippi River sources, the sediments are thick. The sediment cover shows an abundance of geologic phenomena such as horsts, grabens, growth faults, normal faults, and consolidation faults, zones with distinct and indistinct parallel reflections, semi-transparent zones, distorted zones, and angular unconformities.The major feature of this region is a N-S linear zone of uplifted and intruded sedimentary deposits formed due to diapiric intrusion.Small scale graben development over the crest of the structure can be attributed to extension and collapse. Large scale undulations of reflections well off the flanks of the uplifted structure suggest sediment creep and slumping. Dipping of parallel reflections show block faulting and tilting.Air gun (5 and 40 cubic inch) records reveal at least five major sequences that show masked onlap and slumping in their lower parts grading into more distinct parallel reflections in their upper parts. Such sequences can be related to local uplift and sea level changes. Minisparker records of this area show similar sequences but on a smaller scale. The distinct parallel reflections often onlap the diapir flanks. The highly reflective parts of these sequences may represent turbidite-type deposition, possibly at times of lower sea level. The acoustically more transparent parts of each sequence may represent deposits containing primarily hemipelagic and pelagic sediment.A complex ridge system is present along the west side of the area and distinct parallel reflections onlap onto this structure primarily from the east. Much of this deposition may be ascribed to sedimentation within a submarine canyon whose position is controlled by this ridge.

  2. Efficient partitioning and assignment on programs for multiprocessor execution

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1993-01-01

    The general problem studied is that of segmenting or partitioning programs for distribution across a multiprocessor system. Efficient partitioning and the assignment of program elements are of great importance since the time consumed in this overhead activity may easily dominate the computation, effectively eliminating any gains made by the use of the parallelism. In this study, the partitioning of sequentially structured programs (written in FORTRAN) is evaluated. Heuristics, developed for similar applications are examined. Finally, a model for queueing networks with finite queues is developed which may be used to analyze multiprocessor system architectures with a shared memory approach to the problem of partitioning. The properties of sequentially written programs form obstacles to large scale (at the procedure or subroutine level) parallelization. Data dependencies of even the minutest nature, reflecting the sequential development of the program, severely limit parallelism. The design of heuristic algorithms is tied to the experience gained in the parallel splitting. Parallelism obtained through the physical separation of data has seen some success, especially at the data element level. Data parallelism on a grander scale requires models that accurately reflect the effects of blocking caused by finite queues. A model for the approximation of the performance of finite queueing networks is developed. This model makes use of the decomposition approach combined with the efficiency of product form solutions.

  3. Parallelization of the FLAPW method

    NASA Astrophysics Data System (ADS)

    Canning, A.; Mannstadt, W.; Freeman, A. J.

    2000-08-01

    The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.

  4. A study of the parallel algorithm for large-scale DC simulation of nonlinear systems

    NASA Astrophysics Data System (ADS)

    Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel

    Newton-Raphson DC analysis of large-scale nonlinear circuits may be an extremely time consuming process even if sparse matrix techniques and bypassing of nonlinear models calculation are used. A slight decrease in the time required for this task may be enabled on multi-core, multithread computers if the calculation of the mathematical models for the nonlinear elements as well as the stamp management of the sparse matrix entries are managed through concurrent processes. This numerical complexity can be further reduced via the circuit decomposition and parallel solution of blocks taking as a departure point the BBD matrix structure. This block-parallel approach may give a considerable profit though it is strongly dependent on the system topology and, of course, on the processor type. This contribution presents the easy-parallelizable decomposition-based algorithm for DC simulation and provides a detailed study of its effectiveness.

  5. Scalable parallel distance field construction for large-scale applications

    DOE PAGES

    Yu, Hongfeng; Xie, Jinrong; Ma, Kwan -Liu; ...

    2015-10-01

    Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. Anew distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking overtime, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate itsmore » efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. In conclusion, our work greatly extends the usability of distance fields for demanding applications.« less

  6. Scalable Parallel Distance Field Construction for Large-Scale Applications.

    PubMed

    Yu, Hongfeng; Xie, Jinrong; Ma, Kwan-Liu; Kolla, Hemanth; Chen, Jacqueline H

    2015-10-01

    Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. A new distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking over time, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate its efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. Our work greatly extends the usability of distance fields for demanding applications.

  7. Composing Data Parallel Code for a SPARQL Graph Engine

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste

    Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basicmore » graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.« less

  8. Large-scale virtual screening on public cloud resources with Apache Spark.

    PubMed

    Capuccini, Marco; Ahmed, Laeeq; Schaal, Wesley; Laure, Erwin; Spjuth, Ola

    2017-01-01

    Structure-based virtual screening is an in-silico method to screen a target receptor against a virtual molecular library. Applying docking-based screening to large molecular libraries can be computationally expensive, however it constitutes a trivially parallelizable task. Most of the available parallel implementations are based on message passing interface, relying on low failure rate hardware and fast network connection. Google's MapReduce revolutionized large-scale analysis, enabling the processing of massive datasets on commodity hardware and cloud resources, providing transparent scalability and fault tolerance at the software level. Open source implementations of MapReduce include Apache Hadoop and the more recent Apache Spark. We developed a method to run existing docking-based screening software on distributed cloud resources, utilizing the MapReduce approach. We benchmarked our method, which is implemented in Apache Spark, docking a publicly available target receptor against [Formula: see text]2.2 M compounds. The performance experiments show a good parallel efficiency (87%) when running in a public cloud environment. Our method enables parallel Structure-based virtual screening on public cloud resources or commodity computer clusters. The degree of scalability that we achieve allows for trying out our method on relatively small libraries first and then to scale to larger libraries. Our implementation is named Spark-VS and it is freely available as open source from GitHub (https://github.com/mcapuccini/spark-vs).Graphical abstract.

  9. The chiral structure of porous chitin within the wing-scales of Callophrys rubi.

    PubMed

    Schröder-Turk, G E; Wickham, S; Averdunk, H; Brink, F; Fitz Gerald, J D; Poladian, L; Large, M C J; Hyde, S T

    2011-05-01

    The structure of the porous three-dimensional reticulated pattern in the wing scales of the butterfly Callophrys rubi (the Green Hairstreak) is explored in detail, via scanning and transmission electron microscopy. A full 3D tomographic reconstruction of a section of this material reveals that the predominantly chitin material is assembled in the wing scale to form a structure whose geometry bears a remarkable correspondence to the srs net, well-known in solid state chemistry and soft materials science. The porous solid is bounded to an excellent approximation by a parallel surface to the Gyroid, a three-periodic minimal surface with cubic crystallographic symmetry I4₁32, as foreshadowed by Stavenga and Michielson. The scale of the structure is commensurate with the wavelength of visible light, with an edge of the conventional cubic unit cell of the parallel-Gyroid of approximately 310 nm. The genesis of this structure is discussed, and we suggest it affords a remarkable example of templating of a chiral material via soft matter, analogous to the formation of mesoporous silica via surfactant assemblies in solution. In the butterfly, the templating is achieved by the lipid-protein membranes within the smooth endoplasmic reticulum (while it remains in the chrysalis), that likely form cubic membranes, folded according to the form of the Gyroid. The subsequent formation of the chiral hard chitin framework is suggested to be driven by the gradual polymerisation of the chitin precursors, whose inherent chiral assembly in solution (during growth) promotes the formation of a single enantiomer. Copyright © 2011 Elsevier Inc. All rights reserved.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran

    We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less

  11. Making Ordered DNA and Protein Structures from Computer-Printed Transparency Film Cut-Outs

    ERIC Educational Resources Information Center

    Jittivadhna, Karnyupha; Ruenwongsa, Pintip; Panijpan, Bhinyo

    2009-01-01

    Instructions are given for building physical scale models of ordered structures of B-form DNA, protein [alpha]-helix, and parallel and antiparallel protein [beta]-pleated sheets made from colored computer printouts designed for transparency film sheets. Cut-outs from these sheets are easily assembled. Conventional color coding for atoms are used…

  12. Implementation of highly parallel and large scale GW calculations within the OpenAtom software

    NASA Astrophysics Data System (ADS)

    Ismail-Beigi, Sohrab

    The need to describe electronic excitations with better accuracy than provided by band structures produced by Density Functional Theory (DFT) has been a long-term enterprise for the computational condensed matter and materials theory communities. In some cases, appropriate theoretical frameworks have existed for some time but have been difficult to apply widely due to computational cost. For example, the GW approximation incorporates a great deal of important non-local and dynamical electronic interaction effects but has been too computationally expensive for routine use in large materials simulations. OpenAtom is an open source massively parallel ab initiodensity functional software package based on plane waves and pseudopotentials (http://charm.cs.uiuc.edu/OpenAtom/) that takes advantage of the Charm + + parallel framework. At present, it is developed via a three-way collaboration, funded by an NSF SI2-SSI grant (ACI-1339804), between Yale (Ismail-Beigi), IBM T. J. Watson (Glenn Martyna) and the University of Illinois at Urbana Champaign (Laxmikant Kale). We will describe the project and our current approach towards implementing large scale GW calculations with OpenAtom. Potential applications of large scale parallel GW software for problems involving electronic excitations in semiconductor and/or metal oxide systems will be also be pointed out.

  13. The build up of the correlation between halo spin and the large-scale structure

    NASA Astrophysics Data System (ADS)

    Wang, Peng; Kang, Xi

    2018-01-01

    Both simulations and observations have confirmed that the spin of haloes/galaxies is correlated with the large-scale structure (LSS) with a mass dependence such that the spin of low-mass haloes/galaxies tend to be parallel with the LSS, while that of massive haloes/galaxies tend to be perpendicular with the LSS. It is still unclear how this mass dependence is built up over time. We use N-body simulations to trace the evolution of the halo spin-LSS correlation and find that at early times the spin of all halo progenitors is parallel with the LSS. As time goes on, mass collapsing around massive halo is more isotropic, especially the recent mass accretion along the slowest collapsing direction is significant and it brings the halo spin to be perpendicular with the LSS. Adopting the fractional anisotropy (FA) parameter to describe the degree of anisotropy of the large-scale environment, we find that the spin-LSS correlation is a strong function of the environment such that a higher FA (more anisotropic environment) leads to an aligned signal, and a lower anisotropy leads to a misaligned signal. In general, our results show that the spin-LSS correlation is a combined consequence of mass flow and halo growth within the cosmic web. Our predicted environmental dependence between spin and large-scale structure can be further tested using galaxy surveys.

  14. DL_MG: A Parallel Multigrid Poisson and Poisson-Boltzmann Solver for Electronic Structure Calculations in Vacuum and Solution.

    PubMed

    Womack, James C; Anton, Lucian; Dziedzic, Jacek; Hasnip, Phil J; Probert, Matt I J; Skylaris, Chris-Kriton

    2018-03-13

    The solution of the Poisson equation is a crucial step in electronic structure calculations, yielding the electrostatic potential-a key component of the quantum mechanical Hamiltonian. In recent decades, theoretical advances and increases in computer performance have made it possible to simulate the electronic structure of extended systems in complex environments. This requires the solution of more complicated variants of the Poisson equation, featuring nonhomogeneous dielectric permittivities, ionic concentrations with nonlinear dependencies, and diverse boundary conditions. The analytic solutions generally used to solve the Poisson equation in vacuum (or with homogeneous permittivity) are not applicable in these circumstances, and numerical methods must be used. In this work, we present DL_MG, a flexible, scalable, and accurate solver library, developed specifically to tackle the challenges of solving the Poisson equation in modern large-scale electronic structure calculations on parallel computers. Our solver is based on the multigrid approach and uses an iterative high-order defect correction method to improve the accuracy of solutions. Using two chemically relevant model systems, we tested the accuracy and computational performance of DL_MG when solving the generalized Poisson and Poisson-Boltzmann equations, demonstrating excellent agreement with analytic solutions and efficient scaling to ∼10 9 unknowns and 100s of CPU cores. We also applied DL_MG in actual large-scale electronic structure calculations, using the ONETEP linear-scaling electronic structure package to study a 2615 atom protein-ligand complex with routinely available computational resources. In these calculations, the overall execution time with DL_MG was not significantly greater than the time required for calculations using a conventional FFT-based solver.

  15. Fine-scale characteristics of interplanetary sector

    NASA Technical Reports Server (NTRS)

    Behannon, K. W.; Neubauer, F. M.; Barnstoff, H.

    1980-01-01

    The structure of the interplanetary sector boundaries observed by Helios 1 within sector transition regions was studied. Such regions consist of intermediate (nonspiral) average field orientations in some cases, as well as a number of large angle directional discontinuities (DD's) on the fine scale (time scales 1 hour). Such DD's are found to be more similar to tangential than rotational discontinuities, to be oriented on average more nearly perpendicular than parallel to the ecliptic plane to be accompanied usually by a large dip ( 80%) in B and, with a most probable thickness of 3 x 10 to the 4th power km, significantly thicker previously studied. It is hypothesized that the observed structures represent multiple traversals of the global heliospheric current sheet due to local fluctuations in the position of the sheet. There is evidence that such fluctuations are sometimes produced by wavelike motions or surface corrugations of scale length 0.05 - 0.1 AU superimposed on the large scale structure.

  16. Local and nonlocal parallel heat transport in general magnetic fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Del-Castillo-Negrete, Diego B; Chacon, Luis

    2011-01-01

    A novel approach for the study of parallel transport in magnetized plasmas is presented. The method avoids numerical pollution issues of grid-based formulations and applies to integrable and chaotic magnetic fields with local or nonlocal parallel closures. In weakly chaotic fields, the method gives the fractal structure of the devil's staircase radial temperature profile. In fully chaotic fields, the temperature exhibits self-similar spatiotemporal evolution with a stretched-exponential scaling function for local closures and an algebraically decaying one for nonlocal closures. It is shown that, for both closures, the effective radial heat transport is incompatible with the quasilinear diffusion model.

  17. Towards anatomic scale agent-based modeling with a massively parallel spatially explicit general-purpose model of enteric tissue (SEGMEnT_HPC).

    PubMed

    Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary

    2015-01-01

    Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis.

  18. A Dual Super-Element Domain Decomposition Approach for Parallel Nonlinear Finite Element Analysis

    NASA Astrophysics Data System (ADS)

    Jokhio, G. A.; Izzuddin, B. A.

    2015-05-01

    This article presents a new domain decomposition method for nonlinear finite element analysis introducing the concept of dual partition super-elements. The method extends ideas from the displacement frame method and is ideally suited for parallel nonlinear static/dynamic analysis of structural systems. In the new method, domain decomposition is realized by replacing one or more subdomains in a "parent system," each with a placeholder super-element, where the subdomains are processed separately as "child partitions," each wrapped by a dual super-element along the partition boundary. The analysis of the overall system, including the satisfaction of equilibrium and compatibility at all partition boundaries, is realized through direct communication between all pairs of placeholder and dual super-elements. The proposed method has particular advantages for matrix solution methods based on the frontal scheme, and can be readily implemented for existing finite element analysis programs to achieve parallelization on distributed memory systems with minimal intervention, thus overcoming memory bottlenecks typically faced in the analysis of large-scale problems. Several examples are presented in this article which demonstrate the computational benefits of the proposed parallel domain decomposition approach and its applicability to the nonlinear structural analysis of realistic structural systems.

  19. Nongyrotropic Electrons in Guide Field Reconnection

    NASA Technical Reports Server (NTRS)

    Wendel, D. E.; Hesse, M.; Bessho, N.; Adrian, M. L.; Kuznetsova, M.

    2016-01-01

    We apply a scalar measure of nongyrotropy to the electron pressure tensor in a 2D particle-in-cell simulation of guide field reconnection and assess the corresponding electron distributions and the forces that account for the nongyrotropy. The scalar measure reveals that the nongyrotropy lies in bands that straddle the electron diffusion region and the separatrices, in the same regions where there are parallel electric fields. Analysis of electron distributions and fields shows that the nongyrotropy along the inflow and outflow separatrices emerges as a result of multiple populations of electrons influenced differently by large and small-scale parallel electric fields and by gradients in the electric field. The relevant parallel electric fields include large-scale potential ramps emanating from the x-line and sub-ion inertial scale bipolar electron holes. Gradients in the perpendicular electric field modify electrons differently depending on their phase, thus producing nongyrotropy. Magnetic flux violation occurs along portions of the separatrices that coincide with the parallel electric fields. An inductive electric field in the electron EB drift frame thus develops, which has the effect of enhancing nongyrotropies already produced by other mechanisms and under certain conditions producing their own nongyrotropy. Particle tracing of electrons from nongyrotropic populations along the inflows and outflows shows that the striated structure of nongyrotropy corresponds to electrons arriving from different source regions. We also show that the relevant parallel electric fields receive important contributions not only from the nongyrotropic portion of the electron pressure tensor but from electron spatial and temporal inertial terms as well.

  20. A Linked-Cell Domain Decomposition Method for Molecular Dynamics Simulation on a Scalable Multiprocessor

    DOE PAGES

    Yang, L. H.; Brooks III, E. D.; Belak, J.

    1992-01-01

    A molecular dynamics algorithm for performing large-scale simulations using the Parallel C Preprocessor (PCP) programming paradigm on the BBN TC2000, a massively parallel computer, is discussed. The algorithm uses a linked-cell data structure to obtain the near neighbors of each atom as time evoles. Each processor is assigned to a geometric domain containing many subcells and the storage for that domain is private to the processor. Within this scheme, the interdomain (i.e., interprocessor) communication is minimized.

  1. State of the art in electromagnetic modeling for the Compact Linear Collider

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Candel, Arno; Kabel, Andreas; Lee, Lie-Quan

    SLAC's Advanced Computations Department (ACD) has developed the parallel 3D electromagnetic time-domain code T3P for simulations of wakefields and transients in complex accelerator structures. T3P is based on state-of-the-art Finite Element methods on unstructured grids and features unconditional stability, quadratic surface approximation and up to 6th-order vector basis functions for unprecedented simulation accuracy. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with fast turn-around times, aiding the design of the next generation of accelerator facilities. Applications include simulations of the proposed two-beam accelerator structures for the Compact Linear Collider (CLIC) - wakefieldmore » damping in the Power Extraction and Transfer Structure (PETS) and power transfer to the main beam accelerating structures are investigated.« less

  2. Reduced-Order Structure-Preserving Model for Parallel-Connected Three-Phase Grid-Tied Inverters: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, Brian B; Purba, Victor; Jafarpour, Saber

    Given that next-generation infrastructures will contain large numbers of grid-connected inverters and these interfaces will be satisfying a growing fraction of system load, it is imperative to analyze the impacts of power electronics on such systems. However, since each inverter model has a relatively large number of dynamic states, it would be impractical to execute complex system models where the full dynamics of each inverter are retained. To address this challenge, we derive a reduced-order structure-preserving model for parallel-connected grid-tied three-phase inverters. Here, each inverter in the system is assumed to have a full-bridge topology, LCL filter at the pointmore » of common coupling, and the control architecture for each inverter includes a current controller, a power controller, and a phase-locked loop for grid synchronization. We outline a structure-preserving reduced-order inverter model for the setting where the parallel inverters are each designed such that the filter components and controller gains scale linearly with the power rating. By structure preserving, we mean that the reduced-order three-phase inverter model is also composed of an LCL filter, a power controller, current controller, and PLL. That is, we show that the system of parallel inverters can be modeled exactly as one aggregated inverter unit and this equivalent model has the same number of dynamical states as an individual inverter in the paralleled system. Numerical simulations validate the reduced-order models.« less

  3. Deformational sequence of a portion of the Michipicoten greenstone belt, Chabanel Township, Ontario

    NASA Technical Reports Server (NTRS)

    Shrady, C. H.; Mcgill, G. E.

    1986-01-01

    Detailed mapping at a scale of one inch = 400 feet is being carried out within a fume kill, having excellent exposure, located in the southwestern portion of the Michipicoten Greenstone Belt near Wawa, Ontario. The rocks are metasediments and metavolcanics of lower greenschist facies. U-Pb geochronology indicates that they are at least 2698 + or - 11 Ma old. The lithologic packages strike northeast to northwest, but the dominant strike is approximately east-west. Sedimentary structures and graded bedding are well preserved, aiding in the structural interpretation of this multiply deformed area. At least six phases of deformation within a relatively small area of the Michipicoten Greenstone Belt have been tentatively identified. These include the following structural features in approximate order of occurrence: (0) soft-sediment structures; (1) regionally overturned rocks, flattened pebbles, bedding parallel cleavage, and early, approximately bedding parallel faults; (2) northwest to north striking cleavage; (3) northeast striking cleavage and associated folds, and at least some late movement on approximately bedding parallel faults; (4) north-northwest and northeast trending faults; and (5) diabase dikes and associated fracture cleavages. Minor displacement of the diabase dikes occurs on faults that appear to be reactivated older structures.

  4. Improved treatment of exact exchange in Quantum ESPRESSO

    DOE PAGES

    Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre; ...

    2017-01-18

    Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre

    Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less

  6. Massively parallel sparse matrix function calculations with NTPoly

    NASA Astrophysics Data System (ADS)

    Dawson, William; Nakajima, Takahito

    2018-04-01

    We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.

  7. The method of parallel-hierarchical transformation for rapid recognition of dynamic images using GPGPU technology

    NASA Astrophysics Data System (ADS)

    Timchenko, Leonid; Yarovyi, Andrii; Kokriatskaya, Nataliya; Nakonechna, Svitlana; Abramenko, Ludmila; Ławicki, Tomasz; Popiel, Piotr; Yesmakhanova, Laura

    2016-09-01

    The paper presents a method of parallel-hierarchical transformations for rapid recognition of dynamic images using GPU technology. Direct parallel-hierarchical transformations based on cluster CPU-and GPU-oriented hardware platform. Mathematic models of training of the parallel hierarchical (PH) network for the transformation are developed, as well as a training method of the PH network for recognition of dynamic images. This research is most topical for problems on organizing high-performance computations of super large arrays of information designed to implement multi-stage sensing and processing as well as compaction and recognition of data in the informational structures and computer devices. This method has such advantages as high performance through the use of recent advances in parallelization, possibility to work with images of ultra dimension, ease of scaling in case of changing the number of nodes in the cluster, auto scan of local network to detect compute nodes.

  8. Exact coherent structures in an asymptotically reduced description of parallel shear flows

    NASA Astrophysics Data System (ADS)

    Beaume, Cédric; Knobloch, Edgar; Chini, Gregory P.; Julien, Keith

    2015-02-01

    A reduced description of shear flows motivated by the Reynolds number scaling of lower-branch exact coherent states in plane Couette flow (Wang J, Gibson J and Waleffe F 2007 Phys. Rev. Lett. 98 204501) is constructed. Exact time-independent nonlinear solutions of the reduced equations corresponding to both lower and upper branch states are found for a sinusoidal, body-forced shear flow. The lower branch solution is characterized by fluctuations that vary slowly along the critical layer while the upper branch solutions display a bimodal structure and are more strongly focused on the critical layer. The reduced equations provide a rational framework for investigations of subcritical spatiotemporal patterns in parallel shear flows.

  9. Spectral enstrophy budget in a shear-less flow with turbulent/non-turbulent interface

    NASA Astrophysics Data System (ADS)

    Cimarelli, Andrea; Cocconi, Giacomo; Frohnapfel, Bettina; De Angelis, Elisabetta

    2015-12-01

    A numerical analysis of the interaction between decaying shear free turbulence and quiescent fluid is performed by means of global statistical budgets of enstrophy, both, at the single-point and two point levels. The single-point enstrophy budget allows us to recognize three physically relevant layers: a bulk turbulent region, an inhomogeneous turbulent layer, and an interfacial layer. Within these layers, enstrophy is produced, transferred, and finally destroyed while leading to a propagation of the turbulent front. These processes do not only depend on the position in the flow field but are also strongly scale dependent. In order to tackle this multi-dimensional behaviour of enstrophy in the space of scales and in physical space, we analyse the spectral enstrophy budget equation. The picture consists of an inviscid spatial cascade of enstrophy from large to small scales parallel to the interface moving towards the interface. At the interface, this phenomenon breaks, leaving place to an anisotropic cascade where large scale structures exhibit only a cascade process normal to the interface thus reducing their thickness while retaining their lengths parallel to the interface. The observed behaviour could be relevant for both the theoretical and the modelling approaches to flow with interacting turbulent/nonturbulent regions. The scale properties of the turbulent propagation mechanisms highlight that the inviscid turbulent transport is a large-scale phenomenon. On the contrary, the viscous diffusion, commonly associated with small scale mechanisms, highlights a much richer physics involving small lengths, normal to the interface, but at the same time large scales, parallel to the interface.

  10. Trace: a high-throughput tomographic reconstruction engine for large-scale datasets

    DOE PAGES

    Bicer, Tekin; Gursoy, Doga; Andrade, Vincent De; ...

    2017-01-28

    Here, synchrotron light source and detector technologies enable scientists to perform advanced experiments. These scientific instruments and experiments produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used data acquisition technique at light sources is Computed Tomography, which can generate tens of GB/s depending on x-ray range. A large-scale tomographic dataset, such as mouse brain, may require hours of computation time with a medium size workstation. In this paper, we present Trace, a data-intensive computing middleware we developed for implementation and parallelization of iterative tomographic reconstruction algorithms. Tracemore » provides fine-grained reconstruction of tomography datasets using both (thread level) shared memory and (process level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations we have done on the replicated reconstruction objects and evaluate them using a shale and a mouse brain sinogram. Our experimental evaluations show that the applied optimizations and parallelization techniques can provide 158x speedup (using 32 compute nodes) over single core configuration, which decreases the reconstruction time of a sinogram (with 4501 projections and 22400 detector resolution) from 12.5 hours to less than 5 minutes per iteration.« less

  11. Trace: a high-throughput tomographic reconstruction engine for large-scale datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bicer, Tekin; Gursoy, Doga; Andrade, Vincent De

    Here, synchrotron light source and detector technologies enable scientists to perform advanced experiments. These scientific instruments and experiments produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used data acquisition technique at light sources is Computed Tomography, which can generate tens of GB/s depending on x-ray range. A large-scale tomographic dataset, such as mouse brain, may require hours of computation time with a medium size workstation. In this paper, we present Trace, a data-intensive computing middleware we developed for implementation and parallelization of iterative tomographic reconstruction algorithms. Tracemore » provides fine-grained reconstruction of tomography datasets using both (thread level) shared memory and (process level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations we have done on the replicated reconstruction objects and evaluate them using a shale and a mouse brain sinogram. Our experimental evaluations show that the applied optimizations and parallelization techniques can provide 158x speedup (using 32 compute nodes) over single core configuration, which decreases the reconstruction time of a sinogram (with 4501 projections and 22400 detector resolution) from 12.5 hours to less than 5 minutes per iteration.« less

  12. A parallelized three-dimensional cellular automaton model for grain growth during additive manufacturing

    NASA Astrophysics Data System (ADS)

    Lian, Yanping; Lin, Stephen; Yan, Wentao; Liu, Wing Kam; Wagner, Gregory J.

    2018-05-01

    In this paper, a parallelized 3D cellular automaton computational model is developed to predict grain morphology for solidification of metal during the additive manufacturing process. Solidification phenomena are characterized by highly localized events, such as the nucleation and growth of multiple grains. As a result, parallelization requires careful treatment of load balancing between processors as well as interprocess communication in order to maintain a high parallel efficiency. We give a detailed summary of the formulation of the model, as well as a description of the communication strategies implemented to ensure parallel efficiency. Scaling tests on a representative problem with about half a billion cells demonstrate parallel efficiency of more than 80% on 8 processors and around 50% on 64; loss of efficiency is attributable to load imbalance due to near-surface grain nucleation in this test problem. The model is further demonstrated through an additive manufacturing simulation with resulting grain structures showing reasonable agreement with those observed in experiments.

  13. A parallelized three-dimensional cellular automaton model for grain growth during additive manufacturing

    NASA Astrophysics Data System (ADS)

    Lian, Yanping; Lin, Stephen; Yan, Wentao; Liu, Wing Kam; Wagner, Gregory J.

    2018-01-01

    In this paper, a parallelized 3D cellular automaton computational model is developed to predict grain morphology for solidification of metal during the additive manufacturing process. Solidification phenomena are characterized by highly localized events, such as the nucleation and growth of multiple grains. As a result, parallelization requires careful treatment of load balancing between processors as well as interprocess communication in order to maintain a high parallel efficiency. We give a detailed summary of the formulation of the model, as well as a description of the communication strategies implemented to ensure parallel efficiency. Scaling tests on a representative problem with about half a billion cells demonstrate parallel efficiency of more than 80% on 8 processors and around 50% on 64; loss of efficiency is attributable to load imbalance due to near-surface grain nucleation in this test problem. The model is further demonstrated through an additive manufacturing simulation with resulting grain structures showing reasonable agreement with those observed in experiments.

  14. Spectral Anisotropy of Magnetic Field Fluctuations around Ion Scales in the Fast Solar Wind

    NASA Astrophysics Data System (ADS)

    Wang, X.; Tu, C.; He, J.; Marsch, E.; Wang, L.

    2016-12-01

    The power spectra of magnetic field at ion scales are significantly influenced by waves and structures. In this work, we study the ΘRB angle dependence of the contribution of waves on the spectral index of the magnetic field. Wavelet technique is applied to the high time-resolution magnetic field data from WIND spacecraft measurements in the fast solar wind. It is found that around ion scales, the parallel spectrum has a slope of -4.6±0.1 originally. When we remove the waves, which correspond to the data points with relatively larger value of magnetic helicity, the parallel spectrum gets shallower gradually to -3.2±0.2. However, the perpendicular spectrum does not change significantly during the wave-removal process, and its slope remains -3.1±0.1. It means that when the waves are removed from the original data, the spectral anisotropy gets weaker. This result may help us understand the physical nature of the spectral anisotropy around the ion scales.

  15. An Implicit Solver on A Parallel Block-Structured Adaptive Mesh Grid for FLASH

    NASA Astrophysics Data System (ADS)

    Lee, D.; Gopal, S.; Mohapatra, P.

    2012-07-01

    We introduce a fully implicit solver for FLASH based on a Jacobian-Free Newton-Krylov (JFNK) approach with an appropriate preconditioner. The main goal of developing this JFNK-type implicit solver is to provide efficient high-order numerical algorithms and methodology for simulating stiff systems of differential equations on large-scale parallel computer architectures. A large number of natural problems in nonlinear physics involve a wide range of spatial and time scales of interest. A system that encompasses such a wide magnitude of scales is described as "stiff." A stiff system can arise in many different fields of physics, including fluid dynamics/aerodynamics, laboratory/space plasma physics, low Mach number flows, reactive flows, radiation hydrodynamics, and geophysical flows. One of the big challenges in solving such a stiff system using current-day computational resources lies in resolving time and length scales varying by several orders of magnitude. We introduce FLASH's preliminary implementation of a time-accurate JFNK-based implicit solver in the framework of FLASH's unsplit hydro solver.

  16. Highly Efficient Parallel Multigrid Solver For Large-Scale Simulation of Grain Growth Using the Structural Phase Field Crystal Model

    NASA Astrophysics Data System (ADS)

    Guan, Zhen; Pekurovsky, Dmitry; Luce, Jason; Thornton, Katsuyo; Lowengrub, John

    The structural phase field crystal (XPFC) model can be used to model grain growth in polycrystalline materials at diffusive time-scales while maintaining atomic scale resolution. However, the governing equation of the XPFC model is an integral-partial-differential-equation (IPDE), which poses challenges in implementation onto high performance computing (HPC) platforms. In collaboration with the XSEDE Extended Collaborative Support Service, we developed a distributed memory HPC solver for the XPFC model, which combines parallel multigrid and P3DFFT. The performance benchmarking on the Stampede supercomputer indicates near linear strong and weak scaling for both multigrid and transfer time between multigrid and FFT modules up to 1024 cores. Scalability of the FFT module begins to decline at 128 cores, but it is sufficient for the type of problem we will be examining. We have demonstrated simulations using 1024 cores, and we expect to achieve 4096 cores and beyond. Ongoing work involves optimization of MPI/OpenMP-based codes for the Intel KNL Many-Core Architecture. This optimizes the code for coming pre-exascale systems, in particular many-core systems such as Stampede 2.0 and Cori 2 at NERSC, without sacrificing efficiency on other general HPC systems.

  17. An Expert Assistant for Computer Aided Parallelization

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Chun, Robert; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit

    2004-01-01

    The prototype implementation of an expert system was developed to assist the user in the computer aided parallelization process. The system interfaces to tools for automatic parallelization and performance analysis. By fusing static program structure information and dynamic performance analysis data the expert system can help the user to filter, correlate, and interpret the data gathered by the existing tools. Sections of the code that show poor performance and require further attention are rapidly identified and suggestions for improvements are presented to the user. In this paper we describe the components of the expert system and discuss its interface to the existing tools. We present a case study to demonstrate the successful use in full scale scientific applications.

  18. A template-based approach for parallel hexahedral two-refinement

    DOE PAGES

    Owen, Steven J.; Shih, Ryan M.; Ernst, Corey D.

    2016-10-17

    Here, we provide a template-based approach for generating locally refined all-hex meshes. We focus specifically on refinement of initially structured grids utilizing a 2-refinement approach where uniformly refined hexes are subdivided into eight child elements. The refinement algorithm consists of identifying marked nodes that are used as the basis for a set of four simple refinement templates. The target application for 2-refinement is a parallel grid-based all-hex meshing tool for high performance computing in a distributed environment. The result is a parallel consistent locally refined mesh requiring minimal communication and where minimum mesh quality is greater than scaled Jacobian 0.3more » prior to smoothing.« less

  19. A template-based approach for parallel hexahedral two-refinement

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Owen, Steven J.; Shih, Ryan M.; Ernst, Corey D.

    Here, we provide a template-based approach for generating locally refined all-hex meshes. We focus specifically on refinement of initially structured grids utilizing a 2-refinement approach where uniformly refined hexes are subdivided into eight child elements. The refinement algorithm consists of identifying marked nodes that are used as the basis for a set of four simple refinement templates. The target application for 2-refinement is a parallel grid-based all-hex meshing tool for high performance computing in a distributed environment. The result is a parallel consistent locally refined mesh requiring minimal communication and where minimum mesh quality is greater than scaled Jacobian 0.3more » prior to smoothing.« less

  20. Emerging Nanophotonic Applications Explored with Advanced Scientific Parallel Computing

    NASA Astrophysics Data System (ADS)

    Meng, Xiang

    The domain of nanoscale optical science and technology is a combination of the classical world of electromagnetics and the quantum mechanical regime of atoms and molecules. Recent advancements in fabrication technology allows the optical structures to be scaled down to nanoscale size or even to the atomic level, which are far smaller than the wavelength they are designed for. These nanostructures can have unique, controllable, and tunable optical properties and their interactions with quantum materials can have important near-field and far-field optical response. Undoubtedly, these optical properties can have many important applications, ranging from the efficient and tunable light sources, detectors, filters, modulators, high-speed all-optical switches; to the next-generation classical and quantum computation, and biophotonic medical sensors. This emerging research of nanoscience, known as nanophotonics, is a highly interdisciplinary field requiring expertise in materials science, physics, electrical engineering, and scientific computing, modeling and simulation. It has also become an important research field for investigating the science and engineering of light-matter interactions that take place on wavelength and subwavelength scales where the nature of the nanostructured matter controls the interactions. In addition, the fast advancements in the computing capabilities, such as parallel computing, also become as a critical element for investigating advanced nanophotonic devices. This role has taken on even greater urgency with the scale-down of device dimensions, and the design for these devices require extensive memory and extremely long core hours. Thus distributed computing platforms associated with parallel computing are required for faster designs processes. Scientific parallel computing constructs mathematical models and quantitative analysis techniques, and uses the computing machines to analyze and solve otherwise intractable scientific challenges. In particular, parallel computing are forms of computation operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently. In this dissertation, we report a series of new nanophotonic developments using the advanced parallel computing techniques. The applications include the structure optimizations at the nanoscale to control both the electromagnetic response of materials, and to manipulate nanoscale structures for enhanced field concentration, which enable breakthroughs in imaging, sensing systems (chapter 3 and 4) and improve the spatial-temporal resolutions of spectroscopies (chapter 5). We also report the investigations on the confinement study of optical-matter interactions at the quantum mechanical regime, where the size-dependent novel properties enhanced a wide range of technologies from the tunable and efficient light sources, detectors, to other nanophotonic elements with enhanced functionality (chapter 6 and 7).

  1. Transport of cosmic-ray protons in intermittent heliospheric turbulence: Model and simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Alouani-Bibi, Fathallah; Le Roux, Jakobus A., E-mail: fb0006@uah.edu

    The transport of charged energetic particles in the presence of strong intermittent heliospheric turbulence is computationally analyzed based on known properties of the interplanetary magnetic field and solar wind plasma at 1 astronomical unit. The turbulence is assumed to be static, composite, and quasi-three-dimensional with a varying energy distribution between a one-dimensional Alfvénic (slab) and a structured two-dimensional component. The spatial fluctuations of the turbulent magnetic field are modeled either as homogeneous with a Gaussian probability distribution function (PDF), or as intermittent on large and small scales with a q-Gaussian PDF. Simulations showed that energetic particle diffusion coefficients both parallelmore » and perpendicular to the background magnetic field are significantly affected by intermittency in the turbulence. This effect is especially strong for parallel transport where for large-scale intermittency results show an extended phase of subdiffusive parallel transport during which cross-field transport diffusion dominates. The effects of intermittency are found to depend on particle rigidity and the fraction of slab energy in the turbulence, yielding a perpendicular to parallel mean free path ratio close to 1 for large-scale intermittency. Investigation of higher order transport moments (kurtosis) indicates that non-Gaussian statistical properties of the intermittent turbulent magnetic field are present in the parallel transport, especially for low rigidity particles at all times.« less

  2. The Center for Optimized Structural Studies (COSS) platform for automation in cloning, expression, and purification of single proteins and protein-protein complexes.

    PubMed

    Mlynek, Georg; Lehner, Anita; Neuhold, Jana; Leeb, Sarah; Kostan, Julius; Charnagalov, Alexej; Stolt-Bergner, Peggy; Djinović-Carugo, Kristina; Pinotsis, Nikos

    2014-06-01

    Expression in Escherichia coli represents the simplest and most cost effective means for the production of recombinant proteins. This is a routine task in structural biology and biochemistry where milligrams of the target protein are required in high purity and monodispersity. To achieve these criteria, the user often needs to screen several constructs in different expression and purification conditions in parallel. We describe a pipeline, implemented in the Center for Optimized Structural Studies, that enables the systematic screening of expression and purification conditions for recombinant proteins and relies on a series of logical decisions. We first use bioinformatics tools to design a series of protein fragments, which we clone in parallel, and subsequently screen in small scale for optimal expression and purification conditions. Based on a scoring system that assesses soluble expression, we then select the top ranking targets for large-scale purification. In the establishment of our pipeline, emphasis was put on streamlining the processes such that it can be easily but not necessarily automatized. In a typical run of about 2 weeks, we are able to prepare and perform small-scale expression screens for 20-100 different constructs followed by large-scale purification of at least 4-6 proteins. The major advantage of our approach is its flexibility, which allows for easy adoption, either partially or entirely, by any average hypothesis driven laboratory in a manual or robot-assisted manner.

  3. Reduced-Order Structure-Preserving Model for Parallel-Connected Three-Phase Grid-Tied Inverters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, Brian B; Purba, Victor; Jafarpour, Saber

    Next-generation power networks will contain large numbers of grid-connected inverters satisfying a significant fraction of system load. Since each inverter model has a relatively large number of dynamic states, it is impractical to analyze complex system models where the full dynamics of each inverter are retained. To address this challenge, we derive a reduced-order structure-preserving model for parallel-connected grid-tied three-phase inverters. Here, each inverter in the system is assumed to have a full-bridge topology, LCL filter at the point of common coupling, and the control architecture for each inverter includes a current controller, a power controller, and a phase-locked loopmore » for grid synchronization. We outline a structure-preserving reduced-order inverter model with lumped parameters for the setting where the parallel inverters are each designed such that the filter components and controller gains scale linearly with the power rating. By structure preserving, we mean that the reduced-order three-phase inverter model is also composed of an LCL filter, a power controller, current controller, and PLL. We show that the system of parallel inverters can be modeled exactly as one aggregated inverter unit and this equivalent model has the same number of dynamical states as any individual inverter in the system. Numerical simulations validate the reduced-order model.« less

  4. Observations of Magnetosphere-Ionosphere Coupling Processes in Jupiter's Downward Auroral Current Region

    NASA Astrophysics Data System (ADS)

    Clark, G. B.; Mauk, B.; Allegrini, F.; Bagenal, F.; Bolton, S. J.; Bunce, E. J.; Connerney, J. E. P.; Ebert, R. W.; Gershman, D. J.; Gladstone, R.; Haggerty, D. K.; Hospodarsky, G. B.; Kotsiaros, S.; Kollmann, P.; Kurth, W. S.; Levin, S.; McComas, D. J.; Paranicas, C.; Rymer, A. M.; Saur, J.; Szalay, J. R.; Tetrick, S.; Valek, P. W.

    2017-12-01

    Our view and understanding of Jupiter's auroral regions are ever-changing as Juno continues to map out this region with every auroral pass. For example, since last year's Fall AGU and the release of publications regarding the first perijove orbit, the Juno particles and fields teams have found direct evidence of parallel potential drops in addition to the stochastic broad energy distributions associated with the downward current auroral acceleration region. In this region, which appears to exist in an altitude range of 1.5-3 Jovian radii, the potential drops can reach as high as several megavolts. Associated with these potentials are anti-planetward electron angle beams, energetic ion conics and precipitating protons, oxygen and sulfur. Sometimes the potentials within the downward current region are structured such that they look like the inverted-V type distributions typically found in Earth's upward current region. This is true for both the ion and electron energy distributions. Other times, the parallel potentials appear to be intermittent or spatially structured in a way such that they do not look like the canonical diverging electrostatic potential structure. Furthermore, the parallel potentials vary grossly in spatial/temporal scale, peak voltage and associated parallel current density. Here, we present a comprehensive study of these structures in Jupiter's downward current region focusing on energetic particle measurements from Juno-JEDI.

  5. Towards Anatomic Scale Agent-Based Modeling with a Massively Parallel Spatially Explicit General-Purpose Model of Enteric Tissue (SEGMEnT_HPC)

    PubMed Central

    Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary

    2015-01-01

    Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis. PMID:25806784

  6. Solvers for $$\\mathcal{O} (N)$$ Electronic Structure in the Strong Scaling Limit

    DOE PAGES

    Bock, Nicolas; Challacombe, William M.; Kale, Laxmikant

    2016-01-26

    Here we present a hybrid OpenMP/Charm\\tt++ framework for solving themore » $$\\mathcal{O} (N)$$ self-consistent-field eigenvalue problem with parallelism in the strong scaling regime, $$P\\gg{N}$$, where $P$ is the number of cores, and $N$ is a measure of system size, i.e., the number of matrix rows/columns, basis functions, atoms, molecules, etc. This result is achieved with a nested approach to spectral projection and the sparse approximate matrix multiply [Bock and Challacombe, SIAM J. Sci. Comput., 35 (2013), pp. C72--C98], and involves a recursive, task-parallel algorithm, often employed by generalized $N$-Body solvers, to occlusion and culling of negligible products in the case of matrices with decay. Lastly, employing classic technologies associated with generalized $N$-Body solvers, including overdecomposition, recursive task parallelism, orderings that preserve locality, and persistence-based load balancing, we obtain scaling beyond hundreds of cores per molecule for small water clusters ([H$${}_2$$O]$${}_N$$, $$N \\in \\{ 30, 90, 150 \\}$$, $$P/N \\approx \\{ 819, 273, 164 \\}$$) and find support for an increasingly strong scalability with increasing system size $N$.« less

  7. Scaling Semantic Graph Databases in Size and Performance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morari, Alessandro; Castellana, Vito G.; Villa, Oreste

    In this paper we present SGEM, a full software system for accelerating large-scale semantic graph databases on commodity clusters. Unlike current approaches, SGEM addresses semantic graph databases by only employing graph methods at all the levels of the stack. On one hand, this allows exploiting the space efficiency of graph data structures and the inherent parallelism of graph algorithms. These features adapt well to the increasing system memory and core counts of modern commodity clusters. On the other hand, however, these systems are optimized for regular computation and batched data transfers, while graph methods usually are irregular and generate fine-grainedmore » data accesses with poor spatial and temporal locality. Our framework comprises a SPARQL to data parallel C compiler, a library of parallel graph methods and a custom, multithreaded runtime system. We introduce our stack, motivate its advantages with respect to other solutions and show how we solved the challenges posed by irregular behaviors. We present the result of our software stack on the Berlin SPARQL benchmarks with datasets up to 10 billion triples (a triple corresponds to a graph edge), demonstrating scaling in dataset size and in performance as more nodes are added to the cluster.« less

  8. Two-dimensional analysis of coupled heat and moisture transport in masonry structures

    NASA Astrophysics Data System (ADS)

    Krejčí, Tomáš

    2016-06-01

    Reconstruction and maintenance of historical buildings and bridges require good knowledge of temperature and moisture distribution. Sharp changes in the temperature and moisture can lead to damage. This paper describes analysis of coupled heat and moisture transfer in masonry based on two-level approach. Macro-scale level describes the whole structure while meso-scale level takes into account detailed composition of the masonry. The two-level approach is very computationally demanding and it was implemented in parallel. The two-level approach was used in analysis of temperature and moisture distribution in Charles bridge in Prague, Czech Republic.

  9. SKIRT: Hybrid parallelization of radiative transfer simulations

    NASA Astrophysics Data System (ADS)

    Verstocken, S.; Van De Putte, D.; Camps, P.; Baes, M.

    2017-07-01

    We describe the design, implementation and performance of the new hybrid parallelization scheme in our Monte Carlo radiative transfer code SKIRT, which has been used extensively for modelling the continuum radiation of dusty astrophysical systems including late-type galaxies and dusty tori. The hybrid scheme combines distributed memory parallelization, using the standard Message Passing Interface (MPI) to communicate between processes, and shared memory parallelization, providing multiple execution threads within each process to avoid duplication of data structures. The synchronization between multiple threads is accomplished through atomic operations without high-level locking (also called lock-free programming). This improves the scaling behaviour of the code and substantially simplifies the implementation of the hybrid scheme. The result is an extremely flexible solution that adjusts to the number of available nodes, processors and memory, and consequently performs well on a wide variety of computing architectures.

  10. Parasitic momentum flux in the tokamak core

    DOE PAGES

    Stoltzfus-Dueck, T.

    2017-03-06

    A geometrical correction to the E × B drift causes an outward flux of co-current momentum whenever electrostatic potential energy is transferred to ion parallel flows. The robust, fully nonlinear symmetry breaking follows from the free-energy flow in phase space and does not depend on any assumed linear eigenmode structure. The resulting rotation peaking is counter-current and scales as temperature over plasma current. Lastly, this peaking mechanism can only act when fluctuations are low-frequency enough to excite ion parallel flows, which may explain some recent experimental observations related to rotation reversals.

  11. Structure preserving parallel algorithms for solving the Bethe–Salpeter eigenvalue problem

    DOE PAGES

    Shao, Meiyue; da Jornada, Felipe H.; Yang, Chao; ...

    2015-10-02

    The Bethe–Salpeter eigenvalue problem is a dense structured eigenvalue problem arising from discretized Bethe–Salpeter equation in the context of computing exciton energies and states. A computational challenge is that at least half of the eigenvalues and the associated eigenvectors are desired in practice. In this paper, we establish the equivalence between Bethe–Salpeter eigenvalue problems and real Hamiltonian eigenvalue problems. Based on theoretical analysis, structure preserving algorithms for a class of Bethe–Salpeter eigenvalue problems are proposed. We also show that for this class of problems all eigenvalues obtained from the Tamm–Dancoff approximation are overestimated. In order to solve large scale problemsmore » of practical interest, we discuss parallel implementations of our algorithms targeting distributed memory systems. Finally, several numerical examples are presented to demonstrate the efficiency and accuracy of our algorithms.« less

  12. Accelerating the Pace of Protein Functional Annotation With Intel Xeon Phi Coprocessors.

    PubMed

    Feinstein, Wei P; Moreno, Juana; Jarrell, Mark; Brylinski, Michal

    2015-06-01

    Intel Xeon Phi is a new addition to the family of powerful parallel accelerators. The range of its potential applications in computationally driven research is broad; however, at present, the repository of scientific codes is still relatively limited. In this study, we describe the development and benchmarking of a parallel version of eFindSite, a structural bioinformatics algorithm for the prediction of ligand-binding sites in proteins. Implemented for the Intel Xeon Phi platform, the parallelization of the structure alignment portion of eFindSite using pragma-based OpenMP brings about the desired performance improvements, which scale well with the number of computing cores. Compared to a serial version, the parallel code runs 11.8 and 10.1 times faster on the CPU and the coprocessor, respectively; when both resources are utilized simultaneously, the speedup is 17.6. For example, ligand-binding predictions for 501 benchmarking proteins are completed in 2.1 hours on a single Stampede node equipped with the Intel Xeon Phi card compared to 3.1 hours without the accelerator and 36.8 hours required by a serial version. In addition to the satisfactory parallel performance, porting existing scientific codes to the Intel Xeon Phi architecture is relatively straightforward with a short development time due to the support of common parallel programming models by the coprocessor. The parallel version of eFindSite is freely available to the academic community at www.brylinski.org/efindsite.

  13. COHERENT EVENTS AND SPECTRAL SHAPE AT ION KINETIC SCALES IN THE FAST SOLAR WIND TURBULENCE

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lion, Sonny; Alexandrova, Olga; Zaslavsky, Arnaud, E-mail: sonny.lion@obspm.fr

    2016-06-10

    In this paper we investigate spectral and phase coherence properties of magnetic fluctuations in the vicinity of the spectral transition from large, magnetohydrodynamic to sub-ion scales using in situ measurements of the Wind spacecraft in a fast stream. For the time interval investigated by Leamon et al. (1998) the phase coherence analysis shows the presence of sporadic quasi-parallel Alfvén ion cyclotron (AIC) waves as well as coherent structures in the form of large-amplitude, quasi-perpendicular Alfvén vortex-like structures and current sheets. These waves and structures importantly contribute to the observed power spectrum of magnetic fluctuations around ion scales; AIC waves contributemore » to the spectrum in a narrow frequency range whereas the coherent structures contribute to the spectrum over a wide frequency band from the inertial range to the sub-ion frequency range. We conclude that a particular combination of waves and coherent structures determines the spectral shape of the magnetic field spectrum around ion scales. This phenomenon provides a possible explanation for a high variability of the magnetic power spectra around ion scales observed in the solar wind.« less

  14. Psychometric properties of the Haitian Creole version of the Resilience Scale with a sample of adult survivors of the 2010 earthquake.

    PubMed

    Cénat, Jude Mary; Derivois, Daniel; Hébert, Martine; Eid, Patricia; Mouchenik, Yoram

    2015-11-01

    Resilience is defined as the ability of people to cope with disasters and significant life adversities. The present paper aims to investigate the underlying structure of the Creole version of the Resilience Scale and its psychometric properties using a sample of adult survivors of the 2010 earthquake. A parallel analysis was conducted to determine the number of factors to extract and confirmatory factor analysis was performed using a sample of 1355 adult survivors of the 2010 earthquake from people of specific places where earthquake occurred with an average age of 31.57 (SD=14.42). All participants completed the Creole version of Resilience Scale (RS), the Impact of Event Scale Revised (IES-R), the Beck Depression Inventory (BDI) and the Social Support Questionnaire (SQQ-6). To facilitate exploratory (EFA) and confirmatory factor analysis (CFA), the sample was divided into two subsamples (subsample 1 for EFA and subsample 2 for CFA). Parallel analysis and confirmatory factor analysis results showed a good-fit 3-factor structure. The Cronbach α coefficient was .79, .74 and .72 respectively for the factor 1, 2 and 3 and correlated to each other. Construct validity of the Resilience scale was provided by significant correlation with measures of depression and social support satisfaction, but no correlation was found with posttraumatic stress disorder measure, except for factor 2. The results reveal a different factorial structure including 25 items of the RS. However, the Haitian Creole version of RS is a valid and reliable measure for assessing resilience for adults in Haiti. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Janjusic, Tommy; Kartsaklis, Christos

    Memory scalability is an enduring problem and bottleneck that plagues many parallel codes. Parallel codes designed for High Performance Systems are typically designed over the span of several, and in some instances 10+, years. As a result, optimization practices which were appropriate for earlier systems may no longer be valid and thus require careful optimization consideration. Specifically, parallel codes whose memory footprint is a function of their scalability must be carefully considered for future exa-scale systems. In this paper we present a methodology and tool to study the memory scalability of parallel codes. Using our methodology we evaluate an applicationmore » s memory footprint as a function of scalability, which we coined memory efficiency, and describe our results. In particular, using our in-house tools we can pinpoint the specific application components which contribute to the application s overall memory foot-print (application data- structures, libraries, etc.).« less

  16. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit

    PubMed Central

    Pronk, Sander; Páll, Szilárd; Schulz, Roland; Larsson, Per; Bjelkmar, Pär; Apostolov, Rossen; Shirts, Michael R.; Smith, Jeremy C.; Kasson, Peter M.; van der Spoel, David; Hess, Berk; Lindahl, Erik

    2013-01-01

    Motivation: Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. Results: Here, we present a range of new simulation algorithms and features developed during the past 4 years, leading up to the GROMACS 4.5 software package. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations. Availability: GROMACS is an open source and free software available from http://www.gromacs.org. Contact: erik.lindahl@scilifelab.se Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23407358

  17. Chebyshev polynomial filtered subspace iteration in the discontinuous Galerkin method for large-scale electronic structure calculations

    DOE PAGES

    Banerjee, Amartya S.; Lin, Lin; Hu, Wei; ...

    2016-10-21

    The Discontinuous Galerkin (DG) electronic structure method employs an adaptive local basis (ALB) set to solve the Kohn-Sham equations of density functional theory in a discontinuous Galerkin framework. The adaptive local basis is generated on-the-fly to capture the local material physics and can systematically attain chemical accuracy with only a few tens of degrees of freedom per atom. A central issue for large-scale calculations, however, is the computation of the electron density (and subsequently, ground state properties) from the discretized Hamiltonian in an efficient and scalable manner. We show in this work how Chebyshev polynomial filtered subspace iteration (CheFSI) canmore » be used to address this issue and push the envelope in large-scale materials simulations in a discontinuous Galerkin framework. We describe how the subspace filtering steps can be performed in an efficient and scalable manner using a two-dimensional parallelization scheme, thanks to the orthogonality of the DG basis set and block-sparse structure of the DG Hamiltonian matrix. The on-the-fly nature of the ALB functions requires additional care in carrying out the subspace iterations. We demonstrate the parallel scalability of the DG-CheFSI approach in calculations of large-scale twodimensional graphene sheets and bulk three-dimensional lithium-ion electrolyte systems. In conclusion, employing 55 296 computational cores, the time per self-consistent field iteration for a sample of the bulk 3D electrolyte containing 8586 atoms is 90 s, and the time for a graphene sheet containing 11 520 atoms is 75 s.« less

  18. Parallel integer sorting with medium and fine-scale parallelism

    NASA Technical Reports Server (NTRS)

    Dagum, Leonardo

    1993-01-01

    Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.

  19. Development of a parallel FE simulator for modeling the whole trans-scale failure process of rock from meso- to engineering-scale

    NASA Astrophysics Data System (ADS)

    Li, Gen; Tang, Chun-An; Liang, Zheng-Zhao

    2017-01-01

    Multi-scale high-resolution modeling of rock failure process is a powerful means in modern rock mechanics studies to reveal the complex failure mechanism and to evaluate engineering risks. However, multi-scale continuous modeling of rock, from deformation, damage to failure, has raised high requirements on the design, implementation scheme and computation capacity of the numerical software system. This study is aimed at developing the parallel finite element procedure, a parallel rock failure process analysis (RFPA) simulator that is capable of modeling the whole trans-scale failure process of rock. Based on the statistical meso-damage mechanical method, the RFPA simulator is able to construct heterogeneous rock models with multiple mechanical properties, deal with and represent the trans-scale propagation of cracks, in which the stress and strain fields are solved for the damage evolution analysis of representative volume element by the parallel finite element method (FEM) solver. This paper describes the theoretical basis of the approach and provides the details of the parallel implementation on a Windows - Linux interactive platform. A numerical model is built to test the parallel performance of FEM solver. Numerical simulations are then carried out on a laboratory-scale uniaxial compression test, and field-scale net fracture spacing and engineering-scale rock slope examples, respectively. The simulation results indicate that relatively high speedup and computation efficiency can be achieved by the parallel FEM solver with a reasonable boot process. In laboratory-scale simulation, the well-known physical phenomena, such as the macroscopic fracture pattern and stress-strain responses, can be reproduced. In field-scale simulation, the formation process of net fracture spacing from initiation, propagation to saturation can be revealed completely. In engineering-scale simulation, the whole progressive failure process of the rock slope can be well modeled. It is shown that the parallel FE simulator developed in this study is an efficient tool for modeling the whole trans-scale failure process of rock from meso- to engineering-scale.

  20. The influence of the self-consistent mode structure on the Coriolis pinch effect

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peeters, A. G.; Camenen, Y.; Casson, F. J.

    This paper discusses the effect of the mode structure on the Coriolis pinch effect [A. G. Peeters, C. Angioni, and D. Strintzi, Phys. Rev. Lett. 98, 265003 (2007)]. It is shown that the Coriolis drift effect can be compensated for by a finite parallel wave vector, resulting in a reduced momentum pinch velocity. Gyrokinetic simulations in full toroidal geometry reveal that parallel dynamics effectively removes the Coriolis pinch for the case of adiabatic electrons, while the compensation due to the parallel dynamics is incomplete for the case of kinetic electrons, resulting in a finite pinch velocity. The finite flux inmore » the case of kinetic electrons is interpreted to be related to the electron trapping, which prevents a strong asymmetry in the electrostatic potential with respect to the low field side position. The physics picture developed here leads to the discovery and explanation of two unexpected effects: First the pinch velocity scales with the trapped particle fraction (root of the inverse aspect ratio), and second there is no strong collisionality dependence. The latter is related to the role of the trapped electrons, which retain some symmetry in the eigenmode, but play no role in the perturbed parallel velocity.« less

  1. Concurrent electromagnetic scattering analysis

    NASA Technical Reports Server (NTRS)

    Patterson, Jean E.; Cwik, Tom; Ferraro, Robert D.; Jacobi, Nathan; Liewer, Paulett C.; Lockhart, Thomas G.; Lyzenga, Gregory A.; Parker, Jay

    1989-01-01

    The computational power of the hypercube parallel computing architecture is applied to the solution of large-scale electromagnetic scattering and radiation problems. Three analysis codes have been implemented. A Hypercube Electromagnetic Interactive Analysis Workstation was developed to aid in the design and analysis of metallic structures such as antennas and to facilitate the use of these analysis codes. The workstation provides a general user environment for specification of the structure to be analyzed and graphical representations of the results.

  2. HACC: Extreme Scaling and Performance Across Diverse Architectures

    NASA Astrophysics Data System (ADS)

    Habib, Salman; Morozov, Vitali; Frontiere, Nicholas; Finkel, Hal; Pope, Adrian; Heitmann, Katrin

    2013-11-01

    Supercomputing is evolving towards hybrid and accelerator-based architectures with millions of cores. The HACC (Hardware/Hybrid Accelerated Cosmology Code) framework exploits this diverse landscape at the largest scales of problem size, obtaining high scalability and sustained performance. Developed to satisfy the science requirements of cosmological surveys, HACC melds particle and grid methods using a novel algorithmic structure that flexibly maps across architectures, including CPU/GPU, multi/many-core, and Blue Gene systems. We demonstrate the success of HACC on two very different machines, the CPU/GPU system Titan and the BG/Q systems Sequoia and Mira, attaining unprecedented levels of scalable performance. We demonstrate strong and weak scaling on Titan, obtaining up to 99.2% parallel efficiency, evolving 1.1 trillion particles. On Sequoia, we reach 13.94 PFlops (69.2% of peak) and 90% parallel efficiency on 1,572,864 cores, with 3.6 trillion particles, the largest cosmological benchmark yet performed. HACC design concepts are applicable to several other supercomputer applications.

  3. Parallel Adaptive High-Order CFD Simulations Characterizing Cavity Acoustics for the Complete SOFIA Aircraft

    NASA Technical Reports Server (NTRS)

    Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak

    2014-01-01

    This paper presents one-of-a-kind MPI-parallel computational fluid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft of a Boeing 747SP. These simulations focus on how the unsteady flow field inside and over the cavity interferes with the optical path and mounting of the telescope. A temporally fourth-order Runge-Kutta, and spatially fifth-order WENO-5Z scheme was used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh refinement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32,000 cores and 4 billion cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregularities caused by the highly complex geometry. Limits to scaling beyond 32K cores are identified, and targeted code optimizations are discussed.

  4. Parallel-vector computation for structural analysis and nonlinear unconstrained optimization problems

    NASA Technical Reports Server (NTRS)

    Nguyen, Duc T.

    1990-01-01

    Practical engineering application can often be formulated in the form of a constrained optimization problem. There are several solution algorithms for solving a constrained optimization problem. One approach is to convert a constrained problem into a series of unconstrained problems. Furthermore, unconstrained solution algorithms can be used as part of the constrained solution algorithms. Structural optimization is an iterative process where one starts with an initial design, a finite element structure analysis is then performed to calculate the response of the system (such as displacements, stresses, eigenvalues, etc.). Based upon the sensitivity information on the objective and constraint functions, an optimizer such as ADS or IDESIGN, can be used to find the new, improved design. For the structural analysis phase, the equation solver for the system of simultaneous, linear equations plays a key role since it is needed for either static, or eigenvalue, or dynamic analysis. For practical, large-scale structural analysis-synthesis applications, computational time can be excessively large. Thus, it is necessary to have a new structural analysis-synthesis code which employs new solution algorithms to exploit both parallel and vector capabilities offered by modern, high performance computers such as the Convex, Cray-2 and Cray-YMP computers. The objective of this research project is, therefore, to incorporate the latest development in the parallel-vector equation solver, PVSOLVE into the widely popular finite-element production code, such as the SAP-4. Furthermore, several nonlinear unconstrained optimization subroutines have also been developed and tested under a parallel computer environment. The unconstrained optimization subroutines are not only useful in their own right, but they can also be incorporated into a more popular constrained optimization code, such as ADS.

  5. Slip-parallel seismic lineations on the Northern Hayward Fault, California

    USGS Publications Warehouse

    Waldhauser, F.; Ellsworth, W.L.; Cole, A.

    1999-01-01

    A high-resolution relative earthquake location procedure is used to image the fine-scale seismicity structure of the northern Hayward fault, California. The seismicity defines a narrow, near-vertical fault zone containing horizontal alignments of hypocenters extending along the fault zone. The lineations persist over the 15-year observation interval, implying the localization of conditions on the fault where brittle failure conditions are met. The horizontal orientation of the lineations parallels the slip direction of the fault, suggesting that they are the result of the smearing of frictionally weak material along the fault plane over thousands of years.

  6. Large trench-parallel gravity variations predict seismogenic behavior in subduction zones.

    PubMed

    Song, Teh-Ru Alex; Simons, Mark

    2003-08-01

    We demonstrate that great earthquakes occur predominantly in regions with a strongly negative trench-parallel gravity anomaly (TPGA), whereas regions with strongly positive TPGA are relatively aseismic. These observations suggest that, over time scales up to at least 1 million years, spatial variations of seismogenic behavior within a given subduction zone are stationary and linked to the geological structure of the fore-arc. The correlations we observe are consistent with a model in which spatial variations in frictional properties on the plate interface control trench-parellel variations in fore-arc topography, gravity, and seismogenic behavior.

  7. Large-eddy simulations of compressible convection on massively parallel computers. [stellar physics

    NASA Technical Reports Server (NTRS)

    Xie, Xin; Toomre, Juri

    1993-01-01

    We report preliminary implementation of the large-eddy simulation (LES) technique in 2D simulations of compressible convection carried out on the CM-2 massively parallel computer. The convective flow fields in our simulations possess structures similar to those found in a number of direct simulations, with roll-like flows coherent across the entire depth of the layer that spans several density scale heights. Our detailed assessment of the effects of various subgrid scale (SGS) terms reveals that they may affect the gross character of convection. Yet, somewhat surprisingly, we find that our LES solutions, and another in which the SGS terms are turned off, only show modest differences. The resulting 2D flows realized here are rather laminar in character, and achieving substantial turbulence may require stronger forcing and less dissipation.

  8. Performance of Extended Local Clustering Organization (LCO) for Large Scale Job-Shop Scheduling Problem (JSP)

    NASA Astrophysics Data System (ADS)

    Konno, Yohko; Suzuki, Keiji

    This paper describes an approach to development of a solution algorithm of a general-purpose for large scale problems using “Local Clustering Organization (LCO)” as a new solution for Job-shop scheduling problem (JSP). Using a performance effective large scale scheduling in the study of usual LCO, a solving JSP keep stability induced better solution is examined. In this study for an improvement of a performance of a solution for JSP, processes to a optimization by LCO is examined, and a scheduling solution-structure is extended to a new solution-structure based on machine-division. A solving method introduced into effective local clustering for the solution-structure is proposed as an extended LCO. An extended LCO has an algorithm which improves scheduling evaluation efficiently by clustering of parallel search which extends over plural machines. A result verified by an application of extended LCO on various scale of problems proved to conduce to minimizing make-span and improving on the stable performance.

  9. Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

    PubMed

    Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

    2016-04-01

    Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .

  10. Space Technology 5 Multipoint Observations of Temporal and Spatial Variability of Field-Aligned Currents

    NASA Technical Reports Server (NTRS)

    Le, G.; Wang, Y.; Slavin, J. A.; Strangeway, R. L.

    2009-01-01

    Space Technology 5 (ST5) is a constellation mission consisting of three microsatellites. It provides the first multipoint magnetic field measurements in low Earth orbit, which enables us to separate spatial and temporal variations. In this paper, we present a study of the temporal variability of field-aligned currents using the ST5 data. We examine the field-aligned current observations during and after a geomagnetic storm and compare the magnetic field profiles at the three spacecraft. The multipoint data demonstrate that mesoscale current structures, commonly embedded within large-scale current sheets, are very dynamic with highly variable current density and/or polarity in approx.10 min time scales. On the other hand, the data also show that the time scales for the currents to be relatively stable are approx.1 min for mesoscale currents and approx.10 min for large-scale currents. These temporal features are very likely associated with dynamic variations of their charge carriers (mainly electrons) as they respond to the variations of the parallel electric field in auroral acceleration region. The characteristic time scales for the temporal variability of mesoscale field-aligned currents are found to be consistent with those of auroral parallel electric field.

  11. Implementation and performance of FDPS: a framework for developing parallel particle simulation codes

    NASA Astrophysics Data System (ADS)

    Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro

    2016-08-01

    We present the basic idea, implementation, measured performance, and performance model of FDPS (Framework for Developing Particle Simulators). FDPS is an application-development framework which helps researchers to develop simulation programs using particle methods for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, exchange of particles which are not in the domain of each computing node, and gathering of the particle information in other nodes which are necessary for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as the Barnes-Hut tree algorithm or the Fast Multipole Method should be used in the case of long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are required. FDPS provides all of these functions which are necessary for efficient parallel execution of particle-based simulations as "templates," which are independent of the actual data structure of particles and the functional form of the particle-particle interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speed-up was obtained for up to the full system of the K computer. The minimum calculation time per timestep is in the range of 30 ms (N = 107) to 300 ms (N = 109). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.

  12. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    PubMed Central

    Azad, Ariful; Ouzounis, Christos A; Kyrpides, Nikos C; Buluç, Aydin

    2018-01-01

    Abstract Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times and memory demands. Here, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license. PMID:29315405

  13. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    DOE PAGES

    Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.; ...

    2018-01-05

    Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less

  14. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.

    Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less

  15. Shear fabrics reveal orogen-parallel deformations, NW Lesser Garhwal Himalaya, Uttarakhand, India

    NASA Astrophysics Data System (ADS)

    Biswas, T.; Bose, N.; Mukherjee, S.

    2017-12-01

    Shear deformation along the Himalayan belt is poorly understood unlike that across the orogen. Field observations and structural analysis along Bhagirathi river section along the National Highway 34 reveals NW Lesser Himalaya (Garhwal region, India) suffered both compression and extension parallel to the orogenic belt and thus forms a unique venue of great structural and tectonic interest. Meso-scale ductile- and brittle shear fabrics, such as S-C, C-P, Y-P, Y-S; are emphasized describing such deformations. Extensional shear fabric strikes N43oE and compressional shear fabrics N39.5oE, which are at a low-angle with the orogenic trend. Our study reviews orogen parallel deformation, both extension as well as compression, taking examples from other part of the world (e.g., Central Andes, N Apennines and SW Alps) and from other terrains in the Himalaya. Proposed models are evaluated and compared with the study area. The results shows that the pre-existing remnant structures (e.g., the Delhi-Haridwar ridge) on the under-thrusting Indian shield/plate plays a vital role in modifying thin-skinned tectonics along with migration of the eastward extrusion of the Tibetian plateau (hinterland deformation) into the Himalayan foreland.

  16. Coarse-grained component concurrency in Earth system modeling: parallelizing atmospheric radiative transfer in the GFDL AM3 model using the Flexible Modeling System coupling framework

    NASA Astrophysics Data System (ADS)

    Balaji, V.; Benson, Rusty; Wyman, Bruce; Held, Isaac

    2016-10-01

    Climate models represent a large variety of processes on a variety of timescales and space scales, a canonical example of multi-physics multi-scale modeling. Current hardware trends, such as Graphical Processing Units (GPUs) and Many Integrated Core (MIC) chips, are based on, at best, marginal increases in clock speed, coupled with vast increases in concurrency, particularly at the fine grain. Multi-physics codes face particular challenges in achieving fine-grained concurrency, as different physics and dynamics components have different computational profiles, and universal solutions are hard to come by. We propose here one approach for multi-physics codes. These codes are typically structured as components interacting via software frameworks. The component structure of a typical Earth system model consists of a hierarchical and recursive tree of components, each representing a different climate process or dynamical system. This recursive structure generally encompasses a modest level of concurrency at the highest level (e.g., atmosphere and ocean on different processor sets) with serial organization underneath. We propose to extend concurrency much further by running more and more lower- and higher-level components in parallel with each other. Each component can further be parallelized on the fine grain, potentially offering a major increase in the scalability of Earth system models. We present here first results from this approach, called coarse-grained component concurrency, or CCC. Within the Geophysical Fluid Dynamics Laboratory (GFDL) Flexible Modeling System (FMS), the atmospheric radiative transfer component has been configured to run in parallel with a composite component consisting of every other atmospheric component, including the atmospheric dynamics and all other atmospheric physics components. We will explore the algorithmic challenges involved in such an approach, and present results from such simulations. Plans to achieve even greater levels of coarse-grained concurrency by extending this approach within other components, such as the ocean, will be discussed.

  17. Astrophysical N-body Simulations Using Hierarchical Tree Data Structures

    NASA Astrophysics Data System (ADS)

    Warren, M. S.; Salmon, J. K.

    The authors report on recent large astrophysical N-body simulations executed on the Intel Touchstone Delta system. They review the astrophysical motivation and the numerical techniques and discuss steps taken to parallelize these simulations. The methods scale as O(N log N), for large values of N, and also scale linearly with the number of processors. The performance sustained for a duration of 67 h, was between 5.1 and 5.4 Gflop/s on a 512-processor system.

  18. GSHR-Tree: a spatial index tree based on dynamic spatial slot and hash table in grid environments

    NASA Astrophysics Data System (ADS)

    Chen, Zhanlong; Wu, Xin-cai; Wu, Liang

    2008-12-01

    Computation Grids enable the coordinated sharing of large-scale distributed heterogeneous computing resources that can be used to solve computationally intensive problems in science, engineering, and commerce. Grid spatial applications are made possible by high-speed networks and a new generation of Grid middleware that resides between networks and traditional GIS applications. The integration of the multi-sources and heterogeneous spatial information and the management of the distributed spatial resources and the sharing and cooperative of the spatial data and Grid services are the key problems to resolve in the development of the Grid GIS. The performance of the spatial index mechanism is the key technology of the Grid GIS and spatial database affects the holistic performance of the GIS in Grid Environments. In order to improve the efficiency of parallel processing of a spatial mass data under the distributed parallel computing grid environment, this paper presents a new grid slot hash parallel spatial index GSHR-Tree structure established in the parallel spatial indexing mechanism. Based on the hash table and dynamic spatial slot, this paper has improved the structure of the classical parallel R tree index. The GSHR-Tree index makes full use of the good qualities of R-Tree and hash data structure. This paper has constructed a new parallel spatial index that can meet the needs of parallel grid computing about the magnanimous spatial data in the distributed network. This arithmetic splits space in to multi-slots by multiplying and reverting and maps these slots to sites in distributed and parallel system. Each sites constructs the spatial objects in its spatial slot into an R tree. On the basis of this tree structure, the index data was distributed among multiple nodes in the grid networks by using large node R-tree method. The unbalance during process can be quickly adjusted by means of a dynamical adjusting algorithm. This tree structure has considered the distributed operation, reduplication operation transfer operation of spatial index in the grid environment. The design of GSHR-Tree has ensured the performance of the load balance in the parallel computation. This tree structure is fit for the parallel process of the spatial information in the distributed network environments. Instead of spatial object's recursive comparison where original R tree has been used, the algorithm builds the spatial index by applying binary code operation in which computer runs more efficiently, and extended dynamic hash code for bit comparison. In GSHR-Tree, a new server is assigned to the network whenever a split of a full node is required. We describe a more flexible allocation protocol which copes with a temporary shortage of storage resources. It uses a distributed balanced binary spatial tree that scales with insertions to potentially any number of storage servers through splits of the overloaded ones. The application manipulates the GSHR-Tree structure from a node in the grid environment. The node addresses the tree through its image that the splits can make outdated. This may generate addressing errors, solved by the forwarding among the servers. In this paper, a spatial index data distribution algorithm that limits the number of servers has been proposed. We improve the storage utilization at the cost of additional messages. The structure of GSHR-Tree is believed that the scheme of this grid spatial index should fit the needs of new applications using endlessly larger sets of spatial data. Our proposal constitutes a flexible storage allocation method for a distributed spatial index. The insertion policy can be tuned dynamically to cope with periods of storage shortage. In such cases storage balancing should be favored for better space utilization, at the price of extra message exchanges between servers. This structure makes a compromise in the updating of the duplicated index and the transformation of the spatial index data. Meeting the needs of the grid computing, GSHRTree has a flexible structure in order to satisfy new needs in the future. The GSHR-Tree provides the R-tree capabilities for large spatial datasets stored over interconnected servers. The analysis, including the experiments, confirmed the efficiency of our design choices. The scheme should fit the needs of new applications of spatial data, using endlessly larger datasets. Using the system response time of the parallel processing of spatial scope query algorithm as the performance evaluation factor, According to the result of the simulated the experiments, GSHR-Tree is performed to prove the reasonable design and the high performance of the indexing structure that the paper presented.

  19. TURBULENCE-GENERATED PROTON-SCALE STRUCTURES IN THE TERRESTRIAL MAGNETOSHEATH

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vörös, Zoltán; Narita, Yasuhito; Yordanova, Emiliya

    2016-03-01

    Recent results of numerical magnetohydrodynamic simulations suggest that in collisionless space plasmas, turbulence can spontaneously generate thin current sheets. These coherent structures can partially explain the intermittency and the non-homogenous distribution of localized plasma heating in turbulence. In this Letter, Cluster multi-point observations are used to investigate the distribution of magnetic field discontinuities and the associated small-scale current sheets in the terrestrial magnetosheath downstream of a quasi-parallel bow shock. It is shown experimentally, for the first time, that the strongest turbulence-generated current sheets occupy the long tails of probability distribution functions associated with extremal values of magnetic field partial derivatives.more » During the analyzed one-hour time interval, about a hundred strong discontinuities, possibly proton-scale current sheets, were observed.« less

  20. The connection-set algebra--a novel formalism for the representation of connectivity structure in neuronal network models.

    PubMed

    Djurfeldt, Mikael

    2012-07-01

    The connection-set algebra (CSA) is a novel and general formalism for the description of connectivity in neuronal network models, from small-scale to large-scale structure. The algebra provides operators to form more complex sets of connections from simpler ones and also provides parameterization of such sets. CSA is expressive enough to describe a wide range of connection patterns, including multiple types of random and/or geometrically dependent connectivity, and can serve as a concise notation for network structure in scientific writing. CSA implementations allow for scalable and efficient representation of connectivity in parallel neuronal network simulators and could even allow for avoiding explicit representation of connections in computer memory. The expressiveness of CSA makes prototyping of network structure easy. A C+ + version of the algebra has been implemented and used in a large-scale neuronal network simulation (Djurfeldt et al., IBM J Res Dev 52(1/2):31-42, 2008b) and an implementation in Python has been publicly released.

  1. A novel milliliter-scale chemostat system for parallel cultivation of microorganisms in stirred-tank bioreactors.

    PubMed

    Schmideder, Andreas; Severin, Timm Steffen; Cremer, Johannes Heinrich; Weuster-Botz, Dirk

    2015-09-20

    A pH-controlled parallel stirred-tank bioreactor system was modified for parallel continuous cultivation on a 10 mL-scale by connecting multichannel peristaltic pumps for feeding and medium removal with micro-pipes (250 μm inner diameter). Parallel chemostat processes with Escherichia coli as an example showed high reproducibility with regard to culture volume and flow rates as well as dry cell weight, dissolved oxygen concentration and pH control at steady states (n=8, coefficient of variation <5%). Reliable estimation of kinetic growth parameters of E. coli was easily achieved within one parallel experiment by preselecting ten different steady states. Scalability of milliliter-scale steady state results was demonstrated by chemostat studies with a stirred-tank bioreactor on a liter-scale. Thus, parallel and continuously operated stirred-tank bioreactors on a milliliter-scale facilitate timesaving and cost reducing steady state studies with microorganisms. The applied continuous bioreactor system overcomes the drawbacks of existing miniaturized bioreactors, like poor mass transfer and insufficient process control. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Cellular automata with object-oriented features for parallel molecular network modeling.

    PubMed

    Zhu, Hao; Wu, Yinghui; Huang, Sui; Sun, Yan; Dhar, Pawan

    2005-06-01

    Cellular automata are an important modeling paradigm for studying the dynamics of large, parallel systems composed of multiple, interacting components. However, to model biological systems, cellular automata need to be extended beyond the large-scale parallelism and intensive communication in order to capture two fundamental properties characteristic of complex biological systems: hierarchy and heterogeneity. This paper proposes extensions to a cellular automata language, Cellang, to meet this purpose. The extended language, with object-oriented features, can be used to describe the structure and activity of parallel molecular networks within cells. Capabilities of this new programming language include object structure to define molecular programs within a cell, floating-point data type and mathematical functions to perform quantitative computation, message passing capability to describe molecular interactions, as well as new operators, statements, and built-in functions. We discuss relevant programming issues of these features, including the object-oriented description of molecular interactions with molecule encapsulation, message passing, and the description of heterogeneity and anisotropy at the cell and molecule levels. By enabling the integration of modeling at the molecular level with system behavior at cell, tissue, organ, or even organism levels, the program will help improve our understanding of how complex and dynamic biological activities are generated and controlled by parallel functioning of molecular networks. Index Terms-Cellular automata, modeling, molecular network, object-oriented.

  3. A Metascalable Computing Framework for Large Spatiotemporal-Scale Atomistic Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nomura, K; Seymour, R; Wang, W

    2009-02-17

    A metascalable (or 'design once, scale on new architectures') parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials based on spatiotemporal data locality principles, which is expected to scale on emerging multipetaflops architectures. The framework consists of: (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms for high complexity problems; (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, while introducing multiple parallelization axes; and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these O(N) algorithms onto a multicore cluster based onmore » hybrid implementation combining message passing and critical section-free multithreading. The EDC-STEP-HCD framework exposes maximal concurrency and data locality, thereby achieving: (1) inter-node parallel efficiency well over 0.95 for 218 billion-atom molecular-dynamics and 1.68 trillion electronic-degrees-of-freedom quantum-mechanical simulations on 212,992 IBM BlueGene/L processors (superscalability); (2) high intra-node, multithreading parallel efficiency (nanoscalability); and (3) nearly perfect time/ensemble parallel efficiency (eon-scalability). The spatiotemporal scale covered by MD simulation on a sustained petaflops computer per day (i.e. petaflops {center_dot} day of computing) is estimated as NT = 2.14 (e.g. N = 2.14 million atoms for T = 1 microseconds).« less

  4. Advances in Parallelization for Large Scale Oct-Tree Mesh Generation

    NASA Technical Reports Server (NTRS)

    O'Connell, Matthew; Karman, Steve L.

    2015-01-01

    Despite great advancements in the parallelization of numerical simulation codes over the last 20 years, it is still common to perform grid generation in serial. Generating large scale grids in serial often requires using special "grid generation" compute machines that can have more than ten times the memory of average machines. While some parallel mesh generation techniques have been proposed, generating very large meshes for LES or aeroacoustic simulations is still a challenging problem. An automated method for the parallel generation of very large scale off-body hierarchical meshes is presented here. This work enables large scale parallel generation of off-body meshes by using a novel combination of parallel grid generation techniques and a hybrid "top down" and "bottom up" oct-tree method. Meshes are generated using hardware commonly found in parallel compute clusters. The capability to generate very large meshes is demonstrated by the generation of off-body meshes surrounding complex aerospace geometries. Results are shown including a one billion cell mesh generated around a Predator Unmanned Aerial Vehicle geometry, which was generated on 64 processors in under 45 minutes.

  5. Trace: a high-throughput tomographic reconstruction engine for large-scale datasets.

    PubMed

    Bicer, Tekin; Gürsoy, Doğa; Andrade, Vincent De; Kettimuthu, Rajkumar; Scullin, William; Carlo, Francesco De; Foster, Ian T

    2017-01-01

    Modern synchrotron light sources and detectors produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used imaging techniques that generates data at tens of gigabytes per second is computed tomography (CT). Although CT experiments result in rapid data generation, the analysis and reconstruction of the collected data may require hours or even days of computation time with a medium-sized workstation, which hinders the scientific progress that relies on the results of analysis. We present Trace, a data-intensive computing engine that we have developed to enable high-performance implementation of iterative tomographic reconstruction algorithms for parallel computers. Trace provides fine-grained reconstruction of tomography datasets using both (thread-level) shared memory and (process-level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations that we apply to the replicated reconstruction objects and evaluate them using tomography datasets collected at the Advanced Photon Source. Our experimental evaluations show that our optimizations and parallelization techniques can provide 158× speedup using 32 compute nodes (384 cores) over a single-core configuration and decrease the end-to-end processing time of a large sinogram (with 4501 × 1 × 22,400 dimensions) from 12.5 h to <5 min per iteration. The proposed tomographic reconstruction engine can efficiently process large-scale tomographic data using many compute nodes and minimize reconstruction times.

  6. From chemotaxis to the cognitive map: The function of olfaction

    PubMed Central

    Jacobs, Lucia F.

    2012-01-01

    A paradox of vertebrate brain evolution is the unexplained variability in the size of the olfactory bulb (OB), in contrast to other brain regions, which scale predictably with brain size. Such variability appears to be the result of selection for olfactory function, yet there is no obvious concordance that would predict the causal relationship between OB size and behavior. This discordance may derive from assuming the primary function of olfaction is odorant discrimination and acuity. If instead the primary function of olfaction is navigation, i.e., predicting odorant distributions in time and space, variability in absolute OB size could be ascribed and explained by variability in navigational demand. This olfactory spatial hypothesis offers a single functional explanation to account for patterns of olfactory system scaling in vertebrates, the primacy of olfaction in spatial navigation, even in visual specialists, and proposes an evolutionary scenario to account for the convergence in olfactory structure and function across protostomes and deuterostomes. In addition, the unique percepts of olfaction may organize odorant information in a parallel map structure. This could have served as a scaffold for the evolution of the parallel map structure of the mammalian hippocampus, and possibly the arthropod mushroom body, and offers an explanation for similar flexible spatial navigation strategies in arthropods and vertebrates. PMID:22723365

  7. Atomic scale structure and chemistry of interfaces by Z-contrast imaging and electron energy loss spectroscopy in the stem

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McGibbon, M.M.; Browning, N.D.; Chisholm, M.F.

    The macroscopic properties of many materials are controlled by the structure and chemistry at grain boundaries. A basic understanding of the structure-property relationship requires a technique which probes both composition and chemical bonding on an atomic scale. High-resolution Z-contrast imaging in the scanning transmission electron microscope (STEM) forms an incoherent image in which changes in atomic structure and composition across an interface can be interpreted directly without the need for preconceived atomic structure models. Since the Z-contrast image is formed by electrons scattered through high angles, parallel detection electron energy loss spectroscopy (PEELS) can be used simultaneously to provide complementarymore » chemical information on an atomic scale. The fine structure in the PEEL spectra can be used to investigate the local electronic structure and the nature of the bonding across the interface. In this paper we use the complimentary techniques of high resolution Z-contrast imaging and PEELS to investigate the atomic structure and chemistry of a 25{degree} symmetric tilt boundary in a bicrystal of the electroceramic SrTiO{sub 3}.« less

  8. Moose: An Open-Source Framework to Enable Rapid Development of Collaborative, Multi-Scale, Multi-Physics Simulation Tools

    NASA Astrophysics Data System (ADS)

    Slaughter, A. E.; Permann, C.; Peterson, J. W.; Gaston, D.; Andrs, D.; Miller, J.

    2014-12-01

    The Idaho National Laboratory (INL)-developed Multiphysics Object Oriented Simulation Environment (MOOSE; www.mooseframework.org), is an open-source, parallel computational framework for enabling the solution of complex, fully implicit multiphysics systems. MOOSE provides a set of computational tools that scientists and engineers can use to create sophisticated multiphysics simulations. Applications built using MOOSE have computed solutions for chemical reaction and transport equations, computational fluid dynamics, solid mechanics, heat conduction, mesoscale materials modeling, geomechanics, and others. To facilitate the coupling of diverse and highly-coupled physical systems, MOOSE employs the Jacobian-free Newton-Krylov (JFNK) method when solving the coupled nonlinear systems of equations arising in multiphysics applications. The MOOSE framework is written in C++, and leverages other high-quality, open-source scientific software packages such as LibMesh, Hypre, and PETSc. MOOSE uses a "hybrid parallel" model which combines both shared memory (thread-based) and distributed memory (MPI-based) parallelism to ensure efficient resource utilization on a wide range of computational hardware. MOOSE-based applications are inherently modular, which allows for simulation expansion (via coupling of additional physics modules) and the creation of multi-scale simulations. Any application developed with MOOSE supports running (in parallel) any other MOOSE-based application. Each application can be developed independently, yet easily communicate with other applications (e.g., conductivity in a slope-scale model could be a constant input, or a complete phase-field micro-structure simulation) without additional code being written. This method of development has proven effective at INL and expedites the development of sophisticated, sustainable, and collaborative simulation tools.

  9. Coloration principles of nymphaline butterflies - thin films, melanin, ommochromes and wing scale stacking.

    PubMed

    Stavenga, Doekele G; Leertouwer, Hein L; Wilts, Bodo D

    2014-06-15

    The coloration of the common butterflies Aglais urticae (small tortoiseshell), Aglais io (peacock) and Vanessa atalanta (red admiral), belonging to the butterfly subfamily Nymphalinae, is due to the species-specific patterning of differently coloured scales on their wings. We investigated the scales' structural and pigmentary properties by applying scanning electron microscopy, (micro)spectrophotometry and imaging scatterometry. The anatomy of the wing scales appears to be basically identical, with an approximately flat lower lamina connected by trabeculae to a highly structured upper lamina, which consists of an array of longitudinal, parallel ridges and transversal crossribs. Isolated scales observed at the abwing (upper) side are blue, yellow, orange, red, brown or black, depending on their pigmentation. The yellow, orange and red scales contain various amounts of 3-OH-kynurenine and ommochrome pigment, black scales contain a high density of melanin, and blue scales have a minor amount of melanin pigment. Observing the scales from their adwing (lower) side always revealed a structural colour, which is blue in the case of blue, red and black scales, but orange for orange scales. The structural colours are created by the lower lamina, which acts as an optical thin film. Its reflectance spectrum, crucially determined by the lamina thickness, appears to be well tuned to the scales' pigmentary spectrum. The colours observed locally on the wing are also due to the degree of scale stacking. Thin films, tuned pigments and combinations of stacked scales together determine the wing coloration of nymphaline butterflies. © 2014. Published by The Company of Biologists Ltd.

  10. Implementation of Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) Software

    DTIC Science & Technology

    2015-08-01

    Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten and James P Larentzos Approved for...Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten Weapons and Materials Research Directorate, ARL James P Larentzos Engility...Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software 5a. CONTRACT NUMBER 5b

  11. A parallel-vector algorithm for rapid structural analysis on high-performance computers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

    1990-01-01

    A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the 'loop unrolling' technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large-scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.

  12. A parallel-vector algorithm for rapid structural analysis on high-performance computers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

    1990-01-01

    A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the loop unrolling technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.

  13. MOOSE: A PARALLEL COMPUTATIONAL FRAMEWORK FOR COUPLED SYSTEMS OF NONLINEAR EQUATIONS.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    G. Hansen; C. Newman; D. Gaston

    Systems of coupled, nonlinear partial di?erential equations often arise in sim- ulation of nuclear processes. MOOSE: Multiphysics Ob ject Oriented Simulation Environment, a parallel computational framework targeted at solving these systems is presented. As opposed to traditional data / ?ow oriented com- putational frameworks, MOOSE is instead founded on mathematics based on Jacobian-free Newton Krylov (JFNK). Utilizing the mathematical structure present in JFNK, physics are modularized into “Kernels” allowing for rapid production of new simulation tools. In addition, systems are solved fully cou- pled and fully implicit employing physics based preconditioning allowing for a large amount of ?exibility even withmore » large variance in time scales. Background on the mathematics, an inspection of the structure of MOOSE and several rep- resentative solutions from applications built on the framework are presented.« less

  14. MOOSE: A parallel computational framework for coupled systems of nonlinear equations.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Derek Gaston; Chris Newman; Glen Hansen

    Systems of coupled, nonlinear partial differential equations (PDEs) often arise in simulation of nuclear processes. MOOSE: Multiphysics Object Oriented Simulation Environment, a parallel computational framework targeted at the solution of such systems, is presented. As opposed to traditional data-flow oriented computational frameworks, MOOSE is instead founded on the mathematical principle of Jacobian-free Newton-Krylov (JFNK) solution methods. Utilizing the mathematical structure present in JFNK, physics expressions are modularized into `Kernels,'' allowing for rapid production of new simulation tools. In addition, systems are solved implicitly and fully coupled, employing physics based preconditioning, which provides great flexibility even with large variance in timemore » scales. A summary of the mathematics, an overview of the structure of MOOSE, and several representative solutions from applications built on the framework are presented.« less

  15. Research of the effectiveness of parallel multithreaded realizations of interpolation methods for scaling raster images

    NASA Astrophysics Data System (ADS)

    Vnukov, A. A.; Shershnev, M. B.

    2018-01-01

    The aim of this work is the software implementation of three image scaling algorithms using parallel computations, as well as the development of an application with a graphical user interface for the Windows operating system to demonstrate the operation of algorithms and to study the relationship between system performance, algorithm execution time and the degree of parallelization of computations. Three methods of interpolation were studied, formalized and adapted to scale images. The result of the work is a program for scaling images by different methods. Comparison of the quality of scaling by different methods is given.

  16. Alignment between Protostellar Outflows and Filamentary Structure

    NASA Astrophysics Data System (ADS)

    Stephens, Ian W.; Dunham, Michael M.; Myers, Philip C.; Pokhrel, Riwaj; Sadavoy, Sarah I.; Vorobyov, Eduard I.; Tobin, John J.; Pineda, Jaime E.; Offner, Stella S. R.; Lee, Katherine I.; Kristensen, Lars E.; Jørgensen, Jes K.; Goodman, Alyssa A.; Bourke, Tyler L.; Arce, Héctor G.; Plunkett, Adele L.

    2017-09-01

    We present new Submillimeter Array (SMA) observations of CO(2-1) outflows toward young, embedded protostars in the Perseus molecular cloud as part of the Mass Assembly of Stellar Systems and their Evolution with the SMA (MASSES) survey. For 57 Perseus protostars, we characterize the orientation of the outflow angles and compare them with the orientation of the local filaments as derived from Herschel observations. We find that the relative angles between outflows and filaments are inconsistent with purely parallel or purely perpendicular distributions. Instead, the observed distribution of outflow-filament angles are more consistent with either randomly aligned angles or a mix of projected parallel and perpendicular angles. A mix of parallel and perpendicular angles requires perpendicular alignment to be more common by a factor of ˜3. Our results show that the observed distributions probably hold regardless of the protostar’s multiplicity, age, or the host core’s opacity. These observations indicate that the angular momentum axis of a protostar may be independent of the large-scale structure. We discuss the significance of independent protostellar rotation axes in the general picture of filament-based star formation.

  17. Schnek: A C++ library for the development of parallel simulation codes on regular grids

    NASA Astrophysics Data System (ADS)

    Schmitz, Holger

    2018-05-01

    A large number of algorithms across the field of computational physics are formulated on grids with a regular topology. We present Schnek, a library that enables fast development of parallel simulations on regular grids. Schnek contains a number of easy-to-use modules that greatly reduce the amount of administrative code for large-scale simulation codes. The library provides an interface for reading simulation setup files with a hierarchical structure. The structure of the setup file is translated into a hierarchy of simulation modules that the developer can specify. The reader parses and evaluates mathematical expressions and initialises variables or grid data. This enables developers to write modular and flexible simulation codes with minimal effort. Regular grids of arbitrary dimension are defined as well as mechanisms for defining physical domain sizes, grid staggering, and ghost cells on these grids. Ghost cells can be exchanged between neighbouring processes using MPI with a simple interface. The grid data can easily be written into HDF5 files using serial or parallel I/O.

  18. Scaling Irregular Applications through Data Aggregation and Software Multithreading

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morari, Alessandro; Tumeo, Antonino; Chavarría-Miranda, Daniel

    Bioinformatics, data analytics, semantic databases, knowledge discovery are emerging high performance application areas that exploit dynamic, linked data structures such as graphs, unbalanced trees or unstructured grids. These data structures usually are very large, requiring significantly more memory than available on single shared memory systems. Additionally, these data structures are difficult to partition on distributed memory systems. They also present poor spatial and temporal locality, thus generating unpredictable memory and network accesses. The Partitioned Global Address Space (PGAS) programming model seems suitable for these applications, because it allows using a shared memory abstraction across distributed-memory clusters. However, current PGAS languagesmore » and libraries are built to target regular remote data accesses and block transfers. Furthermore, they usually rely on the Single Program Multiple Data (SPMD) parallel control model, which is not well suited to the fine grained, dynamic and unbalanced parallelism of irregular applications. In this paper we present {\\bf GMT} (Global Memory and Threading library), a custom runtime library that enables efficient execution of irregular applications on commodity clusters. GMT integrates a PGAS data substrate with simple fork/join parallelism and provides automatic load balancing on a per node basis. It implements multi-level aggregation and lightweight multithreading to maximize memory and network bandwidth with fine-grained data accesses and tolerate long data access latencies. A key innovation in the GMT runtime is its thread specialization (workers, helpers and communication threads) that realize the overall functionality. We compare our approach with other PGAS models, such as UPC running using GASNet, and hand-optimized MPI code on a set of typical large-scale irregular applications, demonstrating speedups of an order of magnitude.« less

  19. Spontaneous Hot Flow Anomalies at Quasi-Parallel Shocks: 2. Hybrid Simulations

    NASA Technical Reports Server (NTRS)

    Omidi, N.; Zhang, H.; Sibeck, D.; Turner, D.

    2013-01-01

    Motivated by recent THEMIS observations, this paper uses 2.5-D electromagnetic hybrid simulations to investigate the formation of Spontaneous Hot Flow Anomalies (SHFA) upstream of quasi-parallel bow shocks during steady solar wind conditions and in the absence of discontinuities. The results show the formation of a large number of structures along and upstream of the quasi-parallel bow shock. Their outer edges exhibit density and magnetic field enhancements, while their cores exhibit drops in density, magnetic field, solar wind velocity and enhancements in ion temperature. Using virtual spacecraft in the simulation, we show that the signatures of these structures in the time series data are very similar to those of SHFAs seen in THEMIS data and conclude that they correspond to SHFAs. Examination of the simulation data shows that SHFAs form as the result of foreshock cavitons interacting with the bow shock. Foreshock cavitons in turn form due to the nonlinear evolution of ULF waves generated by the interaction of the solar wind with the backstreaming ions. Because foreshock cavitons are an inherent part of the shock dissipation process, the formation of SHFAs is also an inherent part of the dissipation process leading to a highly non-uniform plasma in the quasi-parallel magnetosheath including large scale density and magnetic field cavities.

  20. MMS Observations and Hybrid Simulations of Surface Ripples at a Marginally Quasi-Parallel Shock

    NASA Astrophysics Data System (ADS)

    Gingell, Imogen; Schwartz, Steven J.; Burgess, David; Johlander, Andreas; Russell, Christopher T.; Burch, James L.; Ergun, Robert E.; Fuselier, Stephen; Gershman, Daniel J.; Giles, Barbara L.; Goodrich, Katherine A.; Khotyaintsev, Yuri V.; Lavraud, Benoit; Lindqvist, Per-Arne; Strangeway, Robert J.; Trattner, Karlheinz; Torbert, Roy B.; Wei, Hanying; Wilder, Frederick

    2017-11-01

    Simulations and observations of collisionless shocks have shown that deviations of the nominal local shock normal orientation, that is, surface waves or ripples, are expected to propagate in the ramp and overshoot of quasi-perpendicular shocks. Here we identify signatures of a surface ripple propagating during a crossing of Earth's marginally quasi-parallel (θBn˜45∘) or quasi-parallel bow shock on 27 November 2015 06:01:44 UTC by the Magnetospheric Multiscale (MMS) mission and determine the ripple's properties using multispacecraft methods. Using two-dimensional hybrid simulations, we confirm that surface ripples are a feature of marginally quasi-parallel and quasi-parallel shocks under the observed solar wind conditions. In addition, since these marginally quasi-parallel and quasi-parallel shocks are expected to undergo a cyclic reformation of the shock front, we discuss the impact of multiple sources of nonstationarity on shock structure. Importantly, ripples are shown to be transient phenomena, developing faster than an ion gyroperiod and only during the period of the reformation cycle when a newly developed shock ramp is unaffected by turbulence in the foot. We conclude that the change in properties of the ripple observed by MMS is consistent with the reformation of the shock front over a time scale of an ion gyroperiod.

  1. Block-Parallel Data Analysis with DIY2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morozov, Dmitriy; Peterka, Tom

    DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial,more » parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.« less

  2. Polarization-sensitive color in butterfly scales: polarization conversion from ridges with reflecting elements.

    PubMed

    Zhang, Ke; Tang, Yiwen; Meng, Jinsong; Wang, Ge; Zhou, Han; Fan, Tongxiang; Zhang, Di

    2014-11-03

    Polarization-sensitive color originates from polarization-dependent reflection or transmission, exhibiting abundant light information, including intensity, spectral distribution, and polarization. A wide range of butterflies are physiologically sensitive to polarized light, but the origins of polarized signal have not been fully understood. Here we systematically investigate the colorful scales of six species of butterfly to reveal the physical origins of polarization-sensitive color. Microscopic optical images under crossed polarizers exhibit their polarization-sensitive characteristic, and micro-structural characterizations clarify their structural commonality. In the case of the structural scales that have deep ridges, the polarization-sensitive color related with scale azimuth is remarkable. Periodic ridges lead to the anisotropic effective refractive indices in the parallel and perpendicular grating orientations, which achieves form-birefringence, resulting in the phase difference of two different component polarized lights. Simulated results show that ridge structures with reflecting elements reflect and rotate the incident p-polarized light into s-polarized light. The dimensional parameters and shapes of grating greatly affect the polarization conversion process, and the triangular deep grating extends the outstanding polarization conversion effect from the sub-wavelength period to the period comparable to visible light wavelength. The parameters of ridge structures in butterfly scales have been optimized to fulfill the polarization-dependent reflection for secret communication. The structural and physical origin of polarization conversion provides a more comprehensive perspective on the creation of polarization-sensitive color in butterfly wing scales. These findings show great potential in anti-counterfeiting technology and advanced optical material design.

  3. Scalable Triadic Analysis of Large-Scale Graphs: Multi-Core vs. Multi-Processor vs. Multi-Threaded Shared Memory Architectures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chin, George; Marquez, Andres; Choudhury, Sutanay

    2012-09-01

    Triadic analysis encompasses a useful set of graph mining methods that is centered on the concept of a triad, which is a subgraph of three nodes and the configuration of directed edges across the nodes. Such methods are often applied in the social sciences as well as many other diverse fields. Triadic methods commonly operate on a triad census that counts the number of triads of every possible edge configuration in a graph. Like other graph algorithms, triadic census algorithms do not scale well when graphs reach tens of millions to billions of nodes. To enable the triadic analysis ofmore » large-scale graphs, we developed and optimized a triad census algorithm to efficiently execute on shared memory architectures. We will retrace the development and evolution of a parallel triad census algorithm. Over the course of several versions, we continually adapted the code’s data structures and program logic to expose more opportunities to exploit parallelism on shared memory that would translate into improved computational performance. We will recall the critical steps and modifications that occurred during code development and optimization. Furthermore, we will compare the performances of triad census algorithm versions on three specific systems: Cray XMT, HP Superdome, and AMD multi-core NUMA machine. These three systems have shared memory architectures but with markedly different hardware capabilities to manage parallelism.« less

  4. In situ patterned micro 3D liver constructs for parallel toxicology testing in a fluidic device

    PubMed Central

    Skardal, Aleksander; Devarasetty, Mahesh; Soker, Shay; Hall, Adam R

    2017-01-01

    3D tissue models are increasingly being implemented for drug and toxicology testing. However, the creation of tissue-engineered constructs for this purpose often relies on complex biofabrication techniques that are time consuming, expensive, and difficult to scale up. Here, we describe a strategy for realizing multiple tissue constructs in a parallel microfluidic platform using an approach that is simple and can be easily scaled for high-throughput formats. Liver cells mixed with a UV-crosslinkable hydrogel solution are introduced into parallel channels of a sealed microfluidic device and photopatterned to produce stable tissue constructs in situ. The remaining uncrosslinked material is washed away, leaving the structures in place. By using a hydrogel that specifically mimics the properties of the natural extracellular matrix, we closely emulate native tissue, resulting in constructs that remain stable and functional in the device during a 7-day culture time course under recirculating media flow. As proof of principle for toxicology analysis, we expose the constructs to ethyl alcohol (0–500 mM) and show that the cell viability and the secretion of urea and albumin decrease with increasing alcohol exposure, while markers for cell damage increase. PMID:26355538

  5. In situ patterned micro 3D liver constructs for parallel toxicology testing in a fluidic device.

    PubMed

    Skardal, Aleksander; Devarasetty, Mahesh; Soker, Shay; Hall, Adam R

    2015-09-10

    3D tissue models are increasingly being implemented for drug and toxicology testing. However, the creation of tissue-engineered constructs for this purpose often relies on complex biofabrication techniques that are time consuming, expensive, and difficult to scale up. Here, we describe a strategy for realizing multiple tissue constructs in a parallel microfluidic platform using an approach that is simple and can be easily scaled for high-throughput formats. Liver cells mixed with a UV-crosslinkable hydrogel solution are introduced into parallel channels of a sealed microfluidic device and photopatterned to produce stable tissue constructs in situ. The remaining uncrosslinked material is washed away, leaving the structures in place. By using a hydrogel that specifically mimics the properties of the natural extracellular matrix, we closely emulate native tissue, resulting in constructs that remain stable and functional in the device during a 7-day culture time course under recirculating media flow. As proof of principle for toxicology analysis, we expose the constructs to ethyl alcohol (0-500 mM) and show that the cell viability and the secretion of urea and albumin decrease with increasing alcohol exposure, while markers for cell damage increase.

  6. Shift-and-invert parallel spectral transformation eigensolver: Massively parallel performance for density-functional based tight-binding

    DOE PAGES

    Zhang, Hong; Zapol, Peter; Dixon, David A.; ...

    2015-11-17

    The Shift-and-invert parallel spectral transformations (SIPs), a computational approach to solve sparse eigenvalue problems, is developed for massively parallel architectures with exceptional parallel scalability and robustness. The capabilities of SIPs are demonstrated by diagonalization of density-functional based tight-binding (DFTB) Hamiltonian and overlap matrices for single-wall metallic carbon nanotubes, diamond nanowires, and bulk diamond crystals. The largest (smallest) example studied is a 128,000 (2000) atom nanotube for which ~330,000 (~5600) eigenvalues and eigenfunctions are obtained in ~190 (~5) seconds when parallelized over 266,144 (16,384) Blue Gene/Q cores. Weak scaling and strong scaling of SIPs are analyzed and the performance of SIPsmore » is compared with other novel methods. Different matrix ordering methods are investigated to reduce the cost of the factorization step, which dominates the time-to-solution at the strong scaling limit. As a result, a parallel implementation of assembling the density matrix from the distributed eigenvectors is demonstrated.« less

  7. Shift-and-invert parallel spectral transformation eigensolver: Massively parallel performance for density-functional based tight-binding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Hong; Zapol, Peter; Dixon, David A.

    The Shift-and-invert parallel spectral transformations (SIPs), a computational approach to solve sparse eigenvalue problems, is developed for massively parallel architectures with exceptional parallel scalability and robustness. The capabilities of SIPs are demonstrated by diagonalization of density-functional based tight-binding (DFTB) Hamiltonian and overlap matrices for single-wall metallic carbon nanotubes, diamond nanowires, and bulk diamond crystals. The largest (smallest) example studied is a 128,000 (2000) atom nanotube for which ~330,000 (~5600) eigenvalues and eigenfunctions are obtained in ~190 (~5) seconds when parallelized over 266,144 (16,384) Blue Gene/Q cores. Weak scaling and strong scaling of SIPs are analyzed and the performance of SIPsmore » is compared with other novel methods. Different matrix ordering methods are investigated to reduce the cost of the factorization step, which dominates the time-to-solution at the strong scaling limit. As a result, a parallel implementation of assembling the density matrix from the distributed eigenvectors is demonstrated.« less

  8. Turbulent statistics in flow field due to interaction of two plane parallel jets

    NASA Astrophysics Data System (ADS)

    Bisoi, Mukul; Das, Manab Kumar; Roy, Subhransu; Patel, Devendra Kumar

    2017-12-01

    Turbulent characteristics of flow fields due to the interaction of two plane parallel jets separated by the jet width distance are studied. Numerical simulation is carried out by large eddy simulation with a dynamic Smagorinsky model for the sub-grid scale stresses. The energy spectra are observed to follow the -5/3 power law for the inertial sub-range. A proper orthogonal decomposition study indicates that the energy carrying large coherent structures is present close to the nozzle exit. It is shown that these coherent structures interact with each other and finally disintegrate into smaller vortices further downstream. The turbulent fluctuations in the longitudinal and lateral directions are shown to follow a similarity. The mean flow at the same time also maintains a close similarity. Prandtl's mixing length, the Taylor microscale, and the Kolmogorov length scales are shown along the lateral direction for different downstream locations. The autocorrelation in the longitudinal and transverse directions is seen to follow a similarity profile. By plotting the probability density function, the skewness and the flatness (kurtosis) are analyzed. The Reynolds stress anisotropy tensor is calculated, and the anisotropy invariant map known as Lumley's triangle is presented and analyzed.

  9. A derivation and scalable implementation of the synchronous parallel kinetic Monte Carlo method for simulating long-time dynamics

    NASA Astrophysics Data System (ADS)

    Byun, Hye Suk; El-Naggar, Mohamed Y.; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya

    2017-10-01

    Kinetic Monte Carlo (KMC) simulations are used to study long-time dynamics of a wide variety of systems. Unfortunately, the conventional KMC algorithm is not scalable to larger systems, since its time scale is inversely proportional to the simulated system size. A promising approach to resolving this issue is the synchronous parallel KMC (SPKMC) algorithm, which makes the time scale size-independent. This paper introduces a formal derivation of the SPKMC algorithm based on local transition-state and time-dependent Hartree approximations, as well as its scalable parallel implementation based on a dual linked-list cell method. The resulting algorithm has achieved a weak-scaling parallel efficiency of 0.935 on 1024 Intel Xeon processors for simulating biological electron transfer dynamics in a 4.2 billion-heme system, as well as decent strong-scaling parallel efficiency. The parallel code has been used to simulate a lattice of cytochrome complexes on a bacterial-membrane nanowire, and it is broadly applicable to other problems such as computational synthesis of new materials.

  10. Parallelization of fine-scale computation in Agile Multiscale Modelling Methodology

    NASA Astrophysics Data System (ADS)

    Macioł, Piotr; Michalik, Kazimierz

    2016-10-01

    Nowadays, multiscale modelling of material behavior is an extensively developed area. An important obstacle against its wide application is high computational demands. Among others, the parallelization of multiscale computations is a promising solution. Heterogeneous multiscale models are good candidates for parallelization, since communication between sub-models is limited. In this paper, the possibility of parallelization of multiscale models based on Agile Multiscale Methodology framework is discussed. A sequential, FEM based macroscopic model has been combined with concurrently computed fine-scale models, employing a MatCalc thermodynamic simulator. The main issues, being investigated in this work are: (i) the speed-up of multiscale models with special focus on fine-scale computations and (ii) on decreasing the quality of computations enforced by parallel execution. Speed-up has been evaluated on the basis of Amdahl's law equations. The problem of `delay error', rising from the parallel execution of fine scale sub-models, controlled by the sequential macroscopic sub-model is discussed. Some technical aspects of combining third-party commercial modelling software with an in-house multiscale framework and a MPI library are also discussed.

  11. F3D Image Processing and Analysis for Many - and Multi-core Platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    F3D is written in OpenCL, so it achieve[sic] platform-portable parallelism on modern mutli-core CPUs and many-core GPUs. The interface and mechanims to access F3D core are written in Java as a plugin for Fiji/ImageJ to deliver several key image-processing algorithms necessary to remove artifacts from micro-tomography data. The algorithms consist of data parallel aware filters that can efficiently utilizes[sic] resources and can work on out of core datasets and scale efficiently across multiple accelerators. Optimizing for data parallel filters, streaming out of core datasets, and efficient resource and memory and data managements over complex execution sequence of filters greatly expeditesmore » any scientific workflow with image processing requirements. F3D performs several different types of 3D image processing operations, such as non-linear filtering using bilateral filtering and/or median filtering and/or morphological operators (MM). F3D gray-level MM operators are one-pass constant time methods that can perform morphological transformations with a line-structuring element oriented in discrete directions. Additionally, MM operators can be applied to gray-scale images, and consist of two parts: (a) a reference shape or structuring element, which is translated over the image, and (b) a mechanism, or operation, that defines the comparisons to be performed between the image and the structuring element. This tool provides a critical component within many complex pipelines such as those for performing automated segmentation of image stacks. F3D is also called a "descendent" of Quant-CT, another software we developed in the past. These two modules are to be integrated in a next version. Further details were reported in: D.M. Ushizima, T. Perciano, H. Krishnan, B. Loring, H. Bale, D. Parkinson, and J. Sethian. Structure recognition from high-resolution images of ceramic composites. IEEE International Conference on Big Data, October 2014.« less

  12. CSM parallel structural methods research

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.

    1989-01-01

    Parallel structural methods, research team activities, advanced architecture computers for parallel computational structural mechanics (CSM) research, the FLEX/32 multicomputer, a parallel structural analyses testbed, blade-stiffened aluminum panel with a circular cutout and the dynamic characteristics of a 60 meter, 54-bay, 3-longeron deployable truss beam are among the topics discussed.

  13. New Parallel Algorithms for Structural Analysis and Design of Aerospace Structures

    NASA Technical Reports Server (NTRS)

    Nguyen, Duc T.

    1998-01-01

    Subspace and Lanczos iterations have been developed, well documented, and widely accepted as efficient methods for obtaining p-lowest eigen-pair solutions of large-scale, practical engineering problems. The focus of this paper is to incorporate recent developments in vectorized sparse technologies in conjunction with Subspace and Lanczos iterative algorithms for computational enhancements. Numerical performance, in terms of accuracy and efficiency of the proposed sparse strategies for Subspace and Lanczos algorithm, is demonstrated by solving for the lowest frequencies and mode shapes of structural problems on the IBM-R6000/590 and SunSparc 20 workstations.

  14. Measuring attitudes towards suicide: Preliminary evaluation of an attitude towards suicide scale.

    PubMed

    Cwik, Jan Christopher; Till, Benedikt; Bieda, Angela; Blackwell, Simon E; Walter, Carolin; Teismann, Tobias

    2017-01-01

    Our study aimed to validate a previously published scale assessing attitudes towards suicide. Factor structure, convergent and discriminant validity, and predictive validity were investigated. Adult German participants (N=503; mean age=24.74years; age range=18-67years) anonymously completed a set of questionnaires. An exploratory factor analysis was conducted, and incongruous items were deleted. Subsequently, scale properties of the reduced scale and its construct validity were analyzed. A confirmatory factor analysis was then conducted in an independent sample (N=266; mean age=28.77years; age range=18-88years) to further confirm the factor structure of the questionnaire. Parallel analysis indicated a three-factor solution, which was also supported by confirmatory factor analysis: right to commit suicide, interpersonal gesture and resilience. The subscales demonstrated acceptable construct and discriminant validity. Cronbach's α for the subscales ranged from 0.67 to 0.83, explaining 49.70% of the total variance. Positive attitudes towards suicide proved to be predictive of suicide risk status, providing preliminary evidence for the utility of the scale. Future studies aiming to reproduce the factor structure in a more heterogeneous sample are warranted. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Robotically Assembled Aerospace Structures: Digital Material Assembly using a Gantry-Type Assembler

    NASA Technical Reports Server (NTRS)

    Trinh, Greenfield; Copplestone, Grace; O'Connor, Molly; Hu, Steven; Nowak, Sebastian; Cheung, Kenneth; Jenett, Benjamin; Cellucci, Daniel

    2017-01-01

    This paper evaluates the development of automated assembly techniques for discrete lattice structures using a multi-axis gantry type CNC machine. These lattices are made of discrete components called digital materials. We present the development of a specialized end effector that works in conjunction with the CNC machine to assemble these lattices. With this configuration we are able to place voxels at a rate of 1.5 per minute. The scalability of digital material structures due to the incremental modular assembly is one of its key traits and an important metric of interest. We investigate the build times of a 5x5 beam structure on the scale of 1 meter (325 parts), 10 meters (3,250 parts), and 30 meters (9,750 parts). Utilizing the current configuration with a single end effector, performing serial assembly with a globally fixed feed station at the edge of the build volume, the build time increases according to a scaling law of n4, where n is the build scale. Build times can be reduced significantly by integrating feed systems into the gantry itself, resulting in a scaling law of n3. A completely serial assembly process will encounter time limitations as build scale increases. Automated assembly for digital materials can assemble high performance structures from discrete parts, and techniques such as built in feed systems, parallelization, and optimization of the fastening process will yield much higher throughput.

  16. Robotically Assembled Aerospace Structures: Digital Material Assembly using a Gantry-Type Assembler

    NASA Technical Reports Server (NTRS)

    Trinh, Greenfield; Copplestone, Grace; O'Connor, Molly; Hu, Steven; Nowak, Sebastian; Cheung, Kenneth; Jenett, Benjamin; Cellucci, Daniel

    2017-01-01

    This paper evaluates the development of automated assembly techniques for discrete lattice structures using a multi-axis gantry type CNC machine. These lattices are made of discrete components called "digital materials." We present the development of a specialized end effector that works in conjunction with the CNC machine to assemble these lattices. With this configuration we are able to place voxels at a rate of 1.5 per minute. The scalability of digital material structures due to the incremental modular assembly is one of its key traits and an important metric of interest. We investigate the build times of a 5x5 beam structure on the scale of 1 meter (325 parts), 10 meters (3,250 parts), and 30 meters (9,750 parts). Utilizing the current configuration with a single end effector, performing serial assembly with a globally fixed feed station at the edge of the build volume, the build time increases according to a scaling law of n4, where n is the build scale. Build times can be reduced significantly by integrating feed systems into the gantry itself, resulting in a scaling law of n3. A completely serial assembly process will encounter time limitations as build scale increases. Automated assembly for digital materials can assemble high performance structures from discrete parts, and techniques such as built in feed systems, parallelization, and optimization of the fastening process will yield much higher throughput.

  17. An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator

    PubMed Central

    Wang, Runchun M.; Thakur, Chetan S.; van Schaik, André

    2018-01-01

    This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks. PMID:29692702

  18. An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator.

    PubMed

    Wang, Runchun M; Thakur, Chetan S; van Schaik, André

    2018-01-01

    This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks.

  19. An evolving view of Saturn's dynamic rings.

    PubMed

    Cuzzi, J N; Burns, J A; Charnoz, S; Clark, R N; Colwell, J E; Dones, L; Esposito, L W; Filacchione, G; French, R G; Hedman, M M; Kempf, S; Marouf, E A; Murray, C D; Nicholson, P D; Porco, C C; Schmidt, J; Showalter, M R; Spilker, L J; Spitale, J N; Srama, R; Sremcević, M; Tiscareno, M S; Weiss, J

    2010-03-19

    We review our understanding of Saturn's rings after nearly 6 years of observations by the Cassini spacecraft. Saturn's rings are composed mostly of water ice but also contain an undetermined reddish contaminant. The rings exhibit a range of structure across many spatial scales; some of this involves the interplay of the fluid nature and the self-gravity of innumerable orbiting centimeter- to meter-sized particles, and the effects of several peripheral and embedded moonlets, but much remains unexplained. A few aspects of ring structure change on time scales as short as days. It remains unclear whether the vigorous evolutionary processes to which the rings are subject imply a much younger age than that of the solar system. Processes on view at Saturn have parallels in circumstellar disks.

  20. Two-dimensional quasineutral description of particles and fields above discrete auroral arcs

    NASA Technical Reports Server (NTRS)

    Newman, A. L.; Chiu, Y. T.; Cornwall, J. M.

    1985-01-01

    Stationary hot and cool particle distributions in the auroral magnetosphere are modelled using adiabatic assumptions of particle motion in the presence of broad-scale electrostatic potential structure. The study has identified geometrical restrictions on the type of broadscale potential structure which can be supported by a multispecies plasma having specified sources and energies. Without energization of cool thermal ionospheric electrons, a substantial parallel potential drop cannot be supported down to altitudes of 2000 km or less. Observed upward-directed field-aligned currents must be closed by return currents along field lines which support little net potential drop. In such regions the plasma density appears significantly enhanced. Model details agree well with recent broad-scale implications of satellite observations.

  1. Controls on Early-Rift Geometry: New Perspectives From the Bilila-Mtakataka Fault, Malawi

    NASA Astrophysics Data System (ADS)

    Hodge, M.; Fagereng, Å.; Biggs, J.; Mdala, H.

    2018-05-01

    We use the ˜110-km long Bilila-Mtakataka fault in the amagmatic southern East African Rift, Malawi, to investigate the controls on early-rift geometry at the scale of a major border fault. Morphological variations along the 14 ± 8-m high scarp define six 10- to 40-km long segments, which are either foliation parallel or oblique to both foliation and the current regional extension direction. As the scarp is neither consistently parallel to foliation nor well oriented for the current regional extension direction, we suggest that the segmented surface expression is related to the local reactivation of well-oriented weak shallow fabrics above a broadly continuous structure at depth. Using a geometrical model, the geometry of the best fitting subsurface structure is consistent with the local strain field from recent seismicity. In conclusion, within this early-rift, preexisting weaknesses only locally control border fault geometry at subsurface.

  2. Decarboxylation of furfural on Pd(111): Ab initio molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Xue, Wenhua; Dang, Hongli; Shields, Darwin; Liu, Yingdi; Jentoft, Friederike; Resasco, Daniel; Wang, Sanwu

    2013-03-01

    Furfural conversion over metal catalysts plays an important role in the studies of biomass-derived feedstocks. We report ab initio molecular dynamics simulations for the decarboxylation process of furfural on the palladium surface at finite temperatures. We observed and analyzed the atomic-scale dynamics of furfural on the Pd(111) surface and the fluctuations of the bondlengths between the atoms in furfural. We found that the dominant bonding structure is the parallel structure in which the furfural plane, while slightly distorted, is parallel to the Pd surface. Analysis of the bondlength fluctuations indicates that the C-H bond is the aldehyde group of a furfural molecule is likely to be broken first, while the C =O bond has a tendency to be isolated as CO. Our results show that the reaction of decarbonylation dominates, consistent with the experimental measurements. Supported by DOE (DE-SC0004600). Simulations and calculations were performed on XSEDE's and NERSC's supercomputers.

  3. Large-scale three-dimensional phase-field simulations for phase coarsening at ultrahigh volume fraction on high-performance architectures

    NASA Astrophysics Data System (ADS)

    Yan, Hui; Wang, K. G.; Jones, Jim E.

    2016-06-01

    A parallel algorithm for large-scale three-dimensional phase-field simulations of phase coarsening is developed and implemented on high-performance architectures. From the large-scale simulations, a new kinetics in phase coarsening in the region of ultrahigh volume fraction is found. The parallel implementation is capable of harnessing the greater computer power available from high-performance architectures. The parallelized code enables increase in three-dimensional simulation system size up to a 5123 grid cube. Through the parallelized code, practical runtime can be achieved for three-dimensional large-scale simulations, and the statistical significance of the results from these high resolution parallel simulations are greatly improved over those obtainable from serial simulations. A detailed performance analysis on speed-up and scalability is presented, showing good scalability which improves with increasing problem size. In addition, a model for prediction of runtime is developed, which shows a good agreement with actual run time from numerical tests.

  4. Linked exploratory visualizations for uncertain MR spectroscopy data

    NASA Astrophysics Data System (ADS)

    Feng, David; Kwock, Lester; Lee, Yueh; Taylor, Russell M., II

    2010-01-01

    We present a system for visualizing magnetic resonance spectroscopy (MRS) data sets. Using MRS, radiologists generate multiple 3D scalar fields of metabolite concentrations within the brain and compare them to anatomical magnetic resonance imaging. By understanding the relationship between metabolic makeup and anatomical structure, radiologists hope to better diagnose and treat tumors and lesions. Our system consists of three linked visualizations: a spatial glyph-based technique we call Scaled Data-Driven Spheres, a parallel coordinates visualization augmented to incorporate uncertainty in the data, and a slice plane for accurate data value extraction. The parallel coordinates visualization uses specialized brush interactions designed to help users identify nontrivial linear relationships between scalar fields. We describe two novel contributions to parallel coordinates visualizations: linear function brushing and new axis construction. Users have discovered significant relationships among metabolites and anatomy by linking interactions between the three visualizations.

  5. Linked Exploratory Visualizations for Uncertain MR Spectroscopy Data

    PubMed Central

    Feng, David; Kwock, Lester; Lee, Yueh; Taylor, Russell M.

    2010-01-01

    We present a system for visualizing magnetic resonance spectroscopy (MRS) data sets. Using MRS, radiologists generate multiple 3D scalar fields of metabolite concentrations within the brain and compare them to anatomical magnetic resonance imaging. By understanding the relationship between metabolic makeup and anatomical structure, radiologists hope to better diagnose and treat tumors and lesions. Our system consists of three linked visualizations: a spatial glyph-based technique we call Scaled Data-Driven Spheres, a parallel coordinates visualization augmented to incorporate uncertainty in the data, and a slice plane for accurate data value extraction. The parallel coordinates visualization uses specialized brush interactions designed to help users identify nontrivial linear relationships between scalar fields. We describe two novel contributions to parallel coordinates visualizations: linear function brushing and new axis construction. Users have discovered significant relationships among metabolites and anatomy by linking interactions between the three visualizations. PMID:21152337

  6. Large-scale structure after COBE: Peculiar velocities and correlations of cold dark matter halos

    NASA Technical Reports Server (NTRS)

    Zurek, Wojciech H.; Quinn, Peter J.; Salmon, John K.; Warren, Michael S.

    1994-01-01

    Large N-body simulations on parallel supercomputers allow one to simultaneously investigate large-scale structure and the formation of galactic halos with unprecedented resolution. Our study shows that the masses as well as the spatial distribution of halos on scales of tens of megaparsecs in a cold dark matter (CDM) universe with the spectrum normalized to the anisotropies detected by Cosmic Background Explorer (COBE) is compatible with the observations. We also show that the average value of the relative pairwise velocity dispersion sigma(sub v) - used as a principal argument against COBE-normalized CDM models-is significantly lower for halos than for individual particles. When the observational methods of extracting sigma(sub v) are applied to the redshift catalogs obtained from the numerical experiments, estimates differ significantly between different observation-sized samples and overlap observational estimates obtained following the same procedure.

  7. Virtual earthquake engineering laboratory with physics-based degrading materials on parallel computers

    NASA Astrophysics Data System (ADS)

    Cho, In Ho

    For the last few decades, we have obtained tremendous insight into underlying microscopic mechanisms of degrading quasi-brittle materials from persistent and near-saintly efforts in laboratories, and at the same time we have seen unprecedented evolution in computational technology such as massively parallel computers. Thus, time is ripe to embark on a novel approach to settle unanswered questions, especially for the earthquake engineering community, by harmoniously combining the microphysics mechanisms with advanced parallel computing technology. To begin with, it should be stressed that we placed a great deal of emphasis on preserving clear meaning and physical counterparts of all the microscopic material models proposed herein, since it is directly tied to the belief that by doing so, the more physical mechanisms we incorporate, the better prediction we can obtain. We departed from reviewing representative microscopic analysis methodologies, selecting out "fixed-type" multidirectional smeared crack model as the base framework for nonlinear quasi-brittle materials, since it is widely believed to best retain the physical nature of actual cracks. Microscopic stress functions are proposed by integrating well-received existing models to update normal stresses on the crack surfaces (three orthogonal surfaces are allowed to initiate herein) under cyclic loading. Unlike the normal stress update, special attention had to be paid to the shear stress update on the crack surfaces, due primarily to the well-known pathological nature of the fixed-type smeared crack model---spurious large stress transfer over the open crack under nonproportional loading. In hopes of exploiting physical mechanism to resolve this deleterious nature of the fixed crack model, a tribology-inspired three-dimensional (3d) interlocking mechanism has been proposed. Following the main trend of tribology (i.e., the science and engineering of interacting surfaces), we introduced the base fabric of solid particle-soft matrix to explain realistic interlocking over rough crack surfaces, and the adopted Gaussian distribution feeds random particle sizes to the entire domain. Validation against a well-documented rough crack experiment reveals promising accuracy of the proposed 3d interlocking model. A consumed energy-based damage model has been proposed for the weak correlation between the normal and shear stresses on the crack surfaces, and also for describing the nature of irrecoverable damage. Since the evaluation of the consumed energy is directly linked to the microscopic deformation, which can be efficiently tracked on the crack surfaces, the proposed damage model is believed to provide a more physical interpretation than existing damage mechanics, which fundamentally stem from mathematical derivation with few physical counterparts. Another novel point of the present work lies in the topological transition-based "smart" steel bar model, notably with evolving compressive buckling length. We presented a systematic framework of information flow between the key ingredients of composite materials (i.e., steel bar and its surrounding concrete elements). The smart steel model suggested can incorporate smooth transition during reversal loading, tensile rupture, early buckling after reversal from excessive tensile loading, and even compressive buckling. Especially, the buckling length is made to evolve according to the damage states of the surrounding elements of each bar, while all other dominant models leave the length unchanged. What lies behind all the aforementioned novel attempts is, of course, the problem-optimized parallel platform. In fact, the parallel computing in our field has been restricted to monotonic shock or blast loading with explicit algorithm which is characteristically feasible to be parallelized. In the present study, efficient parallelization strategies for the highly demanding implicit nonlinear finite element analysis (FEA) program for real-scale reinforced concrete (RC) structures under cyclic loading are proposed. Quantitative comparison of state-of-the-art parallel strategies, in terms of factorization, had been carried out, leading to the problem-optimized solver, which is successfully embracing the penalty method and banded nature. Particularly, the penalty method employed imparts considerable smoothness to the global response, which yields a practical superiority of the parallel triangular system solver over other advanced solvers such as parallel preconditioned conjugate gradient method. Other salient issues on parallelization are also addressed. The parallel platform established offers unprecedented access to simulations of real-scale structures, giving new understanding about the physics-based mechanisms adopted and probabilistic randomness at the entire system level. Particularly, the platform enables bold simulations of real-scale RC structures exposed to cyclic loading---H-shaped wall system and 4-story T-shaped wall system. The simulations show the desired capability of accurate prediction of global force-displacement responses, postpeak softening behavior, and compressive buckling of longitudinal steel bars. It is fascinating to see that intrinsic randomness of the 3d interlocking model appears to cause "localized" damage of the real-scale structures, which is consistent with reported observations in different fields such as granular media. Equipped with accuracy, stability and scalability as demonstrated so far, the parallel platform is believed to serve as a fertile ground for the introducing of further physical mechanisms into various research fields as well as the earthquake engineering community. In the near future, it can be further expanded to run in concert with reliable FEA programs such as FRAME3d or OPENSEES. Following the central notion of "multiscale" analysis technique, actual infrastructures exposed to extreme natural hazard can be successfully tackled by this next generation analysis tool---the harmonious union of the parallel platform and a general FEA program. At the same time, any type of experiments can be easily conducted by this "virtual laboratory."

  8. Three sets of crystallographic sub-planar structures in quartz formed by tectonic deformation

    NASA Astrophysics Data System (ADS)

    Derez, Tine; Pennock, Gill; Drury, Martyn; Sintubin, Manuel

    2016-05-01

    In quartz, multiple sets of fine planar deformation microstructures that have specific crystallographic orientations parallel to planes with low Miller-Bravais indices are commonly considered as shock-induced planar deformation features (PDFs) diagnostic of shock metamorphism. Using polarized light microscopy, we demonstrate that up to three sets of tectonically induced sub-planar fine extinction bands (FEBs), sub-parallel to the basal, γ, ω, and π crystallographic planes, are common in vein quartz in low-grade tectonometamorphic settings. We conclude that the observation of multiple (2-3) sets of fine scale, closely spaced, crystallographically controlled, sub-planar microstructures is not sufficient to unambiguously distinguish PDFs from tectonic FEBs.

  9. PETSc Users Manual Revision 3.7

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Balay, Satish; Abhyankar, S.; Adams, M.

    This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.

  10. PETSc Users Manual Revision 3.8

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Balay, S.; Abhyankar, S.; Adams, M.

    This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.

  11. THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

    NASA Astrophysics Data System (ADS)

    Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

    2015-07-01

    The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.

  12. Structural control of coalbed methane production in Alabama

    USGS Publications Warehouse

    Pashin, J.C.; Groshong, R.H.

    1998-01-01

    Thin-skinned structures are distributed throughout the Alabama coalbed methane fields, and these structures affect the production of gas and water from coal-bearing strata. Extensional structures in Deerlick Creek and Cedar Cove fields include normal faults and hanging-wall rollovers, and area balancing indicates that these structures are detached in the Pottsville Formation. Compressional folds in Gurnee and Oak Grove fields, by comparison, are interpreted to be detachment folds formed above decollements at different stratigraphic levels. Patterns of gas and water production reflect the structural style of each field and further indicate that folding and faulting have affected the distribution of permeability and the overall success of coalbed methane operations. Area balancing can be an effective way to characterize coalbed methane reservoirs in structurally complex regions because it constrains structural geometry and can be used to determine the distribution of layer-parallel strain. Comparison of calculated requisite strain and borehole expansion data from calliper logs suggests that strain in coalbed methane reservoirs is predictable and can be expressed as fracturing and small-scale faulting. However, refined methodology is needed to analyze heterogeneous strain distributions in discrete bed segments. Understanding temporal variation of production patterns in areas where gas and water production are influenced by map-scale structure will further facilitate effective management of coalbed methane fields.Thin-skinned structures are distributed throughout the Alabama coalbed methane fields, and these structures affect the production of gas and water from coal-bearing strata. Extensional structures in Deerlick Creek and Cedar Cove fields include normal faults and hanging-wall rollovers, and area balancing indicates that these structures are detached in the Pottsville Formation. Compressional folds in Gurnee and Oak Grove fields, by comparison, are interpreted to be detachment folds formed above decollements at different stratigraphic levels. Patterns of gas and water production reflect the structural style of each field and further indicate that folding and faulting have affected the distribution of permeability and the overall success of coalbed methane operations. Area balancing can be an effective way to characterize coalbed methane reservoirs in structurally complex regions because it constrains structural geometry and can be used to determine the distribution of layer-parallel strain. Comparison of calculated requisite strain and borehole expansion data from calliper logs suggests that strain in coalbed methane reservoirs is predictable and can be expressed as fracturing and small-scale faulting. However, refined methodology is needed to analyze heterogeneous strain distributions in discrete bed segments. Understanding temporal variation of production patterns in areas where gas and water production are influenced by map-scale structure will further facilitate effective management of coalbed methane fields.

  13. Multicoil resonance-based parallel array for smart wireless power delivery.

    PubMed

    Mirbozorgi, S A; Sawan, M; Gosselin, B

    2013-01-01

    This paper presents a novel resonance-based multicoil structure as a smart power surface to wirelessly power up apparatus like mobile, animal headstage, implanted devices, etc. The proposed powering system is based on a 4-coil resonance-based inductive link, the resonance coil of which is formed by an array of several paralleled coils as a smart power transmitter. The power transmitter employs simple circuit connections and includes only one power driver circuit per multicoil resonance-based array, which enables higher power transfer efficiency and power delivery to the load. The power transmitted by the driver circuit is proportional to the load seen by the individual coil in the array. Thus, the transmitted power scales with respect to the load of the electric/electronic system to power up, and does not divide equally over every parallel coils that form the array. Instead, only the loaded coils of the parallel array transmit significant part of total transmitted power to the receiver. Such adaptive behavior enables superior power, size and cost efficiency then other solutions since it does not need to use complex detection circuitry to find the location of the load. The performance of the proposed structure is verified by measurement results. Natural load detection and covering 4 times bigger area than conventional topologies with a power transfer efficiency of 55% are the novelties of presented paper.

  14. Self-assembled three-dimensional chiral colloidal architecture

    NASA Astrophysics Data System (ADS)

    Ben Zion, Matan Yah; He, Xiaojin; Maass, Corinna C.; Sha, Ruojie; Seeman, Nadrian C.; Chaikin, Paul M.

    2017-11-01

    Although stereochemistry has been a central focus of the molecular sciences since Pasteur, its province has previously been restricted to the nanometric scale. We have programmed the self-assembly of micron-sized colloidal clusters with structural information stemming from a nanometric arrangement. This was done by combining DNA nanotechnology with colloidal science. Using the functional flexibility of DNA origami in conjunction with the structural rigidity of colloidal particles, we demonstrate the parallel self-assembly of three-dimensional microconstructs, evincing highly specific geometry that includes control over position, dihedral angles, and cluster chirality.

  15. Prosodic Structure as a Parallel to Musical Structure

    PubMed Central

    Heffner, Christopher C.; Slevc, L. Robert

    2015-01-01

    What structural properties do language and music share? Although early speculation identified a wide variety of possibilities, the literature has largely focused on the parallels between musical structure and syntactic structure. Here, we argue that parallels between musical structure and prosodic structure deserve more attention. We review the evidence for a link between musical and prosodic structure and find it to be strong. In fact, certain elements of prosodic structure may provide a parsimonious comparison with musical structure without sacrificing empirical findings related to the parallels between language and music. We then develop several predictions related to such a hypothesis. PMID:26733930

  16. Alignment between Protostellar Outflows and Filamentary Structure

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stephens, Ian W.; Dunham, Michael M.; Myers, Philip C.

    2017-09-01

    We present new Submillimeter Array (SMA) observations of CO(2–1) outflows toward young, embedded protostars in the Perseus molecular cloud as part of the Mass Assembly of Stellar Systems and their Evolution with the SMA (MASSES) survey. For 57 Perseus protostars, we characterize the orientation of the outflow angles and compare them with the orientation of the local filaments as derived from Herschel observations. We find that the relative angles between outflows and filaments are inconsistent with purely parallel or purely perpendicular distributions. Instead, the observed distribution of outflow-filament angles are more consistent with either randomly aligned angles or a mixmore » of projected parallel and perpendicular angles. A mix of parallel and perpendicular angles requires perpendicular alignment to be more common by a factor of ∼3. Our results show that the observed distributions probably hold regardless of the protostar’s multiplicity, age, or the host core’s opacity. These observations indicate that the angular momentum axis of a protostar may be independent of the large-scale structure. We discuss the significance of independent protostellar rotation axes in the general picture of filament-based star formation.« less

  17. Parallel Optimization of Polynomials for Large-scale Problems in Stability and Control

    NASA Astrophysics Data System (ADS)

    Kamyar, Reza

    In this thesis, we focus on some of the NP-hard problems in control theory. Thanks to the converse Lyapunov theory, these problems can often be modeled as optimization over polynomials. To avoid the problem of intractability, we establish a trade off between accuracy and complexity. In particular, we develop a sequence of tractable optimization problems --- in the form of Linear Programs (LPs) and/or Semi-Definite Programs (SDPs) --- whose solutions converge to the exact solution of the NP-hard problem. However, the computational and memory complexity of these LPs and SDPs grow exponentially with the progress of the sequence - meaning that improving the accuracy of the solutions requires solving SDPs with tens of thousands of decision variables and constraints. Setting up and solving such problems is a significant challenge. The existing optimization algorithms and software are only designed to use desktop computers or small cluster computers --- machines which do not have sufficient memory for solving such large SDPs. Moreover, the speed-up of these algorithms does not scale beyond dozens of processors. This in fact is the reason we seek parallel algorithms for setting-up and solving large SDPs on large cluster- and/or super-computers. We propose parallel algorithms for stability analysis of two classes of systems: 1) Linear systems with a large number of uncertain parameters; 2) Nonlinear systems defined by polynomial vector fields. First, we develop a distributed parallel algorithm which applies Polya's and/or Handelman's theorems to some variants of parameter-dependent Lyapunov inequalities with parameters defined over the standard simplex. The result is a sequence of SDPs which possess a block-diagonal structure. We then develop a parallel SDP solver which exploits this structure in order to map the computation, memory and communication to a distributed parallel environment. Numerical tests on a supercomputer demonstrate the ability of the algorithm to efficiently utilize hundreds and potentially thousands of processors, and analyze systems with 100+ dimensional state-space. Furthermore, we extend our algorithms to analyze robust stability over more complicated geometries such as hypercubes and arbitrary convex polytopes. Our algorithms can be readily extended to address a wide variety of problems in control such as Hinfinity synthesis for systems with parametric uncertainty and computing control Lyapunov functions.

  18. Depression Anxiety Stress Scale: is it valid for children and adolescents?

    PubMed

    Patrick, Jeff; Dyck, Murray; Bramston, Paul

    2010-09-01

    The Depression Anxiety Stress Scale (Lovibond & Lovibond, 1995) is used to assess the severity of symptoms in child and adolescent samples although its validity in these populations has not been demonstrated. The authors assessed the latent structure of the 21-item version of the scale in samples of 425 and 285 children and adolescents on two occasions, one year apart. On each occasion, parallel analyses suggested that only one component should be extracted, indicating that the test does not differentiate depression, anxiety, and stress in children and adolescents. The results provide additional evidence that adult models of depression do not describe the experience of depression in children and adolescents. (c) 2010 Wiley Periodicals, Inc.

  19. Atomic scale structure and chemistry of interfaces by Z-contrast imaging and electron energy loss spectroscopy in the STEM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McGibbon, M.M.; Browning, N.D.; Chisholm, M.F.

    The macroscopic properties of many materials are controlled by the structure and chemistry at the grain boundaries. A basic understanding of the structure-property relationship requires a technique which probes both composition and chemical bonding on an atomic scale. The high-resolution Z-contrast imaging technique in the scanning transmission electron microscope (STEM) forms an incoherent image in which changes in atomic structure and composition can be interpreted intuitively. This direct image allows the electron probe to be positioned over individual atomic columns for parallel detection electron energy loss spectroscopy (PEELS) at a spatial resolution approaching 0.22nm. The bonding information which can bemore » obtained from the fine structure within the PEELS edges can then be used in conjunction with the Z-contrast images to determine the structure at the grain boundary. In this paper we present 3 examples of correlations between the structural, chemical and electronic properties at materials interfaces in metal-semiconductor systems, superconducting and ferroelectric materials.« less

  20. Bounds on the attractor dimension for magnetohydrodynamic channel flow with parallel magnetic field at low magnetic Reynolds number.

    PubMed

    Low, R; Pothérat, A

    2015-05-01

    We investigate aspects of low-magnetic-Reynolds-number flow between two parallel, perfectly insulating walls in the presence of an imposed magnetic field parallel to the bounding walls. We find a functional basis to describe the flow, well adapted to the problem of finding the attractor dimension and which is also used in subsequent direct numerical simulation of these flows. For given Reynolds and Hartmann numbers, we obtain an upper bound for the dimension of the attractor by means of known bounds on the nonlinear inertial term and this functional basis for the flow. Three distinct flow regimes emerge: a quasi-isotropic three-dimensional (3D) flow, a nonisotropic 3D flow, and a 2D flow. We find the transition curves between these regimes in the space parametrized by Hartmann number Ha and attractor dimension d(att). We find how the attractor dimension scales as a function of Reynolds and Hartmann numbers (Re and Ha) in each regime. We also investigate the thickness of the boundary layer along the bounding wall and find that in all regimes this scales as 1/Re, independently of the value of Ha, unlike Hartmann boundary layers found when the field is normal to the channel. The structure of the set of least dissipative modes is indeed quite different between these two cases but the properties of turbulence far from the walls (smallest scales and number of degrees of freedom) are found to be very similar.

  1. A New Numerical Scheme for Cosmic-Ray Transport

    NASA Astrophysics Data System (ADS)

    Jiang, Yan-Fei; Oh, S. Peng

    2018-02-01

    Numerical solutions of the cosmic-ray (CR) magnetohydrodynamic equations are dogged by a powerful numerical instability, which arises from the constraint that CRs can only stream down their gradient. The standard cure is to regularize by adding artificial diffusion. Besides introducing ad hoc smoothing, this has a significant negative impact on either computational cost or complexity and parallel scalings. We describe a new numerical algorithm for CR transport, with close parallels to two-moment methods for radiative transfer under the reduced speed of light approximation. It stably and robustly handles CR streaming without any artificial diffusion. It allows for both isotropic and field-aligned CR streaming and diffusion, with arbitrary streaming and diffusion coefficients. CR transport is handled explicitly, while source terms are handled implicitly. The overall time step scales linearly with resolution (even when computing CR diffusion) and has a perfect parallel scaling. It is given by the standard Courant condition with respect to a constant maximum velocity over the entire simulation domain. The computational cost is comparable to that of solving the ideal MHD equation. We demonstrate the accuracy and stability of this new scheme with a wide variety of tests, including anisotropic streaming and diffusion tests, CR-modified shocks, CR-driven blast waves, and CR transport in multiphase media. The new algorithm opens doors to much more ambitious and hitherto intractable calculations of CR physics in galaxies and galaxy clusters. It can also be applied to other physical processes with similar mathematical structure, such as saturated, anisotropic heat conduction.

  2. Efficient parallelization for AMR MHD multiphysics calculations; implementation in AstroBEAR

    NASA Astrophysics Data System (ADS)

    Carroll-Nellenback, Jonathan J.; Shroyer, Brandon; Frank, Adam; Ding, Chen

    2013-03-01

    Current adaptive mesh refinement (AMR) simulations require algorithms that are highly parallelized and manage memory efficiently. As compute engines grow larger, AMR simulations will require algorithms that achieve new levels of efficient parallelization and memory management. We have attempted to employ new techniques to achieve both of these goals. Patch or grid based AMR often employs ghost cells to decouple the hyperbolic advances of each grid on a given refinement level. This decoupling allows each grid to be advanced independently. In AstroBEAR we utilize this independence by threading the grid advances on each level with preference going to the finer level grids. This allows for global load balancing instead of level by level load balancing and allows for greater parallelization across both physical space and AMR level. Threading of level advances can also improve performance by interleaving communication with computation, especially in deep simulations with many levels of refinement. While we see improvements of up to 30% on deep simulations run on a few cores, the speedup is typically more modest (5-20%) for larger scale simulations. To improve memory management we have employed a distributed tree algorithm that requires processors to only store and communicate local sections of the AMR tree structure with neighboring processors. Using this distributed approach we are able to get reasonable scaling efficiency (>80%) out to 12288 cores and up to 8 levels of AMR - independent of the use of threading.

  3. HPCC Methodologies for Structural Design and Analysis on Parallel and Distributed Computing Platforms

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel

    1998-01-01

    In this grant, we have proposed a three-year research effort focused on developing High Performance Computation and Communication (HPCC) methodologies for structural analysis on parallel processors and clusters of workstations, with emphasis on reducing the structural design cycle time. Besides consolidating and further improving the FETI solver technology to address plate and shell structures, we have proposed to tackle the following design related issues: (a) parallel coupling and assembly of independently designed and analyzed three-dimensional substructures with non-matching interfaces, (b) fast and smart parallel re-analysis of a given structure after it has undergone design modifications, (c) parallel evaluation of sensitivity operators (derivatives) for design optimization, and (d) fast parallel analysis of mildly nonlinear structures. While our proposal was accepted, support was provided only for one year.

  4. Efficient Computation of Sparse Matrix Functions for Large-Scale Electronic Structure Calculations: The CheSS Library.

    PubMed

    Mohr, Stephan; Dawson, William; Wagner, Michael; Caliste, Damien; Nakajima, Takahito; Genovese, Luigi

    2017-10-10

    We present CheSS, the "Chebyshev Sparse Solvers" library, which has been designed to solve typical problems arising in large-scale electronic structure calculations using localized basis sets. The library is based on a flexible and efficient expansion in terms of Chebyshev polynomials and presently features the calculation of the density matrix, the calculation of matrix powers for arbitrary powers, and the extraction of eigenvalues in a selected interval. CheSS is able to exploit the sparsity of the matrices and scales linearly with respect to the number of nonzero entries, making it well-suited for large-scale calculations. The approach is particularly adapted for setups leading to small spectral widths of the involved matrices and outperforms alternative methods in this regime. By coupling CheSS to the DFT code BigDFT, we show that such a favorable setup is indeed possible in practice. In addition, the approach based on Chebyshev polynomials can be massively parallelized, and CheSS exhibits excellent scaling up to thousands of cores even for relatively small matrix sizes.

  5. Predictive wind turbine simulation with an adaptive lattice Boltzmann method for moving boundaries

    NASA Astrophysics Data System (ADS)

    Deiterding, Ralf; Wood, Stephen L.

    2016-09-01

    Operating horizontal axis wind turbines create large-scale turbulent wake structures that affect the power output of downwind turbines considerably. The computational prediction of this phenomenon is challenging as efficient low dissipation schemes are necessary that represent the vorticity production by the moving structures accurately and that are able to transport wakes without significant artificial decay over distances of several rotor diameters. We have developed a parallel adaptive lattice Boltzmann method for large eddy simulation of turbulent weakly compressible flows with embedded moving structures that considers these requirements rather naturally and enables first principle simulations of wake-turbine interaction phenomena at reasonable computational costs. The paper describes the employed computational techniques and presents validation simulations for the Mexnext benchmark experiments as well as simulations of the wake propagation in the Scaled Wind Farm Technology (SWIFT) array consisting of three Vestas V27 turbines in triangular arrangement.

  6. Arkas: Rapid reproducible RNAseq analysis

    PubMed Central

    Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan

    2017-01-01

    The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments.  We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways .  Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing.   Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import.  Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134

  7. Parallel Scaling Characteristics of Selected NERSC User ProjectCodes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Skinner, David; Verdier, Francesca; Anand, Harsh

    This report documents parallel scaling characteristics of NERSC user project codes between Fiscal Year 2003 and the first half of Fiscal Year 2004 (Oct 2002-March 2004). The codes analyzed cover 60% of all the CPU hours delivered during that time frame on seaborg, a 6080 CPU IBM SP and the largest parallel computer at NERSC. The scale in terms of concurrency and problem size of the workload is analyzed. Drawing on batch queue logs, performance data and feedback from researchers we detail the motivations, benefits, and challenges of implementing highly parallel scientific codes on current NERSC High Performance Computing systems.more » An evaluation and outlook of the NERSC workload for Allocation Year 2005 is presented.« less

  8. Tracing WR wind structures by using the orbiting companion in the 29d WC8d + O8-9IV binary CV Ser

    NASA Astrophysics Data System (ADS)

    David-Uraz, Alexandre; Moffat, Anthony F. J.; Chené, André Nicolas; Lange, Nicholas

    2011-01-01

    We have obtained continuous, high-precision, broadband visible photometry from the MOST satellite of CV Ser over more than a full orbit in order to link the small-scale light-curve variations to extinction due to wind structures in the WR component, thus permitting us to trace these structures. The light-curve presented unexpected characteristics, in particular eclipses with a varying depth. Parallel optical spectroscopy from the Mont Megantic Observatory and Dominion Astrophysical Observatory was obtained to refine the orbital and wind-collision parameters, as well as to reveal line emission from clumps.

  9. An evolving view of Saturn’s dynamic rings

    USGS Publications Warehouse

    Cuzzi, J.N.; Burns, J.A.; Charnoz, S.; Clark, Roger N.; Colwell, J.E.; Dones, L.; Esposito, L.W.; Filacchione, G.; French, R.G.; Hedman, M.M.; Kempf, S.; Marouf, E.A.; Murray, C.D.; Nicholson, P.D.; Porco, C.C.; Schmidt, J.; Showalter, M.R.; Spilker, L.J.; Spitale, J.; Srama, R.; Srem evi, M.; Tiscareno, M.S.; Weiss, J.

    2010-01-01

    We review our understanding of Saturn’s rings after nearly 6 years of observations by the Cassini spacecraft. Saturn’s rings are composed mostly of water ice but also contain an undetermined reddish contaminant. The rings exhibit a range of structure across many spatial scales; some of this involves the interplay of the fluid nature and the self-gravity of innumerable orbiting centimeter- to meter-sized particles, and the effects of several peripheral and embedded moonlets, but much remains unexplained. A few aspects of ring structure change on time scales as short as days. It remains unclear whether the vigorous evolutionary processes to which the rings are subject imply a much younger age than that of the solar system. Processes on view at Saturn have parallels in circumstellar disks.

  10. The role of parallelism in the real-time processing of anaphora.

    PubMed

    Poirier, Josée; Walenski, Matthew; Shapiro, Lewis P

    2012-06-01

    Parallelism effects refer to the facilitated processing of a target structure when it follows a similar, parallel structure. In coordination, a parallelism-related conjunction triggers the expectation that a second conjunct with the same structure as the first conjunct should occur. It has been proposed that parallelism effects reflect the use of the first structure as a template that guides the processing of the second. In this study, we examined the role of parallelism in real-time anaphora resolution by charting activation patterns in coordinated constructions containing anaphora, Verb-Phrase Ellipsis (VPE) and Noun-Phrase Traces (NP-traces). Specifically, we hypothesised that an expectation of parallelism would incite the parser to assume a structure similar to the first conjunct in the second, anaphora-containing conjunct. The speculation of a similar structure would result in early postulation of covert anaphora. Experiment 1 confirms that following a parallelism-related conjunction, first-conjunct material is activated in the second conjunct. Experiment 2 reveals that an NP-trace in the second conjunct is posited immediately where licensed, which is earlier than previously reported in the literature. In light of our findings, we propose an intricate relation between structural expectations and anaphor resolution.

  11. The role of parallelism in the real-time processing of anaphora

    PubMed Central

    Poirier, Josée; Walenski, Matthew; Shapiro, Lewis P.

    2012-01-01

    Parallelism effects refer to the facilitated processing of a target structure when it follows a similar, parallel structure. In coordination, a parallelism-related conjunction triggers the expectation that a second conjunct with the same structure as the first conjunct should occur. It has been proposed that parallelism effects reflect the use of the first structure as a template that guides the processing of the second. In this study, we examined the role of parallelism in real-time anaphora resolution by charting activation patterns in coordinated constructions containing anaphora, Verb-Phrase Ellipsis (VPE) and Noun-Phrase Traces (NP-traces). Specifically, we hypothesised that an expectation of parallelism would incite the parser to assume a structure similar to the first conjunct in the second, anaphora-containing conjunct. The speculation of a similar structure would result in early postulation of covert anaphora. Experiment 1 confirms that following a parallelism-related conjunction, first-conjunct material is activated in the second conjunct. Experiment 2 reveals that an NP-trace in the second conjunct is posited immediately where licensed, which is earlier than previously reported in the literature. In light of our findings, we propose an intricate relation between structural expectations and anaphor resolution. PMID:23741080

  12. Consistency between hydrological models and field observations: Linking processes at the hillslope scale to hydrological responses at the watershed scale

    USGS Publications Warehouse

    Clark, M.P.; Rupp, D.E.; Woods, R.A.; Tromp-van, Meerveld; Peters, N.E.; Freer, J.E.

    2009-01-01

    The purpose of this paper is to identify simple connections between observations of hydrological processes at the hillslope scale and observations of the response of watersheds following rainfall, with a view to building a parsimonious model of catchment processes. The focus is on the well-studied Panola Mountain Research Watershed (PMRW), Georgia, USA. Recession analysis of discharge Q shows that while the relationship between dQ/dt and Q is approximately consistent with a linear reservoir for the hillslope, there is a deviation from linearity that becomes progressively larger with increasing spatial scale. To account for these scale differences conceptual models of streamflow recession are defined at both the hillslope scale and the watershed scale, and an assessment made as to whether models at the hillslope scale can be aggregated to be consistent with models at the watershed scale. Results from this study show that a model with parallel linear reservoirs provides the most plausible explanation (of those tested) for both the linear hillslope response to rainfall and non-linear recession behaviour observed at the watershed outlet. In this model each linear reservoir is associated with a landscape type. The parallel reservoir model is consistent with both geochemical analyses of hydrological flow paths and water balance estimates of bedrock recharge. Overall, this study demonstrates that standard approaches of using recession analysis to identify the functional form of storage-discharge relationships identify model structures that are inconsistent with field evidence, and that recession analysis at multiple spatial scales can provide useful insights into catchment behaviour. Copyright ?? 2008 John Wiley & Sons, Ltd.

  13. Protein structure determination by exhaustive search of Protein Data Bank derived databases.

    PubMed

    Stokes-Rees, Ian; Sliz, Piotr

    2010-12-14

    Parallel sequence and structure alignment tools have become ubiquitous and invaluable at all levels in the study of biological systems. We demonstrate the application and utility of this same parallel search paradigm to the process of protein structure determination, benefitting from the large and growing corpus of known structures. Such searches were previously computationally intractable. Through the method of Wide Search Molecular Replacement, developed here, they can be completed in a few hours with the aide of national-scale federated cyberinfrastructure. By dramatically expanding the range of models considered for structure determination, we show that small (less than 12% structural coverage) and low sequence identity (less than 20% identity) template structures can be identified through multidimensional template scoring metrics and used for structure determination. Many new macromolecular complexes can benefit significantly from such a technique due to the lack of known homologous protein folds or sequences. We demonstrate the effectiveness of the method by determining the structure of a full-length p97 homologue from Trichoplusia ni. Example cases with the MHC/T-cell receptor complex and the EmoB protein provide systematic estimates of minimum sequence identity, structure coverage, and structural similarity required for this method to succeed. We describe how this structure-search approach and other novel computationally intensive workflows are made tractable through integration with the US national computational cyberinfrastructure, allowing, for example, rapid processing of the entire Structural Classification of Proteins protein fragment database.

  14. Seismic analysis of parallel structures coupled by lead extrusion dampers

    NASA Astrophysics Data System (ADS)

    Patel, C. C.

    2017-06-01

    In this paper, the response behaviors of two parallel structures coupled by Lead Extrusion Dampers (LED) under various earthquake ground motion excitations are investigated. The equation of motion for the two parallel, multi-degree-of-freedom (MDOF) structures connected by LEDs is formulated. To explore the viability of LED to control the responses, namely displacement, acceleration and shear force of parallel coupled structures, the numerical study is done in two parts: (1) two parallel MDOF structures connected with LEDs having same damper damping in all the dampers and (2) two parallel MDOF structures connected with LEDs having different damper damping. A parametric study is conducted to investigate the optimum damping of the dampers. Moreover, to limit the cost of the dampers, the study is conducted with only 50% of total dampers at optimal locations, instead of placing the dampers at all the floor level. Results show that LEDs connecting the parallel structures of different fundamental frequencies, the earthquake-induced responses of either structure can be effectively reduced. Further, it is not necessary to connect the two structures at all floors; however, lesser damper at appropriate locations can significantly reduce the earthquake response of the coupled system, thus reducing the cost of the dampers significantly.

  15. Investigations on the hierarchy of reference frames in geodesy and geodynamics

    NASA Technical Reports Server (NTRS)

    Grafarend, E. W.; Mueller, I. I.; Papo, H. B.; Richter, B.

    1979-01-01

    Problems related to reference directions were investigated. Space and time variant angular parameters are illustrated in hierarchic structures or towers. Using least squares techniques, model towers of triads are presented which allow the formation of linear observation equations. Translational and rotational degrees of freedom (origin and orientation) are discussed along with and the notion of length and scale degrees of freedom. According to the notion of scale parallelism, scale factors with respect to a unit length are given. Three-dimensional geodesy was constructed from the set of three base vectors (gravity, earth-rotation and the ecliptic normal vector). Space and time variations are given with respect to a polar and singular value decomposition or in terms of changes in translation, rotation, deformation (shear, dilatation or angular and scale distortions).

  16. A compositional reservoir simulator on distributed memory parallel computers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rame, M.; Delshad, M.

    1995-12-31

    This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less

  17. Global kinetic simulations of neoclassical toroidal viscosity in low-collisional perturbed tokamak plasmas

    NASA Astrophysics Data System (ADS)

    Matsuoka, Seikichi; Idomura, Yasuhiro; Satake, Shinsuke

    2017-10-01

    The neoclassical toroidal viscosity (NTV) caused by a non-axisymmetric magnetic field perturbation is numerically studied using two global kinetic simulations with different numerical approaches. Both simulations reproduce similar collisionality ( νb*) dependencies over wide νb * ranges. It is demonstrated that resonant structures in the velocity space predicted by the conventional superbanana-plateau theory exist in the small banana width limit, while the resonances diminish when the banana width becomes large. It is also found that fine scale structures are generated in the velocity space as νb* decreases in the large banana width simulations, leading to the νb* -dependency of the NTV. From the analyses of the particle orbit, it is found that the finite k∥ mode structure along the bounce motion appears owing to the finite orbit width, and it suffers from bounce phase mixing, suggesting the generation of the fine scale structures by the similar mechanism as the parallel phase mixing of passing particles.

  18. George E. Pake Prize Lecture: CMOS Technology Roadmap: Is Scaling Ending?

    NASA Astrophysics Data System (ADS)

    Chen, Tze-Chiang (T. C.)

    The development of silicon technology has been based on the principle of physics and driven by the system needs. Traditionally, the system needs have been satisfied by the increase in transistor density and performance, as suggested by Moore's Law and guided by ''Dennard CMOS scaling theory''. As the silicon industry moves towards the 14nm node and beyond, three of the most important challenges facing Moore's Law and continued CMOS scaling are the growing standby power dissipation, the increasing variability in device characteristics and the ever increasing manufacturing cost. Actually, the first two factors are the embodiments of CMOS approaching atomistic and quantum-mechanical physics boundaries. Industry directions for addressing these challenges are also developing along three primary approaches: Extending silicon scaling through innovations in materials and device structure, expanding the level of integration through three-dimensional structures comprised of through-silicon-vias holes and chip stacking in order to enhance functionality and parallelism and exploring post-silicon CMOS innovation with new nano-devices based on distinctly different principles of physics, new materials and new processes such as spintronics, carbon nanotubes and nanowires. Hence, the infusion of new materials, innovative integration and novel device structures will continue to extend CMOS technology scaling for at least another decade.

  19. Molecular-Scale Structural Controls on Nanoscale Growth Processes: Step-Specific Regulation of Biomineral Morphology

    NASA Astrophysics Data System (ADS)

    Dove, P. M.; Davis, K. J.; De Yoreo, J. J.; Orme, C. A.

    2001-12-01

    Deciphering the complex strategies by which organisms produce nanocrystalline materials with exquisite morphologies is central to understanding biomineralizing systems. One control on the morphology of biogenic nanoparticles is the specific interactions of their surfaces with the organic functional groups provided by the organism and the various inorganic species present in the ambient environment. It is now possible to directly probe the microscopic structural controls on crystal morphology by making quantitative measurements of the dynamic processes occurring at the mineral-water interface. These observations can provide crucial information concerning the actual mechanisms of growth that is otherwise unobtainable through macroscopic techniques. Here we use in situ molecular-scale observations of step dynamics and growth hillock morphology to directly resolve roles of principal impurities in regulating calcite surface morphologies. We show that the interactions of certain inorganic as well as organic impurities with the calcite surface are dependent upon the molecular-scale structures of step-edges. These interactions can assume a primary role in directing crystal morphology. In calcite growth experiments containing magnesium, we show that growth hillock structures become modified owing to the preferential inhibition of step motion along directions approximately parallel to the [010]. Compositional analyses have shown that Mg incorporates at different levels into the two types of nonequivalent steps, which meet at the hillock corner parallel to [010]. A simple calculation of the strain caused by this difference indicates that we should expect a significant retardation at this corner, in agreement with the observed development of [010] steps. If the low-energy step-risers produced by these [010] steps is perpendicular to the c-axis as seems likely from crystallographic considerations, this effect provides a plausible mechanism for the elongated calcite crystal habits found in natural environments that contain magnesium. In a separate study, step-specific interactions are also found between chiral aspartate molecules and the calcite surface. The L and D- aspartate enantiomers exhibit structure preferences for the different types of step-risers on the calcite surface. These site-specific interactions result in the transfer of asymmetry from the organic molecule to the crystal surface through the formation of chiral growth hillocks and surface morphologies. These studies yield direct experimental insight into the molecular-scale structural controls on nanocrystal morphology in biomineralizing systems.

  20. Decomposition method for fast computation of gigapixel-sized Fresnel holograms on a graphics processing unit cluster.

    PubMed

    Jackin, Boaz Jessie; Watanabe, Shinpei; Ootsu, Kanemitsu; Ohkawa, Takeshi; Yokota, Takashi; Hayasaki, Yoshio; Yatagai, Toyohiko; Baba, Takanobu

    2018-04-20

    A parallel computation method for large-size Fresnel computer-generated hologram (CGH) is reported. The method was introduced by us in an earlier report as a technique for calculating Fourier CGH from 2D object data. In this paper we extend the method to compute Fresnel CGH from 3D object data. The scale of the computation problem is also expanded to 2 gigapixels, making it closer to real application requirements. The significant feature of the reported method is its ability to avoid communication overhead and thereby fully utilize the computing power of parallel devices. The method exhibits three layers of parallelism that favor small to large scale parallel computing machines. Simulation and optical experiments were conducted to demonstrate the workability and to evaluate the efficiency of the proposed technique. A two-times improvement in computation speed has been achieved compared to the conventional method, on a 16-node cluster (one GPU per node) utilizing only one layer of parallelism. A 20-times improvement in computation speed has been estimated utilizing two layers of parallelism on a very large-scale parallel machine with 16 nodes, where each node has 16 GPUs.

  1. Structure and Dissipation Characteristics of an Electron Diffusion Region Observed by MMS During a Rapid, Normal-Incidence Magnetopause Crossing

    NASA Astrophysics Data System (ADS)

    Torbert, R. B.; Burch, J. L.; Argall, M. R.; Alm, L.; Farrugia, C. J.; Forbes, T. G.; Giles, B. L.; Rager, A.; Dorelli, J.; Strangeway, R. J.; Ergun, R. E.; Wilder, F. D.; Ahmadi, N.; Lindqvist, P.-A.; Khotyaintsev, Y.

    2017-12-01

    On 22 October 2016, the Magnetospheric Multiscale (MMS) spacecraft encountered the electron diffusion region (EDR) when the magnetosheath field was southward, and there were signatures of fast reconnection, including flow jets, Hall fields, and large power dissipation. One rapid, normal-incidence crossing, during which the EDR structure was almost stationary in the boundary frame, provided an opportunity to observe the spatial structure for the zero guide field case of magnetic reconnection. The reconnection electric field was determined unambiguously to be 2-3 mV/m. There were clear signals of fluctuating parallel electric fields, up to 6 mV/m on the magnetosphere side of the diffusion region, associated with a Hall-like parallel current feature on the electron scale. The width of the main EDR structure was determined to be 2 km (1.8 de). Although the MMS spacecraft were in their closest tetrahedral separation of 8 km, the divergences and curls for these thin current structures could therefore not be computed in the usual manner. A method is developed to determine these quantities on a much smaller scale and applied to compute the normal component of terms in the generalized Ohm's law for the positions of each individual spacecraft (not a barocentric average). Although the gradient pressure term has a qualitative dependence that follows the observed variation of E + Ve × B, the quantitative magnitude of these terms differs by more than a factor of 2, which is shown to be greater than the respective errors. Thus, future research is required to find the manner in which Ohm's law is balanced.

  2. Scalable Preconditioners for Structure Preserving Discretizations of Maxwell Equations in First Order Form

    DOE PAGES

    Phillips, Edward Geoffrey; Shadid, John N.; Cyr, Eric C.

    2018-05-01

    Here, we report multiple physical time-scales can arise in electromagnetic simulations when dissipative effects are introduced through boundary conditions, when currents follow external time-scales, and when material parameters vary spatially. In such scenarios, the time-scales of interest may be much slower than the fastest time-scales supported by the Maxwell equations, therefore making implicit time integration an efficient approach. The use of implicit temporal discretizations results in linear systems in which fast time-scales, which severely constrain the stability of an explicit method, can manifest as so-called stiff modes. This study proposes a new block preconditioner for structure preserving (also termed physicsmore » compatible) discretizations of the Maxwell equations in first order form. The intent of the preconditioner is to enable the efficient solution of multiple-time-scale Maxwell type systems. An additional benefit of the developed preconditioner is that it requires only a traditional multigrid method for its subsolves and compares well against alternative approaches that rely on specialized edge-based multigrid routines that may not be readily available. Lastly, results demonstrate parallel scalability at large electromagnetic wave CFL numbers on a variety of test problems.« less

  3. Scalable Preconditioners for Structure Preserving Discretizations of Maxwell Equations in First Order Form

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Phillips, Edward Geoffrey; Shadid, John N.; Cyr, Eric C.

    Here, we report multiple physical time-scales can arise in electromagnetic simulations when dissipative effects are introduced through boundary conditions, when currents follow external time-scales, and when material parameters vary spatially. In such scenarios, the time-scales of interest may be much slower than the fastest time-scales supported by the Maxwell equations, therefore making implicit time integration an efficient approach. The use of implicit temporal discretizations results in linear systems in which fast time-scales, which severely constrain the stability of an explicit method, can manifest as so-called stiff modes. This study proposes a new block preconditioner for structure preserving (also termed physicsmore » compatible) discretizations of the Maxwell equations in first order form. The intent of the preconditioner is to enable the efficient solution of multiple-time-scale Maxwell type systems. An additional benefit of the developed preconditioner is that it requires only a traditional multigrid method for its subsolves and compares well against alternative approaches that rely on specialized edge-based multigrid routines that may not be readily available. Lastly, results demonstrate parallel scalability at large electromagnetic wave CFL numbers on a variety of test problems.« less

  4. Parallel Multiscale Algorithms for Astrophysical Fluid Dynamics Simulations

    NASA Technical Reports Server (NTRS)

    Norman, Michael L.

    1997-01-01

    Our goal is to develop software libraries and applications for astrophysical fluid dynamics simulations in multidimensions that will enable us to resolve the large spatial and temporal variations that inevitably arise due to gravity, fronts and microphysical phenomena. The software must run efficiently on parallel computers and be general enough to allow the incorporation of a wide variety of physics. Cosmological structure formation with realistic gas physics is the primary application driver in this work. Accurate simulations of e.g. galaxy formation require a spatial dynamic range (i.e., ratio of system scale to smallest resolved feature) of 104 or more in three dimensions in arbitrary topologies. We take this as our technical requirement. We have achieved, and in fact, surpassed these goals.

  5. Load Balancing Strategies for Multi-Block Overset Grid Applications

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Biswas, Rupak; Lopez-Benitez, Noe; Biegel, Bryan (Technical Monitor)

    2002-01-01

    The multi-block overset grid method is a powerful technique for high-fidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process uses a grid system that discretizes the problem domain by using separately generated but overlapping structured grids that periodically update and exchange boundary information through interpolation. For efficient high performance computations of large-scale realistic applications using this methodology, the individual grids must be properly partitioned among the parallel processors. Overall performance, therefore, largely depends on the quality of load balancing. In this paper, we present three different load balancing strategies far overset grids and analyze their effects on the parallel efficiency of a Navier-Stokes CFD application running on an SGI Origin2000 machine.

  6. Nonlinear Tollmien-Schlichting/vortex interaction in boundary layers

    NASA Technical Reports Server (NTRS)

    Hall, P.; Smith, F. T.

    1988-01-01

    The nonlinear reaction between two oblique 3-D Tollmein-Schlichting (TS) waves and their induced streamwise-vortex flow is considered theoretically for an imcompressible boundary layer. The same theory applies to the destabilization of an incident vortex motion by subharmonic TS waves, followed by interaction. The scales and flow structure involved are addressed for high Reynolds numbers. The nonlionear interaction is powerful, starting at quite low amplitudes with a triple-deck structure for the TS waves but a large-scale structure for the induced vortex, after which strong nonlinear amplification occurs. This includes nonparallel-flow effects. The nonlinear interaction is governed by a partial differential system for the vortex flow coupled with an ordinary-differential one for the TS pressure. The solution properties found sometimes produce a breakup within a finite distance and sometimes further downstream, depending on the input amplitudes upstream and on the wave angles, and that then leads to the second stages of interaction associated with higher amplitudes, the main second stages giving either long-scale phenomena significantly affected by nonparallelism or shorter quasi-parallel ones governed by the full nonlinear triple-deck response.

  7. Dissipative structures of diffuse molecular gas. III. Small-scale intermittency of intense velocity-shears

    NASA Astrophysics Data System (ADS)

    Hily-Blant, P.; Falgarone, E.; Pety, J.

    2008-04-01

    Aims: We further characterize the structures tentatively identified on thermal and chemical grounds as the sites of dissipation of turbulence in molecular clouds (Papers I and II). Methods: Our study is based on two-point statistics of line centroid velocities (CV), computed from three large 12CO maps of two fields. We build the probability density functions (PDF) of the CO line centroid velocity increments (CVI) over lags varying by an order of magnitude. Structure functions of the line CV are computed up to the 6th order. We compare these statistical properties in two translucent parsec-scale fields embedded in different large-scale environments, one far from virial balance and the other virialized. We also address their scale dependence in the former, more turbulent, field. Results: The statistical properties of the line CV bear the three signatures of intermittency in a turbulent velocity field: (1) the non-Gaussian tails in the CVI PDF grow as the lag decreases, (2) the departure from Kolmogorov scaling of the high-order structure functions is more pronounced in the more turbulent field, (3) the positions contributing to the CVI PDF tails delineate narrow filamentary structures (thickness ~0.02 pc), uncorrelated to dense gas structures and spatially coherent with thicker ones (~0.18 pc) observed on larger scales. We show that the largest CVI trace sharp variations of the extreme CO linewings and that they actually capture properties of the underlying velocity field, uncontaminated by density fluctuations. The confrontation with theoretical predictions leads us to identify these small-scale filamentary structures with extrema of velocity-shears. We estimate that viscous dissipation at the 0.02 pc-scale in these structures is up to 10 times higher than average, consistent with their being associated with gas warmer than the bulk. Last, their average direction is parallel (or close) to that of the local magnetic field projection. Conclusions: Turbulence in these translucent fields exhibits the statistical and structural signatures of small-scale and inertial-range intermittency. The more turbulent field on the 30 pc-scale is also the more intermittent on small scales. The small-scale intermittent structures coincide with those formerly identified as sites of enhanced dissipation. They are organized into parsec-scale coherent structures, coupling a broad range of scales. Based on observations carried out with the IRAM-30 m telescope. IRAM is supported by INSU-CNRS/MPG/IGN.

  8. A new deadlock resolution protocol and message matching algorithm for the extreme-scale simulator

    DOE PAGES

    Engelmann, Christian; Naughton, III, Thomas J.

    2016-03-22

    Investigating the performance of parallel applications at scale on future high-performance computing (HPC) architectures and the performance impact of different HPC architecture choices is an important component of HPC hardware/software co-design. The Extreme-scale Simulator (xSim) is a simulation toolkit for investigating the performance of parallel applications at scale. xSim scales to millions of simulated Message Passing Interface (MPI) processes. The overhead introduced by a simulation tool is an important performance and productivity aspect. This paper documents two improvements to xSim: (1)~a new deadlock resolution protocol to reduce the parallel discrete event simulation overhead and (2)~a new simulated MPI message matchingmore » algorithm to reduce the oversubscription management overhead. The results clearly show a significant performance improvement. The simulation overhead for running the NAS Parallel Benchmark suite was reduced from 102% to 0% for the embarrassingly parallel (EP) benchmark and from 1,020% to 238% for the conjugate gradient (CG) benchmark. xSim offers a highly accurate simulation mode for better tracking of injected MPI process failures. Furthermore, with highly accurate simulation, the overhead was reduced from 3,332% to 204% for EP and from 37,511% to 13,808% for CG.« less

  9. OpenMP parallelization of a gridded SWAT (SWATG)

    NASA Astrophysics Data System (ADS)

    Zhang, Ying; Hou, Jinliang; Cao, Yongpan; Gu, Juan; Huang, Chunlin

    2017-12-01

    Large-scale, long-term and high spatial resolution simulation is a common issue in environmental modeling. A Gridded Hydrologic Response Unit (HRU)-based Soil and Water Assessment Tool (SWATG) that integrates grid modeling scheme with different spatial representations also presents such problems. The time-consuming problem affects applications of very high resolution large-scale watershed modeling. The OpenMP (Open Multi-Processing) parallel application interface is integrated with SWATG (called SWATGP) to accelerate grid modeling based on the HRU level. Such parallel implementation takes better advantage of the computational power of a shared memory computer system. We conducted two experiments at multiple temporal and spatial scales of hydrological modeling using SWATG and SWATGP on a high-end server. At 500-m resolution, SWATGP was found to be up to nine times faster than SWATG in modeling over a roughly 2000 km2 watershed with 1 CPU and a 15 thread configuration. The study results demonstrate that parallel models save considerable time relative to traditional sequential simulation runs. Parallel computations of environmental models are beneficial for model applications, especially at large spatial and temporal scales and at high resolutions. The proposed SWATGP model is thus a promising tool for large-scale and high-resolution water resources research and management in addition to offering data fusion and model coupling ability.

  10. Biophysical Discovery through the Lens of a Computational Microscope

    NASA Astrophysics Data System (ADS)

    Amaro, Rommie

    With exascale computing power on the horizon, improvements in the underlying algorithms and available structural experimental data are enabling new paradigms for chemical discovery. My work has provided key insights for the systematic incorporation of structural information resulting from state-of-the-art biophysical simulations into protocols for inhibitor and drug discovery. We have shown that many disease targets have druggable pockets that are otherwise ``hidden'' in high resolution x-ray structures, and that this is a common theme across a wide range of targets in different disease areas. We continue to push the limits of computational biophysical modeling by expanding the time and length scales accessible to molecular simulation. My sights are set on, ultimately, the development of detailed physical models of cells, as the fundamental unit of life, and two recent achievements highlight our efforts in this arena. First is the development of a molecular and Brownian dynamics multi-scale modeling framework, which allows us to investigate drug binding kinetics in addition to thermodynamics. In parallel, we have made significant progress developing new tools to extend molecular structure to cellular environments. Collectively, these achievements are enabling the investigation of the chemical and biophysical nature of cells at unprecedented scales.

  11. Spatial structures of stream and hillslope drainage networks following gully erosion after wildfire

    USGS Publications Warehouse

    Moody, J.A.; Kinner, D.A.

    2006-01-01

    The drainage networks of catchment areas burned by wildfire were analysed at several scales. The smallest scale (1-1000 m2) representative of hillslopes, and the small scale (1000 m2 to 1 km2), representative of small catchments, were characterized by the analysis of field measurements. The large scale (1-1000 km2), representative of perennial stream networks, was derived from a 30-m digital elevation model and analysed by computer analysis. Scaling laws used to describe large-scale drainage networks could be extrapolated to the small scale but could not describe the smallest scale of drainage structures observed in the hillslope region. The hillslope drainage network appears to have a second-order effect that reduces the number of order 1 and order 2 streams predicted by the large-scale channel structure. This network comprises two spatial patterns of rills with width-to-depth ratios typically less than 10. One pattern is parallel rills draining nearly planar hillslope surfaces, and the other pattern is three to six converging rills draining the critical source area uphill from an order 1 channel head. The magnitude of this critical area depends on infiltration, hillslope roughness and critical shear stress for erosion of sediment, all of which can be substantially altered by wildfire. Order 1 and 2 streams were found to constitute the interface region, which is altered by a disturbance, like wildfire, from subtle unchannelized drainages in unburned catchments to incised drainages. These drainages are characterized by gullies also with width-to-depth ratios typically less than 10 in burned catchments. The regions (hillslope, interface and chanel) had different drainage network structures to collect and transfer water and sediment. Copyright ?? 2005 John Wiley & Sons, Ltd.

  12. Reversible Parallel Discrete-Event Execution of Large-scale Epidemic Outbreak Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perumalla, Kalyan S; Seal, Sudip K

    2010-01-01

    The spatial scale, runtime speed and behavioral detail of epidemic outbreak simulations together require the use of large-scale parallel processing. In this paper, an optimistic parallel discrete event execution of a reaction-diffusion simulation model of epidemic outbreaks is presented, with an implementation over themore » $$\\mu$$sik simulator. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. Parallel speedup and other runtime performance metrics of the simulation are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundred million individuals in the largest case) are exercised.« less

  13. A compact linear accelerator based on a scalable microelectromechanical-system RF-structure

    DOE PAGES

    Persaud, A.; Ji, Q.; Feinberg, E.; ...

    2017-06-08

    Here, a new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number ofmore » parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further red ucing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Finally, ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.« less

  14. A compact linear accelerator based on a scalable microelectromechanical-system RF-structure

    NASA Astrophysics Data System (ADS)

    Persaud, A.; Ji, Q.; Feinberg, E.; Seidl, P. A.; Waldron, W. L.; Schenkel, T.; Lal, A.; Vinayakumar, K. B.; Ardanuc, S.; Hammer, D. A.

    2017-06-01

    A new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number of parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further reducing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.

  15. A compact linear accelerator based on a scalable microelectromechanical-system RF-structure.

    PubMed

    Persaud, A; Ji, Q; Feinberg, E; Seidl, P A; Waldron, W L; Schenkel, T; Lal, A; Vinayakumar, K B; Ardanuc, S; Hammer, D A

    2017-06-01

    A new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number of parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further reducing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.

  16. Parallel computations and control of adaptive structures

    NASA Technical Reports Server (NTRS)

    Park, K. C.; Alvin, Kenneth F.; Belvin, W. Keith; Chong, K. P. (Editor); Liu, S. C. (Editor); Li, J. C. (Editor)

    1991-01-01

    The equations of motion for structures with adaptive elements for vibration control are presented for parallel computations to be used as a software package for real-time control of flexible space structures. A brief introduction of the state-of-the-art parallel computational capability is also presented. Time marching strategies are developed for an effective use of massive parallel mapping, partitioning, and the necessary arithmetic operations. An example is offered for the simulation of control-structure interaction on a parallel computer and the impact of the approach presented for applications in other disciplines than aerospace industry is assessed.

  17. Compiled MPI: Cost-Effective Exascale Applications Development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bronevetsky, G; Quinlan, D; Lumsdaine, A

    2012-04-10

    The complexity of petascale and exascale machines makes it increasingly difficult to develop applications that can take advantage of them. Future systems are expected to feature billion-way parallelism, complex heterogeneous compute nodes and poor availability of memory (Peter Kogge, 2008). This new challenge for application development is motivating a significant amount of research and development on new programming models and runtime systems designed to simplify large-scale application development. Unfortunately, DoE has significant multi-decadal investment in a large family of mission-critical scientific applications. Scaling these applications to exascale machines will require a significant investment that will dwarf the costs of hardwaremore » procurement. A key reason for the difficulty in transitioning today's applications to exascale hardware is their reliance on explicit programming techniques, such as the Message Passing Interface (MPI) programming model to enable parallelism. MPI provides a portable and high performance message-passing system that enables scalable performance on a wide variety of platforms. However, it also forces developers to lock the details of parallelization together with application logic, making it very difficult to adapt the application to significant changes in the underlying system. Further, MPI's explicit interface makes it difficult to separate the application's synchronization and communication structure, reducing the amount of support that can be provided by compiler and run-time tools. This is in contrast to the recent research on more implicit parallel programming models such as Chapel, OpenMP and OpenCL, which promise to provide significantly more flexibility at the cost of reimplementing significant portions of the application. We are developing CoMPI, a novel compiler-driven approach to enable existing MPI applications to scale to exascale systems with minimal modifications that can be made incrementally over the application's lifetime. It includes: (1) New set of source code annotations, inserted either manually or automatically, that will clarify the application's use of MPI to the compiler infrastructure, enabling greater accuracy where needed; (2) A compiler transformation framework that leverages these annotations to transform the original MPI source code to improve its performance and scalability; (3) Novel MPI runtime implementation techniques that will provide a rich set of functionality extensions to be used by applications that have been transformed by our compiler; and (4) A novel compiler analysis that leverages simple user annotations to automatically extract the application's communication structure and synthesize most complex code annotations.« less

  18. Molecular-dynamics simulations of self-assembled monolayers (SAM) on parallel computers

    NASA Astrophysics Data System (ADS)

    Vemparala, Satyavani

    The purpose of this dissertation is to investigate the properties of self-assembled monolayers, particularly alkanethiols and Poly (ethylene glycol) terminated alkanethiols. These simulations are based on realistic interatomic potentials and require scalable and portable multiresolution algorithms implemented on parallel computers. Large-scale molecular dynamics simulations of self-assembled alkanethiol monolayer systems have been carried out using an all-atom model involving a million atoms to investigate their structural properties as a function of temperature, lattice spacing and molecular chain-length. Results show that the alkanethiol chains tilt from the surface normal by a collective angle of 25° along next-nearest neighbor direction at 300K. At 350K the system transforms to a disordered phase characterized by small tilt angle, flexible tilt direction, and random distribution of backbone planes. With increasing lattice spacing, a, the tilt angle increases rapidly from a nearly zero value at a = 4.7A to as high as 34° at a = 5.3A at 300K. We also studied the effect of end groups on the tilt structure of SAM films. We characterized the system with respect to temperature, the alkane chain length, lattice spacing, and the length of the end group. We found that the gauche defects were predominant only in the tails, and the gauche defects increased with the temperature and number of EG units. Effect of electric field on the structure of poly (ethylene glycol) (PEG) terminated alkanethiol self assembled monolayer (SAM) on gold has been studied using parallel molecular dynamics method. An applied electric field triggers a conformational transition from all-trans to a mostly gauche conformation. The polarity of the electric field has a significant effect on the surface structure of PEG leading to a profound effect on the hydrophilicity of the surface. The electric field applied anti-parallel to the surface normal causes a reversible transition to an ordered state in which the oxygen atoms are exposed. On the other hand, an electric field applied in a direction parallel to the surface normal introduces considerable disorder in the system and the oxygen atoms are buried inside.

  19. Seismic Anisotropy And Upper Mantle Structure In Se Brazil

    NASA Astrophysics Data System (ADS)

    Heintz, M.; Vauchez, A.; Assumpcao, M.; Egydio-Silva, M.

    We present preliminary shear wave splitting measurements performed in south-east Brazil in a quite complex region, from a geological point of view. Seismic anisotropy is the result of a preferred orientation of anisotropic minerals (olivine) in the upper mantle, due to deformation. Splitting parameters Ø (direction of the fastest S wave) are compared to large-scale tectonic structures of the area, in order to infer to which extent the deformations in the upper mantle and in the crust are mechanically coupled. The field of study is a region of 1000 by 1000 km, along the Atlantic coast from São Paulo to 500 km north of Rio de Janeiro. This region is made up of large scale geological units as the southern termination of the São Francisco craton, from archean age, surrounded by two neoproterozoic belts (the Ribeira belt to the east and the Brasilia belt to the west), and the Parana basin, which is a vast flood basalt region. Teleseisms used were acquired by 39 seismological stations well distributed in the region of interest. The results highlight the fact that the orientations of the polarization plane of the fast split shear wave vary a lot in this region, and measurements could be splitted into 5 groups : directions are parallel to the NE-SW trending of the Ribeira belt, some are parallel to the NW-SE trending of the Brasilia belt, in the NE-SW direction of the Transbrasiliano lineament, parallel to the absolute plate maotion (APM) that is EW in this region, or turning around a cylindrical low velocity anomaly imaged in the Parana basin and supposed to be the fossil plume head conduit of the Tristan da Cunha plume head.

  20. New type of kinematic indicator in bed-parallel veins, Late Jurassic-Early Cretaceous Vaca Muerta Formation, Argentina: E-W shortening during Late Cretaceous vein opening

    NASA Astrophysics Data System (ADS)

    Ukar, Estibalitz; Lopez, Ramiro G.; Gale, Julia F. W.; Laubach, Stephen E.; Manceda, Rene

    2017-11-01

    In the Late Jurassic-Early Cretaceous Vaca Muerta Formation, previously unrecognized yet abundant structures constituting a new category of kinematic indicator occur within bed-parallel fibrous calcite veins (BPVs) in shale. Domal shapes result from localized shortening and thickening of BPVs and the intercalation of centimeter-thick, host-rock shale inclusions within fibrous calcite beef, forming thrust fault-bounded pop-up structures. Ellipsoidal and rounded structures show consistent orientations, lineaments of interlayered shale and fibrous calcite, and local centimeter-scale offset thrust faults that at least in some cases cut across the median line of the BPV and indicate E-W shortening. Continuity of crystal fibers shows the domal structures are contemporaneous with BPV formation and help establish timing of fibrous vein growth in the Late Cretaceous, when shortening directions were oriented E-W. Differences in the number of opening stages and the deformational style of the different BPVs indicate they may have opened at different times. The new domal kinematic indicators described in this study are small enough to be captured in core. When present in the subsurface, domal structures can be used to either infer paleostress orientation during the formation of BPVs or to orient core in cases where the paleostress is independently known.

  1. Image quality analysis to reduce dental artifacts in head and neck imaging with dual-source computed tomography.

    PubMed

    Ketelsen, D; Werner, M K; Thomas, C; Tsiflikas, I; Koitschev, A; Reimann, A; Claussen, C D; Heuschmid, M

    2009-01-01

    Important oropharyngeal structures can be superimposed by metallic artifacts due to dental implants. The aim of this study was to compare the image quality of multiplanar reconstructions and an angulated spiral in dual-source computed tomography (DSCT) of the neck. Sixty-two patients were included for neck imaging with DSCT. MPRs from an axial dataset and an additional short spiral parallel to the mouth floor were acquired. Leading anatomical structures were then evaluated with respect to the extent to which they were affected by dental artifacts using a visual scale, ranging from 1 (least artifacts) to 4 (most artifacts). In MPR, 87.1 % of anatomical structures had significant artifacts (3.12 +/- 0.86), while in angulated slices leading anatomical structures of the oropharynx showed negligible artifacts (1.28 +/- 0.46). The diagnostic growth due to primarily angulated slices concerning artifact severity was significant (p < 0.01). MPRs are not capable of reducing dental artifacts sufficiently. In patients with dental artifacts overlying the anatomical structures of the oropharynx, an additional short angulated spiral parallel to the floor of the mouth is recommended and should be applied for daily routine. As a result of the static gantry design of DSCT, the use of a flexible head holder is essential.

  2. Anomalous transport in discrete arcs and simulation of double layers in a model auroral circuit

    NASA Technical Reports Server (NTRS)

    Smith, Robert A.

    1987-01-01

    The evolution and long-time stability of a double layer (DL) in a discrete auroral arc requires that the parallel current in the arc, which may be considered uniform at the source, be diverted within the arc to charge the flanks of the U-shaped double layer potential structure. A simple model is presented in which this current redistribution is effected by anomalous transport based on electrostatic lower hybrid waves driven by the flank structure itself. This process provides the limiting constraint on the double layer potential. The flank charging may be represented as that of a nonlinear transmission line. A simplified model circuit, in which the transmission line is represented by a nonlinear impedance in parallel with a variable resistor, is incorporated in a one-dimensional simulation model to give the current density at the DL boundaries. Results are presented for the scaling of the DL potential as a function of the width of the arc and the saturation efficiency of the lower hybrid instability mechanism.

  3. Anomalous transport in discrete arcs and simulation of double layers in a model auroral circuit

    NASA Technical Reports Server (NTRS)

    Smith, Robert A.

    1987-01-01

    The evolution and long-time stability of a double layer in a discrete auroral arc requires that the parallel current in the arc, which may be considered uniform at the source, be diverted within the arc to charge the flanks of the U-shaped double-layer potential structure. A simple model is presented in which this current re-distribution is effected by anomalous transport based on electrostatic lower hybrid waves driven by the flank structure itself. This process provides the limiting constraint on the double-layer potential. The flank charging may be represented as that of a nonlinear transmission. A simplified model circuit, in which the transmission line is represented by a nonlinear impedance in parallel with a variable resistor, is incorporated in a 1-d simulation model to give the current density at the DL boundaries. Results are presented for the scaling of the DL potential as a function of the width of the arc and the saturation efficiency of the lower hybrid instability mechanism.

  4. Regional-scale calculation of the LS factor using parallel processing

    NASA Astrophysics Data System (ADS)

    Liu, Kai; Tang, Guoan; Jiang, Ling; Zhu, A.-Xing; Yang, Jianyi; Song, Xiaodong

    2015-05-01

    With the increase of data resolution and the increasing application of USLE over large areas, the existing serial implementation of algorithms for computing the LS factor is becoming a bottleneck. In this paper, a parallel processing model based on message passing interface (MPI) is presented for the calculation of the LS factor, so that massive datasets at a regional scale can be processed efficiently. The parallel model contains algorithms for calculating flow direction, flow accumulation, drainage network, slope, slope length and the LS factor. According to the existence of data dependence, the algorithms are divided into local algorithms and global algorithms. Parallel strategy are designed according to the algorithm characters including the decomposition method for maintaining the integrity of the results, optimized workflow for reducing the time taken for exporting the unnecessary intermediate data and a buffer-communication-computation strategy for improving the communication efficiency. Experiments on a multi-node system show that the proposed parallel model allows efficient calculation of the LS factor at a regional scale with a massive dataset.

  5. Scaling Optimization of the SIESTA MHD Code

    NASA Astrophysics Data System (ADS)

    Seal, Sudip; Hirshman, Steven; Perumalla, Kalyan

    2013-10-01

    SIESTA is a parallel three-dimensional plasma equilibrium code capable of resolving magnetic islands at high spatial resolutions for toroidal plasmas. Originally designed to exploit small-scale parallelism, SIESTA has now been scaled to execute efficiently over several thousands of processors P. This scaling improvement was accomplished with minimal intrusion to the execution flow of the original version. First, the efficiency of the iterative solutions was improved by integrating the parallel tridiagonal block solver code BCYCLIC. Krylov-space generation in GMRES was then accelerated using a customized parallel matrix-vector multiplication algorithm. Novel parallel Hessian generation algorithms were integrated and memory access latencies were dramatically reduced through loop nest optimizations and data layout rearrangement. These optimizations sped up equilibria calculations by factors of 30-50. It is possible to compute solutions with granularity N/P near unity on extremely fine radial meshes (N > 1024 points). Grid separation in SIESTA, which manifests itself primarily in the resonant components of the pressure far from rational surfaces, is strongly suppressed by finer meshes. Large problem sizes of up to 300 K simultaneous non-linear coupled equations have been solved on the NERSC supercomputers. Work supported by U.S. DOE under Contract DE-AC05-00OR22725 with UT-Battelle, LLC.

  6. Electrically driven nanopillars for THz quantum cascade lasers.

    PubMed

    Amanti, M I; Bismuto, A; Beck, M; Isa, L; Kumar, K; Reimhult, E; Faist, J

    2013-05-06

    In this work we present a rapid and parallel process for the fabrication of large scale arrays of electrically driven nanopillars for THz quantum cascade active media. We demonstrate electrical injection of pillars of 200 nm diameter and 2 µm height, over a surface of 1 mm(2). THz electroluminescence from the nanopillars is reported. This result is a promising step toward the realization of zero-dimensional structure for terahertz quantum cascade lasers.

  7. Hybrid MPI+OpenMP Programming of an Overset CFD Solver and Performance Investigations

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Jin, Haoqiang H.; Biegel, Bryan (Technical Monitor)

    2002-01-01

    This report describes a two level parallelization of a Computational Fluid Dynamic (CFD) solver with multi-zone overset structured grids. The approach is based on a hybrid MPI+OpenMP programming model suitable for shared memory and clusters of shared memory machines. The performance investigations of the hybrid application on an SGI Origin2000 (O2K) machine is reported using medium and large scale test problems.

  8. Design and Verification of Remote Sensing Image Data Center Storage Architecture Based on Hadoop

    NASA Astrophysics Data System (ADS)

    Tang, D.; Zhou, X.; Jing, Y.; Cong, W.; Li, C.

    2018-04-01

    The data center is a new concept of data processing and application proposed in recent years. It is a new method of processing technologies based on data, parallel computing, and compatibility with different hardware clusters. While optimizing the data storage management structure, it fully utilizes cluster resource computing nodes and improves the efficiency of data parallel application. This paper used mature Hadoop technology to build a large-scale distributed image management architecture for remote sensing imagery. Using MapReduce parallel processing technology, it called many computing nodes to process image storage blocks and pyramids in the background to improve the efficiency of image reading and application and sovled the need for concurrent multi-user high-speed access to remotely sensed data. It verified the rationality, reliability and superiority of the system design by testing the storage efficiency of different image data and multi-users and analyzing the distributed storage architecture to improve the application efficiency of remote sensing images through building an actual Hadoop service system.

  9. Parallel Force Assay for Protein-Protein Interactions

    PubMed Central

    Aschenbrenner, Daniela; Pippig, Diana A.; Klamecka, Kamila; Limmer, Katja; Leonhardt, Heinrich; Gaub, Hermann E.

    2014-01-01

    Quantitative proteome research is greatly promoted by high-resolution parallel format assays. A characterization of protein complexes based on binding forces offers an unparalleled dynamic range and allows for the effective discrimination of non-specific interactions. Here we present a DNA-based Molecular Force Assay to quantify protein-protein interactions, namely the bond between different variants of GFP and GFP-binding nanobodies. We present different strategies to adjust the maximum sensitivity window of the assay by influencing the binding strength of the DNA reference duplexes. The binding of the nanobody Enhancer to the different GFP constructs is compared at high sensitivity of the assay. Whereas the binding strength to wild type and enhanced GFP are equal within experimental error, stronger binding to superfolder GFP is observed. This difference in binding strength is attributed to alterations in the amino acids that form contacts according to the crystal structure of the initial wild type GFP-Enhancer complex. Moreover, we outline the potential for large-scale parallelization of the assay. PMID:25546146

  10. Parallel force assay for protein-protein interactions.

    PubMed

    Aschenbrenner, Daniela; Pippig, Diana A; Klamecka, Kamila; Limmer, Katja; Leonhardt, Heinrich; Gaub, Hermann E

    2014-01-01

    Quantitative proteome research is greatly promoted by high-resolution parallel format assays. A characterization of protein complexes based on binding forces offers an unparalleled dynamic range and allows for the effective discrimination of non-specific interactions. Here we present a DNA-based Molecular Force Assay to quantify protein-protein interactions, namely the bond between different variants of GFP and GFP-binding nanobodies. We present different strategies to adjust the maximum sensitivity window of the assay by influencing the binding strength of the DNA reference duplexes. The binding of the nanobody Enhancer to the different GFP constructs is compared at high sensitivity of the assay. Whereas the binding strength to wild type and enhanced GFP are equal within experimental error, stronger binding to superfolder GFP is observed. This difference in binding strength is attributed to alterations in the amino acids that form contacts according to the crystal structure of the initial wild type GFP-Enhancer complex. Moreover, we outline the potential for large-scale parallelization of the assay.

  11. Generation and evolution of anisotropic turbulence and related energy transfer in a multi-species solar wind

    NASA Astrophysics Data System (ADS)

    Maneva, Yana; Poedts, Stefaan

    2017-04-01

    The electromagnetic fluctuations in the solar wind represent a zoo of plasma waves with different properties, whose wavelengths range from largest fluid scales to the smallest dissipation scales. By nature the power spectrum of the magnetic fluctuations is anisotropic with different spectral slopes in parallel and perpendicular directions with respect to the background magnetic field. Furthermore, the magnetic field power spectra steepen as one moves from the inertial to the dissipation range and we observe multiple spectral breaks with different slopes in parallel and perpendicular direction at the ion scales and beyond. The turbulent dissipation of magnetic field fluctuations at the sub-ion scales is believed to go into local ion heating and acceleration, so that the spectral breaks are typically associated with particle energization. The gained energy can be in the form of anisotropic heating, formation of non-thermal features in the particle velocity distributions functions, and redistribution of the differential acceleration between the different ion populations. To study the relation between the evolution of the anisotropic turbulent spectra and the particle heating at the ion and sub-ion scales we perform a series of 2.5D hybrid simulations in a collisionless drifting proton-alpha plasma. We neglect the fast electron dynamics and treat the electrons as an isothermal fluid electrons, whereas the protons and a minor population of alpha particles are evolved in a fully kinetic manner. We start with a given wave spectrum and study the evolution of the magnetic field spectral slopes as a function of the parallel and perpendicular wave¬numbers. Simultaneously, we track the particle response and the energy exchange between the parallel and perpendicular scales. We observe anisotropic behavior of the turbulent power spectra with steeper slopes along the dominant energy-containing direction. This means that for parallel and quasi-parallel waves we have steeper spectral slope in parallel direction, whereas for highly oblique waves the dissipation occurs predominantly in perpendicular direction and the spectral slopes are steeper across the background magnetic field. The value of the spectral slopes depends on the angle of propagation, the spectral range, as well as the plasma properties. In general the dissipation is stronger at small scales and the corresponding spectral slopes there are steeper. For parallel and quasi-parallel propagation the prevailing energy cascade remains along the magnetic field, whereas for initially isotropic oblique turbulence the cascade develops mainly in perpendicular direction.

  12. The linearly scaling 3D fragment method for large scale electronic structure calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, Zhengji; Meza, Juan; Lee, Byounghak

    2009-07-28

    The Linearly Scaling three-dimensional fragment (LS3DF) method is an O(N) ab initio electronic structure method for large-scale nano material simulations. It is a divide-and-conquer approach with a novel patching scheme that effectively cancels out the artificial boundary effects, which exist in all divide-and-conquer schemes. This method has made ab initio simulations of thousand-atom nanosystems feasible in a couple of hours, while retaining essentially the same accuracy as the direct calculation methods. The LS3DF method won the 2008 ACM Gordon Bell Prize for algorithm innovation. Our code has reached 442 Tflop/s running on 147,456 processors on the Cray XT5 (Jaguar) atmore » OLCF, and has been run on 163,840 processors on the Blue Gene/P (Intrepid) at ALCF, and has been applied to a system containing 36,000 atoms. In this paper, we will present the recent parallel performance results of this code, and will apply the method to asymmetric CdSe/CdS core/shell nanorods, which have potential applications in electronic devices and solar cells.« less

  13. The Linearly Scaling 3D Fragment Method for Large Scale Electronic Structure Calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, Zhengji; Meza, Juan; Lee, Byounghak

    2009-06-26

    The Linearly Scaling three-dimensional fragment (LS3DF) method is an O(N) ab initio electronic structure method for large-scale nano material simulations. It is a divide-and-conquer approach with a novel patching scheme that effectively cancels out the artificial boundary effects, which exist in all divide-and-conquer schemes. This method has made ab initio simulations of thousand-atom nanosystems feasible in a couple of hours, while retaining essentially the same accuracy as the direct calculation methods. The LS3DF method won the 2008 ACM Gordon Bell Prize for algorithm innovation. Our code has reached 442 Tflop/s running on 147,456 processors on the Cray XT5 (Jaguar) atmore » OLCF, and has been run on 163,840 processors on the Blue Gene/P (Intrepid) at ALCF, and has been applied to a system containing 36,000 atoms. In this paper, we will present the recent parallel performance results of this code, and will apply the method to asymmetric CdSe/CdS core/shell nanorods, which have potential applications in electronic devices and solar cells.« less

  14. Random number generators for large-scale parallel Monte Carlo simulations on FPGA

    NASA Astrophysics Data System (ADS)

    Lin, Y.; Wang, F.; Liu, B.

    2018-05-01

    Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.

  15. Preparation of Protein Samples for NMR Structure, Function, and Small Molecule Screening Studies

    PubMed Central

    Acton, Thomas B.; Xiao, Rong; Anderson, Stephen; Aramini, James; Buchwald, William A.; Ciccosanti, Colleen; Conover, Ken; Everett, John; Hamilton, Keith; Huang, Yuanpeng Janet; Janjua, Haleema; Kornhaber, Gregory; Lau, Jessica; Lee, Dong Yup; Liu, Gaohua; Maglaqui, Melissa; Ma, Lichung; Mao, Lei; Patel, Dayaban; Rossi, Paolo; Sahdev, Seema; Shastry, Ritu; Swapna, G.V.T.; Tang, Yeufeng; Tong, Saichiu; Wang, Dongyan; Wang, Huang; Zhao, Li; Montelione, Gaetano T.

    2014-01-01

    In this chapter, we concentrate on the production of high quality protein samples for NMR studies. In particular, we provide an in-depth description of recent advances in the production of NMR samples and their synergistic use with recent advancements in NMR hardware. We describe the protein production platform of the Northeast Structural Genomics Consortium, and outline our high-throughput strategies for producing high quality protein samples for nuclear magnetic resonance (NMR) studies. Our strategy is based on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems and isotope enrichment in minimal media. We describe 96-well ligation-independent cloning and analytical expression systems, parallel preparative scale fermentation, and high-throughput purification protocols. The 6X-His affinity tag allows for a similar two-step purification procedure implemented in a parallel high-throughput fashion that routinely results in purity levels sufficient for NMR studies (> 97% homogeneity). Using this platform, the protein open reading frames of over 17,500 different targeted proteins (or domains) have been cloned as over 28,000 constructs. Nearly 5,000 of these proteins have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html), resulting in more than 950 new protein structures, including more than 400 NMR structures, deposited in the Protein Data Bank. The Northeast Structural Genomics Consortium pipeline has been effective in producing protein samples of both prokaryotic and eukaryotic origin. Although this paper describes our entire pipeline for producing isotope-enriched protein samples, it focuses on the major updates introduced during the last 5 years (Phase 2 of the National Institute of General Medical Sciences Protein Structure Initiative). Our advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are suitable for implementation in a large individual laboratory or by a small group of collaborating investigators for structural biology, functional proteomics, ligand screening and structural genomics research. PMID:21371586

  16. Multiscale Modeling: A Review

    NASA Astrophysics Data System (ADS)

    Horstemeyer, M. F.

    This review of multiscale modeling covers a brief history of various multiscale methodologies related to solid materials and the associated experimental influences, the various influence of multiscale modeling on different disciplines, and some examples of multiscale modeling in the design of structural components. Although computational multiscale modeling methodologies have been developed in the late twentieth century, the fundamental notions of multiscale modeling have been around since da Vinci studied different sizes of ropes. The recent rapid growth in multiscale modeling is the result of the confluence of parallel computing power, experimental capabilities to characterize structure-property relations down to the atomic level, and theories that admit multiple length scales. The ubiquitous research that focus on multiscale modeling has broached different disciplines (solid mechanics, fluid mechanics, materials science, physics, mathematics, biological, and chemistry), different regions of the world (most continents), and different length scales (from atoms to autos).

  17. Using Agent Base Models to Optimize Large Scale Network for Large System Inventories

    NASA Technical Reports Server (NTRS)

    Shameldin, Ramez Ahmed; Bowling, Shannon R.

    2010-01-01

    The aim of this paper is to use Agent Base Models (ABM) to optimize large scale network handling capabilities for large system inventories and to implement strategies for the purpose of reducing capital expenses. The models used in this paper either use computational algorithms or procedure implementations developed by Matlab to simulate agent based models in a principal programming language and mathematical theory using clusters, these clusters work as a high performance computational performance to run the program in parallel computational. In both cases, a model is defined as compilation of a set of structures and processes assumed to underlie the behavior of a network system.

  18. Magnetic Shear Damped Polar Convective Fluid Instabilities

    NASA Astrophysics Data System (ADS)

    Atul, Jyoti K.; Singh, Rameswar; Sarkar, Sanjib; Kravchenko, Oleg V.; Singh, Sushil K.; Chattopadhyaya, Prabal K.; Kaw, Predhiman K.

    2018-01-01

    The influence of the magnetic field shear is studied on the E × B (and/or gravitational) and the Current Convective Instabilities (CCI) occurring in the high-latitude F layer ionosphere. It is shown that magnetic shear reduces the growth rate of these instabilities. The magnetic shear-induced stabilization is more effective at the larger-scale sizes (≥ tens of kilometers) while at the scintillation causing intermediate scale sizes (˜ a few kilometers), the growth rate remains largely unaffected. The eigenmode structure gets localized about a rational surface due to finite magnetic shear and has broken reflectional symmetry due to centroid shift of the mode by equilibrium parallel flow or current.

  19. Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations

    NASA Astrophysics Data System (ADS)

    Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.

    2016-07-01

    Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.

  20. Partitioning problems in parallel, pipelined and distributed computing

    NASA Technical Reports Server (NTRS)

    Bokhari, S.

    1985-01-01

    The problem of optimally assigning the modules of a parallel program over the processors of a multiple computer system is addressed. A Sum-Bottleneck path algorithm is developed that permits the efficient solution of many variants of this problem under some constraints on the structure of the partitions. In particular, the following problems are solved optimally for a single-host, multiple satellite system: partitioning multiple chain structured parallel programs, multiple arbitrarily structured serial programs and single tree structured parallel programs. In addition, the problems of partitioning chain structured parallel programs across chain connected systems and across shared memory (or shared bus) systems are also solved under certain constraints. All solutions for parallel programs are equally applicable to pipelined programs. These results extend prior research in this area by explicitly taking concurrency into account and permit the efficient utilization of multiple computer architectures for a wide range of problems of practical interest.

  1. Extending substructure based iterative solvers to multiple load and repeated analyses

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel

    1993-01-01

    Direct solvers currently dominate commercial finite element structural software, but do not scale well in the fine granularity regime targeted by emerging parallel processors. Substructure based iterative solvers--often called also domain decomposition algorithms--lend themselves better to parallel processing, but must overcome several obstacles before earning their place in general purpose structural analysis programs. One such obstacle is the solution of systems with many or repeated right hand sides. Such systems arise, for example, in multiple load static analyses and in implicit linear dynamics computations. Direct solvers are well-suited for these problems because after the system matrix has been factored, the multiple or repeated solutions can be obtained through relatively inexpensive forward and backward substitutions. On the other hand, iterative solvers in general are ill-suited for these problems because they often must restart from scratch for every different right hand side. In this paper, we present a methodology for extending the range of applications of domain decomposition methods to problems with multiple or repeated right hand sides. Basically, we formulate the overall problem as a series of minimization problems over K-orthogonal and supplementary subspaces, and tailor the preconditioned conjugate gradient algorithm to solve them efficiently. The resulting solution method is scalable, whereas direct factorization schemes and forward and backward substitution algorithms are not. We illustrate the proposed methodology with the solution of static and dynamic structural problems, and highlight its potential to outperform forward and backward substitutions on parallel computers. As an example, we show that for a linear structural dynamics problem with 11640 degrees of freedom, every time-step beyond time-step 15 is solved in a single iteration and consumes 1.0 second on a 32 processor iPSC-860 system; for the same problem and the same parallel processor, a pair of forward/backward substitutions at each step consumes 15.0 seconds.

  2. Synchronized motion control and precision positioning compensation of a 3-DOFs macro-micro parallel manipulator fully actuated by piezoelectric actuators

    NASA Astrophysics Data System (ADS)

    Zhang, Quan; Li, Chaodong; Zhang, Jiantao; Zhang, Xu

    2017-11-01

    The macro-micro combined approach, as an effective way to realize trans-scale nano-precision positioning with multi-dimensions and high velocity, plays a significant role in integrated circuit manufacturing field. A 3-degree-of-freedoms (3-DOFs) macro-micro manipulator is designed and analyzed to compromise the conflictions among the large stroke, high precision and multi-DOFs. The macro manipulator is a 3-Prismatic-Revolute-Revolute (3-PRR) structure parallel manipulator which is driven by three linear ultrasonic motors. The dynamic model and the cross-coupling error based synchronized motion controller of the 3-PRR parallel manipulator are theoretical analyzed and experimental tested. To further improve the positioning accuracy, a 3-DOFs monolithic compliant manipulator actuated by three piezoelectric stack actuators is designed. Then a multilayer BP neural network based inverse kinematic model identifier is developed to perform the positioning control. Finally, by forming the macro-micro structure, the dual stage manipulator successfully achieved the positioning task from the point (2 mm, 2 mm, 0 rad) back to the original point (0 mm, 0 mm, 0 rad) with the translation errors in X and Y directions less than ±50 nm and the rotation error around Z axis less than ±1 μrad, respectively.

  3. Simulation of double layers in a model auroral circuit with nonlinear impedance

    NASA Technical Reports Server (NTRS)

    Smith, R. A.

    1986-01-01

    A reduced circuit description of the U-shaped potential structure of a discrete auroral arc, consisting of the flank transmission line plus parallel-electric-field region, is used to provide the boundary condition for one-dimensional simulations of the double-layer evolution. The model yields asymptotic scalings of the double-layer potential, as a function of an anomalous transport coefficient alpha and of the perpendicular length scale l(a) of the arc. The arc potential phi(DL) scales approximately linearly with alpha, and for alpha fixed phi (DL) about l(a) to the z power. Using parameters appropriate to the auroral zone acceleration region, potentials of phi (DPL) 10 kV scale to projected ionospheric dimensions of about 1 km, with power flows of the order of magnitude of substorm dissipation rates.

  4. The Role of the Mantle on Structural Reactivation at the Plate Tectonics Scale (Invited)

    NASA Astrophysics Data System (ADS)

    Vauchez, A. R.; Tommasi, A.

    2009-12-01

    During orogeny, rifting, and in major strike-slip faults, the lithospheric mantle undergoes solid-state flow to accommodate the imposed strain. This deformation occurs mostly through crystal plasticity processes, like dislocation creep, and results in the development of a crystallographic preferred orientation (CPO) of olivine and pyroxene. Because these minerals, especially olivine, display strongly anisotropic physical properties, their preferred orientation confers anisotropic properties at the scale of the rock. When the deformation event comes to its end, the CPO are "frozen" and remain stable for millions or even billions years if no other deformation subsequently affects the lithospheric mantle. This means that anisotropic properties preserving a memory of previous deformation events may subsist in the continental mantle over very long periods of time. One of the main consequences of a well-developed olivine CPO is an anisotropic mantle viscosity and hence a deformation dependant on the orientation of the tectonic solicitations relative to the orientation of the olivine CPO inherited from the past orogenic events. The most obvious expression of this anisotropic mechanical behaviour is the influence of the inherited tectonic fabric on continental rifting. Most continental rifts that lead to successful continental breakup, like in the early Atlantic or the western Indian systems, formed parallel to ancient collisional belts. Moreover, the early stages of deformation in these systems are characterized by a transtensional strain regime involving a large component of strike-slip shearing parallel to the inherited fabric. The link between the lithospheric mantle fabric and the rift structure is further supported by seismic anisotropy measurements in major rifts (e.g., the East-African Rift) or at passive continental margins (e.g., the Atlantic Ocean) that show fast split S-waves polarized in a direction parallel to both the inherited fabric and the trend of the rift, and by the analysis of the CPO in mantle xenoliths collected in such areas. These observations are consistent with recent multi-scale numerical models showing that olivine CPO frozen in the lithospheric mantle result in an anisotropic mechanical behaviour. In a plate submitted to extension, CPO-induced anisotropy favours the reactivation in transtension of lithospheric-scale strike slip faults that are oblique to the imposed tensional stresses. Further investigation is needed to constrain the role of an inherited mechanical anisotropy of the lithosphere during compressional events and the possible feedbacks between an anisotropic viscous deformation of the lithospheric mantle and the seismic cycle. In both cases, crust-mantle coupling is likely for large-scale structures and mantle CPO may influence the kinematics of tectonic systems, at least during the initial stages of their evolution.

  5. Neurite, a Finite Difference Large Scale Parallel Program for the Simulation of Electrical Signal Propagation in Neurites under Mechanical Loading

    PubMed Central

    García-Grajales, Julián A.; Rucabado, Gabriel; García-Dopico, Antonio; Peña, José-María; Jérusalem, Antoine

    2015-01-01

    With the growing body of research on traumatic brain injury and spinal cord injury, computational neuroscience has recently focused its modeling efforts on neuronal functional deficits following mechanical loading. However, in most of these efforts, cell damage is generally only characterized by purely mechanistic criteria, functions of quantities such as stress, strain or their corresponding rates. The modeling of functional deficits in neurites as a consequence of macroscopic mechanical insults has been rarely explored. In particular, a quantitative mechanically based model of electrophysiological impairment in neuronal cells, Neurite, has only very recently been proposed. In this paper, we present the implementation details of this model: a finite difference parallel program for simulating electrical signal propagation along neurites under mechanical loading. Following the application of a macroscopic strain at a given strain rate produced by a mechanical insult, Neurite is able to simulate the resulting neuronal electrical signal propagation, and thus the corresponding functional deficits. The simulation of the coupled mechanical and electrophysiological behaviors requires computational expensive calculations that increase in complexity as the network of the simulated cells grows. The solvers implemented in Neurite—explicit and implicit—were therefore parallelized using graphics processing units in order to reduce the burden of the simulation costs of large scale scenarios. Cable Theory and Hodgkin-Huxley models were implemented to account for the electrophysiological passive and active regions of a neurite, respectively, whereas a coupled mechanical model accounting for the neurite mechanical behavior within its surrounding medium was adopted as a link between electrophysiology and mechanics. This paper provides the details of the parallel implementation of Neurite, along with three different application examples: a long myelinated axon, a segmented dendritic tree, and a damaged axon. The capabilities of the program to deal with large scale scenarios, segmented neuronal structures, and functional deficits under mechanical loading are specifically highlighted. PMID:25680098

  6. System for inspecting large size structural components

    DOEpatents

    Birks, Albert S.; Skorpik, James R.

    1990-01-01

    The present invention relates to a system for inspecting large scale structural components such as concrete walls or the like. The system includes a mobile gamma radiation source and a mobile gamma radiation detector. The source and detector are constructed and arranged for simultaneous movement along parallel paths in alignment with one another on opposite sides of a structural component being inspected. A control system provides signals which coordinate the movements of the source and detector and receives and records the radiation level data developed by the detector as a function of source and detector positions. The radiation level data is then analyzed to identify areas containing defects corresponding to unexpected variations in the radiation levels detected.

  7. A parallel form of the Gudjonsson Suggestibility Scale.

    PubMed

    Gudjonsson, G H

    1987-09-01

    The purpose of this study is twofold: (1) to present a parallel form of the Gudjonsson Suggestibility Scale (GSS, Form 1); (2) to study test-retest reliabilities of interrogative suggestibility. Three groups of subjects were administered the two suggestibility scales in a counterbalanced order. Group 1 (28 normal subjects) and Group 2 (32 'forensic' patients) completed both scales within the same testing session, whereas Group 3 (30 'forensic' patients) completed the two scales between one week and eight months apart. All the correlations were highly significant, giving support for high 'temporal consistency' of interrogative suggestibility.

  8. Parallel Simulation of Unsteady Turbulent Flames

    NASA Technical Reports Server (NTRS)

    Menon, Suresh

    1996-01-01

    Time-accurate simulation of turbulent flames in high Reynolds number flows is a challenging task since both fluid dynamics and combustion must be modeled accurately. To numerically simulate this phenomenon, very large computer resources (both time and memory) are required. Although current vector supercomputers are capable of providing adequate resources for simulations of this nature, the high cost and their limited availability, makes practical use of such machines less than satisfactory. At the same time, the explicit time integration algorithms used in unsteady flow simulations often possess a very high degree of parallelism, making them very amenable to efficient implementation on large-scale parallel computers. Under these circumstances, distributed memory parallel computers offer an excellent near-term solution for greatly increased computational speed and memory, at a cost that may render the unsteady simulations of the type discussed above more feasible and affordable.This paper discusses the study of unsteady turbulent flames using a simulation algorithm that is capable of retaining high parallel efficiency on distributed memory parallel architectures. Numerical studies are carried out using large-eddy simulation (LES). In LES, the scales larger than the grid are computed using a time- and space-accurate scheme, while the unresolved small scales are modeled using eddy viscosity based subgrid models. This is acceptable for the moment/energy closure since the small scales primarily provide a dissipative mechanism for the energy transferred from the large scales. However, for combustion to occur, the species must first undergo mixing at the small scales and then come into molecular contact. Therefore, global models cannot be used. Recently, a new model for turbulent combustion was developed, in which the combustion is modeled, within the subgrid (small-scales) using a methodology that simulates the mixing and the molecular transport and the chemical kinetics within each LES grid cell. Finite-rate kinetics can be included without any closure and this approach actually provides a means to predict the turbulent rates and the turbulent flame speed. The subgrid combustion model requires resolution of the local time scales associated with small-scale mixing, molecular diffusion and chemical kinetics and, therefore, within each grid cell, a significant amount of computations must be carried out before the large-scale (LES resolved) effects are incorporated. Therefore, this approach is uniquely suited for parallel processing and has been implemented on various systems such as: Intel Paragon, IBM SP-2, Cray T3D and SGI Power Challenge (PC) using the system independent Message Passing Interface (MPI) compiler. In this paper, timing data on these machines is reported along with some characteristic results.

  9. Integration experiences and performance studies of A COTS parallel archive systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Hsing-bung; Scott, Cody; Grider, Bary

    2010-01-01

    Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf(COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and lessmore » robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, ls, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petaflop/s computing system, LANL's Roadrunner, and demonstrated its capability to address requirements of future archival storage systems.« less

  10. Integration experiments and performance studies of a COTS parallel archive system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Hsing-bung; Scott, Cody; Grider, Gary

    2010-06-16

    Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf (COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching andmore » less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, Is, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petafiop/s computing system, LANL's Roadrunner machine, and demonstrated its capability to address requirements of future archival storage systems.« less

  11. Anisotropic transverse mixing and its effect on reaction rates in multi-scale, 3D heterogeneous porous media

    NASA Astrophysics Data System (ADS)

    Engdahl, N. B.

    2016-12-01

    Mixing rates in porous media have been a heavily research topic in recent years covering analytic, random, and structured fields. However, there are some persistent assumptions and common features to these models that raise some questions about the generality of the results. One of these commonalities is the orientation of the flow field with respect to the heterogeneity structure, which are almost always defined to be parallel each other if there is an elongated axis of permeability correlation. Given the vastly different tortuosities for flow parallel to bedding and flow transverse to bedding, this assumption of parallel orientation may have significant effects on reaction rates when natural flows deviate from this assumed setting. This study investigates the role of orientation on mixing and reaction rates in multi-scale, 3D heterogeneous porous media with varying degrees of anisotropy in the correlation structure. Ten realizations of a small flow field, with three anisotropy levels, were simulated for flow parallel and transverse to bedding. Transport was simulated in each model with an advective-diffusive random walk and reactions were simulated using the chemical Langevin equation. The reaction system is a vertically segregated, transverse mixing problem between two mobile reactants. The results show that different transport behaviors and reaction rates are obtained by simply rotating the direction of flow relative to bedding, even when the net flux in both directions is the same. This kind of behavior was observed for three different weightings of the initial condition: 1) uniform, 2) flux-based, and 3) travel time based. The different schemes resulted in 20-50% more mass formation in the transverse direction than the longitudinal. The greatest variability in mass was observed for the flux weights and these were proportionate to the level of anisotropy. The implications of this study are that flux or travel time weights do not provide any guarantee of a fair comparison in this kind of a mixing scenario and that the role of directional tendencies on reaction rates can be significant. Further, it may be necessary to include anisotropy in future upscaled models to create robust methods that give representative reaction rates for any flow direction relative to geologic bedding.

  12. Transpressional deformation style and AMS fabrics adjacent to the southernmost segment of the San Andreas fault, Durmid Hill, CA

    NASA Astrophysics Data System (ADS)

    French, M.; Wojtal, S. F.; Housen, B.

    2006-12-01

    In the Salton Trough, the trace of the San Andreas Fault (SAF) ends where it intersects the NNW-trending Brawley seismic zone at Durmid Hill (DH). The topographic relief of DH is a product of faulting and folding of Pleistocene Borrego Formation strata (Babcock, 1974). Burgmann's (1991) detailed mapping and analysis of the western part of DH showed that the folds and faults accommodate transpression. Key to Burgmann's work was the recognition that the ~2m thick Bishop Ash, a prominent marker horizon, has been elongated parallel to the hinges of folds and boudinaged. We are mapping in detail the eastern portion of DH, nearer to the trace of the SAF. Folds in the eastern part of DH are tighter and thrust faulting is more prominent, consistent with greater shortening magnitude oblique to the SAF. Boudinage of the ash layer again indicates elongation parallel to fold hinges and subparallel to the SAF. The Bishop Ash locally is <1m thick along fold limbs in eastern DH, suggesting that significant continuous deformation accompanied the development of map-scale features. We measured anisotropy of magnetic susceptibility (AMS) fabrics in the Bishop Ash in order to assess continuous deformation in the Ash at DH. Because the Bishop Ash at DH is altered, consisting mainly of silica glass and clay minerals, samples from DH have significantly lower magnetic susceptibilities than Bishop Ash samples from elsewhere in the Salton Trough. With such low susceptibilities, there is significant scatter in the orientation of magnetic foliation and lineation in our samples. Still, in some Bishop samples within 1 km of the SAF, magnetic foliation is consistent with fold-related flattening. Magnetic lineation in these samples is consistently sub-parallel to fold hinges, parallel to the elongation direction inferred from boudinage. Even close to the trace of the SAF, this correlation breaks down in map-scale zones where fold hinge lines change attitude, fold shapes change, and the distribution and orientations of fractures and veins changes. These zones of structural complication separate broader regions of more uniform deformation patterns. Together, the geometry of structures and AMS fabrics suggest that deformation in eastern DH occurs by the distortion and reorientation of more or less coherent blocks separated by narrow zones where structural elements change orientation.

  13. Scale-space for empty catheter segmentation in PCI fluoroscopic images.

    PubMed

    Bacchuwar, Ketan; Cousty, Jean; Vaillant, Régis; Najman, Laurent

    2017-07-01

    In this article, we present a method for empty guiding catheter segmentation in fluoroscopic X-ray images. The guiding catheter, being a commonly visible landmark, its segmentation is an important and a difficult brick for Percutaneous Coronary Intervention (PCI) procedure modeling. In number of clinical situations, the catheter is empty and appears as a low contrasted structure with two parallel and partially disconnected edges. To segment it, we work on the level-set scale-space of image, the min tree, to extract curve blobs. We then propose a novel structural scale-space, a hierarchy built on these curve blobs. The deep connected component, i.e. the cluster of curve blobs on this hierarchy, that maximizes the likelihood to be an empty catheter is retained as final segmentation. We evaluate the performance of the algorithm on a database of 1250 fluoroscopic images from 6 patients. As a result, we obtain very good qualitative and quantitative segmentation performance, with mean precision and recall of 80.48 and 63.04% respectively. We develop a novel structural scale-space to segment a structured object, the empty catheter, in challenging situations where the information content is very sparse in the images. Fully-automatic empty catheter segmentation in X-ray fluoroscopic images is an important and preliminary step in PCI procedure modeling, as it aids in tagging the arrival and removal location of other interventional tools.

  14. Field of genes: using Apache Kafka as a bioinformatic data repository.

    PubMed

    Lawlor, Brendan; Lynch, Richard; Mac Aogáin, Micheál; Walsh, Paul

    2018-04-01

    Bioinformatic research is increasingly dependent on large-scale datasets, accessed either from private or public repositories. An example of a public repository is National Center for Biotechnology Information's (NCBI's) Reference Sequence (RefSeq). These repositories must decide in what form to make their data available. Unstructured data can be put to almost any use but are limited in how access to them can be scaled. Highly structured data offer improved performance for specific algorithms but limit the wider usefulness of the data. We present an alternative: lightly structured data stored in Apache Kafka in a way that is amenable to parallel access and streamed processing, including subsequent transformations into more highly structured representations. We contend that this approach could provide a flexible and powerful nexus of bioinformatic data, bridging the gap between low structure on one hand, and high performance and scale on the other. To demonstrate this, we present a proof-of-concept version of NCBI's RefSeq database using this technology. We measure the performance and scalability characteristics of this alternative with respect to flat files. The proof of concept scales almost linearly as more compute nodes are added, outperforming the standard approach using files. Apache Kafka merits consideration as a fast and more scalable but general-purpose way to store and retrieve bioinformatic data, for public, centralized reference datasets such as RefSeq and for private clinical and experimental data.

  15. A learnable parallel processing architecture towards unity of memory and computing

    NASA Astrophysics Data System (ADS)

    Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.

    2015-08-01

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  16. A learnable parallel processing architecture towards unity of memory and computing.

    PubMed

    Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J

    2015-08-14

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  17. Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures

    PubMed Central

    Manolakos, Elias S.

    2015-01-01

    Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332

  18. Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.

    PubMed

    Sharma, Anuj; Manolakos, Elias S

    2015-01-01

    Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.

  19. Developing eThread pipeline using SAGA-pilot abstraction for large-scale structural bioinformatics.

    PubMed

    Ragothaman, Anjani; Boddu, Sairam Chowdary; Kim, Nayong; Feinstein, Wei; Brylinski, Michal; Jha, Shantenu; Kim, Joohyun

    2014-01-01

    While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread--a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.

  20. Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

    PubMed Central

    Ragothaman, Anjani; Feinstein, Wei; Jha, Shantenu; Kim, Joohyun

    2014-01-01

    While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure. PMID:24995285

  1. Handling Big Data in Medical Imaging: Iterative Reconstruction with Large-Scale Automated Parallel Computation

    PubMed Central

    Lee, Jae H.; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T.; Seo, Youngho

    2014-01-01

    The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting. PMID:27081299

  2. Handling Big Data in Medical Imaging: Iterative Reconstruction with Large-Scale Automated Parallel Computation.

    PubMed

    Lee, Jae H; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T; Seo, Youngho

    2014-11-01

    The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting.

  3. Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)

    2000-01-01

    HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).

  4. Data Acquisition and Linguistic Resources

    NASA Astrophysics Data System (ADS)

    Strassel, Stephanie; Christianson, Caitlin; McCary, John; Staderman, William; Olive, Joseph

    All human language technology demands substantial quantities of data for system training and development, plus stable benchmark data to measure ongoing progress. While creation of high quality linguistic resources is both costly and time consuming, such data has the potential to profoundly impact not just a single evaluation program but language technology research in general. GALE's challenging performance targets demand linguistic data on a scale and complexity never before encountered. Resources cover multiple languages (Arabic, Chinese, and English) and multiple genres -- both structured (newswire and broadcast news) and unstructured (web text, including blogs and newsgroups, and broadcast conversation). These resources include significant volumes of monolingual text and speech, parallel text, and transcribed audio combined with multiple layers of linguistic annotation, ranging from word aligned parallel text and Treebanks to rich semantic annotation.

  5. Parallel family trees for transfer matrices in the Potts model

    NASA Astrophysics Data System (ADS)

    Navarro, Cristobal A.; Canfora, Fabrizio; Hitschfeld, Nancy; Navarro, Gonzalo

    2015-02-01

    The computational cost of transfer matrix methods for the Potts model is related to the question in how many ways can two layers of a lattice be connected? Answering the question leads to the generation of a combinatorial set of lattice configurations. This set defines the configuration space of the problem, and the smaller it is, the faster the transfer matrix can be computed. The configuration space of generic (q , v) transfer matrix methods for strips is in the order of the Catalan numbers, which grows asymptotically as O(4m) where m is the width of the strip. Other transfer matrix methods with a smaller configuration space indeed exist but they make assumptions on the temperature, number of spin states, or restrict the structure of the lattice. In this paper we propose a parallel algorithm that uses a sub-Catalan configuration space of O(3m) to build the generic (q , v) transfer matrix in a compressed form. The improvement is achieved by grouping the original set of Catalan configurations into a forest of family trees, in such a way that the solution to the problem is now computed by solving the root node of each family. As a result, the algorithm becomes exponentially faster than the Catalan approach while still highly parallel. The resulting matrix is stored in a compressed form using O(3m ×4m) of space, making numerical evaluation and decompression to be faster than evaluating the matrix in its O(4m ×4m) uncompressed form. Experimental results for different sizes of strip lattices show that the parallel family trees (PFT) strategy indeed runs exponentially faster than the Catalan Parallel Method (CPM), especially when dealing with dense transfer matrices. In terms of parallel performance, we report strong-scaling speedups of up to 5.7 × when running on an 8-core shared memory machine and 28 × for a 32-core cluster. The best balance of speedup and efficiency for the multi-core machine was achieved when using p = 4 processors, while for the cluster scenario it was in the range p ∈ [ 8 , 10 ] . Because of the parallel capabilities of the algorithm, a large-scale execution of the parallel family trees strategy in a supercomputer could contribute to the study of wider strip lattices.

  6. Factor structure and criterion validity across the full scale and ten short forms of the CES-D among Chinese adolescents.

    PubMed

    Yang, Wenhui; Xiong, Ge; Garrido, Luis Eduardo; Zhang, John X; Wang, Meng-Cheng; Wang, Chong

    2018-04-16

    We systematically examined the factor structure and criterion validity across the full scale and 10 short forms of the Center for Epidemiological Studies Depression Scale (CES-D) with Chinese youth. Participants were 5,434 Chinese adolescents in Grades 7 to 12 who completed the full CES-D; 612 of them further completed a structured diagnostic interview with the major depressive disorder (MDD) module of the Kiddie Schedule for Affective Disorder and Schizophrenia for School-age Children. Using a split-sample approach, a series of 4-, 3-, 2-, and 1-factor models were tested using exploratory structural equation modeling and cross-validated using confirmatory factor analysis; the dimensionality was also evaluated by parallel analysis in conjunction with the scree test and aided by factor mixture analysis. The results indicated that a single-factor model of depression with a wording method factor fitted the data well, and was the optimal structure underlying the scores of the full and shortened CES-D. Additionally, receiver operating characteristic curve analyses for MDD case detection showed that the CES-D full-scale scores accurately detected MDD youth (area under the curve [AUC] = .84). Furthermore, the short-form scores produced comparable AUCs with the full scale (.82 to .85), as well as similar levels of sensitivity and specificity when using optimal cutoffs. These findings suggest that depression among Chinese adolescents can be adequately measured and screened for by a single-factor structure underlying the CES-D scores, and that the short forms provide a viable alternative to the full instrument. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  7. Xray: N-dimensional, labeled arrays for analyzing physical datasets in Python

    NASA Astrophysics Data System (ADS)

    Hoyer, S.

    2015-12-01

    Efficient analysis of geophysical datasets requires tools that both preserve and utilize metadata, and that transparently scale to process large datas. Xray is such a tool, in the form of an open source Python library for analyzing the labeled, multi-dimensional array (tensor) datasets that are ubiquitous in the Earth sciences. Xray's approach pairs Python data structures based on the data model of the netCDF file format with the proven design and user interface of pandas, the popular Python data analysis library for labeled tabular data. On top of the NumPy array, xray adds labeled dimensions (e.g., "time") and coordinate values (e.g., "2015-04-10"), which it uses to enable a host of operations powered by these labels: selection, aggregation, alignment, broadcasting, split-apply-combine, interoperability with pandas and serialization to netCDF/HDF5. Many of these operations are enabled by xray's tight integration with pandas. Finally, to allow for easy parallelism and to enable its labeled data operations to scale to datasets that does not fit into memory, xray integrates with the parallel processing library dask.

  8. Towards a large-scale scalable adaptive heart model using shallow tree meshes

    NASA Astrophysics Data System (ADS)

    Krause, Dorian; Dickopf, Thomas; Potse, Mark; Krause, Rolf

    2015-10-01

    Electrophysiological heart models are sophisticated computational tools that place high demands on the computing hardware due to the high spatial resolution required to capture the steep depolarization front. To address this challenge, we present a novel adaptive scheme for resolving the deporalization front accurately using adaptivity in space. Our adaptive scheme is based on locally structured meshes. These tensor meshes in space are organized in a parallel forest of trees, which allows us to resolve complicated geometries and to realize high variations in the local mesh sizes with a minimal memory footprint in the adaptive scheme. We discuss both a non-conforming mortar element approximation and a conforming finite element space and present an efficient technique for the assembly of the respective stiffness matrices using matrix representations of the inclusion operators into the product space on the so-called shallow tree meshes. We analyzed the parallel performance and scalability for a two-dimensional ventricle slice as well as for a full large-scale heart model. Our results demonstrate that the method has good performance and high accuracy.

  9. Structural control on the emplacement of contemporaneous Sn-Ta-Nb mineralized LCT pegmatites and Sn bearing quartz veins: Insights from the Musha and Ntunga deposits of the Karagwe-Ankole Belt, Rwanda

    NASA Astrophysics Data System (ADS)

    Hulsbosch, Niels; Van Daele, Johanna; Reinders, Nathan; Dewaele, Stijn; Jacques, Dominique; Muchez, Philippe

    2017-10-01

    The Nb-Ta-Sn pegmatites and Sn quartz veins of the Rwamagana-Musha-Ntunga area in eastern Rwanda are part of the Mesoproterozoic Karagwe-Ankole Belt. These commodities are on a regional scale spatiotemporally associated to the early Neoproterozoic fertile G4-granite generation. Although a transition from the lithium-cesium-tantalum pegmatites to cassiterite-microcline-quartz veins has been observed in the Rwamagana-Musha-Ntunga area, the structural control and the paragenetic relationship between the mineralized pegmatites and the Sn bearing quartz veins is largely unknown. Consequently, this study investigates the occurrence of pegmatites and quartz veins and the structural and lithological controls on their emplacement. The metasediments in the area are affected by a regional compressional regime with a shortening direction oriented N70E, which resulted in a N20W-oriented fold sequence. The Lake Muhazi granite is present in center of the Karehe anticline. The structural orientations of pegmatites and quartz veins show that two important factors control their emplacement. The first control is the reactivation of pre-existing discontinuities such as the bedding, bedding-parallel joints or strike-slip fault planes. In view of the regional structural grain in the Rwamagana-Musha-Ntunga area, this corresponds with abundant N20W-oriented pegmatites and quartz veins. The reactivation is strongly related to the lithology of the host rocks. The Musha Formation, which mainly consists of decimeter- to meter-scale lithological alternations of metapelite, metasiltstone and metasandstone, represents the most suitable environment for bedding reactivation. This is reflected in the predominance of bedding-parallel pegmatites and quartz veins hosted by the Musha Formation. Strike-parallel joints were mainly observed in the competent lithologies. The second controlling factor is related to the regional post-compressional stress regime. New joints initiated upon emplacement of the pegmatites and quartz veins. The orientations of these joints are influenced by the regional stress regime and resulted in steep EW-oriented pegmatites and quartz veins in the Rwamagana-Musha-Ntunga area. The pegmatites and quartz veins are interpreted as being initiated upon emplacement under influence of the prevailing regional stress regime. This post-compressional stress regime is characterized by a subvertical maximum compressive stress.

  10. Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ibrahim, Khaled Z.; Epifanovsky, Evgeny; Williams, Samuel

    Coupled-cluster methods provide highly accurate models of molecular structure through explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix–matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts tomore » extend the Libtensor framework to work in the distributed memory environment in a scalable and energy-efficient manner. We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures (Cray XC30 and XC40, and IBM Blue Gene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance, tasking and bulk synchronous models. Nevertheless, we preserve a unified interface to both programming models to maintain the productivity of computational quantum chemists.« less

  11. Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends

    DOE PAGES

    Ibrahim, Khaled Z.; Epifanovsky, Evgeny; Williams, Samuel; ...

    2017-03-08

    Coupled-cluster methods provide highly accurate models of molecular structure through explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix–matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts tomore » extend the Libtensor framework to work in the distributed memory environment in a scalable and energy-efficient manner. We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures (Cray XC30 and XC40, and IBM Blue Gene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance, tasking and bulk synchronous models. Nevertheless, we preserve a unified interface to both programming models to maintain the productivity of computational quantum chemists.« less

  12. Parallel-vector computation for linear structural analysis and non-linear unconstrained optimization problems

    NASA Technical Reports Server (NTRS)

    Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.

    1991-01-01

    Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.

  13. Acceleration of barium ions near 8000 km above an aurora

    NASA Technical Reports Server (NTRS)

    Stenbaek-Nielsen, H. C.; Hallinan, T. J.; Wescott, E. M.; Foeppl, H.

    1984-01-01

    A barium shaped charge, named Limerick, was released from a rocket launched from Poker Flat Research Range, Alaska, on March 30, 1982, at 1033 UT. The release took place in a small auroral breakup. The jet of ionized barium reached an altitude of 8100 km 14.5 min after release, indicating that there were no parallel electric fields below this altitude. At 8100 km the jet appeared to stop. Analysis shows that the barium at this altitude was effectively removed from the tip. It is concluded that the barium was actually accelerated upward, resulting in a large decrease in the line-of-sight density and hence the optical intensity. The parallel electric potential in the acceleration region must have been greater than 1 kV over an altitude interval of less than 200 km. The acceleration region, although presumably auroral in origin, did not seem to be related to individual auroral structures, but appeared to be a large-scale horizontal structure. The perpendicular electric field below, as deduced from the drift of the barium, was temporally and spatially very uniform and showed no variation related to individual auroral structures passing through.

  14. Some TEM observations of Al2O3 scales formed on NiCrAl alloys

    NASA Technical Reports Server (NTRS)

    Smialek, J.; Gibala, R.

    1979-01-01

    The microstructural development of Al2O3 scales on NiCrAl alloys has been examined by transmission electron microscopy. Voids were observed within grains in scales formed on a pure NiCrAl alloy. Both voids and oxide grains grew measurably with oxidation time at 1100 C. The size and amount of porosity decreased towards the oxide-metal growth interface. The voids resulted from an excess number of oxygen vacancies near the oxidemetal interface. Short-circuit diffusion paths were discussed in reference to current growth stress models for oxide scales. Transient oxidation of pure, Y-doped, and Zr-doped NiCrAl was also examined. Oriented alpha-(Al, Cr)2O3 and Ni(Al, Cr)2O4 scales often coexisted in layered structures on all three alloys. Close-packed oxygen planes and directions in the corundum and spinel layers were parallel. The close relationship between oxide layers provided a gradual transition from initial transient scales to steady state Al2O3 growth.

  15. Analysis of passive scalar advection in parallel shear flows: Sorting of modes at intermediate time scales

    NASA Astrophysics Data System (ADS)

    Camassa, Roberto; McLaughlin, Richard M.; Viotti, Claudio

    2010-11-01

    The time evolution of a passive scalar advected by parallel shear flows is studied for a class of rapidly varying initial data. Such situations are of practical importance in a wide range of applications from microfluidics to geophysics. In these contexts, it is well-known that the long-time evolution of the tracer concentration is governed by Taylor's asymptotic theory of dispersion. In contrast, we focus here on the evolution of the tracer at intermediate time scales. We show how intermediate regimes can be identified before Taylor's, and in particular, how the Taylor regime can be delayed indefinitely by properly manufactured initial data. A complete characterization of the sorting of these time scales and their associated spatial structures is presented. These analytical predictions are compared with highly resolved numerical simulations. Specifically, this comparison is carried out for the case of periodic variations in the streamwise direction on the short scale with envelope modulations on the long scales, and show how this structure can lead to "anomalously" diffusive transients in the evolution of the scalar onto the ultimate regime governed by Taylor dispersion. Mathematically, the occurrence of these transients can be viewed as a competition in the asymptotic dominance between large Péclet (Pe) numbers and the long/short scale aspect ratios (LVel/LTracer≡k), two independent nondimensional parameters of the problem. We provide analytical predictions of the associated time scales by a modal analysis of the eigenvalue problem arising in the separation of variables of the governing advection-diffusion equation. The anomalous time scale in the asymptotic limit of large k Pe is derived for the short scale periodic structure of the scalar's initial data, for both exactly solvable cases and in general with WKBJ analysis. In particular, the exactly solvable sawtooth flow is especially important in that it provides a short cut to the exact solution to the eigenvalue problem for the physically relevant vanishing Neumann boundary conditions in linear-shear channel flow. We show that the life of the corresponding modes at large Pe for this case is shorter than the ones arising from shear free zones in the fluid's interior. A WKBJ study of the latter modes provides a longer intermediate time evolution. This part of the analysis is technical, as the corresponding spectrum is dominated by asymptotically coalescing turning points in the limit of large Pe numbers. When large scale initial data components are present, the transient regime of the WKBJ (anomalous) modes evolves into one governed by Taylor dispersion. This is studied by a regular perturbation expansion of the spectrum in the small wavenumber regimes.

  16. Liquid-liquid transition in the ST2 model of water

    NASA Astrophysics Data System (ADS)

    Debenedetti, Pablo

    2013-03-01

    We present clear evidence of the existence of a metastable liquid-liquid phase transition in the ST2 model of water. Using four different techniques (the weighted histogram analysis method with single-particle moves, well-tempered metadynamics with single-particle moves, weighted histograms with parallel tempering and collective particle moves, and conventional molecular dynamics), we calculate the free energy surface over a range of thermodynamic conditions, we perform a finite size scaling analysis for the free energy barrier between the coexisting liquid phases, we demonstrate the attainment of diffusive behavior, and we perform stringent thermodynamic consistency checks. The results provide conclusive evidence of a first-order liquid-liquid transition. We also show that structural equilibration in the sluggish low-density phase is attained over the time scale of our simulations, and that crystallization times are significantly longer than structural equilibration, even under deeply supercooled conditions. We place our results in the context of the theory of metastability.

  17. An information-theoretic approach to motor action decoding with a reconfigurable parallel architecture.

    PubMed

    Craciun, Stefan; Brockmeier, Austin J; George, Alan D; Lam, Herman; Príncipe, José C

    2011-01-01

    Methods for decoding movements from neural spike counts using adaptive filters often rely on minimizing the mean-squared error. However, for non-Gaussian distribution of errors, this approach is not optimal for performance. Therefore, rather than using probabilistic modeling, we propose an alternate non-parametric approach. In order to extract more structure from the input signal (neuronal spike counts) we propose using minimum error entropy (MEE), an information-theoretic approach that minimizes the error entropy as part of an iterative cost function. However, the disadvantage of using MEE as the cost function for adaptive filters is the increase in computational complexity. In this paper we present a comparison between the decoding performance of the analytic Wiener filter and a linear filter trained with MEE, which is then mapped to a parallel architecture in reconfigurable hardware tailored to the computational needs of the MEE filter. We observe considerable speedup from the hardware design. The adaptation of filter weights for the multiple-input, multiple-output linear filters, necessary in motor decoding, is a highly parallelizable algorithm. It can be decomposed into many independent computational blocks with a parallel architecture readily mapped to a field-programmable gate array (FPGA) and scales to large numbers of neurons. By pipelining and parallelizing independent computations in the algorithm, the proposed parallel architecture has sublinear increases in execution time with respect to both window size and filter order.

  18. Processor farming in two-level analysis of historical bridge

    NASA Astrophysics Data System (ADS)

    Krejčí, T.; Kruis, J.; Koudelka, T.; Šejnoha, M.

    2017-11-01

    This contribution presents a processor farming method in connection with a multi-scale analysis. In this method, each macro-scopic integration point or each finite element is connected with a certain meso-scopic problem represented by an appropriate representative volume element (RVE). The solution of a meso-scale problem provides then effective parameters needed on the macro-scale. Such an analysis is suitable for parallel computing because the meso-scale problems can be distributed among many processors. The application of the processor farming method to a real world masonry structure is illustrated by an analysis of Charles bridge in Prague. The three-dimensional numerical model simulates the coupled heat and moisture transfer of one half of arch No. 3. and it is a part of a complex hygro-thermo-mechanical analysis which has been developed to determine the influence of climatic loading on the current state of the bridge.

  19. Using the orbiting companion to trace WR wind structures in the 29d WC8d + O8-9IV binary CV Ser

    NASA Astrophysics Data System (ADS)

    David-Uraz, Alexandre; Moffat, Anthony F. J.

    2011-07-01

    We have used continuous, high-precision, broadband visible photometry from the MOST satellite to trace wind structures in the WR component of CV Ser over more than a full orbit. Most of the small-scale light-curve variations are likely due to extinction by clumps along the line of sight to the O companion as it orbits and shines through varying columns of the WR wind. Parallel optical spectroscopy from the Mont Megantic Observatory is used to refine the orbital and wind-collision parameters, as well as to reveal line emission from clumps.

  20. Chromatin organization and global regulation of Hox gene clusters

    PubMed Central

    Montavon, Thomas; Duboule, Denis

    2013-01-01

    During development, a properly coordinated expression of Hox genes, within their different genomic clusters is critical for patterning the body plans of many animals with a bilateral symmetry. The fascinating correspondence between the topological organization of Hox clusters and their transcriptional activation in space and time has served as a paradigm for understanding the relationships between genome structure and function. Here, we review some recent observations, which revealed highly dynamic changes in the structure of chromatin at Hox clusters, in parallel with their activation during embryonic development. We discuss the relevance of these findings for our understanding of large-scale gene regulation. PMID:23650639

  1. First Applications of the New Parallel Krylov Solver for MODFLOW on a National and Global Scale

    NASA Astrophysics Data System (ADS)

    Verkaik, J.; Hughes, J. D.; Sutanudjaja, E.; van Walsum, P.

    2016-12-01

    Integrated high-resolution hydrologic models are increasingly being used for evaluating water management measures at field scale. Their drawbacks are large memory requirements and long run times. Examples of such models are The Netherlands Hydrological Instrument (NHI) model and the PCRaster Global Water Balance (PCR-GLOBWB) model. Typical simulation periods are 30-100 years with daily timesteps. The NHI model predicts water demands in periods of drought, supporting operational and long-term water-supply decisions. The NHI is a state-of-the-art coupling of several models: a 7-layer MODFLOW groundwater model ( 6.5M 250m cells), a MetaSWAP model for the unsaturated zone (Richards emulator of 0.5M cells), and a surface water model (MOZART-DM). The PCR-GLOBWB model provides a grid-based representation of global terrestrial hydrology and this work uses the version that includes a 2-layer MODFLOW groundwater model ( 4.5M 10km cells). The Parallel Krylov Solver (PKS) speeds up computation by both distributed memory parallelization (Message Passing Interface) and shared memory parallelization (Open Multi-Processing). PKS includes conjugate gradient, bi-conjugate gradient stabilized, and generalized minimal residual linear accelerators that use an overlapping additive Schwarz domain decomposition preconditioner. PKS can be used for both structured and unstructured grids and has been fully integrated in MODFLOW-USG using METIS partitioning and in iMODFLOW using RCB partitioning. iMODFLOW is an accelerated version of MODFLOW-2005 that is implicitly and online coupled to MetaSWAP. Results for benchmarks carried out on the Cartesius Dutch supercomputer (https://userinfo.surfsara.nl/systems/cartesius) for the PCRGLOB-WB model and on a 2x16 core Windows machine for the NHI model show speedups up to 10-20 and 5-10, respectively.

  2. A parallel algorithm for the initial screening of space debris collisions prediction using the SGP4/SDP4 models and GPU acceleration

    NASA Astrophysics Data System (ADS)

    Lin, Mingpei; Xu, Ming; Fu, Xiaoyu

    2017-05-01

    Currently, a tremendous amount of space debris in Earth's orbit imperils operational spacecraft. It is essential to undertake risk assessments of collisions and predict dangerous encounters in space. However, collision predictions for an enormous amount of space debris give rise to large-scale computations. In this paper, a parallel algorithm is established on the Compute Unified Device Architecture (CUDA) platform of NVIDIA Corporation for collision prediction. According to the parallel structure of NVIDIA graphics processors, a block decomposition strategy is adopted in the algorithm. Space debris is divided into batches, and the computation and data transfer operations of adjacent batches overlap. As a consequence, the latency to access shared memory during the entire computing process is significantly reduced, and a higher computing speed is reached. Theoretically, a simulation of collision prediction for space debris of any amount and for any time span can be executed. To verify this algorithm, a simulation example including 1382 pieces of debris, whose operational time scales vary from 1 min to 3 days, is conducted on Tesla C2075 of NVIDIA. The simulation results demonstrate that with the same computational accuracy as that of a CPU, the computing speed of the parallel algorithm on a GPU is 30 times that on a CPU. Based on this algorithm, collision prediction of over 150 Chinese spacecraft for a time span of 3 days can be completed in less than 3 h on a single computer, which meets the timeliness requirement of the initial screening task. Furthermore, the algorithm can be adapted for multiple tasks, including particle filtration, constellation design, and Monte-Carlo simulation of an orbital computation.

  3. Porting LAMMPS to GPUs.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, William Michael; Plimpton, Steven James; Wang, Peng

    2010-03-01

    LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has potentials for soft materials (biomolecules, polymers) and solid-state materials (metals, semiconductors) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale. LAMMPS runs on single processors or in parallel using message-passing techniques and a spatial-decomposition of the simulation domain. The code is designed to be easy to modify or extend with new functionality.

  4. Butterfly scale form birefringence related to photonics.

    PubMed

    Vidal, Benedicto de Campos

    2011-12-01

    Wings of the butterflies Morpho aega and Eryphanis reevesi were investigated in the present study by fluorescence, polarization and infra-red (IR) spectroscopic microscopy with the aim of identifying the oriented organization of their components and morphological details of their substructures. These wings were found to exhibit a strong iridescent glow depending on the angle of the incident light; their isolated scales exhibited blue fluorescence. Parallel columns or ridges extend from the pad and sockets to the dented apical scale's region, and they are perpendicular to the ribs that connect the columnar ridges. The scales reveal linear dichroism (LD) visually, when attached on the wing matrix or isolated on slides. The LD was inferred to be textural and positive and was also demonstrated with IR microscopy. The scale columns and ribs are birefringent structures. Images obtained before and after birefringence compensation allowed a detailed study of the scale morphology. Form and intrinsic birefringence findings here estimated and discussed in the context of nonlinear optical properties, bring to the level of morphology the state of molecular order and periodicity of the wing structure. FT-IR absorption peaks were found at wavenumbers which correspond to symmetric and asymmetric (-N-H) stretching, symmetric (-C-H) stretching, amide I (-CO) stretching, amide II(-N-H), and β-linking. Based on LD results obtained with polarized IR the molecular vibrations of the wing scales of M. aega and E. reevesi are assumed to be oriented with respect to the long axis of these structures. Copyright © 2011 Elsevier Ltd. All rights reserved.

  5. Very large scale characterization of graphene mechanical devices using a colorimetry technique.

    PubMed

    Cartamil-Bueno, Santiago Jose; Centeno, Alba; Zurutuza, Amaia; Steeneken, Peter Gerard; van der Zant, Herre Sjoerd Jan; Houri, Samer

    2017-06-08

    We use a scalable optical technique to characterize more than 21 000 circular nanomechanical devices made of suspended single- and double-layer graphene on cavities with different diameters (D) and depths (g). To maximize the contrast between suspended and broken membranes we used a model for selecting the optimal color filter. The method enables parallel and automatized image processing for yield statistics. We find the survival probability to be correlated with a structural mechanics scaling parameter given by D 4 /g 3 . Moreover, we extract a median adhesion energy of Γ = 0.9 J m -2 between the membrane and the native SiO 2 at the bottom of the cavities.

  6. COLA with scale-dependent growth: applications to screened modified gravity models

    NASA Astrophysics Data System (ADS)

    Winther, Hans A.; Koyama, Kazuya; Manera, Marc; Wright, Bill S.; Zhao, Gong-Bo

    2017-08-01

    We present a general parallelized and easy-to-use code to perform numerical simulations of structure formation using the COLA (COmoving Lagrangian Acceleration) method for cosmological models that exhibit scale-dependent growth at the level of first and second order Lagrangian perturbation theory. For modified gravity theories we also include screening using a fast approximate method that covers all the main examples of screening mechanisms in the literature. We test the code by comparing it to full simulations of two popular modified gravity models, namely f(R) gravity and nDGP, and find good agreement in the modified gravity boost-factors relative to ΛCDM even when using a fairly small number of COLA time steps.

  7. Design of a space shuttle structural dynamics model

    NASA Technical Reports Server (NTRS)

    1972-01-01

    A 1/8 scale structural dynamics model of a parallel burn space shuttle has been designed. Basic objectives were to represent the significant low frequency structural dynamic characteristics while keeping the fabrication costs low. The model was derived from the proposed Grumman Design 619 space shuttle. The design includes an orbiter, two solid rocket motors (SRM) and an external tank (ET). The ET consists of a monocoque LO2 tank an interbank skirt with three frames to accept SRM attachment members, an LH2 tank with 10 frames of which 3 provide for orbiter attachment members, and an aft skirt with on frame to provide for aft SRM attachment members. The frames designed for the SRM attachments are fitted with transverse struts to take symmetric loads.

  8. Parallelization Issues and Particle-In Codes.

    NASA Astrophysics Data System (ADS)

    Elster, Anne Cathrine

    1994-01-01

    "Everything should be made as simple as possible, but not simpler." Albert Einstein. The field of parallel scientific computing has concentrated on parallelization of individual modules such as matrix solvers and factorizers. However, many applications involve several interacting modules. Our analyses of a particle-in-cell code modeling charged particles in an electric field, show that these accompanying dependencies affect data partitioning and lead to new parallelization strategies concerning processor, memory and cache utilization. Our test-bed, a KSR1, is a distributed memory machine with a globally shared addressing space. However, most of the new methods presented hold generally for hierarchical and/or distributed memory systems. We introduce a novel approach that uses dual pointers on the local particle arrays to keep the particle locations automatically partially sorted. Complexity and performance analyses with accompanying KSR benchmarks, have been included for both this scheme and for the traditional replicated grids approach. The latter approach maintains load-balance with respect to particles. However, our results demonstrate it fails to scale properly for problems with large grids (say, greater than 128-by-128) running on as few as 15 KSR nodes, since the extra storage and computation time associated with adding the grid copies, becomes significant. Our grid partitioning scheme, although harder to implement, does not need to replicate the whole grid. Consequently, it scales well for large problems on highly parallel systems. It may, however, require load balancing schemes for non-uniform particle distributions. Our dual pointer approach may facilitate this through dynamically partitioned grids. We also introduce hierarchical data structures that store neighboring grid-points within the same cache -line by reordering the grid indexing. This alignment produces a 25% savings in cache-hits for a 4-by-4 cache. A consideration of the input data's effect on the simulation may lead to further improvements. For example, in the case of mean particle drift, it is often advantageous to partition the grid primarily along the direction of the drift. The particle-in-cell codes for this study were tested using physical parameters, which lead to predictable phenomena including plasma oscillations and two-stream instabilities. An overview of the most central references related to parallel particle codes is also given.

  9. Ultrasonic Nondestructive Evaluation of Pultruded Rod Stitched Efficient Unitized Structure (PRSEUS) During Large-Scale Load Testing and Rod Push-Out Testing

    NASA Technical Reports Server (NTRS)

    Johnston, Patrick H.; Juarez, Peter D.

    2016-01-01

    The Pultruded Rod Stitched Efficient Unitized Structure (PRSEUS) is a structural concept developed by the Boeing Company to address the complex structural design aspects associated with a pressurized hybrid wing body (HWB) aircraft configuration. The HWB has long been a focus of NASA's environmentally responsible aviation (ERA) project, following a building block approach to structures development, culminating with the testing of a nearly full-scale multi-bay box (MBB), representing a segment of the pressurized, non-circular fuselage portion of the HWB. PRSEUS is an integral structural concept wherein skins, frames, stringers and tear straps made of variable number of layers of dry warp-knit carbon-fiber stacks are stitched together, then resin-infused and cured in an out-of-autoclave process. The PRSEUS concept has the potential for reducing the weight and cost and increasing the structural efficiency of transport aircraft structures. A key feature of PRSEUS is the damage-arresting nature of the stitches, which enables the use of fail-safe design principles. During the load testing of the MBB, ultrasonic nondestructive evaluation (NDE) was used to monitor several sites of intentional barely-visible impact damage (BVID) as well as to survey the areas surrounding the failure cracks after final loading to catastrophic failure. The damage-arresting ability of PRSEUS was confirmed by the results of NDE. In parallel with the large-scale structural testing of the MBB, mechanical tests were conducted of the PRSEUS rod-to-overwrap bonds, as measured by pushing the rod axially from a short length of stringer.

  10. Kink-style detachment folding in Bachu fold belt of central Tarim Basin, China: geometry and seismic interpretation

    NASA Astrophysics Data System (ADS)

    Bo, Zhang; Jinjiang, Zhang; Shuyu, Yan; Jiang, Liu; Jinhai, Zhang; Zhongpei, Zhang

    2010-05-01

    The phenomenon of Kink banding is well known throughout the engineering and geophysical sciences. Associated with layered structures compressed in a layer-parallel direction, it arises for example in stratified geological systems under tectonic compression. Our work documented it is also possible to develop super large-scale kink-bands in sedimentary sequences. We interpret the Bachu fold uplift belt of the central Tarim basin in western China to be composed of detachment folds flanked by megascopic-scale kink-bands. Those previous principal fold models for the Bachu uplift belt incorporated components of large-scale thrust faulting, such as the imbricate fault-related fold model and the high-angle, reverse-faulted detachment fold model. Based on our observations in the outcrops and on the two-dimension seismic profiles, we interpret that first-order structures in the region are kink-band style detachment folds to accommodate regional shortening, and thrust faulting can be a second-order deformation style occurring on the limb of the detachment folds or at the cores of some folds to accommodate the further strain of these folds. The belt mainly consists of detachment folds overlying a ductile decollement layer. The crests of the detachment folds are bounded by large-scale kink-bands, which are zones of angularly folded strata. These low-signal-tonoise, low-reflectivity zones observed on seismic profiles across the Bachu belt are poorly imaged sections, which resulted from steeply dipping bedding in the kink-bands. The substantial width (beyond 200m) of these low-reflectivity zones, their sub-parallel edges in cross section, and their orientations at a high angle to layering between 50 and 60 degrees, as well as their conjugate geometry, support a kink-band interpretation. The kink-band interpretation model is based on the Maximum Effective Moment Criteria for continuous deformation, rather than Mohr-Column Criteria for brittle fracture. Seismic modeling is done to identify the characteristics and natures of seismic waves within the kink-band and its fold structure, which supplies the further evidences for the kink-band interpretation in the region.

  11. User's Guide for ENSAERO_FE Parallel Finite Element Solver

    NASA Technical Reports Server (NTRS)

    Eldred, Lloyd B.; Guruswamy, Guru P.

    1999-01-01

    A high fidelity parallel static structural analysis capability is created and interfaced to the multidisciplinary analysis package ENSAERO-MPI of Ames Research Center. This new module replaces ENSAERO's lower fidelity simple finite element and modal modules. Full aircraft structures may be more accurately modeled using the new finite element capability. Parallel computation is performed by breaking the full structure into multiple substructures. This approach is conceptually similar to ENSAERO's multizonal fluid analysis capability. The new substructure code is used to solve the structural finite element equations for each substructure in parallel. NASTRANKOSMIC is utilized as a front end for this code. Its full library of elements can be used to create an accurate and realistic aircraft model. It is used to create the stiffness matrices for each substructure. The new parallel code then uses an iterative preconditioned conjugate gradient method to solve the global structural equations for the substructure boundary nodes.

  12. Final Technical Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kirkpatrick, R. James

    This document serves as the final report for United States Department of Energy Basic Energy Sciences Grant DE-FG02-08ER15929, “Computational and Spectroscopic Investigations of the Molecular Scale Structure and Dynamics of Geologically Important Fluids and Mineral-Fluid Interfaces” (R. James Kirkpatrick, P.I., A. O. Yazaydin, co-P.I.). The research under this grant was intimately tied to that supported by the parallel the grant of the same title at Alfred (DOE DE-FG02-10ER16128; Geoffrey M. Bowers, P.I.).

  13. Parallel and distributed computation for fault-tolerant object recognition

    NASA Technical Reports Server (NTRS)

    Wechsler, Harry

    1988-01-01

    The distributed associative memory (DAM) model is suggested for distributed and fault-tolerant computation as it relates to object recognition tasks. The fault-tolerance is with respect to geometrical distortions (scale and rotation), noisy inputs, occulsion/overlap, and memory faults. An experimental system was developed for fault-tolerant structure recognition which shows the feasibility of such an approach. The approach is futher extended to the problem of multisensory data integration and applied successfully to the recognition of colored polyhedral objects.

  14. Structural design using equilibrium programming formulations

    NASA Technical Reports Server (NTRS)

    Scotti, Stephen J.

    1995-01-01

    Solutions to increasingly larger structural optimization problems are desired. However, computational resources are strained to meet this need. New methods will be required to solve increasingly larger problems. The present approaches to solving large-scale problems involve approximations for the constraints of structural optimization problems and/or decomposition of the problem into multiple subproblems that can be solved in parallel. An area of game theory, equilibrium programming (also known as noncooperative game theory), can be used to unify these existing approaches from a theoretical point of view (considering the existence and optimality of solutions), and be used as a framework for the development of new methods for solving large-scale optimization problems. Equilibrium programming theory is described, and existing design techniques such as fully stressed design and constraint approximations are shown to fit within its framework. Two new structural design formulations are also derived. The first new formulation is another approximation technique which is a general updating scheme for the sensitivity derivatives of design constraints. The second new formulation uses a substructure-based decomposition of the structure for analysis and sensitivity calculations. Significant computational benefits of the new formulations compared with a conventional method are demonstrated.

  15. A Parallel Vector Machine for the PM Programming Language

    NASA Astrophysics Data System (ADS)

    Bellerby, Tim

    2016-04-01

    PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using standard OpenMP and MPI. Performance analyses of the PM vector machine, demonstrating its scaling properties with respect to domain size and the number of processor nodes will be presented for a range of hardware configurations. The PM software and language definition are being made available under unrestrictive MIT and Creative Commons Attribution licenses respectively: www.pm-lang.org.

  16. An Analysis of Performance Enhancement Techniques for Overset Grid Applications

    NASA Technical Reports Server (NTRS)

    Djomehri, J. J.; Biswas, R.; Potsdam, M.; Strawn, R. C.; Biegel, Bryan (Technical Monitor)

    2002-01-01

    The overset grid methodology has significantly reduced time-to-solution of high-fidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement techniques on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machine. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.

  17. Exploring the protein folding free energy landscape: coupling replica exchange method with P3ME/RESPA algorithm.

    PubMed

    Zhou, Ruhong

    2004-05-01

    A highly parallel replica exchange method (REM) that couples with a newly developed molecular dynamics algorithm particle-particle particle-mesh Ewald (P3ME)/RESPA has been proposed for efficient sampling of protein folding free energy landscape. The algorithm is then applied to two separate protein systems, beta-hairpin and a designed protein Trp-cage. The all-atom OPLSAA force field with an explicit solvent model is used for both protein folding simulations. Up to 64 replicas of solvated protein systems are simulated in parallel over a wide range of temperatures. The combined trajectories in temperature and configurational space allow a replica to overcome free energy barriers present at low temperatures. These large scale simulations reveal detailed results on folding mechanisms, intermediate state structures, thermodynamic properties and the temperature dependences for both protein systems.

  18. In Situ Observation of Intermittent Dissipation at Kinetic Scales in the Earth's Magnetosheath

    NASA Astrophysics Data System (ADS)

    Chasapis, Alexandros; Matthaeus, W. H.; Parashar, T. N.; Wan, M.; Haggerty, C. C.; Pollock, C. J.; Giles, B. L.; Paterson, W. R.; Dorelli, J.; Gershman, D. J.; Torbert, R. B.; Russell, C. T.; Lindqvist, P.-A.; Khotyaintsev, Y.; Moore, T. E.; Ergun, R. E.; Burch, J. L.

    2018-03-01

    We present a study of signatures of energy dissipation at kinetic scales in plasma turbulence based on observations by the Magnetospheric Multiscale mission (MMS) in the Earth’s magnetosheath. Using several intervals, and taking advantage of the high-resolution instrumentation on board MMS, we compute and discuss several statistical measures of coherent structures and heating associated with electrons, at previously unattainable scales in space and time. We use the multi-spacecraft Partial Variance of Increments (PVI) technique to study the intermittent structure of the magnetic field. Furthermore, we examine a measure of dissipation and its behavior with respect to the PVI as well as the current density. Additionally, we analyze the evolution of the anisotropic electron temperature and non-Maxwellian features of the particle distribution function. From these diagnostics emerges strong statistical evidence that electrons are preferentially heated in subproton-scale regions of strong electric current density, and this heating is preferentially in the parallel direction relative to the local magnetic field. Accordingly, the conversion of magnetic energy into electron kinetic energy occurs more strongly in regions of stronger current density, a finding consistent with several kinetic plasma simulation studies and hinted at by prior studies using lower resolution Cluster observations.

  19. Teach for America, Relay Graduate School, and the Charter School Networks: The Making of a Parallel Education Structure

    ERIC Educational Resources Information Center

    Mungal, Angus Shiva

    2016-01-01

    In New York City, a partnership between Teach For America (TFA), the New York City Department of Education (NYCDOE), the Relay Graduate School of Education (Relay), and three charter school networks produced a "parallel education structure" within the public school system. Driving the partnership and the parallel education structure are…

  20. Liter-scale production of uniform gas bubbles via parallelization of flow-focusing generators.

    PubMed

    Jeong, Heon-Ho; Yadavali, Sagar; Issadore, David; Lee, Daeyeon

    2017-07-25

    Microscale gas bubbles have demonstrated enormous utility as versatile templates for the synthesis of functional materials in medicine, ultra-lightweight materials and acoustic metamaterials. In many of these applications, high uniformity of the size of the gas bubbles is critical to achieve the desired properties and functionality. While microfluidics have been used with success to create gas bubbles that have a uniformity not achievable using conventional methods, the inherently low volumetric flow rate of microfluidics has limited its use in most applications. Parallelization of liquid droplet generators, in which many droplet generators are incorporated onto a single chip, has shown great promise for the large scale production of monodisperse liquid emulsion droplets. However, the scale-up of monodisperse gas bubbles using such an approach has remained a challenge because of possible coupling between parallel bubbles generators and feedback effects from the downstream channels. In this report, we systematically investigate the effect of factors such as viscosity of the continuous phase, capillary number, and gas pressure as well as the channel uniformity on the size distribution of gas bubbles in a parallelized microfluidic device. We show that, by optimizing the flow conditions, a device with 400 parallel flow focusing generators on a footprint of 5 × 5 cm 2 can be used to generate gas bubbles with a coefficient of variation of less than 5% at a production rate of approximately 1 L h -1 . Our results suggest that the optimization of flow conditions using a device with a small number (e.g., 8) of parallel FFGs can facilitate large-scale bubble production.

  1. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

    PubMed

    Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

    2015-01-20

    While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.

  2. Improving parallel I/O autotuning with performance modeling

    DOE PAGES

    Behzad, Babak; Byna, Surendra; Wild, Stefan M.; ...

    2014-01-01

    Various layers of the parallel I/O subsystem offer tunable parameters for improving I/O performance on large-scale computers. However, searching through a large parameter space is challenging. We are working towards an autotuning framework for determining the parallel I/O parameters that can achieve good I/O performance for different data write patterns. In this paper, we characterize parallel I/O and discuss the development of predictive models for use in effectively reducing the parameter space. Furthermore, applying our technique on tuning an I/O kernel derived from a large-scale simulation code shows that the search time can be reduced from 12 hours to 2more » hours, while achieving 54X I/O performance speedup.« less

  3. Field of genes: using Apache Kafka as a bioinformatic data repository

    PubMed Central

    Lynch, Richard; Walsh, Paul

    2018-01-01

    Abstract Background Bioinformatic research is increasingly dependent on large-scale datasets, accessed either from private or public repositories. An example of a public repository is National Center for Biotechnology Information's (NCBI’s) Reference Sequence (RefSeq). These repositories must decide in what form to make their data available. Unstructured data can be put to almost any use but are limited in how access to them can be scaled. Highly structured data offer improved performance for specific algorithms but limit the wider usefulness of the data. We present an alternative: lightly structured data stored in Apache Kafka in a way that is amenable to parallel access and streamed processing, including subsequent transformations into more highly structured representations. We contend that this approach could provide a flexible and powerful nexus of bioinformatic data, bridging the gap between low structure on one hand, and high performance and scale on the other. To demonstrate this, we present a proof-of-concept version of NCBI’s RefSeq database using this technology. We measure the performance and scalability characteristics of this alternative with respect to flat files. Results The proof of concept scales almost linearly as more compute nodes are added, outperforming the standard approach using files. Conclusions Apache Kafka merits consideration as a fast and more scalable but general-purpose way to store and retrieve bioinformatic data, for public, centralized reference datasets such as RefSeq and for private clinical and experimental data. PMID:29635394

  4. Orogen-transverse tectonic window in the Eastern Himalayan fold belt: A superposed buckling model

    NASA Astrophysics Data System (ADS)

    Bose, Santanu; Mandal, Nibir; Acharyya, S. K.; Ghosh, Subhajit; Saha, Puspendu

    2014-09-01

    The Eastern Lesser Himalayan fold-thrust belt is punctuated by a row of orogen-transverse domal tectonic windows. To evaluate their origin, a variety of thrust-stack models have been proposed, assuming that the crustal shortening occurred dominantly by brittle deformations. However, the Rangit Window (RW) in the Darjeeling-Sikkim Himalaya (DSH) shows unequivocal structural imprints of ductile deformations of multiple episodes. Based on new structural maps, coupled with outcrop-scale field observations, we recognize at least four major episodes of folding in the litho-tectonic units of DSH. The last episode has produced regionally orogen-transverse upright folds (F4), the interference of which with the third-generation (F3) orogen-parallel folds has shaped the large-scale structural patterns in DSH. We propose a new genetic model for the RW, invoking the mechanics of superposed buckling in the mechanically stratified litho-tectonic systems. We substantiate this superposed buckling model with results obtained from analogue experiments. The model explains contrasting F3-F4 interferences in the Lesser Himalayan Sequence (LHS). The lower-order (terrain-scale) folds have undergone superposed buckling in Mode 1, producing large-scale domes and basins, whereas the RW occurs as a relatively higher-order dome nested in the first-order Tista Dome. The Gondwana and the Proterozoic rocks within the RW underwent superposed buckling in Modes 3 and 4, leading to Type 2 fold interferences, as evident from their structural patterns.

  5. Portable parallel stochastic optimization for the design of aeropropulsion components

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Rhodes, G. S.

    1994-01-01

    This report presents the results of Phase 1 research to develop a methodology for performing large-scale Multi-disciplinary Stochastic Optimization (MSO) for the design of aerospace systems ranging from aeropropulsion components to complete aircraft configurations. The current research recognizes that such design optimization problems are computationally expensive, and require the use of either massively parallel or multiple-processor computers. The methodology also recognizes that many operational and performance parameters are uncertain, and that uncertainty must be considered explicitly to achieve optimum performance and cost. The objective of this Phase 1 research was to initialize the development of an MSO methodology that is portable to a wide variety of hardware platforms, while achieving efficient, large-scale parallelism when multiple processors are available. The first effort in the project was a literature review of available computer hardware, as well as review of portable, parallel programming environments. The first effort was to implement the MSO methodology for a problem using the portable parallel programming language, Parallel Virtual Machine (PVM). The third and final effort was to demonstrate the example on a variety of computers, including a distributed-memory multiprocessor, a distributed-memory network of workstations, and a single-processor workstation. Results indicate the MSO methodology can be well-applied towards large-scale aerospace design problems. Nearly perfect linear speedup was demonstrated for computation of optimization sensitivity coefficients on both a 128-node distributed-memory multiprocessor (the Intel iPSC/860) and a network of workstations (speedups of almost 19 times achieved for 20 workstations). Very high parallel efficiencies (75 percent for 31 processors and 60 percent for 50 processors) were also achieved for computation of aerodynamic influence coefficients on the Intel. Finally, the multi-level parallelization strategy that will be needed for large-scale MSO problems was demonstrated to be highly efficient. The same parallel code instructions were used on both platforms, demonstrating portability. There are many applications for which MSO can be applied, including NASA's High-Speed-Civil Transport, and advanced propulsion systems. The use of MSO will reduce design and development time and testing costs dramatically.

  6. OpenSWPC: an open-source integrated parallel simulation code for modeling seismic wave propagation in 3D heterogeneous viscoelastic media

    NASA Astrophysics Data System (ADS)

    Maeda, Takuto; Takemura, Shunsuke; Furumura, Takashi

    2017-07-01

    We have developed an open-source software package, Open-source Seismic Wave Propagation Code (OpenSWPC), for parallel numerical simulations of seismic wave propagation in 3D and 2D (P-SV and SH) viscoelastic media based on the finite difference method in local-to-regional scales. This code is equipped with a frequency-independent attenuation model based on the generalized Zener body and an efficient perfectly matched layer for absorbing boundary condition. A hybrid-style programming using OpenMP and the Message Passing Interface (MPI) is adopted for efficient parallel computation. OpenSWPC has wide applicability for seismological studies and great portability to allowing excellent performance from PC clusters to supercomputers. Without modifying the code, users can conduct seismic wave propagation simulations using their own velocity structure models and the necessary source representations by specifying them in an input parameter file. The code has various modes for different types of velocity structure model input and different source representations such as single force, moment tensor and plane-wave incidence, which can easily be selected via the input parameters. Widely used binary data formats, the Network Common Data Form (NetCDF) and the Seismic Analysis Code (SAC) are adopted for the input of the heterogeneous structure model and the outputs of the simulation results, so users can easily handle the input/output datasets. All codes are written in Fortran 2003 and are available with detailed documents in a public repository.[Figure not available: see fulltext.

  7. Enhancing Scalability and Efficiency of the TOUGH2_MP for LinuxClusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Keni; Wu, Yu-Shu

    2006-04-17

    TOUGH2{_}MP, the parallel version TOUGH2 code, has been enhanced by implementing more efficient communication schemes. This enhancement is achieved through reducing the amount of small-size messages and the volume of large messages. The message exchange speed is further improved by using non-blocking communications for both linear and nonlinear iterations. In addition, we have modified the AZTEC parallel linear-equation solver to nonblocking communication. Through the improvement of code structuring and bug fixing, the new version code is now more stable, while demonstrating similar or even better nonlinear iteration converging speed than the original TOUGH2 code. As a result, the new versionmore » of TOUGH2{_}MP is improved significantly in its efficiency. In this paper, the scalability and efficiency of the parallel code are demonstrated by solving two large-scale problems. The testing results indicate that speedup of the code may depend on both problem size and complexity. In general, the code has excellent scalability in memory requirement as well as computing time.« less

  8. Droplet impact on regular micro-grooved surfaces

    NASA Astrophysics Data System (ADS)

    Hu, Hai-Bao; Huang, Su-He; Chen, Li-Bin

    2013-08-01

    We have investigated experimentally the process of a droplet impact on a regular micro-grooved surface. The target surfaces are patterned such that micro-scale spokes radiate from the center, concentric circles, and parallel lines on the polishing copper plate, using Quasi-LIGA molding technology. The dynamic behavior of water droplets impacting on these structured surfaces is examined using a high-speed camera, including the drop impact processes, the maximum spreading diameters, and the lengths and numbers of fingers at different values of Weber number. Experimental results validate that the spreading processes are arrested on all target surfaces at low velocity. Also, the experimental results at higher impact velocity demonstrate that the spreading process is conducted on the surface parallel to the micro-grooves, but is arrested in the direction perpendicular to the micro-grooves. Besides, the lengths of fingers increase observably, even when they are ejected out as tiny droplets along the groove direction, at the same time the drop recoil velocity is reduced by micro-grooves which are parallel to the spreading direction, but not by micro-grooves which are vertical to the spreading direction.

  9. PROTO-PLASM: parallel language for adaptive and scalable modelling of biosystems.

    PubMed

    Bajaj, Chandrajit; DiCarlo, Antonio; Paoluzzi, Alberto

    2008-09-13

    This paper discusses the design goals and the first developments of PROTO-PLASM, a novel computational environment to produce libraries of executable, combinable and customizable computer models of natural and synthetic biosystems, aiming to provide a supporting framework for predictive understanding of structure and behaviour through multiscale geometric modelling and multiphysics simulations. Admittedly, the PROTO-PLASM platform is still in its infancy. Its computational framework--language, model library, integrated development environment and parallel engine--intends to provide patient-specific computational modelling and simulation of organs and biosystem, exploiting novel functionalities resulting from the symbolic combination of parametrized models of parts at various scales. PROTO-PLASM may define the model equations, but it is currently focused on the symbolic description of model geometry and on the parallel support of simulations. Conversely, CellML and SBML could be viewed as defining the behavioural functions (the model equations) to be used within a PROTO-PLASM program. Here we exemplify the basic functionalities of PROTO-PLASM, by constructing a schematic heart model. We also discuss multiscale issues with reference to the geometric and physical modelling of neuromuscular junctions.

  10. Proto-Plasm: parallel language for adaptive and scalable modelling of biosystems

    PubMed Central

    Bajaj, Chandrajit; DiCarlo, Antonio; Paoluzzi, Alberto

    2008-01-01

    This paper discusses the design goals and the first developments of Proto-Plasm, a novel computational environment to produce libraries of executable, combinable and customizable computer models of natural and synthetic biosystems, aiming to provide a supporting framework for predictive understanding of structure and behaviour through multiscale geometric modelling and multiphysics simulations. Admittedly, the Proto-Plasm platform is still in its infancy. Its computational framework—language, model library, integrated development environment and parallel engine—intends to provide patient-specific computational modelling and simulation of organs and biosystem, exploiting novel functionalities resulting from the symbolic combination of parametrized models of parts at various scales. Proto-Plasm may define the model equations, but it is currently focused on the symbolic description of model geometry and on the parallel support of simulations. Conversely, CellML and SBML could be viewed as defining the behavioural functions (the model equations) to be used within a Proto-Plasm program. Here we exemplify the basic functionalities of Proto-Plasm, by constructing a schematic heart model. We also discuss multiscale issues with reference to the geometric and physical modelling of neuromuscular junctions. PMID:18559320

  11. A parallel reaction-transport model applied to cement hydration and microstructure development

    NASA Astrophysics Data System (ADS)

    Bullard, Jeffrey W.; Enjolras, Edith; George, William L.; Satterfield, Steven G.; Terrill, Judith E.

    2010-03-01

    A recently described stochastic reaction-transport model on three-dimensional lattices is parallelized and is used to simulate the time-dependent structural and chemical evolution in multicomponent reactive systems. The model, called HydratiCA, uses probabilistic rules to simulate the kinetics of diffusion, homogeneous reactions and heterogeneous phenomena such as solid nucleation, growth and dissolution in complex three-dimensional systems. The algorithms require information only from each lattice site and its immediate neighbors, and this localization enables the parallelized model to exhibit near-linear scaling up to several hundred processors. Although applicable to a wide range of material systems, including sedimentary rock beds, reacting colloids and biochemical systems, validation is performed here on two minerals that are commonly found in Portland cement paste, calcium hydroxide and ettringite, by comparing their simulated dissolution or precipitation rates far from equilibrium to standard rate equations, and also by comparing simulated equilibrium states to thermodynamic calculations, as a function of temperature and pH. Finally, we demonstrate how HydratiCA can be used to investigate microstructure characteristics, such as spatial correlations between different condensed phases, in more complex microstructures.

  12. The Parallel System for Integrating Impact Models and Sectors (pSIMS)

    NASA Technical Reports Server (NTRS)

    Elliott, Joshua; Kelly, David; Chryssanthacopoulos, James; Glotter, Michael; Jhunjhnuwala, Kanika; Best, Neil; Wilde, Michael; Foster, Ian

    2014-01-01

    We present a framework for massively parallel climate impact simulations: the parallel System for Integrating Impact Models and Sectors (pSIMS). This framework comprises a) tools for ingesting and converting large amounts of data to a versatile datatype based on a common geospatial grid; b) tools for translating this datatype into custom formats for site-based models; c) a scalable parallel framework for performing large ensemble simulations, using any one of a number of different impacts models, on clusters, supercomputers, distributed grids, or clouds; d) tools and data standards for reformatting outputs to common datatypes for analysis and visualization; and e) methodologies for aggregating these datatypes to arbitrary spatial scales such as administrative and environmental demarcations. By automating many time-consuming and error-prone aspects of large-scale climate impacts studies, pSIMS accelerates computational research, encourages model intercomparison, and enhances reproducibility of simulation results. We present the pSIMS design and use example assessments to demonstrate its multi-model, multi-scale, and multi-sector versatility.

  13. A mixed parallel strategy for the solution of coupled multi-scale problems at finite strains

    NASA Astrophysics Data System (ADS)

    Lopes, I. A. Rodrigues; Pires, F. M. Andrade; Reis, F. J. P.

    2018-02-01

    A mixed parallel strategy for the solution of homogenization-based multi-scale constitutive problems undergoing finite strains is proposed. The approach aims to reduce the computational time and memory requirements of non-linear coupled simulations that use finite element discretization at both scales (FE^2). In the first level of the algorithm, a non-conforming domain decomposition technique, based on the FETI method combined with a mortar discretization at the interface of macroscopic subdomains, is employed. A master-slave scheme, which distributes tasks by macroscopic element and adopts dynamic scheduling, is then used for each macroscopic subdomain composing the second level of the algorithm. This strategy allows the parallelization of FE^2 simulations in computers with either shared memory or distributed memory architectures. The proposed strategy preserves the quadratic rates of asymptotic convergence that characterize the Newton-Raphson scheme. Several examples are presented to demonstrate the robustness and efficiency of the proposed parallel strategy.

  14. Design Sketches For Optical Crossbar Switches Intended For Large-Scale Parallel Processing Applications

    NASA Astrophysics Data System (ADS)

    Hartmann, Alfred; Redfield, Steve

    1989-04-01

    This paper discusses design of large-scale (1000x 1000) optical crossbar switching networks for use in parallel processing supercom-puters. Alternative design sketches for an optical crossbar switching network are presented using free-space optical transmission with either a beam spreading/masking model or a beam steering model for internodal communications. The performances of alternative multiple access channel communications protocol-unslotted and slotted ALOHA and carrier sense multiple access (CSMA)-are compared with the performance of the classic arbitrated bus crossbar of conventional electronic parallel computing. These comparisons indicate an almost inverse relationship between ease of implementation and speed of operation. Practical issues of optical system design are addressed, and an optically addressed, composite spatial light modulator design is presented for fabrication to arbitrarily large scale. The wide range of switch architecture, communications protocol, optical systems design, device fabrication, and system performance problems presented by these design sketches poses a serious challenge to practical exploitation of highly parallel optical interconnects in advanced computer designs.

  15. Imprint of non-linear effects on HI intensity mapping on large scales

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Umeh, Obinna, E-mail: umeobinna@gmail.com

    Intensity mapping of the HI brightness temperature provides a unique way of tracing large-scale structures of the Universe up to the largest possible scales. This is achieved by using a low angular resolution radio telescopes to detect emission line from cosmic neutral Hydrogen in the post-reionization Universe. We use general relativistic perturbation theory techniques to derive for the first time the full expression for the HI brightness temperature up to third order in perturbation theory without making any plane-parallel approximation. We use this result and the renormalization prescription for biased tracers to study the impact of nonlinear effects on themore » power spectrum of HI brightness temperature both in real and redshift space. We show how mode coupling at nonlinear order due to nonlinear bias parameters and redshift space distortion terms modulate the power spectrum on large scales. The large scale modulation may be understood to be due to the effective bias parameter and effective shot noise.« less

  16. Imprint of non-linear effects on HI intensity mapping on large scales

    NASA Astrophysics Data System (ADS)

    Umeh, Obinna

    2017-06-01

    Intensity mapping of the HI brightness temperature provides a unique way of tracing large-scale structures of the Universe up to the largest possible scales. This is achieved by using a low angular resolution radio telescopes to detect emission line from cosmic neutral Hydrogen in the post-reionization Universe. We use general relativistic perturbation theory techniques to derive for the first time the full expression for the HI brightness temperature up to third order in perturbation theory without making any plane-parallel approximation. We use this result and the renormalization prescription for biased tracers to study the impact of nonlinear effects on the power spectrum of HI brightness temperature both in real and redshift space. We show how mode coupling at nonlinear order due to nonlinear bias parameters and redshift space distortion terms modulate the power spectrum on large scales. The large scale modulation may be understood to be due to the effective bias parameter and effective shot noise.

  17. The development and validation of the Physical Appearance Comparison Scale-Revised (PACS-R).

    PubMed

    Schaefer, Lauren M; Thompson, J Kevin

    2014-04-01

    The Physical Appearance Comparison Scale (PACS; Thompson, Heinberg, & Tantleff, 1991) was revised to assess appearance comparisons relevant to women and men in a wide variety of contexts. The revised scale (Physical Appearance Comparison Scale-Revised, PACS-R) was administered to 1176 college females. In Study 1, exploratory factor analysis and parallel analysis using one half of the sample suggested a single factor structure for the PACS-R. Study 2 utilized the remaining half of the sample to conduct confirmatory factor analysis, item analysis, and to examine the convergent validity of the scale. These analyses resulted in an 11-item measure that demonstrated excellent internal consistency and convergent validity with measures of body satisfaction, eating pathology, sociocultural influences on appearance, and self-esteem. Regression analyses demonstrated the utility of the PACS-R in predicting body satisfaction and eating pathology. Overall, results indicate that the PACS-R is a reliable and valid tool for assessing appearance comparison tendencies in women. Copyright © 2014. Published by Elsevier Ltd.

  18. Fault-zone structure and weakening processes in basin-scale reverse faults: The Moonlight Fault Zone, South Island, New Zealand

    NASA Astrophysics Data System (ADS)

    Alder, S.; Smith, S. A. F.; Scott, J. M.

    2016-10-01

    The >200 km long Moonlight Fault Zone (MFZ) in southern New Zealand was an Oligocene basin-bounding normal fault zone that reactivated in the Miocene as a high-angle reverse fault (present dip angle 65°-75°). Regional exhumation in the last c. 5 Ma has resulted in deep exposures of the MFZ that present an opportunity to study the structure and deformation processes that were active in a basin-scale reverse fault at basement depths. Syn-rift sediments are preserved only as thin fault-bound slivers. The hanging wall and footwall of the MFZ are mainly greenschist facies quartzofeldspathic schists that have a steeply-dipping (55°-75°) foliation subparallel to the main fault trace. In more fissile lithologies (e.g. greyschists), hanging-wall deformation occurred by the development of foliation-parallel breccia layers up to a few centimetres thick. Greyschists in the footwall deformed mainly by folding and formation of tabular, foliation-parallel breccias up to 1 m wide. Where the hanging-wall contains more competent lithologies (e.g. greenschist facies metabasite) it is laced with networks of pseudotachylyte that formed parallel to the host rock foliation in a damage zone extending up to 500 m from the main fault trace. The fault core contains an up to 20 m thick sequence of breccias, cataclasites and foliated cataclasites preserving evidence for the progressive development of interconnected networks of (partly authigenic) chlorite and muscovite. Deformation in the fault core occurred by cataclasis of quartz and albite, frictional sliding of chlorite and muscovite grains, and dissolution-precipitation. Combined with published friction and permeability data, our observations suggest that: 1) host rock lithology and anisotropy were the primary controls on the structure of the MFZ at basement depths and 2) high-angle reverse slip was facilitated by the low frictional strength of fault core materials. Restriction of pseudotachylyte networks to the hanging-wall of the MFZ further suggests that the wide, phyllosilicate-rich fault core acted as an efficient hydrological barrier, resulting in a relatively hydrous footwall and fault core but a relatively dry hanging-wall.

  19. Performance of parallel computation using CUDA for solving the one-dimensional elasticity equations

    NASA Astrophysics Data System (ADS)

    Darmawan, J. B. B.; Mungkasi, S.

    2017-01-01

    In this paper, we investigate the performance of parallel computation in solving the one-dimensional elasticity equations. Elasticity equations are usually implemented in engineering science. Solving these equations fast and efficiently is desired. Therefore, we propose the use of parallel computation. Our parallel computation uses CUDA of the NVIDIA. Our research results show that parallel computation using CUDA has a great advantage and is powerful when the computation is of large scale.

  20. Orientations of Pre-existing Structures along the Scarp of the Bilila-Mtakataka Fault in the Central Malawi Rift.

    NASA Astrophysics Data System (ADS)

    Elifritz, E. A.; Johnson, S.; Beresh, S. C. M.; Mendez, K.; Mynatt, W. G.; Mayle, M.; Laó-Dávila, D. A.; Atekwana, E. A.; Chindandali, P. R. N.; Chisenga, C.; Gondwe, S.; Mkumbwa, M.; Kalindekafe, L.; Kalaguluka, D.; Salima, J.

    2017-12-01

    The NW-SE Bilila-Mtakataka Fault is suggested to be 100 km in length and is located in the Malawi Rift, a portion of the magma-poor Western Branch of the East African Rift System. This fault is exposed south of Lake Malawi and occurs close to the epicenter of the 1989 6.2 magnitude Salima Earthquake. Moreover, it traverses rocks with inherited Precambrian fabrics that may control the modern rifting process. The effect of the orientation of the pre-existing fabric on the formation of this potentially seismogenic fault has not been well studied. In this project, we measured the older foliations, dikes, and joints in addition to younger faults and striations to understand how the active faulting of the Bilila-Mtakataka Fault is affected by the older fabric. The Fault is divided into 5 segments and 4 linkage zones. All four linkage zones were studied in detail and a Brunton compass was used to determine orientations of structures. The linkage zone between segments 1 and 2 occurs between a regional WNW-ESE joint and the border fault, which is identified by a zig-zag pattern in SRTM data. Precambrian gneiss is cut by oblique steeply-dipping faults in this area. Striations and layer offsets suggest both right-lateral and normal components. This segment strikes NE-SW, in contrast with the NW-SE average strike of the entire fault. The foliations, faults, dikes, and joints collected in this area strike NE-SW, therefore running parallel to the segment. The last 3 southern linkage zones all strike NW-SE and the linkage zone between segment 3 and 4 has a steep dip angle. Dip angles of structures vary from segment to segment, having a wide range of results. Nonetheless, all four linkage zones show structures striking parallel to its segment direction. The results show that pre-existing meso-scale and regional structures and faults strike parallel to the fault scarp. The parallelism of the structures suggest that they serve as planes of weakness, controlling the localization of extension expressed as the border fault. Thus, further studies of the Precambrian foliation in the subsurface are necessary to understand the characterization of the fault where it is unexposed at depth.

  1. Parallel processing for nonlinear dynamics simulations of structures including rotating bladed-disk assemblies

    NASA Technical Reports Server (NTRS)

    Hsieh, Shang-Hsien

    1993-01-01

    The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.

  2. Determination of accurate 1H positions of an alanine tripeptide with anti-parallel and parallel β-sheet structures by high resolution 1H solid state NMR and GIPAW chemical shift calculation.

    PubMed

    Yazawa, Koji; Suzuki, Furitsu; Nishiyama, Yusuke; Ohata, Takuya; Aoki, Akihiro; Nishimura, Katsuyuki; Kaji, Hironori; Shimizu, Tadashi; Asakura, Tetsuo

    2012-11-25

    The accurate (1)H positions of alanine tripeptide, A(3), with anti-parallel and parallel β-sheet structures could be determined by highly resolved (1)H DQMAS solid-state NMR spectra and (1)H chemical shift calculation with gauge-including projector augmented wave calculations.

  3. Scaling device for photographic images

    NASA Technical Reports Server (NTRS)

    Rivera, Jorge E. (Inventor); Youngquist, Robert C. (Inventor); Cox, Robert B. (Inventor); Haskell, William D. (Inventor); Stevenson, Charles G. (Inventor)

    2005-01-01

    A scaling device projects a known optical pattern into the field of view of a camera, which can be employed as a reference scale in a resulting photograph of a remote object, for example. The device comprises an optical beam projector that projects two or more spaced, parallel optical beams onto a surface of a remotely located object to be photographed. The resulting beam spots or lines on the object are spaced from one another by a known, predetermined distance. As a result, the size of other objects or features in the photograph can be determined through comparison of their size to the known distance between the beam spots. Preferably, the device is a small, battery-powered device that can be attached to a camera and employs one or more laser light sources and associated optics to generate the parallel light beams. In a first embodiment of the invention, a single laser light source is employed, but multiple parallel beams are generated thereby through use of beam splitting optics. In another embodiment, multiple individual laser light sources are employed that are mounted in the device parallel to one another to generate the multiple parallel beams.

  4. Visual analysis of inter-process communication for large-scale parallel computing.

    PubMed

    Muelder, Chris; Gygi, Francois; Ma, Kwan-Liu

    2009-01-01

    In serial computation, program profiling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more efficient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become difficult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.

  5. An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

    NASA Astrophysics Data System (ADS)

    Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

    2018-02-01

    De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.

  6. Exploring structural colour in uni- and multi-coloured butterfly wings and Ag+ uptake by scales

    NASA Astrophysics Data System (ADS)

    Aideo, Swati N.; Haloi, Rajib; Mohanta, Dambarudhar

    2017-09-01

    We discuss the origin of the structural colour of different butterfly wings in the light of the typical built-in microstructural arrangement of scales that are comprised of chitin-melanin layer and air-gaps. Three specimens of butterfly wings namely, Papilio Liomedon (black), Catopsilia Pyranthe (light green) and Vanessa Cardui (multi-coloured) were chosen and diffuse reflectance characteristics have been aquired for normal incidence of p-polarized light. Moreover, the time-dependent uptake of Ag+ into scales has led to swelling and spread of the chitinous ridges and ribs, with revelation of micro-beads in Catopsilia Pyranthe specimen. The reduction of the number of air-gaps between any two parallel ridges is attributed to the merging of adjacent gaps possessing a common boundary. The availability of Ag at the centre of a chosen ridge, for every wing type, follows an exponential growing trend, ∼e0.36t . Precise inclusion of nanoscale metals into natural photonic systems would provide new insight, while applying principles of photonics and plasmonics simultaneously.

  7. Combined aerodynamic and structural dynamic problem emulating routines (CASPER): Theory and implementation

    NASA Technical Reports Server (NTRS)

    Jones, William H.

    1985-01-01

    The Combined Aerodynamic and Structural Dynamic Problem Emulating Routines (CASPER) is a collection of data-base modification computer routines that can be used to simulate Navier-Stokes flow through realistic, time-varying internal flow fields. The Navier-Stokes equation used involves calculations in all three dimensions and retains all viscous terms. The only term neglected in the current implementation is gravitation. The solution approach is of an interative, time-marching nature. Calculations are based on Lagrangian aerodynamic elements (aeroelements). It is assumed that the relationships between a particular aeroelement and its five nearest neighbor aeroelements are sufficient to make a valid simulation of Navier-Stokes flow on a small scale and that the collection of all small-scale simulations makes a valid simulation of a large-scale flow. In keeping with these assumptions, it must be noted that CASPER produces an imitation or simulation of Navier-Stokes flow rather than a strict numerical solution of the Navier-Stokes equation. CASPER is written to operate under the Parallel, Asynchronous Executive (PAX), which is described in a separate report.

  8. A mathematical model of the structure and evolution of small scale discrete auroral arcs

    NASA Technical Reports Server (NTRS)

    Seyler, C. E.

    1990-01-01

    A three dimensional fluid model which includes the dispersive effect of electron inertia is used to study the nonlinear macroscopic plasma dynamics of small scale discrete auroral arcs within the auroral acceleration zone and ionosphere. The motion of the Alfven wave source relative to the magnetospheric and ionospheric plasma forms an oblique Alfven wave which is reflected from the topside ionosphere by the negative density gradient. The superposition of the incident and reflected wave can be described by a steady state analytical solution of the model equations with the appropriate boundary conditions. This two dimensional discrete auroral arc equilibrium provides a simple explanation of auroral acceleration associated with the parallel electric field. Three dimensional fully nonlinear numerical simulations indicate that the equilibrium arc configuration evolves three dimensionally through collisionless tearing and reconnection of the current layer. The interaction of the perturbed flow and the transverse magnetic field produces complex transverse structure that may be the origin of the folds and curls observed to be associated with small scale discrete arcs.

  9. Kolmogorov-Kraichnan Scaling in the Inverse Energy Cascade of Two-Dimensional Plasma Turbulence

    NASA Astrophysics Data System (ADS)

    Antar, G. Y.

    2003-08-01

    Turbulence in plasmas that are magnetically confined, such as tokamaks or linear devices, is two dimensional or at least quasi two dimensional due to the strong magnetic field, which leads to extreme elongation of the fluctuations, if any, in the direction parallel to the magnetic field. These plasmas are also compressible fluid flows obeying the compressible Navier-Stokes equations. This Letter presents the first comprehensive scaling of the structure functions of the density and velocity fields up to 10th order in the PISCES linear plasma device and up to 6th order in the Mega-Ampère Spherical Tokamak (MAST). In the two devices, it is found that the scaling of the turbulent fields is in good agreement with the prediction of the Kolmogorov-Kraichnan theory for two-dimensional turbulence in the energy cascade subrange.

  10. The Forest Method as a New Parallel Tree Method with the Sectional Voronoi Tessellation

    NASA Astrophysics Data System (ADS)

    Yahagi, Hideki; Mori, Masao; Yoshii, Yuzuru

    1999-09-01

    We have developed a new parallel tree method which will be called the forest method hereafter. This new method uses the sectional Voronoi tessellation (SVT) for the domain decomposition. The SVT decomposes a whole space into polyhedra and allows their flat borders to move by assigning different weights. The forest method determines these weights based on the load balancing among processors by means of the overload diffusion (OLD). Moreover, since all the borders are flat, before receiving the data from other processors, each processor can collect enough data to calculate the gravity force with precision. Both the SVT and the OLD are coded in a highly vectorizable manner to accommodate on vector parallel processors. The parallel code based on the forest method with the Message Passing Interface is run on various platforms so that a wide portability is guaranteed. Extensive calculations with 15 processors of Fujitsu VPP300/16R indicate that the code can calculate the gravity force exerted on 105 particles in each second for some ideal dark halo. This code is found to enable an N-body simulation with 107 or more particles for a wide dynamic range and is therefore a very powerful tool for the study of galaxy formation and large-scale structure in the universe.

  11. Parallel Narrative Structure in Paul Harding's "Tinkers"

    ERIC Educational Resources Information Center

    Çirakli, Mustafa Zeki

    2014-01-01

    The present paper explores the implications of parallel narrative structure in Paul Harding's "Tinkers" (2009). Besides primarily recounting the two sets of parallel narratives, "Tinkers" also comprises of seemingly unrelated fragments such as excerpts from clock repair manuals and diaries. The main stories, however, told…

  12. Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) Simulations of the Molecular Crystal alphaRDX

    DTIC Science & Technology

    2013-08-01

    potential for HMX / RDX (3, 9). ...................................................................................8 1 1. Purpose This work...6 dispersion and electrostatic interactions. Constants for the SB potential are given in table 1. 8 Table 1. SB potential for HMX / RDX (3, 9...modeling dislocations in the energetic molecular crystal RDX using the Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) molecular

  13. Discrete Event Modeling and Massively Parallel Execution of Epidemic Outbreak Phenomena

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perumalla, Kalyan S; Seal, Sudip K

    2011-01-01

    In complex phenomena such as epidemiological outbreaks, the intensity of inherent feedback effects and the significant role of transients in the dynamics make simulation the only effective method for proactive, reactive or post-facto analysis. The spatial scale, runtime speed, and behavioral detail needed in detailed simulations of epidemic outbreaks make it necessary to use large-scale parallel processing. Here, an optimistic parallel execution of a new discrete event formulation of a reaction-diffusion simulation model of epidemic propagation is presented to facilitate in dramatically increasing the fidelity and speed by which epidemiological simulations can be performed. Rollback support needed during optimistic parallelmore » execution is achieved by combining reverse computation with a small amount of incremental state saving. Parallel speedup of over 5,500 and other runtime performance metrics of the system are observed with weak-scaling execution on a small (8,192-core) Blue Gene / P system, while scalability with a weak-scaling speedup of over 10,000 is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes exceeding several hundreds of millions of individuals in the largest cases are successfully exercised to verify model scalability.« less

  14. FWT2D: A massively parallel program for frequency-domain full-waveform tomography of wide-aperture seismic data—Part 1: Algorithm

    NASA Astrophysics Data System (ADS)

    Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves

    2009-03-01

    This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.

  15. NASA Workshop on Computational Structural Mechanics 1987, part 1

    NASA Technical Reports Server (NTRS)

    Sykes, Nancy P. (Editor)

    1989-01-01

    Topics in Computational Structural Mechanics (CSM) are reviewed. CSM parallel structural methods, a transputer finite element solver, architectures for multiprocessor computers, and parallel eigenvalue extraction are among the topics discussed.

  16. Multiscale analysis of structure development in expanded starch snacks

    NASA Astrophysics Data System (ADS)

    van der Sman, R. G. M.; Broeze, J.

    2014-11-01

    In this paper we perform a multiscale analysis of the food structuring process of the expansion of starchy snack foods like keropok, which obtains a solid foam structure. In particular, we want to investigate the validity of the hypothesis of Kokini and coworkers, that expansion is optimal at the moisture content, where the glass transition and the boiling line intersect. In our analysis we make use of several tools, (1) time scale analysis from the field of physical transport phenomena, (2) the scale separation map (SSM) developed within a multiscale simulation framework of complex automata, (3) the supplemented state diagram (SSD), depicting phase transition and glass transition lines, and (4) a multiscale simulation model for the bubble expansion. Results of the time scale analysis are plotted in the SSD, and give insight into the dominant physical processes involved in expansion. Furthermore, the results of the time scale analysis are used to construct the SSM, which has aided us in the construction of the multiscale simulation model. Simulation results are plotted in the SSD. This clearly shows that the hypothesis of Kokini is qualitatively true, but has to be refined. Our results show that bubble expansion is optimal for moisture content, where the boiling line for gas pressure of 4 bars intersects the isoviscosity line of the critical viscosity 106 Pa.s, which runs parallel to the glass transition line.

  17. Normal block faulting in the Airport Graben, Managua pull-apart rift, Nicaragua: gravity and magnetic constraints

    NASA Astrophysics Data System (ADS)

    Campos-Enriquez, J. O.; Zambrana Arias, X.; Keppie, D.; Ramón Márquez, V.

    2012-12-01

    Regional scale models have been proposed for the Nicaraguan depression: 1) parallel rifting of the depression (and volcanic front) due to roll back of the underlying subducted Cocos plate; 2) right-lateral strike-slip faulting parallel to the depression and locally offset by pull-apart basins; 3) right-lateral strike-slip faulting parallel to the depression and offset by left-lateral transverse or bookshelf faults. At an intermediate scale, Funk et al. (2011) interpret the depression as half graben type structures. The E-W Airport graben lies in the southeastern part of the Managua graben (Nicaragua), across which the active Central American volcanic arc is dextrally offset, possibly the result of a subducted transform fault where the subduction angle changes. The Managua graben lies within the late Quaternary Nicaragua depression produced by backarc rifting during roll back of the Middle American Trench. The Managua graben formed as a pull-apart rift associated with dextral bookshelf faulting during dextral shear between the forearc and arc and is the locus of two historical, large earthquakes that destroyed the city of Managua. In order to asses future earthquake risk, four E-W gravity and magnetic profiles were undertaken to determine its structure across the Airport graben, which is bounded by the Cofradia and Airport fault zones, to the east and west, respectively. These data indicated the presence of a series of normal faults bounding down-thrown and up-thrown fault blocks and a listric normal fault, Sabana Grande Fault. The models imply that this area has been subjected to tectonic extension. These faults appear to be part of the bookshelf suite and will probably be the locus of future earthquakes, which could destroy the airport and surrounding part of Managua. Three regional SW-NE gravity profiles running from the Pacific Ocean up to the Caribbean See indicate a change in crustal structure: from north to south the crust thins. According to these regional crustal models the offset observed in the Volcanic Front around the Nicaragua Lake is associated with a weakness zone related with: 1) this N-S change in crustal structure, 2) to the subduction angle of the Cocos plate, and 3) to the distance to the Middle America Trench (i.e. the location of the mantle wedge). As mentioned above a subducted transform fault might have given rise to this crustal discontinuity.

  18. ParallelStructure: A R Package to Distribute Parallel Runs of the Population Genetics Program STRUCTURE on Multi-Core Computers

    PubMed Central

    Besnier, Francois; Glover, Kevin A.

    2013-01-01

    This software package provides an R-based framework to make use of multi-core computers when running analyses in the population genetics program STRUCTURE. It is especially addressed to those users of STRUCTURE dealing with numerous and repeated data analyses, and who could take advantage of an efficient script to automatically distribute STRUCTURE jobs among multiple processors. It also consists of additional functions to divide analyses among combinations of populations within a single data set without the need to manually produce multiple projects, as it is currently the case in STRUCTURE. The package consists of two main functions: MPI_structure() and parallel_structure() as well as an example data file. We compared the performance in computing time for this example data on two computer architectures and showed that the use of the present functions can result in several-fold improvements in terms of computation time. ParallelStructure is freely available at https://r-forge.r-project.org/projects/parallstructure/. PMID:23923012

  19. Armours for soft bodies: how far can bioinspiration take us?

    PubMed

    White, Zachary W; Vernerey, Franck J

    2018-05-15

    The development of armour is as old as the dawn of civilization. Early man looked to natural structures to harvest or replicate for protection, leaning on millennia of evolutionary developments in natural protection. Since the advent of more modern weaponry, Armor development has seemingly been driven more by materials research than bio-inspiration. However, parallels can still be drawn between modern bullet-protective armours and natural defensive structures. Soft armour for handgun and fragmentation threats can be likened to mammalian skin, and similarly, hard armour can be compared with exoskeletons and turtle shells. Via bio-inspiration, it may be possible to develop structures previously un-researched for ballistic protection. This review will cover current modern ballistic protective structures focusing on energy dissipation and absorption methods, and their natural analogues. As all armour is a compromise between weight, flexibility and protection, the imbricated structure of scaled skin will be presented as a better balance between these factors.

  20. Scalable hierarchical PDE sampler for generating spatially correlated random fields using nonmatching meshes: Scalable hierarchical PDE sampler using nonmatching meshes

    DOE PAGES

    Osborn, Sarah; Zulian, Patrick; Benson, Thomas; ...

    2018-01-30

    This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on anmore » embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. Here, we demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·109 unknowns.« less

  1. Scalable hierarchical PDE sampler for generating spatially correlated random fields using nonmatching meshes: Scalable hierarchical PDE sampler using nonmatching meshes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Osborn, Sarah; Zulian, Patrick; Benson, Thomas

    This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on anmore » embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. Here, we demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·109 unknowns.« less

  2. Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies

    PubMed Central

    Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang

    2008-01-01

    Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146

  3. Martian plate tectonics

    NASA Astrophysics Data System (ADS)

    Sleep, N. H.

    1994-03-01

    The northern lowlands of Mars have been produced by plate tectonics. Preexisting old thick highland crust was subducted, while seafloor spreading produced thin lowland crust during late Noachian and Early Hesperian time. In the preferred reconstruction, a breakup margin extended north of Cimmeria Terra between Daedalia Planum and Isidis Planitia where the highland-lowland transition is relatively simple. South dipping subduction occured beneath Arabia Terra and east dipping subduction beneath Tharsis Montes and Tempe Terra. Lineations associated with Gordii Dorsum are attributed to ridge-parallel structures, while Phelegra Montes and Scandia Colles are interpreted as transfer-parallel structures or ridge-fault-fault triple junction tracks. Other than for these few features, there is little topographic roughness in the lowlands. Seafloor spreading, if it occurred, must have been relatively rapid. Quantitative estimates of spreading rate are obtained by considering the physics of seafloor spreading in the lower (approx. 0.4 g) gravity of Mars, the absence of vertical scarps from age differences across fracture zones, and the smooth axial topography. Crustal thickness at a given potential temperature in the mantle source region scales inversely with gravity. Thus, the velocity of the rough-smooth transition for axial topography also scales inversely with gravity. Plate reorganizations where young crust becomes difficult to subduct are another constraint on spreading age. Plate tectonics, if it occurred, dominated the thermal and stress history of the planet. A geochemical implication is that the lower gravity of Mars allows deeper hydrothermal circulation through cracks and hence more hydration of oceanic crust so that more water is easily subducted than on the Earth. Age and structural relationships from photogeology as well as median wavelength gravity anomalies across the now dead breakup and subduction margins are the data most likely to test and modify hypotheses about Mars plate tectonics.

  4. CROSS-DISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY: Statistical interior properties of globular proteins

    NASA Astrophysics Data System (ADS)

    Jiang, Zhou-Ting; Zhang, Lin-Xi; Sun, Ting-Ting; Wu, Tai-Quan

    2009-10-01

    The character of forming long-range contacts affects the three-dimensional structure of globular proteins deeply. As the different ability to form long-range contacts between 20 types of amino acids and 4 categories of globular proteins, the statistical properties are thoroughly discussed in this paper. Two parameters NC and ND are defined to confine the valid residues in detail. The relationship between hydrophobicity scales and valid residue percentage of each amino acid is given in the present work and the linear functions are shown in our statistical results. It is concluded that the hydrophobicity scale defined by chemical derivatives of the amino acids and nonpolar phase of large unilamellar vesicle membranes is the most effective technique to characterise the hydrophobic behavior of amino acid residues. Meanwhile, residue percentage Pi and sequential residue length Li of a certain protein i are calculated under different conditions. The statistical results show that the average value of Pi as well as Li of all-α proteins has a minimum among these 4 classes of globular proteins, indicating that all-α proteins are hardly capable of forming long-range contacts one by one along their linear amino acid sequences. All-β proteins have a higher tendency to construct long-range contacts along their primary sequences related to the secondary configurations, i.e. parallel and anti-parallel configurations of β sheets. The investigation of the interior properties of globular proteins give us the connection between the three-dimensional structure and its primary sequence data or secondary configurations, and help us to understand the structure of protein and its folding process well.

  5. Automatic Energy Schemes for High Performance Applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sundriyal, Vaibhav

    Although high-performance computing traditionally focuses on the efficient execution of large-scale applications, both energy and power have become critical concerns when approaching exascale. Drastic increases in the power consumption of supercomputers affect significantly their operating costs and failure rates. In modern microprocessor architectures, equipped with dynamic voltage and frequency scaling (DVFS) and CPU clock modulation (throttling), the power consumption may be controlled in software. Additionally, network interconnect, such as Infiniband, may be exploited to maximize energy savings while the application performance loss and frequency switching overheads must be carefully balanced. This work first studies two important collective communication operations, all-to-allmore » and allgather and proposes energy saving strategies on the per-call basis. Next, it targets point-to-point communications to group them into phases and apply frequency scaling to them to save energy by exploiting the architectural and communication stalls. Finally, it proposes an automatic runtime system which combines both collective and point-to-point communications into phases, and applies throttling to them apart from DVFS to maximize energy savings. The experimental results are presented for NAS parallel benchmark problems as well as for the realistic parallel electronic structure calculations performed by the widely used quantum chemistry package GAMESS. Close to the maximum energy savings were obtained with a substantially low performance loss on the given platform.« less

  6. Constituent order and semantic parallelism in online comprehension: eye-tracking evidence from German.

    PubMed

    Knoeferle, Pia; Crocker, Matthew W

    2009-12-01

    Reading times for the second conjunct of and-coordinated clauses are faster when the second conjunct parallels the first conjunct in its syntactic or semantic (animacy) structure than when its structure differs (Frazier, Munn, & Clifton, 2000; Frazier, Taft, Roeper, & Clifton, 1984). What remains unclear, however, is the time course of parallelism effects, their scope, and the kinds of linguistic information to which they are sensitive. Findings from the first two eye-tracking experiments revealed incremental constituent order parallelism across the board-both during structural disambiguation (Experiment 1) and in sentences with unambiguously case-marked constituent order (Experiment 2), as well as for both marked and unmarked constituent orders (Experiments 1 and 2). Findings from Experiment 3 revealed effects of both constituent order and subtle semantic (noun phrase similarity) parallelism. Together our findings provide evidence for an across-the-board account of parallelism for processing and-coordinated clauses, in which both constituent order and semantic aspects of representations contribute towards incremental parallelism effects. We discuss our findings in the context of existing findings on parallelism and priming, as well as mechanisms of sentence processing.

  7. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.

  8. The distinct character of anisotropy and intermittency in inertial and kinetic range solar wind plasma turbulence

    NASA Astrophysics Data System (ADS)

    Kiyani, Khurom; Chapman, Sandra; Osman, Kareem; Sahraoui, Fouad; Hnat, Bogdan

    2014-05-01

    The anisotropic nature of the scaling properties of solar wind magnetic turbulence fluctuations is investigated scale by scale using high cadence in situ magnetic field measurements from the Cluster, ACE and STEREO spacecraft missions in both fast and slow quiet solar wind conditions. The data span five decades in scales from the inertial range to the electron Larmor radius. We find a clear transition in scaling behaviour between the inertial and kinetic range of scales, which provides a direct, quantitative constraint on the physical processes that mediate the cascade of energy through these scales. In the inertial (magnetohydrodynamic) range the statistical nature of turbulent fluctuations are known to be anisotropic, both in the vector components of the magnetic field fluctuations (variance anisotropy) and in the spatial scales of these fluctuations (wavevector or k-anisotropy). We show for the first time that, when measuring parallel to the local magnetic field direction, the full statistical signature of the magnetic and Elsasser field fluctuations is that of a non-Gaussian globally scale-invariant process. This is distinct from the classic multi-exponent statistics observed when the local magnetic field is perpendicular to the flow direction. These observations suggest the weakness, or absence, of a parallel magnetofluid turbulence energy cascade. In contrast to the inertial range, there is a successive increase toward isotropy between parallel and transverse power at scales below the ion Larmor radius, with isotropy being achieved at the electron Larmor radius. Computing higher-order statistics, we show that the full statistical signature of both parallel, and perpendicular fluctuations at scales below the ion Larmor radius are that of an isotropic globally scale-invariant non-Gaussian process. Lastly, we perform a survey of multiple intervals of quiet solar wind sampled under different plasma conditions (fast, slow wind; plasma beta etc.) and find that the above results on the scaling transition between inertial and kinetic range scales are qualitatively robust, and that quantitatively, there is a spread in the values of the scaling exponents.

  9. The determination of interplanetary magnetic field polarities around sector boundaries using E greater than 2 keV electrons

    NASA Technical Reports Server (NTRS)

    Kahler, S.; Lin, R. P.

    1994-01-01

    The determination of the polarities of interplanetary magnetic fields (whether the field direction is outward from or inward toward the sun) has been based on a comparison of observed field directions with the nominal Parker spiral angle. These polarities can be mapped back to the solar source field polarities. This technique fails when field directions deviate substantially from the Parker angle or when fields are substantially kinked. We introduce a simple new technique to determine the polarities of interplanetary fields using E greater than 2 keV interplanetary electrons which stream along field lines away from the sun. Those electrons usually show distinct unidirectional pitch-angle anisotropies either parallel or anti-parallel to the field. Since the electron flow direction is known to be outward from the sun, the anisotropies parallel to the field indicate outward-pointing, positive-polarity fields, and those anti-parallel indicate inward-pointing, negative-polarity fields. We use data from the UC Berkeley electron experiment on the International Sun Earth Explorer 3 (ISSE-3) spacecraft to compare the field polarities deduced from the electron data, Pe (outward or inward), with the polarities inferred from field directions, Pd, around two sector boundaries in 1979. We show examples of large (greater than 100 deg) changes in azimuthal field direction Phi over short (less than 1 hr) time scales, some with and some without reversals in Pe. The latter cases indicate that such large directional changes can occur in unipolar structures. On the other hand, we found an example of a change in Pe during which the rotation in Phi was less than 30 deg, indicating polarity changes in nearly unidirectional structures. The field directions are poor guides to the polarities in these cases.

  10. Self-assembled three-dimensional chiral colloidal architecture.

    PubMed

    Ben Zion, Matan Yah; He, Xiaojin; Maass, Corinna C; Sha, Ruojie; Seeman, Nadrian C; Chaikin, Paul M

    2017-11-03

    Although stereochemistry has been a central focus of the molecular sciences since Pasteur, its province has previously been restricted to the nanometric scale. We have programmed the self-assembly of micron-sized colloidal clusters with structural information stemming from a nanometric arrangement. This was done by combining DNA nanotechnology with colloidal science. Using the functional flexibility of DNA origami in conjunction with the structural rigidity of colloidal particles, we demonstrate the parallel self-assembly of three-dimensional microconstructs, evincing highly specific geometry that includes control over position, dihedral angles, and cluster chirality. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  11. Postimpact deformation associated with the late Eocene Chesapeake Bay impact structure in southeastern Virginia

    USGS Publications Warehouse

    Johnson, G.H.; Kruse, S.E.; Vaughn, A.W.; Lucey, J.K.; Hobbs, C. H.; Powars, D.S.

    1998-01-01

    Upper Cenozoic strata covering the Chesapeake Bay impact structure in southeastern Virginia record intermittent differential movement around its buried rim. Miocene strata in a graben detected by seismic surveys on the York River exhibit variable thickness and are deformed above the creater rim. Fan-like interformational and intraformational angular unconformities within Pliocene-Pleistocene strata, which strike parallel to the crater rim and dip 2-3?? away from the crater center, indicate that deformation and deposition were synchronous. Concentric, large-scale crossbedded, bioclastics and bodies of Pliocene age within ~20km of the buried crater rim formed on offshore shoals, presumably as subsiding listric slump blocks rotated near the crater rim.

  12. A 45-ns molecular dynamics simulation of hemoglobin in water by vectorizing and parallelizing COSMOS90 on the earth simulator: dynamics of tertiary and quaternary structures.

    PubMed

    Saito, Minoru; Okazaki, Isao

    2007-04-30

    Molecular dynamics (MD) simulations of human adult hemoglobin (HbA) were carried out for 45 ns in water with all degrees of freedom including bond stretching and without any artificial constraints. To perform such large-scale simulations, one of the authors (M.S.) accelerated his own software COSMOS90 on the Earth Simulator by vectorization and parallelization. The dynamical features of HbA were investigated by evaluating root-mean-square deviations from the initial X-ray structure (an oxy T-state hemoglobin with PDB code: 1GZX) and root-mean-square fluctuations around the average structure from the simulation trajectories. The four subunits (alpha(1), alpha(2), beta(1), and beta(2)) of HbA maintained structures close to their respective X-ray structures during the simulations even though no constraints were applied to HbA in the simulations. Dimers alpha(1)beta(1) and alpha(2)beta(2) also maintained structures close to their respective X-ray structures while they moved relative to each other like two stacks of dumbbells. The distance between the two dimers (alpha(1)beta(1) and alpha(2)beta(2)) increased by 2 A (7.4%) in the initial 15 ns and stably fluctuated at the distance with the standard deviation 0.2 A. The relative orientation of the two dimers fluctuated between the initial X-ray angle -100 degrees and about -105 degrees with intervals of a few tens of nanoseconds.

  13. A fully parallel in time and space algorithm for simulating the electrical activity of a neural tissue.

    PubMed

    Bedez, Mathieu; Belhachmi, Zakaria; Haeberlé, Olivier; Greget, Renaud; Moussaoui, Saliha; Bouteiller, Jean-Marie; Bischoff, Serge

    2016-01-15

    The resolution of a model describing the electrical activity of neural tissue and its propagation within this tissue is highly consuming in term of computing time and requires strong computing power to achieve good results. In this study, we present a method to solve a model describing the electrical propagation in neuronal tissue, using parareal algorithm, coupling with parallelization space using CUDA in graphical processing unit (GPU). We applied the method of resolution to different dimensions of the geometry of our model (1-D, 2-D and 3-D). The GPU results are compared with simulations from a multi-core processor cluster, using message-passing interface (MPI), where the spatial scale was parallelized in order to reach a comparable calculation time than that of the presented method using GPU. A gain of a factor 100 in term of computational time between sequential results and those obtained using the GPU has been obtained, in the case of 3-D geometry. Given the structure of the GPU, this factor increases according to the fineness of the geometry used in the computation. To the best of our knowledge, it is the first time such a method is used, even in the case of neuroscience. Parallelization time coupled with GPU parallelization space allows for drastically reducing computational time with a fine resolution of the model describing the propagation of the electrical signal in a neuronal tissue. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Parallel Clustering Algorithm for Large-Scale Biological Data Sets

    PubMed Central

    Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

    2014-01-01

    Backgrounds Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Methods Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. Result A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies. PMID:24705246

  15. Integrated nanoscale tools for interrogating living cells

    NASA Astrophysics Data System (ADS)

    Jorgolli, Marsela

    The development of next-generation, nanoscale technologies that interface biological systems will pave the way towards new understanding of such complex systems. Nanowires -- one-dimensional nanoscale structures -- have shown unique potential as an ideal physical interface to biological systems. Herein, we focus on the development of nanowire-based devices that can enable a wide variety of biological studies. First, we built upon standard nanofabrication techniques to optimize nanowire devices, resulting in perfectly ordered arrays of both opaque (Silicon) and transparent (Silicon dioxide) nanowires with user defined structural profile, densities, and overall patterns, as well as high sample consistency and large scale production. The high-precision and well-controlled fabrication method in conjunction with additional technologies laid the foundation for the generation of highly specialized platforms for imaging, electrochemical interrogation, and molecular biology. Next, we utilized nanowires as the fundamental structure in the development of integrated nanoelectronic platforms to directly interrogate the electrical activity of biological systems. Initially, we generated a scalable intracellular electrode platform based on vertical nanowires that allows for parallel electrical interfacing to multiple mammalian neurons. Our prototype device consisted of 16 individually addressable stimulation/recording sites, each containing an array of 9 electrically active silicon nanowires. We showed that these vertical nanowire electrode arrays could intracellularly record and stimulate neuronal activity in dissociated cultures of rat cortical neurons similar to patch clamp electrodes. In addition, we used our intracellular electrode platform to measure multiple individual synaptic connections, which enables the reconstruction of the functional connectivity maps of neuronal circuits. In order to expand and improve the capability of this functional prototype device we designed and fabricated a new hybrid chip that combines a front-side nanowire-based interface for neuronal recording with backside complementary metal oxide semiconductor (CMOS) circuits for on-chip multiplexing, voltage control for stimulation, signal amplification, and signal processing. Individual chips contain 1024 stimulation/recording sites enabling large-scale interfacing of neuronal networks with single cell resolution. Through electrical and electrochemical characterization of the devices, we demonstrated their enhanced functionality at a massively parallel scale. In our initial cell experiments, we achieved intracellular stimulations and recordings of changes in the membrane potential in a variety of cells including: HEK293T, cardiomyocytes, and rat cortical neurons. This demonstrated the device capability for single-cell-resolution recording/stimulation which when extended to a large number of neurons in a massively parallel fashion will enable the functional mapping of a complex neuronal network.

  16. Particle Acceleration, Magnetic Field Generation, and Emission in Relativistic Pair Jets

    NASA Technical Reports Server (NTRS)

    Nishikawa, K. I.; Hardee, P.; Hededal, C. B.; Richardson, G.; Sol, H.; Preece, R.; Fishman, G. J.

    2004-01-01

    Shock acceleration is a ubiquitous phenomenon in astrophysical plasmas. Plasma waves and their associated instabilities (e.g., Buneman, Weibel and other two-stream instabilities) created in collisionless shocks are responsible for particle (electron, positron, and ion) acceleration. Using a 3-D relativistic electromagnetic particle (REMP) code, we have investigated particle acceleration associated with a relativistic jet front propagating into an ambient plasma. We find that the growth times depend on the Lorenz factors of jets. The jets with larger Lorenz factors grow slower. Simulations show that the Weibel instability created in the collisionless shock front accelerates jet and ambient particles both perpendicular and parallel to the jet propagation direction. The small scale magnetic field structure generated by the Weibel instability is appropriate to the generation of "jitter" radiation from deflected electrons (positrons) as opposed to synchrotron radiation. The jitter radiation resulting from small scale magnetic field structures may be important for understanding the complex time structure and spectral evolution observed in gamma-ray bursts or other astrophysical sources containing relativistic jets and relativistic collisionless shocks.

  17. Implementation of a flexible and scalable particle-in-cell method for massively parallel computations in the mantle convection code ASPECT

    NASA Astrophysics Data System (ADS)

    Gassmöller, Rene; Bangerth, Wolfgang

    2016-04-01

    Particle-in-cell methods have a long history and many applications in geodynamic modelling of mantle convection, lithospheric deformation and crustal dynamics. They are primarily used to track material information, the strain a material has undergone, the pressure-temperature history a certain material region has experienced, or the amount of volatiles or partial melt present in a region. However, their efficient parallel implementation - in particular combined with adaptive finite-element meshes - is complicated due to the complex communication patterns and frequent reassignment of particles to cells. Consequently, many current scientific software packages accomplish this efficient implementation by specifically designing particle methods for a single purpose, like the advection of scalar material properties that do not evolve over time (e.g., for chemical heterogeneities). Design choices for particle integration, data storage, and parallel communication are then optimized for this single purpose, making the code relatively rigid to changing requirements. Here, we present the implementation of a flexible, scalable and efficient particle-in-cell method for massively parallel finite-element codes with adaptively changing meshes. Using a modular plugin structure, we allow maximum flexibility of the generation of particles, the carried tracer properties, the advection and output algorithms, and the projection of properties to the finite-element mesh. We present scaling tests ranging up to tens of thousands of cores and tens of billions of particles. Additionally, we discuss efficient load-balancing strategies for particles in adaptive meshes with their strengths and weaknesses, local particle-transfer between parallel subdomains utilizing existing communication patterns from the finite element mesh, and the use of established parallel output algorithms like the HDF5 library. Finally, we show some relevant particle application cases, compare our implementation to a modern advection-field approach, and demonstrate under which conditions which method is more efficient. We implemented the presented methods in ASPECT (aspect.dealii.org), a freely available open-source community code for geodynamic simulations. The structure of the particle code is highly modular, and segregated from the PDE solver, and can thus be easily transferred to other programs, or adapted for various application cases.

  18. Extreme-Scale De Novo Genome Assembly

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob

    De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less

  19. A hybrid parallel framework for the cellular Potts model simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, Yi; He, Kejing; Dong, Shoubin

    2009-01-01

    The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approachmore » achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).« less

  20. High-Resolution Numerical Simulation and Analysis of Mach Reflection Structures in Detonation Waves in Low-Pressure H 2 –O 2 –Ar Mixtures: A Summary of Results Obtained with the Adaptive Mesh Refinement Framework AMROC

    DOE PAGES

    Deiterding, Ralf

    2011-01-01

    Numerical simulation can be key to the understanding of the multidimensional nature of transient detonation waves. However, the accurate approximation of realistic detonations is demanding as a wide range of scales needs to be resolved. This paper describes a successful solution strategy that utilizes logically rectangular dynamically adaptive meshes. The hydrodynamic transport scheme and the treatment of the nonequilibrium reaction terms are sketched. A ghost fluid approach is integrated into the method to allow for embedded geometrically complex boundaries. Large-scale parallel simulations of unstable detonation structures of Chapman-Jouguet detonations in low-pressure hydrogen-oxygen-argon mixtures demonstrate the efficiency of the described techniquesmore » in practice. In particular, computations of regular cellular structures in two and three space dimensions and their development under transient conditions, that is, under diffraction and for propagation through bends are presented. Some of the observed patterns are classified by shock polar analysis, and a diagram of the transition boundaries between possible Mach reflection structures is constructed.« less

  1. SQDFT: Spectral Quadrature method for large-scale parallel O(N) Kohn-Sham calculations at high temperature

    NASA Astrophysics Data System (ADS)

    Suryanarayana, Phanish; Pratapa, Phanisri P.; Sharma, Abhiraj; Pask, John E.

    2018-03-01

    We present SQDFT: a large-scale parallel implementation of the Spectral Quadrature (SQ) method for O(N) Kohn-Sham Density Functional Theory (DFT) calculations at high temperature. Specifically, we develop an efficient and scalable finite-difference implementation of the infinite-cell Clenshaw-Curtis SQ approach, in which results for the infinite crystal are obtained by expressing quantities of interest as bilinear forms or sums of bilinear forms, that are then approximated by spatially localized Clenshaw-Curtis quadrature rules. We demonstrate the accuracy of SQDFT by showing systematic convergence of energies and atomic forces with respect to SQ parameters to reference diagonalization results, and convergence with discretization to established planewave results, for both metallic and insulating systems. We further demonstrate that SQDFT achieves excellent strong and weak parallel scaling on computer systems consisting of tens of thousands of processors, with near perfect O(N) scaling with system size and wall times as low as a few seconds per self-consistent field iteration. Finally, we verify the accuracy of SQDFT in large-scale quantum molecular dynamics simulations of aluminum at high temperature.

  2. [Value of gray scale analysis for the assessment of ultrasound detected structures in the area of the abdomen].

    PubMed

    Wildgrube, H J; Dehwald, H

    1990-01-01

    The characteristics of the echo structure constitute an important criterion for the appraisal of sonograms. Since every pixel usually represents one out of 64 gray values, it should be possible to use the density as an objective parameter of the echo structure. In this study, the echogenicity of the pancreas was examined. The density of the pancreas became higher with increasing accumulation of fatty connective tissue or as a result of air in the intestine. In 42 people with varying degrees of obesity, the echo structure was compared with the gray scale distribution of the lumen of the gallbladder, aorta and the water-filled stomach. The results indicated that the increasing echodensity is attributable to reflections and scatter of the ultrasound in adjacent regions. The presence of air gave rise to the same effect. On the basis of standardized investigations at 15-minute intervals, the density and the visual index under the influence of a quick-acting simethicone preparation (Lefax) were compared. The density also decreased significantly within 30 to 45 minutes parallel to the reduction of superimpositional interferences due to air. The present investigations confirm the relevance of gray scale analysis for objective confirmation of sonographic structures. However, they make it evident that the echo pattern is quantifiable only under standardized conditions and when the projection plane is largely occupied. Misleading mixed values are measured in marginal zones and in superimpositions.

  3. Laser-induced extreme magnetic field in nanorod targets

    NASA Astrophysics Data System (ADS)

    Lécz, Zsolt; Andreev, Alexander

    2018-03-01

    The application of nano-structured target surfaces in laser-solid interaction has attracted significant attention in the last few years. Their ability to absorb significantly more laser energy promises a possible route for advancing the currently established laser ion acceleration concepts. However, it is crucial to have a better understanding of field evolution and electron dynamics during laser-matter interactions before the employment of such exotic targets. This paper focuses on the magnetic field generation in nano-forest targets consisting of parallel nanorods grown on plane surfaces. A general scaling law for the self-generated quasi-static magnetic field amplitude is given and it is shown that amplitudes up to 1 MT field are achievable with current technology. Analytical results are supported by three-dimensional particle-in-cell simulations. Non-parallel arrangements of nanorods has also been considered which result in the generation of donut-shaped azimuthal magnetic fields in a larger volume.

  4. Acceleration of the Particle Swarm Optimization for Peierls-Nabarro modeling of dislocations in conventional and high-entropy alloys

    NASA Astrophysics Data System (ADS)

    Pei, Zongrui; Eisenbach, Markus

    2017-06-01

    Dislocations are among the most important defects in determining the mechanical properties of both conventional alloys and high-entropy alloys. The Peierls-Nabarro model supplies an efficient pathway to their geometries and mobility. The difficulty in solving the integro-differential Peierls-Nabarro equation is how to effectively avoid the local minima in the energy landscape of a dislocation core. Among the other methods to optimize the dislocation core structures, we choose the algorithm of Particle Swarm Optimization, an algorithm that simulates the social behaviors of organisms. By employing more particles (bigger swarm) and more iterative steps (allowing them to explore for longer time), the local minima can be effectively avoided. But this would require more computational cost. The advantage of this algorithm is that it is readily parallelized in modern high computing architecture. We demonstrate the performance of our parallelized algorithm scales linearly with the number of employed cores.

  5. Performance Enhancement Strategies for Multi-Block Overset Grid CFD Applications

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Biswas, Rupak

    2003-01-01

    The overset grid methodology has significantly reduced time-to-solution of highfidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement strategies on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machinc. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Details of a sophisticated graph partitioning technique for grid grouping are also provided. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.

  6. Raytracing and Direct-Drive Targets

    NASA Astrophysics Data System (ADS)

    Schmitt, Andrew J.; Bates, Jason; Fyfe, David; Eimerl, David

    2013-10-01

    Accurate simulation of the effects of laser imprinting and drive asymmetries in directly driven targets requires the ability to distinguish between raytrace noise and the intensity structure produced by the spatial and temporal incoherence of optical smoothing. We have developed and implemented a smoother raytrace algorithm for our mpi-parallel radiation hydrodynamics code, FAST3D. The underlying approach is to connect the rays into either sheets (in 2D) or volume-enclosing chunks (in 3D) so that the absorbed energy distribution continuously covers the propagation area illuminated by the laser. We will describe the status and show the different scalings encountered in 2D and 3D problems as the computational size, parallelization strategy, and number of rays is varied. Finally, we show results using the method in current NIKE experimental target simulations and in proposed symmetric and polar direct-drive target designs. Supported by US DoE/NNSA.

  7. Sub-volcanic slope influencing the development of major structures at volcanoes during strike-slip faulting

    NASA Astrophysics Data System (ADS)

    Andrade, Daniel; van Wyk de Vries, Benjamin; Robin, Claude

    2014-05-01

    Volcano-basement interactions can deeply determine the structural development of volcanoes basically by the propagation of stress and strain fields from the basement into the volcanic edifice, and vice versa. An extensively studied case of such interactions is the propagation of a strike-slip fault through a volcanic edifice, which gives place to a strong tendency of major volcanic construction and destruction events to occur in a sub-parallel trend with respect to the strike of the fault. During precedent studies, however, both scaled and natural prototypes have always considered that the surfaces on which volcanoes stand (i.e. the sub-volcanic slope) are horizontal. The scaled experiments presented here show that the dip-angle and dip-direction of the subvolcanic slopes can systematically and significantly change the deformation patterns developed by the volcanic edifice during strike-slip faulting. When the dip-direction of the sub-volcanic slope and the strike of the fault are nearly parallel, an increased development and concentration of the deformation on the down-slope side of the volcanic cone occurs. In medium to long-term, this would imply again a tendency of major volcanic structures growing in a sub-parallel trend with respect to the strike of the fault, but with one preferred direction: that of the dip-direction. In the experiments, the dip-direction of the sub-volcanic slope was set progressively oblique, up to perpendicular, with respect to the strike of the fault by: 1) rotating in the same sense as the strike-slip fault, or 2) rotating in the opposite sense as the fault. In both cases, the downslope side of the volcanic cone still concentrates the deformation, but the deformed sectors progressively rotate which results in a structural development (construction and destruction) of the edifice occurring clearly oblique with respect to the strike of the fault. Imbabura volcano (Ecuador) is traversed by the strike-slip El Angel-Río Ambi fault, whose sense of movement (left- or right-lateral) has not been clearly established yet. Aditionally, Imbabura has been constructed on the NW, medium to lower flank of the neighbor Cubilche volcano. The application of the experimental results presented above to the case of Imbabura volcano helps to understand the particular structure of this volcano which displays a complex history of construction and destruction events. Additionally, the experiments strongly suggests that the El Angel-Río Ambi fault is left-lateral.

  8. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.

  9. Enceladus Jet Orientations: Effects of Surface Structure

    NASA Astrophysics Data System (ADS)

    Helfenstein, P.; Porco, C.; DiNino, D.

    2013-12-01

    Jetting activity across the South Polar Terrain (SPT) of Enceladus is now known to erupt directly from tiger-stripe rifts and associated fracture systems. However, details of the vent conduit geometry are hidden below the icy surface. The three-dimensional orientations of the erupting jets may provide important clues. Porco et al. (2013, Lunar Planet. Sci. Conf. 44th, p.1775) surveyed jet locations and orientations as imaged at high resolution (< 1.3 km/pixel) by Cassini ISS from 2005 through May 2012. Ninety-eight (98) jets were identified either on the main trunks or branches of the 4 tiger-stripes. The azimuth angles of the jets are seen to vary across the SPT. Here, we use histogram analysis of the survey data to test if the jet azimuths are influenced by their placement relative to surface morphology and tectonic structures. Azimuths are measured positive counterclockwise with zero pointing along the fracture in the direction of the sub-Saturn hemisphere, and rosette histograms were binned in 30° increments. Overall, the jet azimuths are not random and only about 11% of them are co-aligned with the tiger stripe valley. There are preferred diagonal orientations between 105°-165° and again between 255°-345°. These trends are dominant along the Damascus and Baghdad tiger-stripes where more than half of the jets are found. Histograms for Cairo and Alexandria show less-distinct trends, fewer jets being measured there, but combining data from both suggests a different pattern of preferred orientations; from 45°-75° and 265°-280°. Many possible factors could affect the orientations of jets, for example, the conduit shape, the presence of obstacles like narrow medial ridges called 'shark-fins' along tiger-stripe valleys, the possibility that jets may breach the surface at some point other than the center of a tiger-stripe, and the presence of structural fabrics or mechanical weaknesses, such as patterns of cross-cutting fractures. The dominance of diagonally crossing azimuths for Damascus and Baghdad suggest that cross-cutting fractures may significantly control jet orientations. At the 100 m/pixel scale of our Enceladus basemap at least 24% of the jets have azimuth orientations that point along or parallel to nearby fractures or fabrics of parallel fractures that approach or intersect the tiger stripe. Structural control of jet orientations by local tectonism is especially suggested by a systematic pattern of jet orientations at the distal end of Damascus Sulcus where it bifurcates into a northern and a southern branch, respectively. The five most distal jets along the northern branch are nearly parallel and point northward while the three most distal jets along the southern branch are also nearly parallel, but they point in the opposite direction. Additional work is needed to show the extent to which jet orientations may be affected at smaller scales by quasi-parallel systems of cross-cutting gossamer fractures or by curving axial discontinuities along the tiger stripes (cf. Helfenstein et al. 2011, http://encfg.ciclops.org/reg/uploads/20110425220109_helfenstein_enceladus_workshop_2011.pdf).

  10. Casimir force in O(n) systems with a diffuse interface.

    PubMed

    Dantchev, Daniel; Grüneberg, Daniel

    2009-04-01

    We study the behavior of the Casimir force in O(n) systems with a diffuse interface and slab geometry infinity;{d-1}xL , where 2infinity limit of O(n) models with antiperiodic boundary conditions applied along the finite dimension L of the film. We observe that the Casimir amplitude Delta_{Casimir}(dmid R:J_{ perpendicular},J_{ parallel}) of the anisotropic d -dimensional system is related to that of the isotropic system Delta_{Casimir}(d) via Delta_{Casimir}(dmid R:J_{ perpendicular},J_{ parallel})=(J_{ perpendicular}J_{ parallel});{(d-1)2}Delta_{Casimir}(d) . For d=3 we derive the exact Casimir amplitude Delta_{Casimir}(3,mid R:J_{ perpendicular},J_{ parallel})=[Cl_{2}(pi3)3-zeta(3)(6pi)](J_{ perpendicular}J_{ parallel}) , as well as the exact scaling functions of the Casimir force and of the helicity modulus Upsilon(T,L) . We obtain that beta_{c}Upsilon(T_{c},L)=(2pi;{2})[Cl_{2}(pi3)3+7zeta(3)(30pi)](J_{ perpendicular}J_{ parallel})L;{-1} , where T_{c} is the critical temperature of the bulk system. We find that the contributions in the excess free energy due to the existence of a diffuse interface result in a repulsive Casimir force in the whole temperature region.

  11. A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

    DOE PAGES

    Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...

    2016-09-18

    This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.

  12. A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.

    This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.

  13. Visualization Co-Processing of a CFD Simulation

    NASA Technical Reports Server (NTRS)

    Vaziri, Arsi

    1999-01-01

    OVERFLOW, a widely used CFD simulation code, is combined with a visualization system, pV3, to experiment with an environment for simulation/visualization co-processing on a SGI Origin 2000 computer(O2K) system. The shared memory version of the solver is used with the O2K 'pfa' preprocessor invoked to automatically discover parallelism in the source code. No other explicit parallelism is enabled. In order to study the scaling and performance of the visualization co-processing system, sample runs are made with different processor groups in the range of 1 to 254 processors. The data exchange between the visualization system and the simulation system is rapid enough for user interactivity when the problem size is small. This shared memory version of OVERFLOW, with minimal parallelization, does not scale well to an increasing number of available processors. The visualization task takes about 18 to 30% of the total processing time and does not appear to be a major contributor to the poor scaling. Improper load balancing and inter-processor communication overhead are contributors to this poor performance. Work is in progress which is aimed at obtaining improved parallel performance of the solver and removing the limitations of serial data transfer to pV3 by examining various parallelization/communication strategies, including the use of the explicit message passing.

  14. Formation Of Nano Layered Lamellar Structure In a Processed γ-TiAl Based Alloy

    NASA Astrophysics Data System (ADS)

    Heshmati-Manesh, S.; Shakoorian, H.; Armaki, H. Ghassemi; Ahmadabadi, M. Nili

    2009-06-01

    In this research, microstructures of an intermetallic alloy based on γ-TiAl has been investigated by optical and transmission electron microscopy. Samples of Ti-47Al-2Cr alloy were subjected to either a cyclic heat treatment or thermomechanical treatment with the aim of microstructural refinement. In both cases it was found that very fine lamellar structure with an interlamellar spacing in the nano scale is formed. Upon cyclic heat treatment, nano layers of α2 and γ ordered intermetallic phases were either formed during rapid cooling cycle in competition with massive structure formation, or formed as secondary lamellar structure during final stages of cyclic heat treatment. Also, TEM observations in hot forged specimens with initial lamellar structure revealed that micro twins form during the deformation within lamellar structure with twinning plates parallel to lamellar interfaces. Concurrent dynamic recrystallisation results in a nano layered structure with an interlamellar spacing of less than 100 nm.

  15. Parallel Mutual Information Based Construction of Genome-Scale Networks on the Intel® Xeon Phi™ Coprocessor.

    PubMed

    Misra, Sanchit; Pamnany, Kiran; Aluru, Srinivas

    2015-01-01

    Construction of whole-genome networks from large-scale gene expression data is an important problem in systems biology. While several techniques have been developed, most cannot handle network reconstruction at the whole-genome scale, and the few that can, require large clusters. In this paper, we present a solution on the Intel Xeon Phi coprocessor, taking advantage of its multi-level parallelism including many x86-based cores, multiple threads per core, and vector processing units. We also present a solution on the Intel® Xeon® processor. Our solution is based on TINGe, a fast parallel network reconstruction technique that uses mutual information and permutation testing for assessing statistical significance. We demonstrate the first ever inference of a plant whole genome regulatory network on a single chip by constructing a 15,575 gene network of the plant Arabidopsis thaliana from 3,137 microarray experiments in only 22 minutes. In addition, our optimization for parallelizing mutual information computation on the Intel Xeon Phi coprocessor holds out lessons that are applicable to other domains.

  16. Formation of collisionless shocks in magnetized plasma interaction with kinetic-scale obstacles

    DOE PAGES

    Cruz, F.; Alves, E. P.; Bamford, R. A.; ...

    2017-02-06

    We investigate the formation of collisionless magnetized shocks triggered by the interaction between magnetized plasma flows and miniature-sized (order of plasma kinetic-scales) magnetic obstacles resorting to massively parallel, full particle-in-cell simulations, including the electron kinetics. The critical obstacle size to generate a compressed plasma region ahead of these objects is determined by independently varying the magnitude of the dipolar magnetic moment and the plasma magnetization. Here we find that the effective size of the obstacle depends on the relative orientation between the dipolar and plasma internal magnetic fields, and we show that this may be critical to form a shockmore » in small-scale structures. We also study the microphysics of the magnetopause in different magnetic field configurations in 2D and compare the results with full 3D simulations. Finally, we evaluate the parameter range where such miniature magnetized shocks can be explored in laboratory experiments.« less

  17. Inhomogeneous cosmology and backreaction: Current status and future prospects

    NASA Astrophysics Data System (ADS)

    Bolejko, Krzysztof; Korzyński, Mikołaj

    Astronomical observations reveal hierarchical structures in the universe, from galaxies, groups of galaxies, clusters and superclusters, to filaments and voids. On the largest scales, it seems that some kind of statistical homogeneity can be observed. As a result, modern cosmological models are based on spatially homogeneous and isotropic solutions of the Einstein equations, and the evolution of the universe is approximated by the Friedmann equations. In parallel to standard homogeneous cosmology, the field of inhomogeneous cosmology and backreaction is being developed. This field investigates whether small scale inhomogeneities via nonlinear effects can backreact and alter the properties of the universe on its largest scales, leading to a non-Friedmannian evolution. This paper presents the current status of inhomogeneous cosmology and backreaction. It also discusses future prospects of the field of inhomogeneous cosmology, which is based on a survey of 50 academics working in the field of inhomogeneous cosmology.

  18. Equilibrium structure of the plasma sheet boundary layer-lobe interface

    NASA Technical Reports Server (NTRS)

    Romero, H.; Ganguli, G.; Palmadesso, P.; Dusenbery, P. B.

    1990-01-01

    Observations are presented which show that plasma parameters vary on a scale length smaller than the ion gyroradius at the interface between the plasma sheet boundary layer and the lobe. The Vlasov equation is used to investigate the properties of such a boundary layer. The existence, at the interface, of a density gradient whose scale length is smaller than the ion gyroradius implies that an electrostatic potential is established in order to maintain quasi-neutrality. Strongly sheared (scale lengths smaller than the ion gyroradius) perpendicular and parallel (to the ambient magnetic field) electron flows develop whose peak velocities are on the order of the electron thermal speed and which carry a net current. The free energy of the sheared flows can give rise to a broadband spectrum of electrostatic instabilities starting near the electron plasma frequency and extending below the lower hybrid frequency.

  19. Exploring the Ability of a Coarse-grained Potential to Describe the Stress-strain Response of Glassy Polystyrene

    DTIC Science & Technology

    2012-10-01

    using the open-source code Large-scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) (http://lammps.sandia.gov) (23). The commercial...parameters are proprietary and cannot be ported to the LAMMPS 4 simulation code. In our molecular dynamics simulations at the atomistic resolution, we...IBI iterative Boltzmann inversion LAMMPS Large-scale Atomic/Molecular Massively Parallel Simulator MAPS Materials Processes and Simulations MS

  20. A Comparison of Hybrid Reynolds Averaged Navier Stokes/Large Eddy Simulation (RANS/LES) and Unsteady RANS Predictions of Separated Flow for a Variable Speed Power Turbine Blade Operating with Low Inlet Turbulence Levels

    DTIC Science & Technology

    2017-10-01

    Facility is a large-scale cascade that allows detailed flow field surveys and blade surface measurements.10–12 The facility has a continuous run ...structured grids at 2 flow conditions, cruise and takeoff, of the VSPT blade . Computations were run in parallel on a Department of Defense...RANS/LES) and Unsteady RANS Predictions of Separated Flow for a Variable-Speed Power- Turbine Blade Operating with Low Inlet Turbulence Levels

  1. Parallel Computing for Probabilistic Response Analysis of High Temperature Composites

    NASA Technical Reports Server (NTRS)

    Sues, R. H.; Lua, Y. J.; Smith, M. D.

    1994-01-01

    The objective of this Phase I research was to establish the required software and hardware strategies to achieve large scale parallelism in solving PCM problems. To meet this objective, several investigations were conducted. First, we identified the multiple levels of parallelism in PCM and the computational strategies to exploit these parallelisms. Next, several software and hardware efficiency investigations were conducted. These involved the use of three different parallel programming paradigms and solution of two example problems on both a shared-memory multiprocessor and a distributed-memory network of workstations.

  2. Hierarchical Parallelism in Finite Difference Analysis of Heat Conduction

    NASA Technical Reports Server (NTRS)

    Padovan, Joseph; Krishna, Lala; Gute, Douglas

    1997-01-01

    Based on the concept of hierarchical parallelism, this research effort resulted in highly efficient parallel solution strategies for very large scale heat conduction problems. Overall, the method of hierarchical parallelism involves the partitioning of thermal models into several substructured levels wherein an optimal balance into various associated bandwidths is achieved. The details are described in this report. Overall, the report is organized into two parts. Part 1 describes the parallel modelling methodology and associated multilevel direct, iterative and mixed solution schemes. Part 2 establishes both the formal and computational properties of the scheme.

  3. Fold-structure analysis of paleozoic rocks in the Variscan Harz Mountains (Lautenthal, Central Germany) based on laserscanning and 3D modelling

    NASA Astrophysics Data System (ADS)

    Wagner, Bianca; Leiss, Bernd; Stöpler, Ralf; Zahnow, Fabian

    2017-04-01

    Folded paleozoic sedimentary rocks of Upper Devonian to Lower Carboniferous age are very well exposed in the abandoned chert quarry of Lautenthal in the western Harz Mountains. The outcrop represents typical structures of the Rhenohercynian thrust and fold belt of the Variscan orogen and therefore allows quantitative studies for the understanding of e.g. fold mechanisms and the amount of shortening. The sequence is composed of alternating beds of cherts, shales and tuffites, which show varying thicknesses, undulating and thinning out of certain layers. Irregularly occurring lenses of greywackes are interpreted as sedimentary intrusions. The compressive deformation style is expressed by different similar and parallel fold structures at varying scales as well as small-scale reverse faults and triangle structures. An accurate mapping of the outcrop in the classical way is very challenging due to distant and unconnected outcrop parts with differing elevations and orientations. Furthermore, the visibility is limited because of nearby trees, diffuse vegetation cover and no available total view. Therefore, we used a FARO 120 3D laserscanner and Trimble GNSS device to generate a referenced and drawn to scale point cloud of the complete quarry. Based on the point cloud a geometric 3D model of prominent horizons and structural features of various sizes was constructed. Thereafter, we analyzed the structures in matters of orientation and deformation mechanisms. Finally, we applied a retrodeformation algorithm on the model to restore the original sedimentary sequence and to calculate shortening including the amount of pressure solution. Only digital mapping allows such a time-saving, accurate and especially complete 3D survey of this excellent study object. We demonstrated that such 3D-models enable spatial correlations with other complex structures cropping out in the area. Moreover, we confirmed that a structural upscaling to the 100 to 1000 m scale is much easier and much more instructive than it could have been done in the classical way.

  4. Measurement of large parallel and perpendicular electric fields on electron spatial scales in the terrestrial bow shock.

    PubMed

    Bale, S D; Mozer, F S

    2007-05-18

    Large parallel (

  5. Skeletonization of gray-scale images by gray weighted distance transform

    NASA Astrophysics Data System (ADS)

    Qian, Kai; Cao, Siqi; Bhattacharya, Prabir

    1997-07-01

    In pattern recognition, thinning algorithms are often a useful tool to represent a digital pattern by means of a skeletonized image, consisting of a set of one-pixel-width lines that highlight the significant features interest in applying thinning directly to gray-scale images, motivated by the desire of processing images characterized by meaningful information distributed over different levels of gray intensity. In this paper, a new algorithm is presented which can skeletonize both black-white and gray pictures. This algorithm is based on the gray distance transformation and can be used to process any non-well uniformly distributed gray-scale picture and can preserve the topology of original picture. This process includes a preliminary phase of investigation in the 'hollows' in the gray-scale image; these hollows are considered not as topological constrains for the skeleton structure depending on their statistically significant depth. This algorithm can also be executed on a parallel machine as all the operations are executed in local. Some examples are discussed to illustrate the algorithm.

  6. Scaling of Multimillion-Atom Biological Molecular Dynamics Simulation on a Petascale Supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schulz, Roland; Lindner, Benjamin; Petridis, Loukas

    2009-01-01

    A strategy is described for a fast all-atom molecular dynamics simulation of multimillion-atom biological systems on massively parallel supercomputers. The strategy is developed using benchmark systems of particular interest to bioenergy research, comprising models of cellulose and lignocellulosic biomass in an aqueous solution. The approach involves using the reaction field (RF) method for the computation of long-range electrostatic interactions, which permits efficient scaling on many thousands of cores. Although the range of applicability of the RF method for biomolecular systems remains to be demonstrated, for the benchmark systems the use of the RF produces molecular dipole moments, Kirkwood G factors,more » other structural properties, and mean-square fluctuations in excellent agreement with those obtained with the commonly used Particle Mesh Ewald method. With RF, three million- and five million atom biological systems scale well up to 30k cores, producing 30 ns/day. Atomistic simulations of very large systems for time scales approaching the microsecond would, therefore, appear now to be within reach.« less

  7. Scaling of Multimillion-Atom Biological Molecular Dynamics Simulation on a Petascale Supercomputer.

    PubMed

    Schulz, Roland; Lindner, Benjamin; Petridis, Loukas; Smith, Jeremy C

    2009-10-13

    A strategy is described for a fast all-atom molecular dynamics simulation of multimillion-atom biological systems on massively parallel supercomputers. The strategy is developed using benchmark systems of particular interest to bioenergy research, comprising models of cellulose and lignocellulosic biomass in an aqueous solution. The approach involves using the reaction field (RF) method for the computation of long-range electrostatic interactions, which permits efficient scaling on many thousands of cores. Although the range of applicability of the RF method for biomolecular systems remains to be demonstrated, for the benchmark systems the use of the RF produces molecular dipole moments, Kirkwood G factors, other structural properties, and mean-square fluctuations in excellent agreement with those obtained with the commonly used Particle Mesh Ewald method. With RF, three million- and five million-atom biological systems scale well up to ∼30k cores, producing ∼30 ns/day. Atomistic simulations of very large systems for time scales approaching the microsecond would, therefore, appear now to be within reach.

  8. Fast Detection of Material Deformation through Structural Dissimilarity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ushizima, Daniela; Perciano, Talita; Parkinson, Dilworth

    2015-10-29

    Designing materials that are resistant to extreme temperatures and brittleness relies on assessing structural dynamics of samples. Algorithms are critically important to characterize material deformation under stress conditions. Here, we report on our design of coarse-grain parallel algorithms for image quality assessment based on structural information and on crack detection of gigabyte-scale experimental datasets. We show how key steps can be decomposed into distinct processing flows, one based on structural similarity (SSIM) quality measure, and another on spectral content. These algorithms act upon image blocks that fit into memory, and can execute independently. We discuss the scientific relevance of themore » problem, key developments, and decomposition of complementary tasks into separate executions. We show how to apply SSIM to detect material degradation, and illustrate how this metric can be allied to spectral analysis for structure probing, while using tiled multi-resolution pyramids stored in HDF5 chunked multi-dimensional arrays. Results show that the proposed experimental data representation supports an average compression rate of 10X, and data compression scales linearly with the data size. We also illustrate how to correlate SSIM to crack formation, and how to use our numerical schemes to enable fast detection of deformation from 3D datasets evolving in time.« less

  9. Snow Micro-Structure Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Micah Johnson, Andrew Slaughter

    PIKA is a MOOSE-based application for modeling micro-structure evolution of seasonal snow. The model will be useful for environmental, atmospheric, and climate scientists. Possible applications include application to energy balance models, ice sheet modeling, and avalanche forecasting. The model implements physics from published, peer-reviewed articles. The main purpose is to foster university and laboratory collaboration to build a larger multi-scale snow model using MOOSE. The main feature of the code is that it is implemented using the MOOSE framework, thus making features such as multiphysics coupling, adaptive mesh refinement, and parallel scalability native to the application. PIKA implements three equations:more » the phase-field equation for tracking the evolution of the ice-air interface within seasonal snow at the grain-scale; the heat equation for computing the temperature of both the ice and air within the snow; and the mass transport equation for monitoring the diffusion of water vapor in the pore space of the snow.« less

  10. Development of an automated large-scale protein-crystallization and monitoring system for high-throughput protein-structure analyses.

    PubMed

    Hiraki, Masahiko; Kato, Ryuichi; Nagai, Minoru; Satoh, Tadashi; Hirano, Satoshi; Ihara, Kentaro; Kudo, Norio; Nagae, Masamichi; Kobayashi, Masanori; Inoue, Michio; Uejima, Tamami; Oda, Shunichiro; Chavas, Leonard M G; Akutsu, Masato; Yamada, Yusuke; Kawasaki, Masato; Matsugaki, Naohiro; Igarashi, Noriyuki; Suzuki, Mamoru; Wakatsuki, Soichi

    2006-09-01

    Protein crystallization remains one of the bottlenecks in crystallographic analysis of macromolecules. An automated large-scale protein-crystallization system named PXS has been developed consisting of the following subsystems, which proceed in parallel under unified control software: dispensing precipitants and protein solutions, sealing crystallization plates, carrying robot, incubators, observation system and image-storage server. A sitting-drop crystallization plate specialized for PXS has also been designed and developed. PXS can set up 7680 drops for vapour diffusion per hour, which includes time for replenishing supplies such as disposable tips and crystallization plates. Images of the crystallization drops are automatically recorded according to a preprogrammed schedule and can be viewed by users remotely using web-based browser software. A number of protein crystals were successfully produced and several protein structures could be determined directly from crystals grown by PXS. In other cases, X-ray quality crystals were obtained by further optimization by manual screening based on the conditions found by PXS.

  11. Cloud Computing for Protein-Ligand Binding Site Comparison

    PubMed Central

    2013-01-01

    The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery. PMID:23762824

  12. Cloud computing for protein-ligand binding site comparison.

    PubMed

    Hung, Che-Lun; Hua, Guan-Jie

    2013-01-01

    The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery.

  13. TomoMiner and TomoMinerCloud: A software platform for large-scale subtomogram structural analysis

    PubMed Central

    Frazier, Zachary; Xu, Min; Alber, Frank

    2017-01-01

    SUMMARY Cryo-electron tomography (cryoET) captures the 3D electron density distribution of macromolecular complexes in close to native state. With the rapid advance of cryoET acquisition technologies, it is possible to generate large numbers (>100,000) of subtomograms, each containing a macromolecular complex. Often, these subtomograms represent a heterogeneous sample due to variations in structure and composition of a complex in situ form or because particles are a mixture of different complexes. In this case subtomograms must be classified. However, classification of large numbers of subtomograms is a time-intensive task and often a limiting bottleneck. This paper introduces an open source software platform, TomoMiner, for large-scale subtomogram classification, template matching, subtomogram averaging, and alignment. Its scalable and robust parallel processing allows efficient classification of tens to hundreds of thousands of subtomograms. Additionally, TomoMiner provides a pre-configured TomoMinerCloud computing service permitting users without sufficient computing resources instant access to TomoMiners high-performance features. PMID:28552576

  14. A nonrecursive order N preconditioned conjugate gradient: Range space formulation of MDOF dynamics

    NASA Technical Reports Server (NTRS)

    Kurdila, Andrew J.

    1990-01-01

    While excellent progress has been made in deriving algorithms that are efficient for certain combinations of system topologies and concurrent multiprocessing hardware, several issues must be resolved to incorporate transient simulation in the control design process for large space structures. Specifically, strategies must be developed that are applicable to systems with numerous degrees of freedom. In addition, the algorithms must have a growth potential in that they must also be amenable to implementation on forthcoming parallel system architectures. For mechanical system simulation, this fact implies that algorithms are required that induce parallelism on a fine scale, suitable for the emerging class of highly parallel processors; and transient simulation methods must be automatically load balancing for a wider collection of system topologies and hardware configurations. These problems are addressed by employing a combination range space/preconditioned conjugate gradient formulation of multi-degree-of-freedom dynamics. The method described has several advantages. In a sequential computing environment, the method has the features that: by employing regular ordering of the system connectivity graph, an extremely efficient preconditioner can be derived from the 'range space metric', as opposed to the system coefficient matrix; because of the effectiveness of the preconditioner, preliminary studies indicate that the method can achieve performance rates that depend linearly upon the number of substructures, hence the title 'Order N'; and the method is non-assembling. Furthermore, the approach is promising as a potential parallel processing algorithm in that the method exhibits a fine parallel granularity suitable for a wide collection of combinations of physical system topologies/computer architectures; and the method is easily load balanced among processors, and does not rely upon system topology to induce parallelism.

  15. An intercalation-locked parallel-stranded DNA tetraplex

    DOE PAGES

    Tripathi, S.; Zhang, D.; Paukstelis, P. J.

    2015-01-27

    DNA has proved to be an excellent material for nanoscale construction because complementary DNA duplexes are programmable and structurally predictable. However, in the absence of Watson–Crick pairings, DNA can be structurally more diverse. Here, we describe the crystal structures of d(ACTCGGATGAT) and the brominated derivative, d(AC BrUCGGA BrUGAT). These oligonucleotides form parallel-stranded duplexes with a crystallographically equivalent strand, resulting in the first examples of DNA crystal structures that contains four different symmetric homo base pairs. Two of the parallel-stranded duplexes are coaxially stacked in opposite directions and locked together to form a tetraplex through intercalation of the 5'-most A–A basemore » pairs between adjacent G–G pairs in the partner duplex. The intercalation region is a new type of DNA tertiary structural motif with similarities to the i-motif. 1H– 1H nuclear magnetic resonance and native gel electrophoresis confirmed the formation of a parallel-stranded duplex in solution. Finally, we modified specific nucleotide positions and added d(GAY) motifs to oligonucleotides and were readily able to obtain similar crystals. This suggests that this parallel-stranded DNA structure may be useful in the rational design of DNA crystals and nanostructures.« less

  16. Performance Evaluation of Parallel Branch and Bound Search with the Intel iPSC (Intel Personal SuperComputer) Hypercube Computer.

    DTIC Science & Technology

    1986-12-01

    17 III. Analysis of Parallel Design ................................................ 18 Parallel Abstract Data ...Types ........................................... 18 Abstract Data Type .................................................. 19 Parallel ADT...22 Data -Structure Design ........................................... 23 Object-Oriented Design

  17. Cooperative storage of shared files in a parallel computing system with dynamic block size

    DOEpatents

    Bent, John M.; Faibish, Sorin; Grider, Gary

    2015-11-10

    Improved techniques are provided for parallel writing of data to a shared object in a parallel computing system. A method is provided for storing data generated by a plurality of parallel processes to a shared object in a parallel computing system. The method is performed by at least one of the processes and comprises: dynamically determining a block size for storing the data; exchanging a determined amount of the data with at least one additional process to achieve a block of the data having the dynamically determined block size; and writing the block of the data having the dynamically determined block size to a file system. The determined block size comprises, e.g., a total amount of the data to be stored divided by the number of parallel processes. The file system comprises, for example, a log structured virtual parallel file system, such as a Parallel Log-Structured File System (PLFS).

  18. Acoustic Emission Measurement with Fiber Bragg Gratings for Structure Health Monitoring

    NASA Technical Reports Server (NTRS)

    Banks, Curtis E.; Walker, James L.; Russell, Sam; Roth, Don; Mabry, Nehemiah; Wilson, Melissa

    2010-01-01

    Structural Health monitoring (SHM) is a way of detecting and assessing damage to large scale structures. Sensors used in SHM for aerospace structures provide real time data on new and propagating damage. One type of sensor that is typically used is an acoustic emission (AE) sensor that detects the acoustic emissions given off from a material cracking or breaking. The use of fiber Bragg grating (FBG) sensors to provide acoustic emission data for damage detection is studied. In this research, FBG sensors are used to detect acoustic emissions of a material during a tensile test. FBG sensors were placed as a strain sensor (oriented parallel to applied force) and as an AE sensor (oriented perpendicular to applied force). A traditional AE transducer was used to collect AE data to compare with the FBG data. Preliminary results show that AE with FBGs can be a viable alternative to traditional AE sensors.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Deng,Y.; Liu, J.; Zheng, Q.

    Entry of SARS coronavirus into its target cell requires large-scale structural transitions in the viral spike (S) glycoprotein in order to induce fusion of the virus and cell membranes. Here we describe the identification and crystal structures of four distinct a-helical domains derived from the highly conserved heptad-repeat (HR) regions of the S2 fusion subunit. The four domains are an antiparallel four-stranded coiled coil, a parallel trimeric coiled coil, a four-helix bundle, and a six-helix bundle that is likely the final fusogenic form of the protein. When considered together, the structural and thermodynamic features of the four domains suggest amore » possible mechanism whereby the HR regions, initially sequestered in the native S glycoprotein spike, are released and refold sequentially to promote membrane fusion. Our results provide a structural framework for understanding the control of membrane fusion and should guide efforts to intervene in the SARS coronavirus entry process.« less

  20. Coherent Structures and Extreme Events in Rotating Multiphase Turbulent Flows

    NASA Astrophysics Data System (ADS)

    Biferale, L.; Bonaccorso, F.; Mazzitelli, I. M.; van Hinsberg, M. A. T.; Lanotte, A. S.; Musacchio, S.; Perlekar, P.; Toschi, F.

    2016-10-01

    By using direct numerical simulations (DNS) at unprecedented resolution, we study turbulence under rotation in the presence of simultaneous direct and inverse cascades. The accumulation of energy at large scale leads to the formation of vertical coherent regions with high vorticity oriented along the rotation axis. By seeding the flow with millions of inertial particles, we quantify—for the first time—the effects of those coherent vertical structures on the preferential concentration of light and heavy particles. Furthermore, we quantitatively show that extreme fluctuations, leading to deviations from a normal-distributed statistics, result from the entangled interaction of the vertical structures with the turbulent background. Finally, we present the first-ever measurement of the relative importance between Stokes drag, Coriolis force, and centripetal force along the trajectories of inertial particles. We discover that vortical coherent structures lead to unexpected diffusion properties for heavy and light particles in the directions parallel and perpendicular to the rotation axis.

  1. Parallel computation of fluid-structural interactions using high resolution upwind schemes

    NASA Astrophysics Data System (ADS)

    Hu, Zongjun

    An efficient and accurate solver is developed to simulate the non-linear fluid-structural interactions in turbomachinery flutter flows. A new low diffusion E-CUSP scheme, Zha CUSP scheme, is developed to improve the efficiency and accuracy of the inviscid flux computation. The 3D unsteady Navier-Stokes equations with the Baldwin-Lomax turbulence model are solved using the finite volume method with the dual-time stepping scheme. The linearized equations are solved with Gauss-Seidel line iterations. The parallel computation is implemented using MPI protocol. The solver is validated with 2D cases for its turbulence modeling, parallel computation and unsteady calculation. The Zha CUSP scheme is validated with 2D cases, including a supersonic flat plate boundary layer, a transonic converging-diverging nozzle and a transonic inlet diffuser. The Zha CUSP2 scheme is tested with 3D cases, including a circular-to-rectangular nozzle, a subsonic compressor cascade and a transonic channel. The Zha CUSP schemes are proved to be accurate, robust and efficient in these tests. The steady and unsteady separation flows in a 3D stationary cascade under high incidence and three inlet Mach numbers are calculated to study the steady state separation flow patterns and their unsteady oscillation characteristics. The leading edge vortex shedding is the mechanism behind the unsteady characteristics of the high incidence separated flows. The separation flow characteristics is affected by the inlet Mach number. The blade aeroelasticity of a linear cascade with forced oscillating blades is studied using parallel computation. A simplified two-passage cascade with periodic boundary condition is first calculated under a medium frequency and a low incidence. The full scale cascade with 9 blades and two end walls is then studied more extensively under three oscillation frequencies and two incidence angles. The end wall influence and the blade stability are studied and compared under different frequencies and incidence angles. The Zha CUSP schemes are the first time to be applied in moving grid systems and 2D and 3D calculations. The implicit Gauss-Seidel iteration with dual time stepping is the first time to be used for moving grid systems. The NASA flutter cascade is the first time to be calculated in full scale.

  2. Production of yarns composed of oriented nanofibers for ophthalmological implants

    NASA Astrophysics Data System (ADS)

    Shynkarenko, A.; Klapstova, A.; Krotov, A.; Moucka, M.; Lukas, D.

    2017-10-01

    Parallelized nanofibrous structures are commonly used in medical sector, especially for the ophthalmological implants. In this research self-fabricated device is tested for improved collection and twisting of the parallel nanofibers. Previously manual techniques are used to collect the nanofibers and then twist is given, where as in our device different parameters can be optimized to obtained parallel nanofibers and further twisting can be given. The device is used to bring automation to the technique of achieving parallel fibrous structures for medical applications.

  3. Alignments of Dark Matter Halos with Large-scale Tidal Fields: Mass and Redshift Dependence

    NASA Astrophysics Data System (ADS)

    Chen, Sijie; Wang, Huiyuan; Mo, H. J.; Shi, Jingjing

    2016-07-01

    Large-scale tidal fields estimated directly from the distribution of dark matter halos are used to investigate how halo shapes and spin vectors are aligned with the cosmic web. The major, intermediate, and minor axes of halos are aligned with the corresponding tidal axes, and halo spin axes tend to be parallel with the intermediate axes and perpendicular to the major axes of the tidal field. The strengths of these alignments generally increase with halo mass and redshift, but the dependence is only on the peak height, ν \\equiv {δ }{{c}}/σ ({M}{{h}},z). The scaling relations of the alignment strengths with the value of ν indicate that the alignment strengths remain roughly constant when the structures within which the halos reside are still in a quasi-linear regime, but decreases as nonlinear evolution becomes more important. We also calculate the alignments in projection so that our results can be compared directly with observations. Finally, we investigate the alignments of tidal tensors on large scales, and use the results to understand alignments of halo pairs separated at various distances. Our results suggest that the coherent structure of the tidal field is the underlying reason for the alignments of halos and galaxies seen in numerical simulations and in observations.

  4. Spiking network simulation code for petascale computers.

    PubMed

    Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M; Plesser, Hans E; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz

    2014-01-01

    Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today.

  5. Constraints on muscle performance provide a novel explanation for the scaling of posture in terrestrial animals.

    PubMed

    Usherwood, James R

    2013-08-23

    Larger terrestrial animals tend to support their weight with more upright limbs. This makes structural sense, reducing the loading on muscles and bones, which is disproportionately challenging in larger animals. However, it does not account for why smaller animals are more crouched; instead, they could enjoy relatively more slender supporting structures or higher safety factors. Here, an alternative account for the scaling of posture is proposed, with close parallels to the scaling of jump performance. If the costs of locomotion are related to the volume of active muscle, and the active muscle volume required depends on both the work and the power demanded during the push-off phase of each step (not just the net positive work), then the disproportional scaling of requirements for work and push-off power are revealing. Larger animals require relatively greater active muscle volumes for dynamically similar gaits (e.g. top walking speed)-which may present an ultimate constraint to the size of running animals. Further, just as for jumping, animals with shorter legs and briefer push-off periods are challenged to provide the power (not the work) required for push-off. This can be ameliorated by having relatively long push-off periods, potentially accounting for the crouched stance of small animals.

  6. Constraints on muscle performance provide a novel explanation for the scaling of posture in terrestrial animals

    PubMed Central

    Usherwood, James R.

    2013-01-01

    Larger terrestrial animals tend to support their weight with more upright limbs. This makes structural sense, reducing the loading on muscles and bones, which is disproportionately challenging in larger animals. However, it does not account for why smaller animals are more crouched; instead, they could enjoy relatively more slender supporting structures or higher safety factors. Here, an alternative account for the scaling of posture is proposed, with close parallels to the scaling of jump performance. If the costs of locomotion are related to the volume of active muscle, and the active muscle volume required depends on both the work and the power demanded during the push-off phase of each step (not just the net positive work), then the disproportional scaling of requirements for work and push-off power are revealing. Larger animals require relatively greater active muscle volumes for dynamically similar gaits (e.g. top walking speed)—which may present an ultimate constraint to the size of running animals. Further, just as for jumping, animals with shorter legs and briefer push-off periods are challenged to provide the power (not the work) required for push-off. This can be ameliorated by having relatively long push-off periods, potentially accounting for the crouched stance of small animals. PMID:23825086

  7. Accelerating large scale Kohn-Sham density functional theory calculations with semi-local functionals and hybrid functionals

    NASA Astrophysics Data System (ADS)

    Lin, Lin

    The computational cost of standard Kohn-Sham density functional theory (KSDFT) calculations scale cubically with respect to the system size, which limits its use in large scale applications. In recent years, we have developed an alternative procedure called the pole expansion and selected inversion (PEXSI) method. The PEXSI method solves KSDFT without solving any eigenvalue and eigenvector, and directly evaluates physical quantities including electron density, energy, atomic force, density of states, and local density of states. The overall algorithm scales as at most quadratically for all materials including insulators, semiconductors and the difficult metallic systems. The PEXSI method can be efficiently parallelized over 10,000 - 100,000 processors on high performance machines. The PEXSI method has been integrated into a number of community electronic structure software packages such as ATK, BigDFT, CP2K, DGDFT, FHI-aims and SIESTA, and has been used in a number of applications with 2D materials beyond 10,000 atoms. The PEXSI method works for LDA, GGA and meta-GGA functionals. The mathematical structure for hybrid functional KSDFT calculations is significantly different. I will also discuss recent progress on using adaptive compressed exchange method for accelerating hybrid functional calculations. DOE SciDAC Program, DOE CAMERA Program, LBNL LDRD, Sloan Fellowship.

  8. Spiking network simulation code for petascale computers

    PubMed Central

    Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M.; Plesser, Hans E.; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz

    2014-01-01

    Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today. PMID:25346682

  9. High-speed prediction of crystal structures for organic molecules

    NASA Astrophysics Data System (ADS)

    Obata, Shigeaki; Goto, Hitoshi

    2015-02-01

    We developed a master-worker type parallel algorithm for allocating tasks of crystal structure optimizations to distributed compute nodes, in order to improve a performance of simulations for crystal structure predictions. The performance experiments were demonstrated on TUT-ADSIM supercomputer system (HITACHI HA8000-tc/HT210). The experimental results show that our parallel algorithm could achieve speed-ups of 214 and 179 times using 256 processor cores on crystal structure optimizations in predictions of crystal structures for 3-aza-bicyclo(3.3.1)nonane-2,4-dione and 2-diazo-3,5-cyclohexadiene-1-one, respectively. We expect that this parallel algorithm is always possible to reduce computational costs of any crystal structure predictions.

  10. Brief Self-Efficacy Scales for use in Weight-Loss Trials: Preliminary Evidence of Validity

    PubMed Central

    Wilson, Kathryn E.; Harden, Samantha M.; Almeida, Fabio A.; You, Wen; Hill, Jennie L.; Goessl, Cody; Estabrooks, Paul A.

    2015-01-01

    Self-efficacy is a commonly included cognitive variable in weight-loss trials, but there is little uniformity in its measurement. Weight-loss trials frequently focus on physical activity (PA) and eating behavior, as well as weight loss, but no survey is available that offers reliable measurement of self-efficacy as it relates to each of these targeted outcomes. The purpose of this study was to test the psychometric properties of brief, pragmatic self-efficacy scales specific to PA, healthful eating and weight-loss (4 items each). An adult sample (n=1790) from 28 worksites enrolled in a worksite weight-loss program completed the self-efficacy scale, as well as measures of PA, dietary fat intake, and weight, at baseline, 6-, and 12-months. The hypothesized factor structure was tested through confirmatory factor analysis, which supported the expected factor structure for three latent self-efficacy factors, specific to PA, healthful eating, and weight-loss. Measurement equivalence/invariance between relevant demographic groups, and over time was also supported. Parallel growth processes in self-efficacy factors and outcomes (PA, fat intake, and weight) support the predictive validity of score interpretations. Overall, this initial series of psychometric analyses supports the interpretation that scores on these scales reflect self-efficacy for PA, healthful eating, and weight-loss. The use of this instrument in large-scale weight-loss trials is encouraged. PMID:26619093

  11. Parallel goal-oriented adaptive finite element modeling for 3D electromagnetic exploration

    NASA Astrophysics Data System (ADS)

    Zhang, Y.; Key, K.; Ovall, J.; Holst, M.

    2014-12-01

    We present a parallel goal-oriented adaptive finite element method for accurate and efficient electromagnetic (EM) modeling of complex 3D structures. An unstructured tetrahedral mesh allows this approach to accommodate arbitrarily complex 3D conductivity variations and a priori known boundaries. The total electric field is approximated by the lowest order linear curl-conforming shape functions and the discretized finite element equations are solved by a sparse LU factorization. Accuracy of the finite element solution is achieved through adaptive mesh refinement that is performed iteratively until the solution converges to the desired accuracy tolerance. Refinement is guided by a goal-oriented error estimator that uses a dual-weighted residual method to optimize the mesh for accurate EM responses at the locations of the EM receivers. As a result, the mesh refinement is highly efficient since it only targets the elements where the inaccuracy of the solution corrupts the response at the possibly distant locations of the EM receivers. We compare the accuracy and efficiency of two approaches for estimating the primary residual error required at the core of this method: one uses local element and inter-element residuals and the other relies on solving a global residual system using a hierarchical basis. For computational efficiency our method follows the Bank-Holst algorithm for parallelization, where solutions are computed in subdomains of the original model. To resolve the load-balancing problem, this approach applies a spectral bisection method to divide the entire model into subdomains that have approximately equal error and the same number of receivers. The finite element solutions are then computed in parallel with each subdomain carrying out goal-oriented adaptive mesh refinement independently. We validate the newly developed algorithm by comparison with controlled-source EM solutions for 1D layered models and with 2D results from our earlier 2D goal oriented adaptive refinement code named MARE2DEM. We demonstrate the performance and parallel scaling of this algorithm on a medium-scale computing cluster with a marine controlled-source EM example that includes a 3D array of receivers located over a 3D model that includes significant seafloor bathymetry variations and a heterogeneous subsurface.

  12. Traffic Simulations on Parallel Computers Using Domain Decomposition Techniques

    DOT National Transportation Integrated Search

    1995-01-01

    Large scale simulations of Intelligent Transportation Systems (ITS) can only be acheived by using the computing resources offered by parallel computing architectures. Domain decomposition techniques are proposed which allow the performance of traffic...

  13. Mesh-free data transfer algorithms for partitioned multiphysics problems: Conservation, accuracy, and parallelism

    DOE PAGES

    Slattery, Stuart R.

    2015-12-02

    In this study we analyze and extend mesh-free algorithms for three-dimensional data transfer problems in partitioned multiphysics simulations. We first provide a direct comparison between a mesh-based weighted residual method using the common-refinement scheme and two mesh-free algorithms leveraging compactly supported radial basis functions: one using a spline interpolation and one using a moving least square reconstruction. Through the comparison we assess both the conservation and accuracy of the data transfer obtained from each of the methods. We do so for a varying set of geometries with and without curvature and sharp features and for functions with and without smoothnessmore » and with varying gradients. Our results show that the mesh-based and mesh-free algorithms are complementary with cases where each was demonstrated to perform better than the other. We then focus on the mesh-free methods by developing a set of algorithms to parallelize them based on sparse linear algebra techniques. This includes a discussion of fast parallel radius searching in point clouds and restructuring the interpolation algorithms to leverage data structures and linear algebra services designed for large distributed computing environments. The scalability of our new algorithms is demonstrated on a leadership class computing facility using a set of basic scaling studies. Finally, these scaling studies show that for problems with reasonable load balance, our new algorithms for both spline interpolation and moving least square reconstruction demonstrate both strong and weak scalability using more than 100,000 MPI processes with billions of degrees of freedom in the data transfer operation.« less

  14. Parallel steady state studies on a milliliter scale accelerate fed-batch bioprocess design for recombinant protein production with Escherichia coli.

    PubMed

    Schmideder, Andreas; Cremer, Johannes H; Weuster-Botz, Dirk

    2016-11-01

    In general, fed-batch processes are applied for recombinant protein production with Escherichia coli (E. coli). However, state of the art methods for identifying suitable reaction conditions suffer from severe drawbacks, i.e. direct transfer of process information from parallel batch studies is often defective and sequential fed-batch studies are time-consuming and cost-intensive. In this study, continuously operated stirred-tank reactors on a milliliter scale were applied to identify suitable reaction conditions for fed-batch processes. Isopropyl β-d-1-thiogalactopyranoside (IPTG) induction strategies were varied in parallel-operated stirred-tank bioreactors to study the effects on the continuous production of the recombinant protein photoactivatable mCherry (PAmCherry) with E. coli. Best-performing induction strategies were transferred from the continuous processes on a milliliter scale to liter scale fed-batch processes. Inducing recombinant protein expression by dynamically increasing the IPTG concentration to 100 µM led to an increase in the product concentration of 21% (8.4 g L -1 ) compared to an implemented high-performance production process with the most frequently applied induction strategy by a single addition of 1000 µM IPGT. Thus, identifying feasible reaction conditions for fed-batch processes in parallel continuous studies on a milliliter scale was shown to be a powerful, novel method to accelerate bioprocess design in a cost-reducing manner. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1426-1435, 2016. © 2016 American Institute of Chemical Engineers.

  15. A comparative study of serial and parallel aeroelastic computations of wings

    NASA Technical Reports Server (NTRS)

    Byun, Chansup; Guruswamy, Guru P.

    1994-01-01

    A procedure for computing the aeroelasticity of wings on parallel multiple-instruction, multiple-data (MIMD) computers is presented. In this procedure, fluids are modeled using Euler equations, and structures are modeled using modal or finite element equations. The procedure is designed in such a way that each discipline can be developed and maintained independently by using a domain decomposition approach. In the present parallel procedure, each computational domain is scalable. A parallel integration scheme is used to compute aeroelastic responses by solving fluid and structural equations concurrently. The computational efficiency issues of parallel integration of both fluid and structural equations are investigated in detail. This approach, which reduces the total computational time by a factor of almost 2, is demonstrated for a typical aeroelastic wing by using various numbers of processors on the Intel iPSC/860.

  16. Experimental evidence and structural modeling of nonstoichiometric (010) surfaces coexisting in hydroxyapatite nano-crystals.

    PubMed

    Ospina, C A; Terra, J; Ramirez, A J; Farina, M; Ellis, D E; Rossi, A M

    2012-01-01

    High-resolution transmission electron microscopy (HRTEM) and ab initio quantum-mechanical calculations of electronic structure were combined to investigate the structure of the hydroxyapatite (HA) (010) surface, which plays an important role in HA interactions with biological media. HA was synthesized by in vitro precipitation at 37°C. HRTEM images revealed thin elongated rod nanoparticles with preferential growth along the [001] direction and terminations parallel to the (010) plane. The focal series reconstruction (FSR) technique was applied to develop an atomic-scale structural model of the high-resolution images. The HRTEM simulations identified the coexistence of two structurally distinct terminations for (010) surfaces: a rather flat Ca(II)-terminated surface and a zig-zag structure with open OH channels. Density functional theory (DFT) was applied in a periodic slab plane-wave pseudopotential approach to refine details of atomic coordination and bond lengths of Ca(I) and Ca(II) sites in hydrated HA (010) surfaces, starting from the HRTEM model. Copyright © 2011 Elsevier B.V. All rights reserved.

  17. Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Candel, A.; Kabel, A.; Lee, L.

    In recent years, SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic time-domain code T3P. Higher-order Finite Element methods on conformal unstructured meshes and massively parallel processing allow unprecedented simulation accuracy for wakefield computations and simulations of transient effects in realistic accelerator structures. Applications include simulation of wakefield damping in the Compact Linear Collider (CLIC) power extraction and transfer structure (PETS).

  18. Linux Kernel Co-Scheduling and Bulk Synchronous Parallelism

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jones, Terry R

    2012-01-01

    This paper describes a kernel scheduling algorithm that is based on coscheduling principles and that is intended for parallel applications running on 1000 cores or more. Experimental results for a Linux implementation on a Cray XT5 machine are presented. The results indicate that Linux is a suitable operating system for this new scheduling scheme, and that this design provides a dramatic improvement in scaling performance for synchronizing collective operations at scale.

  19. Temporal and spatial influences incur reconfiguration of Arctic heathland soil bacterial community structure.

    PubMed

    Hill, Richard; Saetnan, Eli R; Scullion, John; Gwynn-Jones, Dylan; Ostle, Nick; Edwards, Arwyn

    2016-06-01

    Microbial responses to Arctic climate change could radically alter the stability of major stores of soil carbon. However, the sensitivity of plot-scale experiments simulating climate change effects on Arctic heathland soils to potential confounding effects of spatial and temporal changes in soil microbial communities is unknown. Here, the variation in heathland soil bacterial communities at two survey sites in Sweden between spring and summer 2013 and at scales between 0-1 m and, 1-100 m and between sites (> 100 m) were investigated in parallel using 16S rRNA gene T-RFLP and amplicon sequencing. T-RFLP did not reveal spatial structuring of communities at scales < 100 m in any site or season. However, temporal changes were striking. Amplicon sequencing corroborated shifts from r- to K-selected taxon-dominated communities, influencing in silico predictions of functional potential. Network analyses reveal temporal keystone taxa, with a spring betaproteobacterial sub-network centred upon a Burkholderia operational taxonomic unit (OTU) and a reconfiguration to a summer sub-network centred upon an alphaproteobacterial OTU. Although spatial structuring effects may not confound comparison between plot-scale treatments, temporal change is a significant influence. Moreover, the prominence of two temporally exclusive keystone taxa suggests that the stability of Arctic heathland soil bacterial communities could be disproportionally influenced by seasonal perturbations affecting individual taxa. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  20. PREMER: a Tool to Infer Biological Networks.

    PubMed

    Villaverde, Alejandro F; Becker, Kolja; Banga, Julio R

    2017-10-04

    Inferring the structure of unknown cellular networks is a main challenge in computational biology. Data-driven approaches based on information theory can determine the existence of interactions among network nodes automatically. However, the elucidation of certain features - such as distinguishing between direct and indirect interactions or determining the direction of a causal link - requires estimating information-theoretic quantities in a multidimensional space. This can be a computationally demanding task, which acts as a bottleneck for the application of elaborate algorithms to large-scale network inference problems. The computational cost of such calculations can be alleviated by the use of compiled programs and parallelization. To this end we have developed PREMER (Parallel Reverse Engineering with Mutual information & Entropy Reduction), a software toolbox that can run in parallel and sequential environments. It uses information theoretic criteria to recover network topology and determine the strength and causality of interactions, and allows incorporating prior knowledge, imputing missing data, and correcting outliers. PREMER is a free, open source software tool that does not require any commercial software. Its core algorithms are programmed in FORTRAN 90 and implement OpenMP directives. It has user interfaces in Python and MATLAB/Octave, and runs on Windows, Linux and OSX (https://sites.google.com/site/premertoolbox/).

  1. Characterization of the seismically imaged Tuscarora fold system and implications for layer parallel shortening in the Pennsylvania salient

    NASA Astrophysics Data System (ADS)

    Mount, Van S.; Wilkins, Scott; Comiskey, Cody S.

    2017-12-01

    The Tuscarora fold system (TFS) is located in the Pennsylvania salient in the foreland of the Valley and Ridge province. The TFS is imaged in high quality 3D seismic data and comprises a system of small-scale folds within relatively flat-lying Lower Silurian Tuscarora Formation strata. We characterize the TFS structures and infer layer parallel shortening (LPS) directions and magnitudes associated with deformation during the Alleghany Orogeny. Previously reported LPS data in our study area are from shallow Devonian and Carboniferous strata (based on outcrop and core analyses) above the shallowest of three major detachments recognized in the region. Seismic data allows us to characterize LPS at depth in strata beneath the shallow detachment. Our LPS data (orientations and inferred magnitudes) are consistent with the shallow data leading us to surmise that LPS during Alleghanian deformation fanned around the salient and was distributed throughout the stratigraphic section - and not isolated to strata above the shallow detachment. We propose that a NW-SE oriented Alleghanian maximum principal stress was perturbed by deep structure associated with the non-linear margin of Laurentia resulting in fanning of shortening directions within the salient.

  2. Multispacecraft study of shock-flux rope interaction

    NASA Astrophysics Data System (ADS)

    Blanco-Cano, Xochitl; Burgess, David; Sundberg, Torbjorn; Kajdic, Primoz

    2017-04-01

    Interplanetary (IP) shocks can be driven in the solar wind by fast coronal mass ejections. These shocks can accelerate particles near the Sun and through the heliosphere, being associated to solar energetic particle (SEP) and energetic storm particle (ESP) events. IP shocks can interact with structures in the solar wind, and with planetary magnetospheres. In this study we show how the properties of an IP shock change when it interacts with a medium scale flux rope (FR) like structure. We use data measurements from CLUSTER, WIND and ACE. These three spacecraft observed the shock-FR interaction at different stages of its evolution. We find that the shock-FR interaction locally changes the shock geometry, affecting ion injection processes, and the upstream and downstream regions. While WIND and ACE observed a quasi-perpendicular shock, CLUSTER crossed a quasi-parallel shock and a foreshock with a variety of ion distributions. The complexity of the ion foreshock can be explained by the dynamics of the shock transitioning from quasi-perpendicular to quasi-parallel, and the geometry of the magnetic field around the flux rope. Interactions such as the one we discuss can occur often along the extended IP shock fronts, and hence their importance towards a better understanding of shock acceleration.

  3. Method for fabricating high aspect ratio structures in perovskite material

    DOEpatents

    Karapetrov, Goran T.; Kwok, Wai-Kwong; Crabtree, George W.; Iavarone, Maria

    2003-10-28

    A method of fabricating high aspect ratio ceramic structures in which a selected portion of perovskite or perovskite-like crystalline material is exposed to a high energy ion beam for a time sufficient to cause the crystalline material contacted by the ion beam to have substantially parallel columnar defects. Then selected portions of the material having substantially parallel columnar defects are etched leaving material with and without substantially parallel columnar defects in a predetermined shape having high aspect ratios of not less than 2 to 1. Etching is accomplished by optical or PMMA lithography. There is also disclosed a structure of a ceramic which is superconducting at a temperature in the range of from about 10.degree. K. to about 90.degree. K. with substantially parallel columnar defects in which the smallest lateral dimension of the structure is less than about 5 microns, and the thickness of the structure is greater than 2 times the smallest lateral dimension of the structure.

  4. Applications of Parallel Computation in Micro-Mechanics and Finite Element Method

    NASA Technical Reports Server (NTRS)

    Tan, Hui-Qian

    1996-01-01

    This project discusses the application of parallel computations related with respect to material analyses. Briefly speaking, we analyze some kind of material by elements computations. We call an element a cell here. A cell is divided into a number of subelements called subcells and all subcells in a cell have the identical structure. The detailed structure will be given later in this paper. It is obvious that the problem is "well-structured". SIMD machine would be a better choice. In this paper we try to look into the potentials of SIMD machine in dealing with finite element computation by developing appropriate algorithms on MasPar, a SIMD parallel machine. In section 2, the architecture of MasPar will be discussed. A brief review of the parallel programming language MPL also is given in that section. In section 3, some general parallel algorithms which might be useful to the project will be proposed. And, combining with the algorithms, some features of MPL will be discussed in more detail. In section 4, the computational structure of cell/subcell model will be given. The idea of designing the parallel algorithm for the model will be demonstrated. Finally in section 5, a summary will be given.

  5. A New View on Origin, Role and Manipulation of Large Scales in Turbulent Boundary Layers

    NASA Technical Reports Server (NTRS)

    Corke, T. C.; Nagib, H. M.; Guezennec, Y. G.

    1982-01-01

    The potential of passive 'manipulators' for altering the large scale turbulent structures in boundary layers was investigated. Utilizing smoke wire visualization and multisensor probes, the experiment verified that the outer scales could be suppressed by simple arrangements of parallel plates. As a result of suppressing the outer scales in turbulent layers, a decrease in the streamwise growth of the boundary layer thickness was achieved and was coupled with a 30 percent decrease in the local wall friction coefficient. After accounting for the drag on the manipulator plates, the net drag reduction reached a value of 20 percent within 55 boundary layer thicknesses downstream of the device. No evidence for the reoccurrence of the outer scales was present at this streamwise distance thereby suggesting that further reductions in the net drag are attainable. The frequency of occurrence of the wall events is simultaneously dependent on the two parameters, Re2 delta sub 2 and Re sub x. As a result of being able to independently control the inner and outer boundary layer characteristics with these manipulators, a different view of these layers emerged.

  6. Gyrokinetic Simulations of Transport Scaling and Structure

    NASA Astrophysics Data System (ADS)

    Hahm, Taik Soo

    2001-10-01

    There is accumulating evidence from global gyrokinetic particle simulations with profile variations and experimental fluctuation measurements that microturbulence, with its time-averaged eddy size which scales with the ion gyroradius, can cause ion thermal transport which deviates from the gyro-Bohm scaling. The physics here can be best addressed by large scale (rho* = rho_i/a = 0.001) full torus gyrokinetic particle-in-cell turbulence simulations using our massively parallel, general geometry gyrokinetic toroidal code with field-aligned mesh. Simulation results from device-size scans for realistic parameters show that ``wave transport'' mechanism is not the dominant contribution for this Bohm-like transport and that transport is mostly diffusive driven by microscopic scale fluctuations in the presence of self-generated zonal flows. In this work, we analyze the turbulence and zonal flow statistics from simulations and compare to nonlinear theoretical predictions including the radial decorrelation of the transport events by zonal flows and the resulting probability distribution function (PDF). In particular, possible deviation of the characteristic radial size of transport processes from the time-averaged radial size of the density fluctuation eddys will be critically examined.

  7. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    NASA Astrophysics Data System (ADS)

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  8. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.

    PubMed

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  9. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    PubMed Central

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-01-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520

  10. Scale dependence of the alignment between strain rate and rotation in turbulent shear flow

    NASA Astrophysics Data System (ADS)

    Fiscaletti, D.; Elsinga, G. E.; Attili, A.; Bisetti, F.; Buxton, O. R. H.

    2016-10-01

    The scale dependence of the statistical alignment tendencies of the eigenvectors of the strain-rate tensor ei, with the vorticity vector ω , is examined in the self-preserving region of a planar turbulent mixing layer. Data from a direct numerical simulation are filtered at various length scales and the probability density functions of the magnitude of the alignment cosines between the two unit vectors | ei.ω ̂| are examined. It is observed that the alignment tendencies are insensitive to the concurrent large-scale velocity fluctuations, but are quantitatively affected by the nature of the concurrent large-scale velocity-gradient fluctuations. It is confirmed that the small-scale (local) vorticity vector is preferentially aligned in parallel with the large-scale (background) extensive strain-rate eigenvector e1, in contrast to the global tendency for ω to be aligned in parallel with the intermediate strain-rate eigenvector [Hamlington et al., Phys. Fluids 20, 111703 (2008), 10.1063/1.3021055]. When only data from regions of the flow that exhibit strong swirling are included, the so-called high-enstrophy worms, the alignment tendencies are exaggerated with respect to the global picture. These findings support the notion that the production of enstrophy, responsible for a net cascade of turbulent kinetic energy from large scales to small scales, is driven by vorticity stretching due to the preferential parallel alignment between ω and nonlocal e1 and that the strongly swirling worms are kinematically significant to this process.

  11. Understanding protein evolution: from protein physics to Darwinian selection.

    PubMed

    Zeldovich, Konstantin B; Shakhnovich, Eugene I

    2008-01-01

    Efforts in whole-genome sequencing and structural proteomics start to provide a global view of the protein universe, the set of existing protein structures and sequences. However, approaches based on the selection of individual sequences have not been entirely successful at the quantitative description of the distribution of structures and sequences in the protein universe because evolutionary pressure acts on the entire organism, rather than on a particular molecule. In parallel to this line of study, studies in population genetics and phenomenological molecular evolution established a mathematical framework to describe the changes in genome sequences in populations of organisms over time. Here, we review both microscopic (physics-based) and macroscopic (organism-level) models of protein-sequence evolution and demonstrate that bridging the two scales provides the most complete description of the protein universe starting from clearly defined, testable, and physiologically relevant assumptions.

  12. Parallel Geospatial Data Management for Multi-Scale Environmental Data Analysis on GPUs

    NASA Astrophysics Data System (ADS)

    Wang, D.; Zhang, J.; Wei, Y.

    2013-12-01

    As the spatial and temporal resolutions of Earth observatory data and Earth system simulation outputs are getting higher, in-situ and/or post- processing such large amount of geospatial data increasingly becomes a bottleneck in scientific inquires of Earth systems and their human impacts. Existing geospatial techniques that are based on outdated computing models (e.g., serial algorithms and disk-resident systems), as have been implemented in many commercial and open source packages, are incapable of processing large-scale geospatial data and achieve desired level of performance. In this study, we have developed a set of parallel data structures and algorithms that are capable of utilizing massively data parallel computing power available on commodity Graphics Processing Units (GPUs) for a popular geospatial technique called Zonal Statistics. Given two input datasets with one representing measurements (e.g., temperature or precipitation) and the other one represent polygonal zones (e.g., ecological or administrative zones), Zonal Statistics computes major statistics (or complete distribution histograms) of the measurements in all regions. Our technique has four steps and each step can be mapped to GPU hardware by identifying its inherent data parallelisms. First, a raster is divided into blocks and per-block histograms are derived. Second, the Minimum Bounding Boxes (MBRs) of polygons are computed and are spatially matched with raster blocks; matched polygon-block pairs are tested and blocks that are either inside or intersect with polygons are identified. Third, per-block histograms are aggregated to polygons for blocks that are completely within polygons. Finally, for blocks that intersect with polygon boundaries, all the raster cells within the blocks are examined using point-in-polygon-test and cells that are within polygons are used to update corresponding histograms. As the task becomes I/O bound after applying spatial indexing and GPU hardware acceleration, we have developed a GPU-based data compression technique by reusing our previous work on Bitplane Quadtree (or BPQ-Tree) based indexing of binary bitmaps. Results have shown that our GPU-based parallel Zonal Statistic technique on 3000+ US counties over 20+ billion NASA SRTM 30 meter resolution Digital Elevation (DEM) raster cells has achieved impressive end-to-end runtimes: 101 seconds and 46 seconds a low-end workstation equipped with a Nvidia GTX Titan GPU using cold and hot cache, respectively; and, 60-70 seconds using a single OLCF TITAN computing node and 10-15 seconds using 8 nodes. Our experiment results clearly show the potentials of using high-end computing facilities for large-scale geospatial processing.

  13. The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

    NASA Technical Reports Server (NTRS)

    Woo, Alex C.; Hill, Kueichien C.

    1996-01-01

    The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts.

  14. Microbial community structures in foaming and nonfoaming full-scale wastewater treatment plants.

    PubMed

    de los Reyes, Francis L; Rothauszky, Dagmar; Raskin, Lutgarde

    2002-01-01

    A survey of full-scale activated-sludge plants in Illinois revealed that filamentous foaming is a widespread problem in the state, and that the causes and consequences of foaming control strategies are not fully understood. To link microbial community structure to foam occurrence, microbial populations in eight foaming and nine nonfoaming full-scale activated-sludge systems were quantified using oligonucleotide hybridization probes targeting the ribosomal RNA (rRNA) of the mycolata; Gordonia spp.; Gordonia amarae; "Candidatus Microthrix parvicella"; the alpha-, beta-, and gamma-subclasses of the Proteobacteria, and members of the Cytophaga-Flavobacteria. Parallel measurements of microbial population abundance using hybridization of extracted RNA and fluorescence in situ hybridization (FISH) showed that the levels of mycolata, particularly Gordonia spp., were higher in most foaming systems compared with nonfoaming systems. Fluorescence in situ hybridization and microscopy suggested the involvement of "Candidatus Microthrix parvicella" and Skermania piniformis in foam formation in other plants. Finally, high numbers of "Candidatus Microthrix parvicella" were detected by FISH in foam and mixed liquor samples of one plant, whereas the corresponding levels of rRNA were low. This finding implies that inactive "Candidatus Microthrix parvicella" cells (i.e., cells with low rRNA levels) can cause foaming.

  15. Monocrystalline Heusler Co2FeSi alloy glass-coated microwires: Fabrication and magneto-structural characterization

    NASA Astrophysics Data System (ADS)

    Galdun, L.; Ryba, T.; Prida, V. M.; Zhukova, V.; Zhukov, A.; Diko, P.; Kavečanský, V.; Vargova, Z.; Varga, R.

    2018-05-01

    Large scale production of single crystalline phase of Heusler Co2FeSi alloy microwire is reported. The long microwire (∼1 km) with the metallic nucleus diameter of about 2 μm is characterized by well oriented monocrystalline structure (B2 phase, with the lattice parameter a = 5.615 Å). Moreover, the crystallographic direction [1 0 1] is parallel to the wire's axis along the entire length. Additionally, the wire is characterized by exhibiting a high Curie temperature (Tc > 800 K) and well-defined magnetic anisotropy mainly governed by shape. Electrical resistivity measurement reveals the exponential suppression of the electron-magnon scattering which provides strong evidence on the half-metallic behaviour of this material in the low temperature range.

  16. Dark halos formed via dissipationless collapse. I - Shapes and alignment of angular momentum

    NASA Astrophysics Data System (ADS)

    Warren, Michael S.; Quinn, Peter J.; Salmon, John K.; Zurek, Wojciech H.

    1992-11-01

    We use N-body simulations on highly parallel supercomputers to study the structure of Galactic dark matter halos. The systems form by gravitational collapse from scale-free and more general Gaussian initial density perturbations in an expanding 400 Mpc-cubed spherical slice of an Einstein-deSitter universe. We analyze the structure and kinematics of about 100 of the largest relaxed halos in each of 10 separate simulations. A typical halo is a triaxial spheroid which tends to be more often prolate than oblate. These shapes are maintained by anisotropic velocity dispersion rather than by angular momentum. Nevertheless, there is a significant tendency for the total angular momentum vector to be aligned with the minor axis of the density distribution.

  17. Development of a Distributed Parallel Computing Framework to Facilitate Regional/Global Gridded Crop Modeling with Various Scenarios

    NASA Astrophysics Data System (ADS)

    Jang, W.; Engda, T. A.; Neff, J. C.; Herrick, J.

    2017-12-01

    Many crop models are increasingly used to evaluate crop yields at regional and global scales. However, implementation of these models across large areas using fine-scale grids is limited by computational time requirements. In order to facilitate global gridded crop modeling with various scenarios (i.e., different crop, management schedule, fertilizer, and irrigation) using the Environmental Policy Integrated Climate (EPIC) model, we developed a distributed parallel computing framework in Python. Our local desktop with 14 cores (28 threads) was used to test the distributed parallel computing framework in Iringa, Tanzania which has 406,839 grid cells. High-resolution soil data, SoilGrids (250 x 250 m), and climate data, AgMERRA (0.25 x 0.25 deg) were also used as input data for the gridded EPIC model. The framework includes a master file for parallel computing, input database, input data formatters, EPIC model execution, and output analyzers. Through the master file for parallel computing, the user-defined number of threads of CPU divides the EPIC simulation into jobs. Then, Using EPIC input data formatters, the raw database is formatted for EPIC input data and the formatted data moves into EPIC simulation jobs. Then, 28 EPIC jobs run simultaneously and only interesting results files are parsed and moved into output analyzers. We applied various scenarios with seven different slopes and twenty-four fertilizer ranges. Parallelized input generators create different scenarios as a list for distributed parallel computing. After all simulations are completed, parallelized output analyzers are used to analyze all outputs according to the different scenarios. This saves significant computing time and resources, making it possible to conduct gridded modeling at regional to global scales with high-resolution data. For example, serial processing for the Iringa test case would require 113 hours, while using the framework developed in this study requires only approximately 6 hours, a nearly 95% reduction in computing time.

  18. SCORPIO: A Scalable Two-Phase Parallel I/O Library With Application To A Large Scale Subsurface Simulator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sreepathi, Sarat; Sripathi, Vamsi; Mills, Richard T

    2013-01-01

    Inefficient parallel I/O is known to be a major bottleneck among scientific applications employed on supercomputers as the number of processor cores grows into the thousands. Our prior experience indicated that parallel I/O libraries such as HDF5 that rely on MPI-IO do not scale well beyond 10K processor cores, especially on parallel file systems (like Lustre) with single point of resource contention. Our previous optimization efforts for a massively parallel multi-phase and multi-component subsurface simulator (PFLOTRAN) led to a two-phase I/O approach at the application level where a set of designated processes participate in the I/O process by splitting themore » I/O operation into a communication phase and a disk I/O phase. The designated I/O processes are created by splitting the MPI global communicator into multiple sub-communicators. The root process in each sub-communicator is responsible for performing the I/O operations for the entire group and then distributing the data to rest of the group. This approach resulted in over 25X speedup in HDF I/O read performance and 3X speedup in write performance for PFLOTRAN at over 100K processor cores on the ORNL Jaguar supercomputer. This research describes the design and development of a general purpose parallel I/O library, SCORPIO (SCalable block-ORiented Parallel I/O) that incorporates our optimized two-phase I/O approach. The library provides a simplified higher level abstraction to the user, sitting atop existing parallel I/O libraries (such as HDF5) and implements optimized I/O access patterns that can scale on larger number of processors. Performance results with standard benchmark problems and PFLOTRAN indicate that our library is able to maintain the same speedups as before with the added flexibility of being applicable to a wider range of I/O intensive applications.« less

  19. The NAS parallel benchmarks

    NASA Technical Reports Server (NTRS)

    Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

    1993-01-01

    A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.

  20. A Parallel Population Genomic and Hydrodynamic Approach to Fishery Management of Highly-Dispersive Marine Invertebrates: The Case of the Fijian Black-Lip Pearl Oyster Pinctada margaritifera.

    PubMed

    Lal, Monal M; Southgate, Paul C; Jerry, Dean R; Bosserelle, Cyprien; Zenger, Kyall R

    2016-01-01

    Fishery management and conservation of marine species increasingly relies on genetic data to delineate biologically relevant stock boundaries. Unfortunately for high gene flow species which may display low, but statistically significant population structure, there is no clear consensus on the level of differentiation required to resolve distinct stocks. The use of fine-scale neutral and adaptive variation, considered together with environmental data can offer additional insights to this problem. Genome-wide genetic data (4,123 SNPs), together with an independent hydrodynamic particle dispersal model were used to inform farm and fishery management in the Fijian black-lip pearl oyster Pinctada margaritifera, where comprehensive fishery management is lacking, and the sustainability of exploitation uncertain. Weak fine-scale patterns of population structure were detected, indicative of broad-scale panmixia among wild oysters, while a hatchery-sourced farmed population exhibited a higher degree of genetic divergence (Fst = 0.0850-0.102). This hatchery-produced population had also experienced a bottleneck (NeLD = 5.1; 95% C.I. = [5.1-5.3]); compared to infinite NeLD estimates for all wild oysters. Simulation of larval transport pathways confirmed the existence of broad-scale mixture by surface ocean currents, correlating well with fine-scale patterns of population structuring. Fst outlier tests failed to detect large numbers of loci supportive of selection, with 2-5 directional outlier SNPs identified (average Fst = 0.116). The lack of biologically significant population genetic structure, absence of evidence for local adaptation and larval dispersal simulation, all indicate the existence of a single genetic stock of P. margaritifera in the Fiji Islands. This approach using independent genomic and oceanographic tools has allowed fundamental insights into stock structure in this species, with transferability to other highly-dispersive marine taxa for their conservation and management.

  1. SQDFT: Spectral Quadrature method for large-scale parallel O ( N ) Kohn–Sham calculations at high temperature

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Suryanarayana, Phanish; Pratapa, Phanisri P.; Sharma, Abhiraj

    We present SQDFT: a large-scale parallel implementation of the Spectral Quadrature (SQ) method formore » $$\\mathscr{O}(N)$$ Kohn–Sham Density Functional Theory (DFT) calculations at high temperature. Specifically, we develop an efficient and scalable finite-difference implementation of the infinite-cell Clenshaw–Curtis SQ approach, in which results for the infinite crystal are obtained by expressing quantities of interest as bilinear forms or sums of bilinear forms, that are then approximated by spatially localized Clenshaw–Curtis quadrature rules. We demonstrate the accuracy of SQDFT by showing systematic convergence of energies and atomic forces with respect to SQ parameters to reference diagonalization results, and convergence with discretization to established planewave results, for both metallic and insulating systems. Here, we further demonstrate that SQDFT achieves excellent strong and weak parallel scaling on computer systems consisting of tens of thousands of processors, with near perfect $$\\mathscr{O}(N)$$ scaling with system size and wall times as low as a few seconds per self-consistent field iteration. Finally, we verify the accuracy of SQDFT in large-scale quantum molecular dynamics simulations of aluminum at high temperature.« less

  2. SQDFT: Spectral Quadrature method for large-scale parallel O ( N ) Kohn–Sham calculations at high temperature

    DOE PAGES

    Suryanarayana, Phanish; Pratapa, Phanisri P.; Sharma, Abhiraj; ...

    2017-12-07

    We present SQDFT: a large-scale parallel implementation of the Spectral Quadrature (SQ) method formore » $$\\mathscr{O}(N)$$ Kohn–Sham Density Functional Theory (DFT) calculations at high temperature. Specifically, we develop an efficient and scalable finite-difference implementation of the infinite-cell Clenshaw–Curtis SQ approach, in which results for the infinite crystal are obtained by expressing quantities of interest as bilinear forms or sums of bilinear forms, that are then approximated by spatially localized Clenshaw–Curtis quadrature rules. We demonstrate the accuracy of SQDFT by showing systematic convergence of energies and atomic forces with respect to SQ parameters to reference diagonalization results, and convergence with discretization to established planewave results, for both metallic and insulating systems. Here, we further demonstrate that SQDFT achieves excellent strong and weak parallel scaling on computer systems consisting of tens of thousands of processors, with near perfect $$\\mathscr{O}(N)$$ scaling with system size and wall times as low as a few seconds per self-consistent field iteration. Finally, we verify the accuracy of SQDFT in large-scale quantum molecular dynamics simulations of aluminum at high temperature.« less

  3. Nanoscale heterogeneity at the aqueous electrolyte-electrode interface

    NASA Astrophysics Data System (ADS)

    Limmer, David T.; Willard, Adam P.

    2015-01-01

    Using molecular dynamics simulations, we reveal emergent properties of hydrated electrode interfaces that while molecular in origin are integral to the behavior of the system across long times scales and large length scales. Specifically, we describe the impact of a disordered and slowly evolving adsorbed layer of water on the molecular structure and dynamics of the electrolyte solution adjacent to it. Generically, we find that densities and mobilities of both water and dissolved ions are spatially heterogeneous in the plane parallel to the electrode over nanosecond timescales. These and other recent results are analyzed in the context of available experimental literature from surface science and electrochemistry. We speculate on the implications of this emerging microscopic picture on the catalytic proficiency of hydrated electrodes, offering a new direction for study in heterogeneous catalysis at the nanoscale.

  4. BioPig: Developing Cloud Computing Applications for Next-Generation Sequence Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhatia, Karan; Wang, Zhong

    Next Generation sequencing is producing ever larger data sizes with a growth rate outpacing Moore's Law. The data deluge has made many of the current sequenceanalysis tools obsolete because they do not scale with data. Here we present BioPig, a collection of cloud computing tools to scale data analysis and management. Pig is aflexible data scripting language that uses Apache's Hadoop data structure and map reduce framework to process very large data files in parallel and combine the results.BioPig extends Pig with capability with sequence analysis. We will show the performance of BioPig on a variety of bioinformatics tasks, includingmore » screeningsequence contaminants, Illumina QA/QC, and gene discovery from metagenome data sets using the Rumen metagenome as an example.« less

  5. Skeletonization with hollow detection on gray image by gray weighted distance transform

    NASA Astrophysics Data System (ADS)

    Bhattacharya, Prabir; Qian, Kai; Cao, Siqi; Qian, Yi

    1998-10-01

    A skeletonization algorithm that could be used to process non-uniformly distributed gray-scale images with hollows was presented. This algorithm is based on the Gray Weighted Distance Transformation. The process includes a preliminary phase of investigation in the hollows in the gray-scale image, whether these hollows are considered as topological constraints for the skeleton structure depending on their statistically significant depth. We then extract the resulting skeleton that has certain meaningful information for understanding the object in the image. This improved algorithm can overcome the possible misinterpretation of some complicated images in the extracted skeleton, especially in images with asymmetric hollows and asymmetric features. This algorithm can be executed on a parallel machine as all the operations are executed in local. Some examples are discussed to illustrate the algorithm.

  6. Synthesis of Efficient Structures for Concurrent Computation.

    DTIC Science & Technology

    1983-10-01

    formal presentation of these techniques, called virtualisation and aggregation, can be found n [King-83$. 113.2 Census Functions Trees perform broadcast... Functions .. .. .. .. ... .... ... ... .... ... ... ....... 6 4 User-Assisted Aggregation .. .. .. .. ... ... ... .... ... .. .......... 6 5 Parallel...6. Simple Parallel Structure for Broadcasting .. .. .. .. .. . ... .. . .. . .... 4 Figure 7. Internal Structure of a Prefix Computation Network

  7. Unstable flow structures in the Blasius boundary layer.

    PubMed

    Wedin, H; Bottaro, A; Hanifi, A; Zampogna, G

    2014-04-01

    Finite amplitude coherent structures with a reflection symmetry in the spanwise direction of a parallel boundary layer flow are reported together with a preliminary analysis of their stability. The search for the solutions is based on the self-sustaining process originally described by Waleffe (Phys. Fluids 9, 883 (1997)). This requires adding a body force to the Navier-Stokes equations; to locate a relevant nonlinear solution it is necessary to perform a continuation in the nonlinear regime and parameter space in order to render the body force of vanishing amplitude. Some states computed display a spanwise spacing between streaks of the same length scale as turbulence flow structures observed in experiments (S.K. Robinson, Ann. Rev. Fluid Mech. 23, 601 (1991)), and are found to be situated within the buffer layer. The exact coherent structures are unstable to small amplitude perturbations and thus may be part of a set of unstable nonlinear states of possible use to describe the turbulent transition. The nonlinear solutions survive down to a displacement thickness Reynolds number Re * = 496 , displaying a 4-vortex structure and an amplitude of the streamwise root-mean-square velocity of 6% scaled with the free-stream velocity. At this Re* the exact coherent structure bifurcates supercritically and this is the point where the laminar Blasius flow starts to cohabit the phase space with alternative simple exact solutions of the Navier-Stokes equations.

  8. Conceptual design of a hybrid parallel mechanism for mask exchanging of TMT

    NASA Astrophysics Data System (ADS)

    Wang, Jianping; Zhou, Hongfei; Li, Kexuan; Zhou, Zengxiang; Zhai, Chao

    2015-10-01

    Mask exchange system is an important part of the Multi-Object Broadband Imaging Echellette (MOBIE) on the Thirty Meter Telescope (TMT). To solve the problem of stiffness changing with the gravity vector of the mask exchange system in the MOBIE, the hybrid parallel mechanism design method was introduced into the whole research. By using the characteristics of high stiffness and precision of parallel structure, combined with large moving range of serial structure, a conceptual design of a hybrid parallel mask exchange system based on 3-RPS parallel mechanism was presented. According to the position requirements of the MOBIE, the SolidWorks structure model of the hybrid parallel mask exchange robot was established and the appropriate installation position without interfering with the related components and light path in the MOBIE of TMT was analyzed. Simulation results in SolidWorks suggested that 3-RPS parallel platform had good stiffness property in different gravity vector directions. Furthermore, through the research of the mechanism theory, the inverse kinematics solution of the 3-RPS parallel platform was calculated and the mathematical relationship between the attitude angle of moving platform and the angle of ball-hinges on the moving platform was established, in order to analyze the attitude adjustment ability of the hybrid parallel mask exchange robot. The proposed conceptual design has some guiding significance for the design of mask exchange system of the MOBIE on TMT.

  9. When the lowest energy does not induce native structures: parallel minimization of multi-energy values by hybridizing searching intelligences.

    PubMed

    Lü, Qiang; Xia, Xiao-Yan; Chen, Rong; Miao, Da-Jun; Chen, Sha-Sha; Quan, Li-Jun; Li, Hai-Ou

    2012-01-01

    Protein structure prediction (PSP), which is usually modeled as a computational optimization problem, remains one of the biggest challenges in computational biology. PSP encounters two difficult obstacles: the inaccurate energy function problem and the searching problem. Even if the lowest energy has been luckily found by the searching procedure, the correct protein structures are not guaranteed to obtain. A general parallel metaheuristic approach is presented to tackle the above two problems. Multi-energy functions are employed to simultaneously guide the parallel searching threads. Searching trajectories are in fact controlled by the parameters of heuristic algorithms. The parallel approach allows the parameters to be perturbed during the searching threads are running in parallel, while each thread is searching the lowest energy value determined by an individual energy function. By hybridizing the intelligences of parallel ant colonies and Monte Carlo Metropolis search, this paper demonstrates an implementation of our parallel approach for PSP. 16 classical instances were tested to show that the parallel approach is competitive for solving PSP problem. This parallel approach combines various sources of both searching intelligences and energy functions, and thus predicts protein conformations with good quality jointly determined by all the parallel searching threads and energy functions. It provides a framework to combine different searching intelligence embedded in heuristic algorithms. It also constructs a container to hybridize different not-so-accurate objective functions which are usually derived from the domain expertise.

  10. When the Lowest Energy Does Not Induce Native Structures: Parallel Minimization of Multi-Energy Values by Hybridizing Searching Intelligences

    PubMed Central

    Lü, Qiang; Xia, Xiao-Yan; Chen, Rong; Miao, Da-Jun; Chen, Sha-Sha; Quan, Li-Jun; Li, Hai-Ou

    2012-01-01

    Background Protein structure prediction (PSP), which is usually modeled as a computational optimization problem, remains one of the biggest challenges in computational biology. PSP encounters two difficult obstacles: the inaccurate energy function problem and the searching problem. Even if the lowest energy has been luckily found by the searching procedure, the correct protein structures are not guaranteed to obtain. Results A general parallel metaheuristic approach is presented to tackle the above two problems. Multi-energy functions are employed to simultaneously guide the parallel searching threads. Searching trajectories are in fact controlled by the parameters of heuristic algorithms. The parallel approach allows the parameters to be perturbed during the searching threads are running in parallel, while each thread is searching the lowest energy value determined by an individual energy function. By hybridizing the intelligences of parallel ant colonies and Monte Carlo Metropolis search, this paper demonstrates an implementation of our parallel approach for PSP. 16 classical instances were tested to show that the parallel approach is competitive for solving PSP problem. Conclusions This parallel approach combines various sources of both searching intelligences and energy functions, and thus predicts protein conformations with good quality jointly determined by all the parallel searching threads and energy functions. It provides a framework to combine different searching intelligence embedded in heuristic algorithms. It also constructs a container to hybridize different not-so-accurate objective functions which are usually derived from the domain expertise. PMID:23028708

  11. Classical test theory and Rasch analysis validation of the Upper Limb Functional Index in subjects with upper limb musculoskeletal disorders.

    PubMed

    Bravini, Elisabetta; Franchignoni, Franco; Giordano, Andrea; Sartorio, Francesco; Ferriero, Giorgio; Vercelli, Stefano; Foti, Calogero

    2015-01-01

    To perform a comprehensive analysis of the psychometric properties and dimensionality of the Upper Limb Functional Index (ULFI) using both classical test theory and Rasch analysis (RA). Prospective, single-group observational design. Freestanding rehabilitation center. Convenience sample of Italian-speaking subjects with upper limb musculoskeletal disorders (N=174). Not applicable. The Italian version of the ULFI. Data were analyzed using parallel analysis, exploratory factor analysis, and RA for evaluating dimensionality, functioning of rating scale categories, item fit, hierarchy of item difficulties, and reliability indices. Parallel analysis revealed 2 factors explaining 32.5% and 10.7% of the response variance. RA confirmed the failure of the unidimensionality assumption, and 6 items out of the 25 misfitted the Rasch model. When the analysis was rerun excluding the misfitting items, the scale showed acceptable fit values, loading meaningfully to a single factor. Item separation reliability and person separation reliability were .98 and .89, respectively. Cronbach alpha was .92. RA revealed weakness of the scale concerning dimensionality and internal construct validity. However, a set of 19 ULFI items defined through the statistical process demonstrated a unidimensional structure, good psychometric properties, and clinical meaningfulness. These findings represent a useful starting point for further analyses of the tool (based on modern psychometric approaches and confirmatory factor analysis) in larger samples, including different patient populations and nationalities. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  12. Three-dimensionally preserved integument reveals hydrodynamic adaptations in the extinct marine lizard Ectenosaurus (Reptilia, Mosasauridae).

    PubMed

    Lindgren, Johan; Everhart, Michael J; Caldwell, Michael W

    2011-01-01

    The physical properties of water and the environment it presents to its inhabitants provide stringent constraints and selection pressures affecting aquatic adaptation and evolution. Mosasaurs (a group of secondarily aquatic reptiles that occupied a broad array of predatory niches in the Cretaceous marine ecosystems about 98-65 million years ago) have traditionally been considered as anguilliform locomotors capable only of generating short bursts of speed during brief ambush pursuits. Here we report on an exceptionally preserved, long-snouted mosasaur (Ectenosaurus clidastoides) from the Santonian (Upper Cretaceous) part of the Smoky Hill Chalk Member of the Niobrara Formation in western Kansas, USA, that contains phosphatized remains of the integument displaying both depth and structure. The small, ovoid neck and/or anterior trunk scales exhibit a longitudinal central keel, and are obliquely arrayed into an alternating pattern where neighboring scales overlap one another. Supportive sculpturing in the form of two parallel, longitudinal ridges on the inner scale surface and a complex system of multiple, superimposed layers of straight, cross-woven helical fiber bundles in the underlying dermis, may have served to minimize surface deformation and frictional drag during locomotion. Additional parallel fiber bundles oriented at acute angles to the long axis of the animal presumably provided stiffness in the lateral plane. These features suggest that the anterior torso of Ectenosaurus was held somewhat rigid during swimming, thereby limiting propulsive movements to the posterior body and tail.

  13. Structural Health Monitoring on Turbine Engines Using Microwave Blade Tip Clearance Sensors

    NASA Technical Reports Server (NTRS)

    Woike, Mark; Abdul-Aziz, Ali; Clem, Michelle

    2014-01-01

    The ability to monitor the structural health of the rotating components, especially in the hot sections of turbine engines, is of major interest to aero community in improving engine safety and reliability. The use of instrumentation for these applications remains very challenging. It requires sensors and techniques that are highly accurate, are able to operate in a high temperature environment, and can detect minute changes and hidden flaws before catastrophic events occur. The National Aeronautics and Space Administration (NASA) has taken a lead role in the investigation of new sensor technologies and techniques for the in situ structural health monitoring of gas turbine engines. As part of this effort, microwave sensor technology has been investigated as a means of making high temperature non-contact blade tip clearance, blade tip timing, and blade vibration measurements for use in gas turbine engines. This paper presents a summary of key results and findings obtained from the evaluation of two different types of microwave sensors that have been investigated for use possible in structural health monitoring applications. The first is a microwave blade tip clearance sensor that has been evaluated on a large scale Axial Vane Fan, a subscale Turbofan, and more recently on sub-scale turbine engine like disks. The second is a novel microwave based blade vibration sensor that was also used in parallel with the microwave blade tip clearance sensors on the experiments with the sub-scale turbine engine disks.

  14. Structural health monitoring on turbine engines using microwave blade tip clearance sensors

    NASA Astrophysics Data System (ADS)

    Woike, Mark; Abdul-Aziz, Ali; Clem, Michelle

    2014-04-01

    The ability to monitor the structural health of the rotating components, especially in the hot sections of turbine engines, is of major interest to the aero community in improving engine safety and reliability. The use of instrumentation for these applications remains very challenging. It requires sensors and techniques that are highly accurate, are able to operate in a high temperature environment, and can detect minute changes and hidden flaws before catastrophic events occur. The National Aeronautics and Space Administration (NASA) has taken a lead role in the investigation of new sensor technologies and techniques for the in situ structural health monitoring of gas turbine engines. As part of this effort, microwave sensor technology has been investigated as a means of making high temperature non-contact blade tip clearance, blade tip timing, and blade vibration measurements for use in gas turbine engines. This paper presents a summary of key results and findings obtained from the evaluation of two different types of microwave sensors that have been investigated for possible use in structural health monitoring applications. The first is a microwave blade tip clearance sensor that has been evaluated on a large scale Axial Vane Fan, a subscale Turbofan, and more recently on sub-scale turbine engine like disks. The second is a novel microwave based blade vibration sensor that was also used in parallel with the microwave blade tip clearance sensors on the same experiments with the sub-scale turbine engine disks.

  15. Large-scale variation in subsurface stream biofilms: a cross-regional comparison of metabolic function and community similarity.

    PubMed

    Findlay, S; Sinsabaugh, R L

    2006-10-01

    We examined bacterial metabolic activity and community similarity in shallow subsurface stream sediments distributed across three regions of the eastern United States to assess whether there were parallel changes in functional and structural attributes at this large scale. Bacterial growth, oxygen consumption, and a suite of extracellular enzyme activities were assayed to describe functional variability. Community similarity was assessed using randomly amplified polymorphic DNA (RAPD) patterns. There were significant differences in streamwater chemistry, metabolic activity, and bacterial growth among regions with, for instance, twofold higher bacterial production in streams near Baltimore, MD, compared to Hubbard Brook, NH. Five of eight extracellular enzymes showed significant differences among regions. Cluster analyses of individual streams by metabolic variables showed clear groups with significant differences in representation of sites from different regions among groups. Clustering of sites based on randomly amplified polymorphic DNA banding resulted in groups with generally less internal similarity although there were still differences in distribution of regional sites. There was a marginally significant (p = 0.09) association between patterns based on functional and structural variables. There were statistically significant but weak (r2 approximately 30%) associations between landcover and measures of both structure and function. These patterns imply a large-scale organization of biofilm communities and this structure may be imposed by factor(s) such as landcover and covariates such as nutrient concentrations, which are known to also cause differences in macrobiota of stream ecosystems.

  16. Strategies for Large Scale Implementation of a Multiscale, Multiprocess Integrated Hydrologic Model

    NASA Astrophysics Data System (ADS)

    Kumar, M.; Duffy, C.

    2006-05-01

    Distributed models simulate hydrologic state variables in space and time while taking into account the heterogeneities in terrain, surface, subsurface properties and meteorological forcings. Computational cost and complexity associated with these model increases with its tendency to accurately simulate the large number of interacting physical processes at fine spatio-temporal resolution in a large basin. A hydrologic model run on a coarse spatial discretization of the watershed with limited number of physical processes needs lesser computational load. But this negatively affects the accuracy of model results and restricts physical realization of the problem. So it is imperative to have an integrated modeling strategy (a) which can be universally applied at various scales in order to study the tradeoffs between computational complexity (determined by spatio- temporal resolution), accuracy and predictive uncertainty in relation to various approximations of physical processes (b) which can be applied at adaptively different spatial scales in the same domain by taking into account the local heterogeneity of topography and hydrogeologic variables c) which is flexible enough to incorporate different number and approximation of process equations depending on model purpose and computational constraint. An efficient implementation of this strategy becomes all the more important for Great Salt Lake river basin which is relatively large (~89000 sq. km) and complex in terms of hydrologic and geomorphic conditions. Also the types and the time scales of hydrologic processes which are dominant in different parts of basin are different. Part of snow melt runoff generated in the Uinta Mountains infiltrates and contributes as base flow to the Great Salt Lake over a time scale of decades to centuries. The adaptive strategy helps capture the steep topographic and climatic gradient along the Wasatch front. Here we present the aforesaid modeling strategy along with an associated hydrologic modeling framework which facilitates a seamless, computationally efficient and accurate integration of the process model with the data model. The flexibility of this framework leads to implementation of multiscale, multiresolution, adaptive refinement/de-refinement and nested modeling simulations with least computational burden. However, performing these simulations and related calibration of these models over a large basin at higher spatio- temporal resolutions is computationally intensive and requires use of increasing computing power. With the advent of parallel processing architectures, high computing performance can be achieved by parallelization of existing serial integrated-hydrologic-model code. This translates to running the same model simulation on a network of large number of processors thereby reducing the time needed to obtain solution. The paper also discusses the implementation of the integrated model on parallel processors. Also will be discussed the mapping of the problem on multi-processor environment, method to incorporate coupling between hydrologic processes using interprocessor communication models, model data structure and parallel numerical algorithms to obtain high performance.

  17. Hybrid Parallelization of Adaptive MHD-Kinetic Module in Multi-Scale Fluid-Kinetic Simulation Suite

    DOE PAGES

    Borovikov, Sergey; Heerikhuisen, Jacob; Pogorelov, Nikolai

    2013-04-01

    The Multi-Scale Fluid-Kinetic Simulation Suite has a computational tool set for solving partially ionized flows. In this paper we focus on recent developments of the kinetic module which solves the Boltzmann equation using the Monte-Carlo method. The module has been recently redesigned to utilize intra-node hybrid parallelization. We describe in detail the redesign process, implementation issues, and modifications made to the code. Finally, we conduct a performance analysis.

  18. Formation of Electrostatic Potential Drops in the Auroral Zone

    NASA Technical Reports Server (NTRS)

    Schriver, D.; Ashour-Abdalla, M.; Richard, R. L.

    2001-01-01

    In order to examine the self-consistent formation of large-scale quasi-static parallel electric fields in the auroral zone on a micro/meso scale, a particle in cell simulation has been developed. The code resolves electron Debye length scales so that electron micro-processes are included and a variable grid scheme is used such that the overall length scale of the simulation is of the order of an Earth radii along the magnetic field. The simulation is electrostatic and includes the magnetic mirror force, as well as two types of plasmas, a cold dense ionospheric plasma and a warm tenuous magnetospheric plasma. In order to study the formation of parallel electric fields in the auroral zone, different magnetospheric ion and electron inflow boundary conditions are used to drive the system. It has been found that for conditions in the primary (upward) current region an upward directed quasi-static electric field can form across the system due to magnetic mirroring of the magnetospheric ions and electrons at different altitudes. For conditions in the return (downward) current region it is shown that a quasi-static parallel electric field in the opposite sense of that in the primary current region is formed, i.e., the parallel electric field is directed earthward. The conditions for how these different electric fields can be formed are discussed using satellite observations and numerical simulations.

  19. The Comparability of Three Wechsler Adult Intelligence Scales in a College Sample.

    ERIC Educational Resources Information Center

    Quereshi, M. Y.; Ostrowski, Michael J.

    1985-01-01

    Administered three Wechsler adult intelligence scales to 72 undergraduates and tested the quality of means, variances, and covariances, utilizing subtest scale scores and IQs. Results indicated that the three scales were not parallel. Generally, the subtest scaled scores exhibited less similarity across the three scales than the IQ estimates.…

  20. Are trait-scaling relationships invariant across contrasting elevations in the widely distributed treeline species Nothofagus pumilio?

    PubMed

    Fajardo, Alex

    2016-05-01

    The study of scaling examines the relative dimensions of diverse organismal traits. Understanding whether global scaling patterns are paralleled within species is key to identify causal factors of universal scaling. I examined whether the foliage-stem (Corner's rules), the leaf size-number, and the leaf mass-leaf area scaling relationships remained invariant and isometric with elevation in a wide-distributed treeline species in the southern Chilean Andes. Mean leaf area, leaf mass, leafing intensity, and twig cross-sectional area were determined for 1-2 twigs of 8-15 Nothofagus pumilio individuals across four elevations (including treeline elevation) and four locations (from central Chile at 36°S to Tierra del Fuego at 54°S). Mixed effects models were fitted to test whether the interaction term between traits and elevation was nonsignificant (invariant). The leaf-twig cross-sectional area and the leaf mass-leaf area scaling relationships were isometric (slope = 1) and remained invariant with elevation, whereas the leaf size-number (i.e., leafing intensity) scaling was allometric (slope ≠ -1) and showed no variation with elevation. Leaf area and leaf number were consistently negatively correlated across elevation. The scaling relationships examined in the current study parallel those seen across species. It is plausible that the explanation of intraspecific scaling relationships, as trait combinations favored by natural selection, is the same as those invoked to explain across species patterns. Thus, it is very likely that the global interspecific Corner's rules and other leaf-leaf scaling relationships emerge as the aggregate of largely parallel intraspecific patterns. © 2016 Botanical Society of America.

Top