Parallel computations and control of adaptive structures
NASA Technical Reports Server (NTRS)
Park, K. C.; Alvin, Kenneth F.; Belvin, W. Keith; Chong, K. P. (Editor); Liu, S. C. (Editor); Li, J. C. (Editor)
1991-01-01
The equations of motion for structures with adaptive elements for vibration control are presented for parallel computations to be used as a software package for real-time control of flexible space structures. A brief introduction of the state-of-the-art parallel computational capability is also presented. Time marching strategies are developed for an effective use of massive parallel mapping, partitioning, and the necessary arithmetic operations. An example is offered for the simulation of control-structure interaction on a parallel computer and the impact of the approach presented for applications in other disciplines than aerospace industry is assessed.
An object-oriented approach for parallel self adaptive mesh refinement on block structured grids
NASA Technical Reports Server (NTRS)
Lemke, Max; Witsch, Kristian; Quinlan, Daniel
1993-01-01
Self-adaptive mesh refinement dynamically matches the computational demands of a solver for partial differential equations to the activity in the application's domain. In this paper we present two C++ class libraries, P++ and AMR++, which significantly simplify the development of sophisticated adaptive mesh refinement codes on (massively) parallel distributed memory architectures. The development is based on our previous research in this area. The C++ class libraries provide abstractions to separate the issues of developing parallel adaptive mesh refinement applications into those of parallelism, abstracted by P++, and adaptive mesh refinement, abstracted by AMR++. P++ is a parallel array class library to permit efficient development of architecture independent codes for structured grid applications, and AMR++ provides support for self-adaptive mesh refinement on block-structured grids of rectangular non-overlapping blocks. Using these libraries, the application programmers' work is greatly simplified to primarily specifying the serial single grid application and obtaining the parallel and self-adaptive mesh refinement code with minimal effort. Initial results for simple singular perturbation problems solved by self-adaptive multilevel techniques (FAC, AFAC), being implemented on the basis of prototypes of the P++/AMR++ environment, are presented. Singular perturbation problems frequently arise in large applications, e.g. in the area of computational fluid dynamics. They usually have solutions with layers which require adaptive mesh refinement and fast basic solvers in order to be resolved efficiently.
PARAMESH: A Parallel Adaptive Mesh Refinement Community Toolkit
NASA Technical Reports Server (NTRS)
MacNeice, Peter; Olson, Kevin M.; Mobarry, Clark; deFainchtein, Rosalinda; Packer, Charles
1999-01-01
In this paper, we describe a community toolkit which is designed to provide parallel support with adaptive mesh capability for a large and important class of computational models, those using structured, logically cartesian meshes. The package of Fortran 90 subroutines, called PARAMESH, is designed to provide an application developer with an easy route to extend an existing serial code which uses a logically cartesian structured mesh into a parallel code with adaptive mesh refinement. Alternatively, in its simplest use, and with minimal effort, it can operate as a domain decomposition tool for users who want to parallelize their serial codes, but who do not wish to use adaptivity. The package can provide them with an incremental evolutionary path for their code, converting it first to uniformly refined parallel code, and then later if they so desire, adding adaptivity.
Parallel Adaptive Mesh Refinement Library
NASA Technical Reports Server (NTRS)
Mac-Neice, Peter; Olson, Kevin
2005-01-01
Parallel Adaptive Mesh Refinement Library (PARAMESH) is a package of Fortran 90 subroutines designed to provide a computer programmer with an easy route to extension of (1) a previously written serial code that uses a logically Cartesian structured mesh into (2) a parallel code with adaptive mesh refinement (AMR). Alternatively, in its simplest use, and with minimal effort, PARAMESH can operate as a domain-decomposition tool for users who want to parallelize their serial codes but who do not wish to utilize adaptivity. The package builds a hierarchy of sub-grids to cover the computational domain of a given application program, with spatial resolution varying to satisfy the demands of the application. The sub-grid blocks form the nodes of a tree data structure (a quad-tree in two or an oct-tree in three dimensions). Each grid block has a logically Cartesian mesh. The package supports one-, two- and three-dimensional models.
Parallel adaptive wavelet collocation method for PDEs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nejadmalayeri, Alireza, E-mail: Alireza.Nejadmalayeri@gmail.com; Vezolainen, Alexei, E-mail: Alexei.Vezolainen@Colorado.edu; Brown-Dymkoski, Eric, E-mail: Eric.Browndymkoski@Colorado.edu
2015-10-01
A parallel adaptive wavelet collocation method for solving a large class of Partial Differential Equations is presented. The parallelization is achieved by developing an asynchronous parallel wavelet transform, which allows one to perform parallel wavelet transform and derivative calculations with only one data synchronization at the highest level of resolution. The data are stored using tree-like structure with tree roots starting at a priori defined level of resolution. Both static and dynamic domain partitioning approaches are developed. For the dynamic domain partitioning, trees are considered to be the minimum quanta of data to be migrated between the processes. This allowsmore » fully automated and efficient handling of non-simply connected partitioning of a computational domain. Dynamic load balancing is achieved via domain repartitioning during the grid adaptation step and reassigning trees to the appropriate processes to ensure approximately the same number of grid points on each process. The parallel efficiency of the approach is discussed based on parallel adaptive wavelet-based Coherent Vortex Simulations of homogeneous turbulence with linear forcing at effective non-adaptive resolutions up to 2048{sup 3} using as many as 2048 CPU cores.« less
Adapting high-level language programs for parallel processing using data flow
NASA Technical Reports Server (NTRS)
Standley, Hilda M.
1988-01-01
EASY-FLOW, a very high-level data flow language, is introduced for the purpose of adapting programs written in a conventional high-level language to a parallel environment. The level of parallelism provided is of the large-grained variety in which parallel activities take place between subprograms or processes. A program written in EASY-FLOW is a set of subprogram calls as units, structured by iteration, branching, and distribution constructs. A data flow graph may be deduced from an EASY-FLOW program.
Parallel Anisotropic Tetrahedral Adaptation
NASA Technical Reports Server (NTRS)
Park, Michael A.; Darmofal, David L.
2008-01-01
An adaptive method that robustly produces high aspect ratio tetrahedra to a general 3D metric specification without introducing hybrid semi-structured regions is presented. The elemental operators and higher-level logic is described with their respective domain-decomposed parallelizations. An anisotropic tetrahedral grid adaptation scheme is demonstrated for 1000-1 stretching for a simple cube geometry. This form of adaptation is applicable to more complex domain boundaries via a cut-cell approach as demonstrated by a parallel 3D supersonic simulation of a complex fighter aircraft. To avoid the assumptions and approximations required to form a metric to specify adaptation, an approach is introduced that directly evaluates interpolation error. The grid is adapted to reduce and equidistribute this interpolation error calculation without the use of an intervening anisotropic metric. Direct interpolation error adaptation is illustrated for 1D and 3D domains.
A parallel adaptive mesh refinement algorithm
NASA Technical Reports Server (NTRS)
Quirk, James J.; Hanebutte, Ulf R.
1993-01-01
Over recent years, Adaptive Mesh Refinement (AMR) algorithms which dynamically match the local resolution of the computational grid to the numerical solution being sought have emerged as powerful tools for solving problems that contain disparate length and time scales. In particular, several workers have demonstrated the effectiveness of employing an adaptive, block-structured hierarchical grid system for simulations of complex shock wave phenomena. Unfortunately, from the parallel algorithm developer's viewpoint, this class of scheme is quite involved; these schemes cannot be distilled down to a small kernel upon which various parallelizing strategies may be tested. However, because of their block-structured nature such schemes are inherently parallel, so all is not lost. In this paper we describe the method by which Quirk's AMR algorithm has been parallelized. This method is built upon just a few simple message passing routines and so it may be implemented across a broad class of MIMD machines. Moreover, the method of parallelization is such that the original serial code is left virtually intact, and so we are left with just a single product to support. The importance of this fact should not be underestimated given the size and complexity of the original algorithm.
Parallel architectures for iterative methods on adaptive, block structured grids
NASA Technical Reports Server (NTRS)
Gannon, D.; Vanrosendale, J.
1983-01-01
A parallel computer architecture well suited to the solution of partial differential equations in complicated geometries is proposed. Algorithms for partial differential equations contain a great deal of parallelism. But this parallelism can be difficult to exploit, particularly on complex problems. One approach to extraction of this parallelism is the use of special purpose architectures tuned to a given problem class. The architecture proposed here is tuned to boundary value problems on complex domains. An adaptive elliptic algorithm which maps effectively onto the proposed architecture is considered in detail. Two levels of parallelism are exploited by the proposed architecture. First, by making use of the freedom one has in grid generation, one can construct grids which are locally regular, permitting a one to one mapping of grids to systolic style processor arrays, at least over small regions. All local parallelism can be extracted by this approach. Second, though there may be a regular global structure to the grids constructed, there will be parallelism at this level. One approach to finding and exploiting this parallelism is to use an architecture having a number of processor clusters connected by a switching network. The use of such a network creates a highly flexible architecture which automatically configures to the problem being solved.
What is adaptive about adaptive decision making? A parallel constraint satisfaction account.
Glöckner, Andreas; Hilbig, Benjamin E; Jekel, Marc
2014-12-01
There is broad consensus that human cognition is adaptive. However, the vital question of how exactly this adaptivity is achieved has remained largely open. Herein, we contrast two frameworks which account for adaptive decision making, namely broad and general single-mechanism accounts vs. multi-strategy accounts. We propose and fully specify a single-mechanism model for decision making based on parallel constraint satisfaction processes (PCS-DM) and contrast it theoretically and empirically against a multi-strategy account. To achieve sufficiently sensitive tests, we rely on a multiple-measure methodology including choice, reaction time, and confidence data as well as eye-tracking. Results show that manipulating the environmental structure produces clear adaptive shifts in choice patterns - as both frameworks would predict. However, results on the process level (reaction time, confidence), in information acquisition (eye-tracking), and from cross-predicting choice consistently corroborate single-mechanisms accounts in general, and the proposed parallel constraint satisfaction model for decision making in particular. Copyright © 2014 Elsevier B.V. All rights reserved.
A new parallelization scheme for adaptive mesh refinement
Loffler, Frank; Cao, Zhoujian; Brandt, Steven R.; ...
2016-05-06
Here, we present a new method for parallelization of adaptive mesh refinement called Concurrent Structured Adaptive Mesh Refinement (CSAMR). This new method offers the lower computational cost (i.e. wall time x processor count) of subcycling in time, but with the runtime performance (i.e. smaller wall time) of evolving all levels at once using the time step of the finest level (which does more work than subcycling but has less parallelism). We demonstrate our algorithm's effectiveness using an adaptive mesh refinement code, AMSS-NCKU, and show performance on Blue Waters and other high performance clusters. For the class of problem considered inmore » this paper, our algorithm achieves a speedup of 1.7-1.9 when the processor count for a given AMR run is doubled, consistent with our theoretical predictions.« less
A new parallelization scheme for adaptive mesh refinement
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loffler, Frank; Cao, Zhoujian; Brandt, Steven R.
Here, we present a new method for parallelization of adaptive mesh refinement called Concurrent Structured Adaptive Mesh Refinement (CSAMR). This new method offers the lower computational cost (i.e. wall time x processor count) of subcycling in time, but with the runtime performance (i.e. smaller wall time) of evolving all levels at once using the time step of the finest level (which does more work than subcycling but has less parallelism). We demonstrate our algorithm's effectiveness using an adaptive mesh refinement code, AMSS-NCKU, and show performance on Blue Waters and other high performance clusters. For the class of problem considered inmore » this paper, our algorithm achieves a speedup of 1.7-1.9 when the processor count for a given AMR run is doubled, consistent with our theoretical predictions.« less
The effect of selection environment on the probability of parallel evolution.
Bailey, Susan F; Rodrigue, Nicolas; Kassen, Rees
2015-06-01
Across the great diversity of life, there are many compelling examples of parallel and convergent evolution-similar evolutionary changes arising in independently evolving populations. Parallel evolution is often taken to be strong evidence of adaptation occurring in populations that are highly constrained in their genetic variation. Theoretical models suggest a few potential factors driving the probability of parallel evolution, but experimental tests are needed. In this study, we quantify the degree of parallel evolution in 15 replicate populations of Pseudomonas fluorescens evolved in five different environments that varied in resource type and arrangement. We identified repeat changes across multiple levels of biological organization from phenotype, to gene, to nucleotide, and tested the impact of 1) selection environment, 2) the degree of adaptation, and 3) the degree of heterogeneity in the environment on the degree of parallel evolution at the gene-level. We saw, as expected, that parallel evolution occurred more often between populations evolved in the same environment; however, the extent of parallel evolution varied widely. The degree of adaptation did not significantly explain variation in the extent of parallelism in our system but number of available beneficial mutations correlated negatively with parallel evolution. In addition, degree of parallel evolution was significantly higher in populations evolved in a spatially structured, multiresource environment, suggesting that environmental heterogeneity may be an important factor constraining adaptation. Overall, our results stress the importance of environment in driving parallel evolutionary changes and point to a number of avenues for future work for understanding when evolution is predictable. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
apGA: An adaptive parallel genetic algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liepins, G.E.; Baluja, S.
1991-01-01
We develop apGA, a parallel variant of the standard generational GA, that combines aggressive search with perpetual novelty, yet is able to preserve enough genetic structure to optimally solve variably scaled, non-uniform block deceptive and hierarchical deceptive problems. apGA combines elitism, adaptive mutation, adaptive exponential scaling, and temporal memory. We present empirical results for six classes of problems, including the DeJong test suite. Although we have not investigated hybrids, we note that apGA could be incorporated into other recent GA variants such as GENITOR, CHC, and the recombination stage of mGA. 12 refs., 2 figs., 2 tabs.
Parallel processors and nonlinear structural dynamics algorithms and software
NASA Technical Reports Server (NTRS)
Belytschko, Ted; Gilbertsen, Noreen D.; Neal, Mark O.; Plaskacz, Edward J.
1989-01-01
The adaptation of a finite element program with explicit time integration to a massively parallel SIMD (single instruction multiple data) computer, the CONNECTION Machine is described. The adaptation required the development of a new algorithm, called the exchange algorithm, in which all nodal variables are allocated to the element with an exchange of nodal forces at each time step. The architectural and C* programming language features of the CONNECTION Machine are also summarized. Various alternate data structures and associated algorithms for nonlinear finite element analysis are discussed and compared. Results are presented which demonstrate that the CONNECTION Machine is capable of outperforming the CRAY XMP/14.
Modeling Cooperative Threads to Project GPU Performance for Adaptive Parallelism
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meng, Jiayuan; Uram, Thomas; Morozov, Vitali A.
Most accelerators, such as graphics processing units (GPUs) and vector processors, are particularly suitable for accelerating massively parallel workloads. On the other hand, conventional workloads are developed for multi-core parallelism, which often scale to only a few dozen OpenMP threads. When hardware threads significantly outnumber the degree of parallelism in the outer loop, programmers are challenged with efficient hardware utilization. A common solution is to further exploit the parallelism hidden deep in the code structure. Such parallelism is less structured: parallel and sequential loops may be imperfectly nested within each other, neigh boring inner loops may exhibit different concurrency patternsmore » (e.g. Reduction vs. Forall), yet have to be parallelized in the same parallel section. Many input-dependent transformations have to be explored. A programmer often employs a larger group of hardware threads to cooperatively walk through a smaller outer loop partition and adaptively exploit any encountered parallelism. This process is time-consuming and error-prone, yet the risk of gaining little or no performance remains high for such workloads. To reduce risk and guide implementation, we propose a technique to model workloads with limited parallelism that can automatically explore and evaluate transformations involving cooperative threads. Eventually, our framework projects the best achievable performance and the most promising transformations without implementing GPU code or using physical hardware. We envision our technique to be integrated into future compilers or optimization frameworks for autotuning.« less
Carpet: Adaptive Mesh Refinement for the Cactus Framework
NASA Astrophysics Data System (ADS)
Schnetter, Erik; Hawley, Scott; Hawke, Ian
2016-11-01
Carpet is an adaptive mesh refinement and multi-patch driver for the Cactus Framework (ascl:1102.013). Cactus is a software framework for solving time-dependent partial differential equations on block-structured grids, and Carpet acts as driver layer providing adaptive mesh refinement, multi-patch capability, as well as parallelization and efficient I/O.
A novel parallel pipeline structure of VP9 decoder
NASA Astrophysics Data System (ADS)
Qin, Huabiao; Chen, Wu; Yi, Sijun; Tan, Yunfei; Yi, Huan
2018-04-01
To improve the efficiency of VP9 decoder, a novel parallel pipeline structure of VP9 decoder is presented in this paper. According to the decoding workflow, VP9 decoder can be divided into sub-modules which include entropy decoding, inverse quantization, inverse transform, intra prediction, inter prediction, deblocking and pixel adaptive compensation. By analyzing the computing time of each module, hotspot modules are located and the causes of low efficiency of VP9 decoder can be found. Then, a novel pipeline decoder structure is designed by using mixed parallel decoding methods of data division and function division. The experimental results show that this structure can greatly improve the decoding efficiency of VP9.
Adaptive mesh refinement and load balancing based on multi-level block-structured Cartesian mesh
NASA Astrophysics Data System (ADS)
Misaka, Takashi; Sasaki, Daisuke; Obayashi, Shigeru
2017-11-01
We developed a framework for a distributed-memory parallel computer that enables dynamic data management for adaptive mesh refinement and load balancing. We employed simple data structure of the building cube method (BCM) where a computational domain is divided into multi-level cubic domains and each cube has the same number of grid points inside, realising a multi-level block-structured Cartesian mesh. Solution adaptive mesh refinement, which works efficiently with the help of the dynamic load balancing, was implemented by dividing cubes based on mesh refinement criteria. The framework was investigated with the Laplace equation in terms of adaptive mesh refinement, load balancing and the parallel efficiency. It was then applied to the incompressible Navier-Stokes equations to simulate a turbulent flow around a sphere. We considered wall-adaptive cube refinement where a non-dimensional wall distance y+ near the sphere is used for a criterion of mesh refinement. The result showed the load imbalance due to y+ adaptive mesh refinement was corrected by the present approach. To utilise the BCM framework more effectively, we also tested a cube-wise algorithm switching where an explicit and implicit time integration schemes are switched depending on the local Courant-Friedrichs-Lewy (CFL) condition in each cube.
Parallel Processing of Adaptive Meshes with Load Balancing
NASA Technical Reports Server (NTRS)
Das, Sajal K.; Harvey, Daniel J.; Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2001-01-01
Many scientific applications involve grids that lack a uniform underlying structure. These applications are often also dynamic in nature in that the grid structure significantly changes between successive phases of execution. In parallel computing environments, mesh adaptation of unstructured grids through selective refinement/coarsening has proven to be an effective approach. However, achieving load balance while minimizing interprocessor communication and redistribution costs is a difficult problem. Traditional dynamic load balancers are mostly inadequate because they lack a global view of system loads across processors. In this paper, we propose a novel and general-purpose load balancer that utilizes symmetric broadcast networks (SBN) as the underlying communication topology, and compare its performance with a successful global load balancing environment, called PLUM, specifically created to handle adaptive unstructured applications. Our experimental results on an IBM SP2 demonstrate that the SBN-based load balancer achieves lower redistribution costs than that under PLUM by overlapping processing and data migration.
NASA Astrophysics Data System (ADS)
Destefano, Anthony; Heerikhuisen, Jacob
2015-04-01
Fully 3D particle simulations can be a computationally and memory expensive task, especially when high resolution grid cells are required. The problem becomes further complicated when parallelization is needed. In this work we focus on computational methods to solve these difficulties. Hilbert curves are used to map the 3D particle space to the 1D contiguous memory space. This method of organization allows for minimized cache misses on the GPU as well as a sorted structure that is equivalent to an octal tree data structure. This type of sorted structure is attractive for uses in adaptive mesh implementations due to the logarithm search time. Implementations using the Message Passing Interface (MPI) library and NVIDIA's parallel computing platform CUDA will be compared, as MPI is commonly used on server nodes with many CPU's. We will also compare static grid structures with those of adaptive mesh structures. The physical test bed will be simulating heavy interstellar atoms interacting with a background plasma, the heliosphere, simulated from fully consistent coupled MHD/kinetic particle code. It is known that charge exchange is an important factor in space plasmas, specifically it modifies the structure of the heliosphere itself. We would like to thank the Alabama Supercomputer Authority for the use of their computational resources.
Adaptive Identification by Systolic Arrays.
1987-12-01
BIBLIOGRIAPHY Anton , Howard, Elementary Linear Algebra , John Wiley & Sons, 19S4. Cristi, Roberto, A Parallel Structure Jor Adaptive Pole Placement...10 11. SYSTEM IDENTIFICATION M*YETHODS ....................... 12 A. LINEAR SYSTEM MODELING ......................... 12 B. SOLUTION OF SYSTEMS OF... LINEAR EQUATIONS ......... 13 C. QR DECOMPOSITION ................................ 14 D. RECURSIVE LEAST SQUARES ......................... 16 E. BLOCK
NASA Astrophysics Data System (ADS)
Gutzwiller, David; Gontier, Mathieu; Demeulenaere, Alain
2014-11-01
Multi-Block structured solvers hold many advantages over their unstructured counterparts, such as a smaller memory footprint and efficient serial performance. Historically, multi-block structured solvers have not been easily adapted for use in a High Performance Computing (HPC) environment, and the recent trend towards hybrid GPU/CPU architectures has further complicated the situation. This paper will elaborate on developments and innovations applied to the NUMECA FINE/Turbo solver that have allowed near-linear scalability with real-world problems on over 250 hybrid GPU/GPU cluster nodes. Discussion will focus on the implementation of virtual partitioning and load balancing algorithms using a novel meta-block concept. This implementation is transparent to the user, allowing all pre- and post-processing steps to be performed using a simple, unpartitioned grid topology. Additional discussion will elaborate on developments that have improved parallel performance, including fully parallel I/O with the ADIOS API and the GPU porting of the computationally heavy CPUBooster convergence acceleration module. Head of HPC and Release Management, Numeca International.
Innovative Language-Based & Object-Oriented Structured AMR Using Fortran 90 and OpenMP
NASA Technical Reports Server (NTRS)
Norton, C.; Balsara, D.
1999-01-01
Parallel adaptive mesh refinement (AMR) is an important numerical technique that leads to the efficient solution of many physical and engineering problems. In this paper, we describe how AMR programing can be performed in an object-oreinted way using the modern aspects of Fortran 90 combined with the parallelization features of OpenMP.
Analysis, preliminary design and simulation systems for control-structure interaction problems
NASA Technical Reports Server (NTRS)
Park, K. C.; Alvin, Kenneth F.
1991-01-01
Software aspects of control-structure interaction (CSI) analysis are discussed. The following subject areas are covered: (1) implementation of a partitioned algorithm for simulation of large CSI problems; (2) second-order discrete Kalman filtering equations for CSI simulations; and (3) parallel computations and control of adaptive structures.
Numerical simulation of h-adaptive immersed boundary method for freely falling disks
NASA Astrophysics Data System (ADS)
Zhang, Pan; Xia, Zhenhua; Cai, Qingdong
2018-05-01
In this work, a freely falling disk with aspect ratio 1/10 is directly simulated by using an adaptive numerical model implemented on a parallel computation framework JASMIN. The adaptive numerical model is a combination of the h-adaptive mesh refinement technique and the implicit immersed boundary method (IBM). Our numerical results agree well with the experimental results in all of the six degrees of freedom of the disk. Furthermore, very similar vortex structures observed in the experiment were also obtained.
Longitudinal trends in climate drive flowering time clines in North American Arabidopsis thaliana.
Samis, Karen E; Murren, Courtney J; Bossdorf, Oliver; Donohue, Kathleen; Fenster, Charles B; Malmberg, Russell L; Purugganan, Michael D; Stinchcombe, John R
2012-06-01
Introduced species frequently show geographic differentiation, and when differentiation mirrors the ancestral range, it is often taken as evidence of adaptive evolution. The mouse-ear cress (Arabidopsis thaliana) was introduced to North America from Eurasia 150-200 years ago, providing an opportunity to study parallel adaptation in a genetic model organism. Here, we test for clinal variation in flowering time using 199 North American (NA) accessions of A. thaliana, and evaluate the contributions of major flowering time genes FRI, FLC, and PHYC as well as potential ecological mechanisms underlying differentiation. We find evidence for substantial within population genetic variation in quantitative traits and flowering time, and putatively adaptive longitudinal differentiation, despite low levels of variation at FRI, FLC, and PHYC and genome-wide reductions in population structure relative to Eurasian (EA) samples. The observed longitudinal cline in flowering time in North America is parallel to an EA cline, robust to the effects of population structure, and associated with geographic variation in winter precipitation and temperature. We detected major effects of FRI on quantitative traits associated with reproductive fitness, although the haplotype associated with higher fitness remains rare in North America. Collectively, our results suggest the evolution of parallel flowering time clines through novel genetic mechanisms.
High Performance Fortran for Aerospace Applications
NASA Technical Reports Server (NTRS)
Mehrotra, Piyush; Zima, Hans; Bushnell, Dennis M. (Technical Monitor)
2000-01-01
This paper focuses on the use of High Performance Fortran (HPF) for important classes of algorithms employed in aerospace applications. HPF is a set of Fortran extensions designed to provide users with a high-level interface for programming data parallel scientific applications, while delegating to the compiler/runtime system the task of generating explicitly parallel message-passing programs. We begin by providing a short overview of the HPF language. This is followed by a detailed discussion of the efficient use of HPF for applications involving multiple structured grids such as multiblock and adaptive mesh refinement (AMR) codes as well as unstructured grid codes. We focus on the data structures and computational structures used in these codes and on the high-level strategies that can be expressed in HPF to optimally exploit the parallelism in these algorithms.
NASA Astrophysics Data System (ADS)
Gassmöller, Rene; Bangerth, Wolfgang
2016-04-01
Particle-in-cell methods have a long history and many applications in geodynamic modelling of mantle convection, lithospheric deformation and crustal dynamics. They are primarily used to track material information, the strain a material has undergone, the pressure-temperature history a certain material region has experienced, or the amount of volatiles or partial melt present in a region. However, their efficient parallel implementation - in particular combined with adaptive finite-element meshes - is complicated due to the complex communication patterns and frequent reassignment of particles to cells. Consequently, many current scientific software packages accomplish this efficient implementation by specifically designing particle methods for a single purpose, like the advection of scalar material properties that do not evolve over time (e.g., for chemical heterogeneities). Design choices for particle integration, data storage, and parallel communication are then optimized for this single purpose, making the code relatively rigid to changing requirements. Here, we present the implementation of a flexible, scalable and efficient particle-in-cell method for massively parallel finite-element codes with adaptively changing meshes. Using a modular plugin structure, we allow maximum flexibility of the generation of particles, the carried tracer properties, the advection and output algorithms, and the projection of properties to the finite-element mesh. We present scaling tests ranging up to tens of thousands of cores and tens of billions of particles. Additionally, we discuss efficient load-balancing strategies for particles in adaptive meshes with their strengths and weaknesses, local particle-transfer between parallel subdomains utilizing existing communication patterns from the finite element mesh, and the use of established parallel output algorithms like the HDF5 library. Finally, we show some relevant particle application cases, compare our implementation to a modern advection-field approach, and demonstrate under which conditions which method is more efficient. We implemented the presented methods in ASPECT (aspect.dealii.org), a freely available open-source community code for geodynamic simulations. The structure of the particle code is highly modular, and segregated from the PDE solver, and can thus be easily transferred to other programs, or adapted for various application cases.
Parallel implementation of an adaptive and parameter-free N-body integrator
NASA Astrophysics Data System (ADS)
Pruett, C. David; Ingham, William H.; Herman, Ralph D.
2011-05-01
Previously, Pruett et al. (2003) [3] described an N-body integrator of arbitrarily high order M with an asymptotic operation count of O(MN). The algorithm's structure lends itself readily to data parallelization, which we document and demonstrate here in the integration of point-mass systems subject to Newtonian gravitation. High order is shown to benefit parallel efficiency. The resulting N-body integrator is robust, parameter-free, highly accurate, and adaptive in both time-step and order. Moreover, it exhibits linear speedup on distributed parallel processors, provided that each processor is assigned at least a handful of bodies. Program summaryProgram title: PNB.f90 Catalogue identifier: AEIK_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEIK_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC license, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 3052 No. of bytes in distributed program, including test data, etc.: 68 600 Distribution format: tar.gz Programming language: Fortran 90 and OpenMPI Computer: All shared or distributed memory parallel processors Operating system: Unix/Linux Has the code been vectorized or parallelized?: The code has been parallelized but has not been explicitly vectorized. RAM: Dependent upon N Classification: 4.3, 4.12, 6.5 Nature of problem: High accuracy numerical evaluation of trajectories of N point masses each subject to Newtonian gravitation. Solution method: Parallel and adaptive extrapolation in time via power series of arbitrary degree. Running time: 5.1 s for the demo program supplied with the package.
Craciun, Stefan; Brockmeier, Austin J; George, Alan D; Lam, Herman; Príncipe, José C
2011-01-01
Methods for decoding movements from neural spike counts using adaptive filters often rely on minimizing the mean-squared error. However, for non-Gaussian distribution of errors, this approach is not optimal for performance. Therefore, rather than using probabilistic modeling, we propose an alternate non-parametric approach. In order to extract more structure from the input signal (neuronal spike counts) we propose using minimum error entropy (MEE), an information-theoretic approach that minimizes the error entropy as part of an iterative cost function. However, the disadvantage of using MEE as the cost function for adaptive filters is the increase in computational complexity. In this paper we present a comparison between the decoding performance of the analytic Wiener filter and a linear filter trained with MEE, which is then mapped to a parallel architecture in reconfigurable hardware tailored to the computational needs of the MEE filter. We observe considerable speedup from the hardware design. The adaptation of filter weights for the multiple-input, multiple-output linear filters, necessary in motor decoding, is a highly parallelizable algorithm. It can be decomposed into many independent computational blocks with a parallel architecture readily mapped to a field-programmable gate array (FPGA) and scales to large numbers of neurons. By pipelining and parallelizing independent computations in the algorithm, the proposed parallel architecture has sublinear increases in execution time with respect to both window size and filter order.
A Robust and Scalable Software Library for Parallel Adaptive Refinement on Unstructured Meshes
NASA Technical Reports Server (NTRS)
Lou, John Z.; Norton, Charles D.; Cwik, Thomas A.
1999-01-01
The design and implementation of Pyramid, a software library for performing parallel adaptive mesh refinement (PAMR) on unstructured meshes, is described. This software library can be easily used in a variety of unstructured parallel computational applications, including parallel finite element, parallel finite volume, and parallel visualization applications using triangular or tetrahedral meshes. The library contains a suite of well-designed and efficiently implemented modules that perform operations in a typical PAMR process. Among these are mesh quality control during successive parallel adaptive refinement (typically guided by a local-error estimator), parallel load-balancing, and parallel mesh partitioning using the ParMeTiS partitioner. The Pyramid library is implemented in Fortran 90 with an interface to the Message-Passing Interface (MPI) library, supporting code efficiency, modularity, and portability. An EM waveguide filter application, adaptively refined using the Pyramid library, is illustrated.
Decentralized Control of Scheduling in Distributed Systems.
1983-03-18
the job scheduling algorithm adapts to the changing busyness of the various hosts in the system. The environment in which the job scheduling entities...resources and processes that constitute the node and a set of interfaces for accessing these processes and resources. The structure of a node could change ...parallel. Chang [CHNG82] has also described some algorithms for detecting properties of general graphs by traversing paths in a graph in parallel. One of
Towards a large-scale scalable adaptive heart model using shallow tree meshes
NASA Astrophysics Data System (ADS)
Krause, Dorian; Dickopf, Thomas; Potse, Mark; Krause, Rolf
2015-10-01
Electrophysiological heart models are sophisticated computational tools that place high demands on the computing hardware due to the high spatial resolution required to capture the steep depolarization front. To address this challenge, we present a novel adaptive scheme for resolving the deporalization front accurately using adaptivity in space. Our adaptive scheme is based on locally structured meshes. These tensor meshes in space are organized in a parallel forest of trees, which allows us to resolve complicated geometries and to realize high variations in the local mesh sizes with a minimal memory footprint in the adaptive scheme. We discuss both a non-conforming mortar element approximation and a conforming finite element space and present an efficient technique for the assembly of the respective stiffness matrices using matrix representations of the inclusion operators into the product space on the so-called shallow tree meshes. We analyzed the parallel performance and scalability for a two-dimensional ventricle slice as well as for a full large-scale heart model. Our results demonstrate that the method has good performance and high accuracy.
ADAPTIVE TETRAHEDRAL GRID REFINEMENT AND COARSENING IN MESSAGE-PASSING ENVIRONMENTS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hallberg, J.; Stagg, A.
2000-10-01
A grid refinement and coarsening scheme has been developed for tetrahedral and triangular grid-based calculations in message-passing environments. The element adaption scheme is based on an edge bisection of elements marked for refinement by an appropriate error indicator. Hash-table/linked-list data structures are used to store nodal and element formation. The grid along inter-processor boundaries is refined and coarsened consistently with the update of these data structures via MPI calls. The parallel adaption scheme has been applied to the solution of a transient, three-dimensional, nonlinear, groundwater flow problem. Timings indicate efficiency of the grid refinement process relative to the flow solvermore » calculations.« less
Mechanisms mediating parallel action monitoring in fronto-striatal circuits.
Beste, Christian; Ness, Vanessa; Lukas, Carsten; Hoffmann, Rainer; Stüwe, Sven; Falkenstein, Michael; Saft, Carsten
2012-08-01
Flexible response adaptation and the control of conflicting information play a pivotal role in daily life. Yet, little is known about the neuronal mechanisms mediating parallel control of these processes. We examined these mechanisms using a multi-methodological approach that integrated data from event-related potentials (ERPs) with structural MRI data and source localisation using sLORETA. Moreover, we calculated evoked wavelet oscillations. We applied this multi-methodological approach in healthy subjects and patients in a prodromal phase of a major basal ganglia disorder (i.e., Huntington's disease), to directly focus on fronto-striatal networks. Behavioural data indicated, especially the parallel execution of conflict monitoring and flexible response adaptation was modulated across the examined cohorts. When both processes do not co-incide a high integrity of fronto-striatal loops seems to be dispensable. The neurophysiological data suggests that conflict monitoring (reflected by the N2 ERP) and working memory processes (reflected by the P3 ERP) differentially contribute to this pattern of results. Flexible response adaptation under the constraint of high conflict processing affected the N2 and P3 ERP, as well as their delta frequency band oscillations. Yet, modulatory effects were strongest for the N2 ERP and evoked wavelet oscillations in this time range. The N2 ERPs were localized in the anterior cingulate cortex (BA32, BA24). Modulations of the P3 ERP were localized in parietal areas (BA7). In addition, MRI-determined caudate head volume predicted modulations in conflict monitoring, but not working memory processes. The results show how parallel conflict monitoring and flexible adaptation of action is mediated via fronto-striatal networks. While both, response monitoring and working memory processes seem to play a role, especially response selection processes and ACC-basal ganglia networks seem to be the driving force in mediating parallel conflict monitoring and flexible adaptation of actions. Copyright © 2012 Elsevier Inc. All rights reserved.
Adaptive parallel logic networks
NASA Technical Reports Server (NTRS)
Martinez, Tony R.; Vidal, Jacques J.
1988-01-01
Adaptive, self-organizing concurrent systems (ASOCS) that combine self-organization with massive parallelism for such applications as adaptive logic devices, robotics, process control, and system malfunction management, are presently discussed. In ASOCS, an adaptive network composed of many simple computing elements operating in combinational and asynchronous fashion is used and problems are specified by presenting if-then rules to the system in the form of Boolean conjunctions. During data processing, which is a different operational phase from adaptation, the network acts as a parallel hardware circuit.
The SMART MIL-STD-1553 bus adapter hardware manual
NASA Technical Reports Server (NTRS)
Ton, T. T.
1981-01-01
The SMART Multiplexer Interface Adapter, (SMIA) a complete system interface for message structure of the MIL-STD-1553, is described. It provides buffering and storage for transmitted and received data and handles all the necessary handshaking to interface between parallel 8-bit data bus and a MIL-STD serial bit stream. The bus adapter is configured as either a bus controller of a remote terminal interface. It is coupled directly to the multiplex bus, or stub coupled through an additional isolation transformer located at the connection point. Fault isolation resistors provide short circuit protection.
Xu, Liangliang; Xu, Nengxiong
2017-01-01
This paper focuses on designing and implementing parallel adaptive inverse distance weighting (AIDW) interpolation algorithms by using the graphics processing unit (GPU). The AIDW is an improved version of the standard IDW, which can adaptively determine the power parameter according to the data points’ spatial distribution pattern and achieve more accurate predictions than those predicted by IDW. In this paper, we first present two versions of the GPU-accelerated AIDW, i.e. the naive version without profiting from the shared memory and the tiled version taking advantage of the shared memory. We also implement the naive version and the tiled version using two data layouts, structure of arrays and array of aligned structures, on both single and double precision. We then evaluate the performance of parallel AIDW by comparing it with its corresponding serial algorithm on three different machines equipped with the GPUs GT730M, M5000 and K40c. The experimental results indicate that: (i) there is no significant difference in the computational efficiency when different data layouts are employed; (ii) the tiled version is always slightly faster than the naive version; and (iii) on single precision the achieved speed-up can be up to 763 (on the GPU M5000), while on double precision the obtained highest speed-up is 197 (on the GPU K40c). To benefit the community, all source code and testing data related to the presented parallel AIDW algorithm are publicly available. PMID:28989754
Mei, Gang; Xu, Liangliang; Xu, Nengxiong
2017-09-01
This paper focuses on designing and implementing parallel adaptive inverse distance weighting (AIDW) interpolation algorithms by using the graphics processing unit (GPU). The AIDW is an improved version of the standard IDW, which can adaptively determine the power parameter according to the data points' spatial distribution pattern and achieve more accurate predictions than those predicted by IDW. In this paper, we first present two versions of the GPU-accelerated AIDW, i.e. the naive version without profiting from the shared memory and the tiled version taking advantage of the shared memory. We also implement the naive version and the tiled version using two data layouts, structure of arrays and array of aligned structures, on both single and double precision. We then evaluate the performance of parallel AIDW by comparing it with its corresponding serial algorithm on three different machines equipped with the GPUs GT730M, M5000 and K40c. The experimental results indicate that: (i) there is no significant difference in the computational efficiency when different data layouts are employed; (ii) the tiled version is always slightly faster than the naive version; and (iii) on single precision the achieved speed-up can be up to 763 (on the GPU M5000), while on double precision the obtained highest speed-up is 197 (on the GPU K40c). To benefit the community, all source code and testing data related to the presented parallel AIDW algorithm are publicly available.
The Feasibility of Adaptive Unstructured Computations On Petaflops Systems
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Oliker, Leonid; Heber, Gerd; Gao, Guang; Saini, Subhash (Technical Monitor)
1999-01-01
This viewgraph presentation covers the advantages of mesh adaptation, unstructured grids, and dynamic load balancing. It illustrates parallel adaptive communications, and explains PLUM (Parallel dynamic load balancing for adaptive unstructured meshes), and PSAW (Proper Self Avoiding Walks).
Multifractal Internet Traffic Model and Active Queue Management
2003-01-01
dropped by the Adaptive RED , ssthresh decreases from 64KB to 4KB and the new con- gestion window cwnd is decreased from 8KB to 1KB (Tahoe). The situation...method to predict the queuing behavior of FIFO and RED queues. In order to satisfy a given delay and jitter requirement for real time connections, and to...5.2 Vulnerability of Adaptive RED to Web-mice . . . . . . . . . . . . . 103 5.3 A Parallel Virtual Queues Structure
Many-to-one form-to-function mapping weakens parallel morphological evolution.
Thompson, Cole J; Ahmed, Newaz I; Veen, Thor; Peichel, Catherine L; Hendry, Andrew P; Bolnick, Daniel I; Stuart, Yoel E
2017-11-01
Evolutionary ecologists aim to explain and predict evolutionary change under different selective regimes. Theory suggests that such evolutionary prediction should be more difficult for biomechanical systems in which different trait combinations generate the same functional output: "many-to-one mapping." Many-to-one mapping of phenotype to function enables multiple morphological solutions to meet the same adaptive challenges. Therefore, many-to-one mapping should undermine parallel morphological evolution, and hence evolutionary predictability, even when selection pressures are shared among populations. Studying 16 replicate pairs of lake- and stream-adapted threespine stickleback (Gasterosteus aculeatus), we quantified three parts of the teleost feeding apparatus and used biomechanical models to calculate their expected functional outputs. The three feeding structures differed in their form-to-function relationship from one-to-one (lower jaw lever ratio) to increasingly many-to-one (buccal suction index, opercular 4-bar linkage). We tested for (1) weaker linear correlations between phenotype and calculated function, and (2) less parallel evolution across lake-stream pairs, in the many-to-one systems relative to the one-to-one system. We confirm both predictions, thus supporting the theoretical expectation that increasing many-to-one mapping undermines parallel evolution. Therefore, sole consideration of morphological variation within and among populations might not serve as a proxy for functional variation when multiple adaptive trait combinations exist. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.
Exploring types of play in an adapted robotics program for children with disabilities.
Lindsay, Sally; Lam, Ashley
2018-04-01
Play is an important occupation in a child's development. Children with disabilities often have fewer opportunities to engage in meaningful play than typically developing children. The purpose of this study was to explore the types of play (i.e., solitary, parallel and co-operative) within an adapted robotics program for children with disabilities aged 6-8 years. This study draws on detailed observations of each of the six robotics workshops and interviews with 53 participants (21 children, 21 parents and 11 programme staff). Our findings showed that four children engaged in solitary play, where all but one showed signs of moving towards parallel play. Six children demonstrated parallel play during all workshops. The remainder of the children had mixed play types play (solitary, parallel and/or co-operative) throughout the robotics workshops. We observed more parallel and co-operative, and less solitary play as the programme progressed. Ten different children displayed co-operative behaviours throughout the workshops. The interviews highlighted how staff supported children's engagement in the programme. Meanwhile, parents reported on their child's development of play skills. An adapted LEGO ® robotics program has potential to develop the play skills of children with disabilities in moving from solitary towards more parallel and co-operative play. Implications for rehabilitation Educators and clinicians working with children who have disabilities should consider the potential of LEGO ® robotics programs for developing their play skills. Clinicians should consider how the extent of their involvement in prompting and facilitating children's engagement and play within a robotics program may influence their ability to interact with their peers. Educators and clinicians should incorporate both structured and unstructured free-play elements within a robotics program to facilitate children's social development.
Barabanova, S V; Artiukhina, Z E; Ovchinnikova, K T; Abramova, T V; Kazakova, T B; Khavinson, V Kh; Malinin, V V; Korneva, E A
2007-02-01
The objective of this work was to perform a parallel analysis of activation of the rat anterior hypothalamus cells as judged by c-Fos protein expression, and of the expression of interleukin-2 (IL-2) under different influences, i. e., mild stress (handling) and adaptation to it, and intranasal administration of saline and the peptides Vilon (Lys-Glu) and Epithalon (Ala-Glu-Asp-Gly). Changes in the counts of cells positive for c-Fos- and IL-2 proteins were studied in structures of the lateral (LHA) area, anterior (AHN), supraoptic (SO) and paraventricular (PVH) nuclei of Wistar rat hypothalamus. Quantity of the interleukin-2-positive and c-Fos-positive cells was calculated. The findings were: a negative correlation between the activation of cells and the amount of IL-2 in the cells in the hypothalamic structures under study, and the specific patterns of changes in the counts of cells positive for c-Fos and IL-2 under stress and adaptation to stress.
A multi-block adaptive solving technique based on lattice Boltzmann method
NASA Astrophysics Data System (ADS)
Zhang, Yang; Xie, Jiahua; Li, Xiaoyue; Ma, Zhenghai; Zou, Jianfeng; Zheng, Yao
2018-05-01
In this paper, a CFD parallel adaptive algorithm is self-developed by combining the multi-block Lattice Boltzmann Method (LBM) with Adaptive Mesh Refinement (AMR). The mesh refinement criterion of this algorithm is based on the density, velocity and vortices of the flow field. The refined grid boundary is obtained by extending outward half a ghost cell from the coarse grid boundary, which makes the adaptive mesh more compact and the boundary treatment more convenient. Two numerical examples of the backward step flow separation and the unsteady flow around circular cylinder demonstrate the vortex structure of the cold flow field accurately and specifically.
Genomics of parallel adaptation at two timescales in Drosophila
Begun, David J.
2017-01-01
Two interesting unanswered questions are the extent to which both the broad patterns and genetic details of adaptive divergence are repeatable across species, and the timescales over which parallel adaptation may be observed. Drosophila melanogaster is a key model system for population and evolutionary genomics. Findings from genetics and genomics suggest that recent adaptation to latitudinal environmental variation (on the timescale of hundreds or thousands of years) associated with Out-of-Africa colonization plays an important role in maintaining biological variation in the species. Additionally, studies of interspecific differences between D. melanogaster and its sister species D. simulans have revealed that a substantial proportion of proteins and amino acid residues exhibit adaptive divergence on a roughly few million years long timescale. Here we use population genomic approaches to attack the problem of parallelism between D. melanogaster and a highly diverged conger, D. hydei, on two timescales. D. hydei, a member of the repleta group of Drosophila, is similar to D. melanogaster, in that it too appears to be a recently cosmopolitan species and recent colonizer of high latitude environments. We observed parallelism both for genes exhibiting latitudinal allele frequency differentiation within species and for genes exhibiting recurrent adaptive protein divergence between species. Greater parallelism was observed for long-term adaptive protein evolution and this parallelism includes not only the specific genes/proteins that exhibit adaptive evolution, but extends even to the magnitudes of the selective effects on interspecific protein differences. Thus, despite the roughly 50 million years of time separating D. melanogaster and D. hydei, and despite their considerably divergent biology, they exhibit substantial parallelism, suggesting the existence of a fundamental predictability of adaptive evolution in the genus. PMID:28968391
Global Load Balancing with Parallel Mesh Adaption on Distributed-Memory Systems
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Oliker, Leonid; Sohn, Andrew
1996-01-01
Dynamic mesh adaption on unstructured grids is a powerful tool for efficiently computing unsteady problems to resolve solution features of interest. Unfortunately, this causes load imbalance among processors on a parallel machine. This paper describes the parallel implementation of a tetrahedral mesh adaption scheme and a new global load balancing method. A heuristic remapping algorithm is presented that assigns partitions to processors such that the redistribution cost is minimized. Results indicate that the parallel performance of the mesh adaption code depends on the nature of the adaption region and show a 35.5X speedup on 64 processors of an SP2 when 35% of the mesh is randomly adapted. For large-scale scientific computations, our load balancing strategy gives almost a sixfold reduction in solver execution times over non-balanced loads. Furthermore, our heuristic remapper yields processor assignments that are less than 3% off the optimal solutions but requires only 1% of the computational time.
NASA Astrophysics Data System (ADS)
Li, Xinhua; Song, Zhenyu; Zhan, Yongjie; Wu, Qiongzhi
2009-12-01
Since the system capacity is severely limited, reducing the multiple access interfere (MAI) is necessary in the multiuser direct-sequence code division multiple access (DS-CDMA) system which is used in the telecommunication terminals data-transferred link system. In this paper, we adopt an adaptive multistage parallel interference cancellation structure in the demodulator based on the least mean square (LMS) algorithm to eliminate the MAI on the basis of overviewing various of multiuser dectection schemes. Neither a training sequence nor a pilot signal is needed in the proposed scheme, and its implementation complexity can be greatly reduced by a LMS approximate algorithm. The algorithm and its FPGA implementation is then derived. Simulation results of the proposed adaptive PIC can outperform some of the existing interference cancellation methods in AWGN channels. The hardware setup of mutiuser demodulator is described, and the experimental results based on it demonstrate that the simulation results shows large performance gains over the conventional single-user demodulator.
Acoustooptic linear algebra processors - Architectures, algorithms, and applications
NASA Technical Reports Server (NTRS)
Casasent, D.
1984-01-01
Architectures, algorithms, and applications for systolic processors are described with attention to the realization of parallel algorithms on various optical systolic array processors. Systolic processors for matrices with special structure and matrices of general structure, and the realization of matrix-vector, matrix-matrix, and triple-matrix products and such architectures are described. Parallel algorithms for direct and indirect solutions to systems of linear algebraic equations and their implementation on optical systolic processors are detailed with attention to the pipelining and flow of data and operations. Parallel algorithms and their optical realization for LU and QR matrix decomposition are specifically detailed. These represent the fundamental operations necessary in the implementation of least squares, eigenvalue, and SVD solutions. Specific applications (e.g., the solution of partial differential equations, adaptive noise cancellation, and optimal control) are described to typify the use of matrix processors in modern advanced signal processing.
Self-organization in neural networks - Applications in structural optimization
NASA Technical Reports Server (NTRS)
Hajela, Prabhat; Fu, B.; Berke, Laszlo
1993-01-01
The present paper discusses the applicability of ART (Adaptive Resonance Theory) networks, and the Hopfield and Elastic networks, in problems of structural analysis and design. A characteristic of these network architectures is the ability to classify patterns presented as inputs into specific categories. The categories may themselves represent distinct procedural solution strategies. The paper shows how this property can be adapted in the structural analysis and design problem. A second application is the use of Hopfield and Elastic networks in optimization problems. Of particular interest are problems characterized by the presence of discrete and integer design variables. The parallel computing architecture that is typical of neural networks is shown to be effective in such problems. Results of preliminary implementations in structural design problems are also included in the paper.
Parallel goal-oriented adaptive finite element modeling for 3D electromagnetic exploration
NASA Astrophysics Data System (ADS)
Zhang, Y.; Key, K.; Ovall, J.; Holst, M.
2014-12-01
We present a parallel goal-oriented adaptive finite element method for accurate and efficient electromagnetic (EM) modeling of complex 3D structures. An unstructured tetrahedral mesh allows this approach to accommodate arbitrarily complex 3D conductivity variations and a priori known boundaries. The total electric field is approximated by the lowest order linear curl-conforming shape functions and the discretized finite element equations are solved by a sparse LU factorization. Accuracy of the finite element solution is achieved through adaptive mesh refinement that is performed iteratively until the solution converges to the desired accuracy tolerance. Refinement is guided by a goal-oriented error estimator that uses a dual-weighted residual method to optimize the mesh for accurate EM responses at the locations of the EM receivers. As a result, the mesh refinement is highly efficient since it only targets the elements where the inaccuracy of the solution corrupts the response at the possibly distant locations of the EM receivers. We compare the accuracy and efficiency of two approaches for estimating the primary residual error required at the core of this method: one uses local element and inter-element residuals and the other relies on solving a global residual system using a hierarchical basis. For computational efficiency our method follows the Bank-Holst algorithm for parallelization, where solutions are computed in subdomains of the original model. To resolve the load-balancing problem, this approach applies a spectral bisection method to divide the entire model into subdomains that have approximately equal error and the same number of receivers. The finite element solutions are then computed in parallel with each subdomain carrying out goal-oriented adaptive mesh refinement independently. We validate the newly developed algorithm by comparison with controlled-source EM solutions for 1D layered models and with 2D results from our earlier 2D goal oriented adaptive refinement code named MARE2DEM. We demonstrate the performance and parallel scaling of this algorithm on a medium-scale computing cluster with a marine controlled-source EM example that includes a 3D array of receivers located over a 3D model that includes significant seafloor bathymetry variations and a heterogeneous subsurface.
Method for six-legged robot stepping on obstacles by indirect force estimation
NASA Astrophysics Data System (ADS)
Xu, Yilin; Gao, Feng; Pan, Yang; Chai, Xun
2016-07-01
Adaptive gaits for legged robots often requires force sensors installed on foot-tips, however impact, temperature or humidity can affect or even damage those sensors. Efforts have been made to realize indirect force estimation on the legged robots using leg structures based on planar mechanisms. Robot Octopus III is a six-legged robot using spatial parallel mechanism(UP-2UPS) legs. This paper proposed a novel method to realize indirect force estimation on walking robot based on a spatial parallel mechanism. The direct kinematics model and the inverse kinematics model are established. The force Jacobian matrix is derived based on the kinematics model. Thus, the indirect force estimation model is established. Then, the relation between the output torques of the three motors installed on one leg to the external force exerted on the foot tip is described. Furthermore, an adaptive tripod static gait is designed. The robot alters its leg trajectory to step on obstacles by using the proposed adaptive gait. Both the indirect force estimation model and the adaptive gait are implemented and optimized in a real time control system. An experiment is carried out to validate the indirect force estimation model. The adaptive gait is tested in another experiment. Experiment results show that the robot can successfully step on a 0.2 m-high obstacle. This paper proposes a novel method to overcome obstacles for the six-legged robot using spatial parallel mechanism legs and to avoid installing the electric force sensors in harsh environment of the robot's foot tips.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2017-05-17
PeleC is an adaptive-mesh compressible hydrodynamics code for reacting flows. It solves the compressible Navier-Stokes with multispecies transport in a block structured framework. The resulting algorithm is well suited for flows with localized resolution requirements and robust to discontinuities. User controllable refinement crieteria has the potential to result in extremely small numerical dissipation and dispersion, making this code appropriate for both research and applied usage. The code is built on the AMReX library which facilitates hierarchical parallelism and manages distributed memory parallism. PeleC algorithms are implemented to express shared memory parallelism.
The Basal Ganglia and Adaptive Motor Control
NASA Astrophysics Data System (ADS)
Graybiel, Ann M.; Aosaki, Toshihiko; Flaherty, Alice W.; Kimura, Minoru
1994-09-01
The basal ganglia are neural structures within the motor and cognitive control circuits in the mammalian forebrain and are interconnected with the neocortex by multiple loops. Dysfunction in these parallel loops caused by damage to the striatum results in major defects in voluntary movement, exemplified in Parkinson's disease and Huntington's disease. These parallel loops have a distributed modular architecture resembling local expert architectures of computational learning models. During sensorimotor learning, such distributed networks may be coordinated by widely spaced striatal interneurons that acquire response properties on the basis of experienced reward.
Multithreaded Model for Dynamic Load Balancing Parallel Adaptive PDE Computations
NASA Technical Reports Server (NTRS)
Chrisochoides, Nikos
1995-01-01
We present a multithreaded model for the dynamic load-balancing of numerical, adaptive computations required for the solution of Partial Differential Equations (PDE's) on multiprocessors. Multithreading is used as a means of exploring concurrency in the processor level in order to tolerate synchronization costs inherent to traditional (non-threaded) parallel adaptive PDE solvers. Our preliminary analysis for parallel, adaptive PDE solvers indicates that multithreading can be used an a mechanism to mask overheads required for the dynamic balancing of processor workloads with computations required for the actual numerical solution of the PDE's. Also, multithreading can simplify the implementation of dynamic load-balancing algorithms, a task that is very difficult for traditional data parallel adaptive PDE computations. Unfortunately, multithreading does not always simplify program complexity, often makes code re-usability not an easy task, and increases software complexity.
Issues in the digital implementation of control compensators. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Moroney, P.
1979-01-01
Techniques developed for the finite-precision implementation of digital filters were used, adapted, and extended for digital feedback compensators, with particular emphasis on steady state, linear-quadratic-Gaussian compensators. Topics covered include: (1) the linear-quadratic-Gaussian problem; (2) compensator structures; (3) architectural issues: serialism, parallelism, and pipelining; (4) finite wordlength effects: quantization noise, quantizing the coefficients, and limit cycles; and (5) the optimization of structures.
NASA Astrophysics Data System (ADS)
Zhang, Quan; Li, Chaodong; Zhang, Jiantao; Zhang, Jianhui
2017-12-01
This paper addresses the dynamic model and active vibration control of a rigid-flexible parallel manipulator with three smart links actuated by three linear ultrasonic motors. To suppress the vibration of three flexible intermediate links under high speed and acceleration, multiple Lead Zirconium Titanate (PZT) sensors and actuators are collocated mounted on each link, forming a smart structure which can achieve self-sensing and self-actuating. The dynamic characteristics and equations of the flexible link incorporated with the PZT sensors and actuator are analyzed and formulated. The smooth adaptive sliding mode based active vibration control is proposed to suppress the vibration of the smart links, and the first and second modes of the three links are targeted to be suppressed in modal space to avoid the spillover phenomenon. Simulations and experiments are implemented to validate the effectiveness of the smart structures and the proposed control laws. Experimental results show that the vibration of the first mode around 92 Hz and the second mode around 240 Hz of the three smart links are reduced respectively by 64.98%, 59.47%, 62.28%, and 45.80%, 36.79%, 33.33%, which further verify the multi-mode vibration control ability of the smooth adaptive sliding mode control law.
Fast adaptive composite grid methods on distributed parallel architectures
NASA Technical Reports Server (NTRS)
Lemke, Max; Quinlan, Daniel
1992-01-01
The fast adaptive composite (FAC) grid method is compared with the adaptive composite method (AFAC) under variety of conditions including vectorization and parallelization. Results are given for distributed memory multiprocessor architectures (SUPRENUM, Intel iPSC/2 and iPSC/860). It is shown that the good performance of AFAC and its superiority over FAC in a parallel environment is a property of the algorithm and not dependent on peculiarities of any machine.
Synchronization Of Parallel Discrete Event Simulations
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S.
1992-01-01
Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.
Parallel Tetrahedral Mesh Adaptation with Dynamic Load Balancing
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Biswas, Rupak; Gabow, Harold N.
1999-01-01
The ability to dynamically adapt an unstructured grid is a powerful tool for efficiently solving computational problems with evolving physical features. In this paper, we report on our experience parallelizing an edge-based adaptation scheme, called 3D_TAG. using message passing. Results show excellent speedup when a realistic helicopter rotor mesh is randomly refined. However. performance deteriorates when the mesh is refined using a solution-based error indicator since mesh adaptation for practical problems occurs in a localized region., creating a severe load imbalance. To address this problem, we have developed PLUM, a global dynamic load balancing framework for adaptive numerical computations. Even though PLUM primarily balances processor workloads for the solution phase, it reduces the load imbalance problem within mesh adaptation by repartitioning the mesh after targeting edges for refinement but before the actual subdivision. This dramatically improves the performance of parallel 3D_TAG since refinement occurs in a more load balanced fashion. We also present optimal and heuristic algorithms that, when applied to the default mapping of a parallel repartitioner, significantly reduce the data redistribution overhead. Finally, portability is examined by comparing performance on three state-of-the-art parallel machines.
Global Load Balancing with Parallel Mesh Adaption on Distributed-Memory Systems
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Oliker, Leonid; Sohn, Andrew
1996-01-01
Dynamic mesh adaptation on unstructured grids is a powerful tool for efficiently computing unsteady problems to resolve solution features of interest. Unfortunately, this causes load inbalances among processors on a parallel machine. This paper described the parallel implementation of a tetrahedral mesh adaption scheme and a new global load balancing method. A heuristic remapping algorithm is presented that assigns partitions to processors such that the redistribution coast is minimized. Results indicate that the parallel performance of the mesh adaption code depends on the nature of the adaption region and show a 35.5X speedup on 64 processors of an SP2 when 35 percent of the mesh is randomly adapted. For large scale scientific computations, our load balancing strategy gives an almost sixfold reduction in solver execution times over non-balanced loads. Furthermore, our heuristic remappier yields processor assignments that are less than 3 percent of the optimal solutions, but requires only 1 percent of the computational time.
Parallel Adaptive High-Order CFD Simulations Characterizing SOFIA Cavitiy Acoustics
NASA Technical Reports Server (NTRS)
Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak
2015-01-01
This paper presents large-scale MPI-parallel computational uid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft fuselage of a Boeing 747SP. These simulations focus on how the unsteady ow eld inside and over the cavity interferes with the optical path and mounting structure of the telescope. A tempo- rally fourth-order accurate Runge-Kutta, and a spatially fth-order accurate WENO-5Z scheme were used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh re nement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32k CPU cores and 4 billion compu- tational cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregular numerical cost associated with blocks con- taining boundaries. Limits to scaling beyond 32k cores are identi ed, and targeted code optimizations are discussed.
Dynamic grid refinement for partial differential equations on parallel computers
NASA Technical Reports Server (NTRS)
Mccormick, S.; Quinlan, D.
1989-01-01
The fast adaptive composite grid method (FAC) is an algorithm that uses various levels of uniform grids to provide adaptive resolution and fast solution of PDEs. An asynchronous version of FAC, called AFAC, that completely eliminates the bottleneck to parallelism is presented. This paper describes the advantage that this algorithm has in adaptive refinement for moving singularities on multiprocessor computers. This work is applicable to the parallel solution of two- and three-dimensional shock tracking problems.
Global magnetosphere simulations using constrained-transport Hall-MHD with CWENO reconstruction
NASA Astrophysics Data System (ADS)
Lin, L.; Germaschewski, K.; Maynard, K. M.; Abbott, S.; Bhattacharjee, A.; Raeder, J.
2013-12-01
We present a new CWENO (Centrally-Weighted Essentially Non-Oscillatory) reconstruction based MHD solver for the OpenGGCM global magnetosphere code. The solver was built using libMRC, a library for creating efficient parallel PDE solvers on structured grids. The use of libMRC gives us access to its core functionality of providing an automated code generation framework which takes a user provided PDE right hand side in symbolic form to generate an efficient, computer architecture specific, parallel code. libMRC also supports block-structured adaptive mesh refinement and implicit-time stepping through integration with the PETSc library. We validate the new CWENO Hall-MHD solver against existing solvers both in standard test problems as well as in global magnetosphere simulations.
Adding dynamic rules to self-organizing fuzzy systems
NASA Technical Reports Server (NTRS)
Buhusi, Catalin V.
1992-01-01
This paper develops a Dynamic Self-Organizing Fuzzy System (DSOFS) capable of adding, removing, and/or adapting the fuzzy rules and the fuzzy reference sets. The DSOFS background consists of a self-organizing neural structure with neuron relocation features which will develop a map of the input-output behavior. The relocation algorithm extends the topological ordering concept. Fuzzy rules (neurons) are dynamically added or released while the neural structure learns the pattern. The DSOFS advantages are the automatic synthesis and the possibility of parallel implementation. A high adaptation speed and a reduced number of neurons is needed in order to keep errors under some limits. The computer simulation results are presented in a nonlinear systems modelling application.
Learning, memory, and the role of neural network architecture.
Hermundstad, Ann M; Brown, Kevin S; Bassett, Danielle S; Carlson, Jean M
2011-06-01
The performance of information processing systems, from artificial neural networks to natural neuronal ensembles, depends heavily on the underlying system architecture. In this study, we compare the performance of parallel and layered network architectures during sequential tasks that require both acquisition and retention of information, thereby identifying tradeoffs between learning and memory processes. During the task of supervised, sequential function approximation, networks produce and adapt representations of external information. Performance is evaluated by statistically analyzing the error in these representations while varying the initial network state, the structure of the external information, and the time given to learn the information. We link performance to complexity in network architecture by characterizing local error landscape curvature. We find that variations in error landscape structure give rise to tradeoffs in performance; these include the ability of the network to maximize accuracy versus minimize inaccuracy and produce specific versus generalizable representations of information. Parallel networks generate smooth error landscapes with deep, narrow minima, enabling them to find highly specific representations given sufficient time. While accurate, however, these representations are difficult to generalize. In contrast, layered networks generate rough error landscapes with a variety of local minima, allowing them to quickly find coarse representations. Although less accurate, these representations are easily adaptable. The presence of measurable performance tradeoffs in both layered and parallel networks has implications for understanding the behavior of a wide variety of natural and artificial learning systems.
Multiscale Simulations of Magnetic Island Coalescence
NASA Technical Reports Server (NTRS)
Dorelli, John C.
2010-01-01
We describe a new interactive parallel Adaptive Mesh Refinement (AMR) framework written in the Python programming language. This new framework, PyAMR, hides the details of parallel AMR data structures and algorithms (e.g., domain decomposition, grid partition, and inter-process communication), allowing the user to focus on the development of algorithms for advancing the solution of a systems of partial differential equations on a single uniform mesh. We demonstrate the use of PyAMR by simulating the pairwise coalescence of magnetic islands using the resistive Hall MHD equations. Techniques for coupling different physics models on different levels of the AMR grid hierarchy are discussed.
A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rajbhandari, Samyam; Kim, Jinsung; Krishnamoorthy, Sriram
This paper describes the design and implementation of a layered domain-specific compiler to support MADNESS---Multiresolution ADaptive Numerical Environment for Scientific Simulation. MADNESS is a high-level software environment for the solution of integral and differential equations in many dimensions, using adaptive and fast harmonic analysis methods with guaranteed precision. MADNESS uses k-d trees to represent spatial functions and implements operators like addition, multiplication, differentiation, and integration on the numerical representation of functions. The MADNESS runtime system provides global namespace support and a task-based execution model including futures. MADNESS is currently deployed on massively parallel supercomputers and has enabled many science advances.more » Due to the highly irregular and statically unpredictable structure of the k-d trees representing the spatial functions encountered in MADNESS applications, only purely runtime approaches to optimization have previously been implemented in the MADNESS framework. This paper describes a layered domain-specific compiler developed to address some performance bottlenecks in MADNESS. The newly developed static compile-time optimizations, in conjunction with the MADNESS runtime support, enable significant performance improvement for the MADNESS framework.« less
Barrett, Tristam; Feola, Giuseppe; Khusnitdinova, Marina; Krylova, Viktoria
2017-01-01
The convergence of climate change and post-Soviet socio-economic and institutional transformations has been underexplored so far, as have the consequences of such convergence on crop agriculture in Central Asia. This paper provides a place-based analysis of constraints and opportunities for adaptation to climate change, with a specific focus on water use, in two districts in southeast Kazakhstan. Data were collected by 2 multi-stakeholder participatory workshops, 21 semi-structured in-depth interviews, and secondary statistical data. The present-day agricultural system is characterised by enduring Soviet-era management structures, but without state inputs that previously sustained agricultural productivity. Low margins of profitability on many privatised farms mean that attempts to implement integrated water management have produced water users associations unable to maintain and upgrade a deteriorating irrigation infrastructure. Although actors engage in tactical adaptation measures, necessary structural adaptation of the irrigation system remains difficult without significant public or private investments. Market-based water management models have been translated ambiguously to this region, which fails to encourage efficient water use and hinders adaptation to water stress. In addition, a mutual interdependence of informal networks and formal institutions characterises both state governance and everyday life in Kazakhstan. Such interdependence simultaneously facilitates operational and tactical adaptation, but hinders structural adaptation, as informal networks exist as a parallel system that achieves substantive outcomes while perpetuating the inertia and incapacity of the state bureaucracy. This article has relevance for critical understanding of integrated water management in practice and adaptation to climate change in post-Soviet institutional settings more broadly.
Optimal Design of Passive Power Filters Based on Pseudo-parallel Genetic Algorithm
NASA Astrophysics Data System (ADS)
Li, Pei; Li, Hongbo; Gao, Nannan; Niu, Lin; Guo, Liangfeng; Pei, Ying; Zhang, Yanyan; Xu, Minmin; Chen, Kerui
2017-05-01
The economic costs together with filter efficiency are taken as targets to optimize the parameter of passive filter. Furthermore, the method of combining pseudo-parallel genetic algorithm with adaptive genetic algorithm is adopted in this paper. In the early stages pseudo-parallel genetic algorithm is introduced to increase the population diversity, and adaptive genetic algorithm is used in the late stages to reduce the workload. At the same time, the migration rate of pseudo-parallel genetic algorithm is improved to change with population diversity adaptively. Simulation results show that the filter designed by the proposed method has better filtering effect with lower economic cost, and can be used in engineering.
NASA Astrophysics Data System (ADS)
Eilert, Tobias; Beckers, Maximilian; Drechsler, Florian; Michaelis, Jens
2017-10-01
The analysis tool and software package Fast-NPS can be used to analyse smFRET data to obtain quantitative structural information about macromolecules in their natural environment. In the algorithm a Bayesian model gives rise to a multivariate probability distribution describing the uncertainty of the structure determination. Since Fast-NPS aims to be an easy-to-use general-purpose analysis tool for a large variety of smFRET networks, we established an MCMC based sampling engine that approximates the target distribution and requires no parameter specification by the user at all. For an efficient local exploration we automatically adapt the multivariate proposal kernel according to the shape of the target distribution. In order to handle multimodality, the sampler is equipped with a parallel tempering scheme that is fully adaptive with respect to temperature spacing and number of chains. Since the molecular surrounding of a dye molecule affects its spatial mobility and thus the smFRET efficiency, we introduce dye models which can be selected for every dye molecule individually. These models allow the user to represent the smFRET network in great detail leading to an increased localisation precision. Finally, a tool to validate the chosen model combination is provided. Programme Files doi:http://dx.doi.org/10.17632/7ztzj63r68.1 Licencing provisions: Apache-2.0 Programming language: GUI in MATLAB (The MathWorks) and the core sampling engine in C++ Nature of problem: Sampling of highly diverse multivariate probability distributions in order to solve for macromolecular structures from smFRET data. Solution method: MCMC algorithm with fully adaptive proposal kernel and parallel tempering scheme.
Applying Parallel Adaptive Methods with GeoFEST/PYRAMID to Simulate Earth Surface Crustal Dynamics
NASA Technical Reports Server (NTRS)
Norton, Charles D.; Lyzenga, Greg; Parker, Jay; Glasscoe, Margaret; Donnellan, Andrea; Li, Peggy
2006-01-01
This viewgraph presentation reviews the use Adaptive Mesh Refinement (AMR) in simulating the Crustal Dynamics of Earth's Surface. AMR simultaneously improves solution quality, time to solution, and computer memory requirements when compared to generating/running on a globally fine mesh. The use of AMR in simulating the dynamics of the Earth's Surface is spurred by future proposed NASA missions, such as InSAR for Earth surface deformation and other measurements. These missions will require support for large-scale adaptive numerical methods using AMR to model observations. AMR was chosen because it has been successful in computation fluid dynamics for predictive simulation of complex flows around complex structures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sofronov, I.D.; Voronin, B.L.; Butnev, O.I.
1997-12-31
The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle.more » The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.« less
Islam, Mohammad Tariqul; Tanvir Ahmed, Sk.; Zabir, Ishmam; Shahnaz, Celia
2018-01-01
Photoplethysmographic (PPG) signal is getting popularity for monitoring heart rate in wearable devices because of simplicity of construction and low cost of the sensor. The task becomes very difficult due to the presence of various motion artefacts. In this study, an algorithm based on cascade and parallel combination (CPC) of adaptive filters is proposed in order to reduce the effect of motion artefacts. First, preliminary noise reduction is performed by averaging two channel PPG signals. Next in order to reduce the effect of motion artefacts, a cascaded filter structure consisting of three cascaded adaptive filter blocks is developed where three-channel accelerometer signals are used as references to motion artefacts. To further reduce the affect of noise, a scheme based on convex combination of two such cascaded adaptive noise cancelers is introduced, where two widely used adaptive filters namely recursive least squares and least mean squares filters are employed. Heart rates are estimated from the noise reduced PPG signal in spectral domain. Finally, an efficient heart rate tracking algorithm is designed based on the nature of the heart rate variability. The performance of the proposed CPC method is tested on a widely used public database. It is found that the proposed method offers very low estimation error and a smooth heart rate tracking with simple algorithmic approach. PMID:29515812
Armour, Brianna L; Barnes, Steve R; Moen, Spencer O; Smith, Eric; Raymond, Amy C; Fairman, James W; Stewart, Lance J; Staker, Bart L; Begley, Darren W; Edwards, Thomas E; Lorimer, Donald D
2013-06-28
Pandemic outbreaks of highly virulent influenza strains can cause widespread morbidity and mortality in human populations worldwide. In the United States alone, an average of 41,400 deaths and 1.86 million hospitalizations are caused by influenza virus infection each year (1). Point mutations in the polymerase basic protein 2 subunit (PB2) have been linked to the adaptation of the viral infection in humans (2). Findings from such studies have revealed the biological significance of PB2 as a virulence factor, thus highlighting its potential as an antiviral drug target. The structural genomics program put forth by the National Institute of Allergy and Infectious Disease (NIAID) provides funding to Emerald Bio and three other Pacific Northwest institutions that together make up the Seattle Structural Genomics Center for Infectious Disease (SSGCID). The SSGCID is dedicated to providing the scientific community with three-dimensional protein structures of NIAID category A-C pathogens. Making such structural information available to the scientific community serves to accelerate structure-based drug design. Structure-based drug design plays an important role in drug development. Pursuing multiple targets in parallel greatly increases the chance of success for new lead discovery by targeting a pathway or an entire protein family. Emerald Bio has developed a high-throughput, multi-target parallel processing pipeline (MTPP) for gene-to-structure determination to support the consortium. Here we describe the protocols used to determine the structure of the PB2 subunit from four different influenza A strains.
NASA Astrophysics Data System (ADS)
Penner, Joyce E.; Andronova, Natalia; Oehmke, Robert C.; Brown, Jonathan; Stout, Quentin F.; Jablonowski, Christiane; van Leer, Bram; Powell, Kenneth G.; Herzog, Michael
2007-07-01
One of the most important advances needed in global climate models is the development of atmospheric General Circulation Models (GCMs) that can reliably treat convection. Such GCMs require high resolution in local convectively active regions, both in the horizontal and vertical directions. During previous research we have developed an Adaptive Mesh Refinement (AMR) dynamical core that can adapt its grid resolution horizontally. Our approach utilizes a finite volume numerical representation of the partial differential equations with floating Lagrangian vertical coordinates and requires resolving dynamical processes on small spatial scales. For the latter it uses a newly developed general-purpose library, which facilitates 3D block-structured AMR on spherical grids. The library manages neighbor information as the blocks adapt, and handles the parallel communication and load balancing, freeing the user to concentrate on the scientific modeling aspects of their code. In particular, this library defines and manages adaptive blocks on the sphere, provides user interfaces for interpolation routines and supports the communication and load-balancing aspects for parallel applications. We have successfully tested the library in a 2-D (longitude-latitude) implementation. During the past year, we have extended the library to treat adaptive mesh refinement in the vertical direction. Preliminary results are discussed. This research project is characterized by an interdisciplinary approach involving atmospheric science, computer science and mathematical/numerical aspects. The work is done in close collaboration between the Atmospheric Science, Computer Science and Aerospace Engineering Departments at the University of Michigan and NOAA GFDL.
NASA Technical Reports Server (NTRS)
Barnard, Stephen T.; Simon, Horst; Lasinski, T. A. (Technical Monitor)
1994-01-01
The design of a parallel implementation of multilevel recursive spectral bisection is described. The goal is to implement a code that is fast enough to enable dynamic repartitioning of adaptive meshes.
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2002-01-01
Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
Engel, Philipp; Salzburger, Walter; Liesch, Marius; Chang, Chao-Chin; Maruyama, Soichi; Lanz, Christa; Calteau, Alexandra; Lajus, Aurélie; Médigue, Claudine; Schuster, Stephan C; Dehio, Christoph
2011-02-10
Adaptive radiation is the rapid origination of multiple species from a single ancestor as the result of concurrent adaptation to disparate environments. This fundamental evolutionary process is considered to be responsible for the genesis of a great portion of the diversity of life. Bacteria have evolved enormous biological diversity by exploiting an exceptional range of environments, yet diversification of bacteria via adaptive radiation has been documented in a few cases only and the underlying molecular mechanisms are largely unknown. Here we show a compelling example of adaptive radiation in pathogenic bacteria and reveal their genetic basis. Our evolutionary genomic analyses of the α-proteobacterial genus Bartonella uncover two parallel adaptive radiations within these host-restricted mammalian pathogens. We identify a horizontally-acquired protein secretion system, which has evolved to target specific bacterial effector proteins into host cells as the evolutionary key innovation triggering these parallel adaptive radiations. We show that the functional versatility and adaptive potential of the VirB type IV secretion system (T4SS), and thereby translocated Bartonella effector proteins (Beps), evolved in parallel in the two lineages prior to their radiations. Independent chromosomal fixation of the virB operon and consecutive rounds of lineage-specific bep gene duplications followed by their functional diversification characterize these parallel evolutionary trajectories. Whereas most Beps maintained their ancestral domain constitution, strikingly, a novel type of effector protein emerged convergently in both lineages. This resulted in similar arrays of host cell-targeted effector proteins in the two lineages of Bartonella as the basis of their independent radiation. The parallel molecular evolution of the VirB/Bep system displays a striking example of a key innovation involved in independent adaptive processes and the emergence of bacterial pathogens. Furthermore, our study highlights the remarkable evolvability of T4SSs and their effector proteins, explaining their broad application in bacterial interactions with the environment.
Engel, Philipp; Salzburger, Walter; Liesch, Marius; Chang, Chao-Chin; Maruyama, Soichi; Lanz, Christa; Calteau, Alexandra; Lajus, Aurélie; Médigue, Claudine; Schuster, Stephan C.; Dehio, Christoph
2011-01-01
Adaptive radiation is the rapid origination of multiple species from a single ancestor as the result of concurrent adaptation to disparate environments. This fundamental evolutionary process is considered to be responsible for the genesis of a great portion of the diversity of life. Bacteria have evolved enormous biological diversity by exploiting an exceptional range of environments, yet diversification of bacteria via adaptive radiation has been documented in a few cases only and the underlying molecular mechanisms are largely unknown. Here we show a compelling example of adaptive radiation in pathogenic bacteria and reveal their genetic basis. Our evolutionary genomic analyses of the α-proteobacterial genus Bartonella uncover two parallel adaptive radiations within these host-restricted mammalian pathogens. We identify a horizontally-acquired protein secretion system, which has evolved to target specific bacterial effector proteins into host cells as the evolutionary key innovation triggering these parallel adaptive radiations. We show that the functional versatility and adaptive potential of the VirB type IV secretion system (T4SS), and thereby translocated Bartonella effector proteins (Beps), evolved in parallel in the two lineages prior to their radiations. Independent chromosomal fixation of the virB operon and consecutive rounds of lineage-specific bep gene duplications followed by their functional diversification characterize these parallel evolutionary trajectories. Whereas most Beps maintained their ancestral domain constitution, strikingly, a novel type of effector protein emerged convergently in both lineages. This resulted in similar arrays of host cell-targeted effector proteins in the two lineages of Bartonella as the basis of their independent radiation. The parallel molecular evolution of the VirB/Bep system displays a striking example of a key innovation involved in independent adaptive processes and the emergence of bacterial pathogens. Furthermore, our study highlights the remarkable evolvability of T4SSs and their effector proteins, explaining their broad application in bacterial interactions with the environment. PMID:21347280
NASA Technical Reports Server (NTRS)
Ross, Muriel D.
2003-01-01
In a letter to Robert Hooke, written on 5 February, 1675, Isaac Newton wrote "If I have seen further than certain other men it is by standing upon the shoulders of giants." In his context, Newton was referring to the work of Galileo and Kepler, who preceded him. However, every field has its own giants, those men and women who went before us and, often with few tools at their disposal, uncovered the facts that enabled later researchers to advance knowledge in a particular area. This review traces the history of the evolution of views from early giants in the field of vestibular research to modern concepts of vestibular organ organization and function. Emphasis will be placed on the mammalian maculae as peripheral processors of linear accelerations acting on the head. This review shows that early, correct findings were sometimes unfortunately disregarded, impeding later investigations into the structure and function of the vestibular organs. The central themes are that the macular organs are highly complex, dynamic, adaptive, distributed parallel processors of information, and that historical references can help us to understand our own place in advancing knowledge about their complicated structure and functions.
Haptic adaptation to slant: No transfer between exploration modes
van Dam, Loes C. J.; Plaisier, Myrthe A.; Glowania, Catharina; Ernst, Marc O.
2016-01-01
Human touch is an inherently active sense: to estimate an object’s shape humans often move their hand across its surface. This way the object is sampled both in a serial (sampling different parts of the object across time) and parallel fashion (sampling using different parts of the hand simultaneously). Both the serial (moving a single finger) and parallel (static contact with the entire hand) exploration modes provide reliable and similar global shape information, suggesting the possibility that this information is shared early in the sensory cortex. In contrast, we here show the opposite. Using an adaptation-and-transfer paradigm, a change in haptic perception was induced by slant-adaptation using either the serial or parallel exploration mode. A unified shape-based coding would predict that this would equally affect perception using other exploration modes. However, we found that adaptation-induced perceptual changes did not transfer between exploration modes. Instead, serial and parallel exploration components adapted simultaneously, but to different kinaesthetic aspects of exploration behaviour rather than object-shape per se. These results indicate that a potential combination of information from different exploration modes can only occur at down-stream cortical processing stages, at which adaptation is no longer effective. PMID:27698392
Thorpe, Roger S; Barlow, Axel; Malhotra, Anita; Surget-Groba, Yann
2015-03-01
Global warming will impact species in a number of ways, and it is important to know the extent to which natural populations can adapt to anthropogenic climate change by natural selection. Parallel microevolution within separate species can demonstrate natural selection, but several studies of homoplasy have not yet revealed examples of widespread parallel evolution in a generic radiation. Taking into account primary phylogeographic divisions, we investigate numerous quantitative traits (size, shape, scalation, colour pattern and hue) in anole radiations from the mountainous Lesser Antillean islands. Adaptation to climatic differences can lead to very pronounced differences between spatially close populations with all studied traits showing some evidence of parallel evolution. Traits from shape, scalation, pattern and hue (particularly the latter) show widespread evolutionary parallels within these species in response to altitudinal climate variation greater than extreme anthropogenic climate change predicted for 2080. This gives strong evidence of the ability to adapt to climate variation by natural selection throughout this radiation. As anoles can evolve very rapidly, it suggests anthropogenic climate change is likely to be less of a conservation threat than other factors, such as habitat loss and invasive species, in this, Lesser Antillean, biodiversity hot spot. © 2015 John Wiley & Sons Ltd.
Particle-in-cell simulations on graphic processing units
NASA Astrophysics Data System (ADS)
Ren, C.; Zhou, X.; Li, J.; Huang, M. C.; Zhao, Y.
2014-10-01
We will show our recent progress in using GPU's to accelerate the PIC code OSIRIS [Fonseca et al. LNCS 2331, 342 (2002)]. The OISRIS parallel structure is retained and the computation-intensive kernels are shipped to GPU's. Algorithms for the kernels are adapted for the GPU, including high-order charge-conserving current deposition schemes with few branching and parallel particle sorting [Kong et al., JCP 230, 1676 (2011)]. These algorithms make efficient use of the GPU shared memory. This work was supported by U.S. Department of Energy under Grant No. DE-FC02-04ER54789 and by NSF under Grant No. PHY-1314734.
Error estimation and adaptive mesh refinement for parallel analysis of shell structures
NASA Technical Reports Server (NTRS)
Keating, Scott C.; Felippa, Carlos A.; Park, K. C.
1994-01-01
The formulation and application of element-level, element-independent error indicators is investigated. This research culminates in the development of an error indicator formulation which is derived based on the projection of element deformation onto the intrinsic element displacement modes. The qualifier 'element-level' means that no information from adjacent elements is used for error estimation. This property is ideally suited for obtaining error values and driving adaptive mesh refinements on parallel computers where access to neighboring elements residing on different processors may incur significant overhead. In addition such estimators are insensitive to the presence of physical interfaces and junctures. An error indicator qualifies as 'element-independent' when only visible quantities such as element stiffness and nodal displacements are used to quantify error. Error evaluation at the element level and element independence for the error indicator are highly desired properties for computing error in production-level finite element codes. Four element-level error indicators have been constructed. Two of the indicators are based on variational formulation of the element stiffness and are element-dependent. Their derivations are retained for developmental purposes. The second two indicators mimic and exceed the first two in performance but require no special formulation of the element stiffness mesh refinement which we demonstrate for two dimensional plane stress problems. The parallelizing of substructures and adaptive mesh refinement is discussed and the final error indicator using two-dimensional plane-stress and three-dimensional shell problems is demonstrated.
Periodic activations of behaviours and emotional adaptation in behaviour-based robotics
NASA Astrophysics Data System (ADS)
Burattini, Ernesto; Rossi, Silvia
2010-09-01
The possible modulatory influence of motivations and emotions is of great interest in designing robotic adaptive systems. In this paper, an attempt is made to connect the concept of periodic behaviour activations to emotional modulation, in order to link the variability of behaviours to the circumstances in which they are activated. The impact of emotion is studied, described as timed controlled structures, on simple but conflicting reactive behaviours. Through this approach it is shown that the introduction of such asynchronies in the robot control system may lead to an adaptation in the emergent behaviour without having an explicit action selection mechanism. The emergent behaviours of a simple robot designed with both a parallel and a hierarchical architecture are evaluated and compared.
Parallel fast multipole boundary element method applied to computational homogenization
NASA Astrophysics Data System (ADS)
Ptaszny, Jacek
2018-01-01
In the present work, a fast multipole boundary element method (FMBEM) and a parallel computer code for 3D elasticity problem is developed and applied to the computational homogenization of a solid containing spherical voids. The system of equation is solved by using the GMRES iterative solver. The boundary of the body is dicretized by using the quadrilateral serendipity elements with an adaptive numerical integration. Operations related to a single GMRES iteration, performed by traversing the corresponding tree structure upwards and downwards, are parallelized by using the OpenMP standard. The assignment of tasks to threads is based on the assumption that the tree nodes at which the moment transformations are initialized can be partitioned into disjoint sets of equal or approximately equal size and assigned to the threads. The achieved speedup as a function of number of threads is examined.
Simple adaptive control system design for a quadrotor with an internal PFC
NASA Astrophysics Data System (ADS)
Mizumoto, Ikuro; Nakamura, Takuto; Kumon, Makoto; Takagi, Taro
2014-12-01
The paper deals with an adaptive control system design problem for a four rotor helicopter or quadrotor. A simple adaptive control design scheme with a parallel feedforward compensator (PFC) in the internal loop of the considered quadrotor will be proposed based on the backstepping strategy. As is well known, the backstepping control strategy is one of the advanced control strategy for nonlinear systems. However, the control algorithm will become complex if the system has higher order relative degrees. We will show that one can skip some design steps of the backstepping method by introducing a PFC in the inner loop of the considered quadrotor, so that the structure of the obtained controller will be simplified and a high gain based adaptive feedback control system will be designed. The effectiveness of the proposed method will be confirmed through numerical simulations.
Predictive wind turbine simulation with an adaptive lattice Boltzmann method for moving boundaries
NASA Astrophysics Data System (ADS)
Deiterding, Ralf; Wood, Stephen L.
2016-09-01
Operating horizontal axis wind turbines create large-scale turbulent wake structures that affect the power output of downwind turbines considerably. The computational prediction of this phenomenon is challenging as efficient low dissipation schemes are necessary that represent the vorticity production by the moving structures accurately and that are able to transport wakes without significant artificial decay over distances of several rotor diameters. We have developed a parallel adaptive lattice Boltzmann method for large eddy simulation of turbulent weakly compressible flows with embedded moving structures that considers these requirements rather naturally and enables first principle simulations of wake-turbine interaction phenomena at reasonable computational costs. The paper describes the employed computational techniques and presents validation simulations for the Mexnext benchmark experiments as well as simulations of the wake propagation in the Scaled Wind Farm Technology (SWIFT) array consisting of three Vestas V27 turbines in triangular arrangement.
DGDFT: A massively parallel method for large scale density functional theory calculations.
Hu, Wei; Lin, Lin; Yang, Chao
2015-09-28
We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.
A New Approach to Parallel Dynamic Partitioning for Adaptive Unstructured Meshes
NASA Technical Reports Server (NTRS)
Heber, Gerd; Biswas, Rupak; Gao, Guang R.
1999-01-01
Classical mesh partitioning algorithms were designed for rather static situations, and their straightforward application in a dynamical framework may lead to unsatisfactory results, e.g., excessive data migration among processors. Furthermore, special attention should be paid to their amenability to parallelization. In this paper, a novel parallel method for the dynamic partitioning of adaptive unstructured meshes is described. It is based on a linear representation of the mesh using self-avoiding walks.
Computation of free energy profiles with parallel adaptive dynamics
NASA Astrophysics Data System (ADS)
Lelièvre, Tony; Rousset, Mathias; Stoltz, Gabriel
2007-04-01
We propose a formulation of an adaptive computation of free energy differences, in the adaptive biasing force or nonequilibrium metadynamics spirit, using conditional distributions of samples of configurations which evolve in time. This allows us to present a truly unifying framework for these methods, and to prove convergence results for certain classes of algorithms. From a numerical viewpoint, a parallel implementation of these methods is very natural, the replicas interacting through the reconstructed free energy. We demonstrate how to improve this parallel implementation by resorting to some selection mechanism on the replicas. This is illustrated by computations on a model system of conformational changes.
NASA Technical Reports Server (NTRS)
Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak
2014-01-01
This paper presents one-of-a-kind MPI-parallel computational fluid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft of a Boeing 747SP. These simulations focus on how the unsteady flow field inside and over the cavity interferes with the optical path and mounting of the telescope. A temporally fourth-order Runge-Kutta, and spatially fifth-order WENO-5Z scheme was used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh refinement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32,000 cores and 4 billion cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregularities caused by the highly complex geometry. Limits to scaling beyond 32K cores are identified, and targeted code optimizations are discussed.
NASA Astrophysics Data System (ADS)
Hadjidoukas, P. E.; Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P.
2015-03-01
We present Π4U, an extensible framework, for non-intrusive Bayesian Uncertainty Quantification and Propagation (UQ+P) of complex and computationally demanding physical models, that can exploit massively parallel computer architectures. The framework incorporates Laplace asymptotic approximations as well as stochastic algorithms, along with distributed numerical differentiation and task-based parallelism for heterogeneous clusters. Sampling is based on the Transitional Markov Chain Monte Carlo (TMCMC) algorithm and its variants. The optimization tasks associated with the asymptotic approximations are treated via the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). A modified subset simulation method is used for posterior reliability measurements of rare events. The framework accommodates scheduling of multiple physical model evaluations based on an adaptive load balancing library and shows excellent scalability. In addition to the software framework, we also provide guidelines as to the applicability and efficiency of Bayesian tools when applied to computationally demanding physical models. Theoretical and computational developments are demonstrated with applications drawn from molecular dynamics, structural dynamics and granular flow.
Dual-thread parallel control strategy for ophthalmic adaptive optics.
Yu, Yongxin; Zhang, Yuhua
To improve ophthalmic adaptive optics speed and compensate for ocular wavefront aberration of high temporal frequency, the adaptive optics wavefront correction has been implemented with a control scheme including 2 parallel threads; one is dedicated to wavefront detection and the other conducts wavefront reconstruction and compensation. With a custom Shack-Hartmann wavefront sensor that measures the ocular wave aberration with 193 subapertures across the pupil, adaptive optics has achieved a closed loop updating frequency up to 110 Hz, and demonstrated robust compensation for ocular wave aberration up to 50 Hz in an adaptive optics scanning laser ophthalmoscope.
Dual-thread parallel control strategy for ophthalmic adaptive optics
Yu, Yongxin; Zhang, Yuhua
2015-01-01
To improve ophthalmic adaptive optics speed and compensate for ocular wavefront aberration of high temporal frequency, the adaptive optics wavefront correction has been implemented with a control scheme including 2 parallel threads; one is dedicated to wavefront detection and the other conducts wavefront reconstruction and compensation. With a custom Shack-Hartmann wavefront sensor that measures the ocular wave aberration with 193 subapertures across the pupil, adaptive optics has achieved a closed loop updating frequency up to 110 Hz, and demonstrated robust compensation for ocular wave aberration up to 50 Hz in an adaptive optics scanning laser ophthalmoscope. PMID:25866498
Kinnison, Michael T.
2017-01-01
Abstract Phenotypic plasticity is often an adaptation of organisms to cope with temporally or spatially heterogenous landscapes. Like other adaptations, one would predict that different species, populations, or sexes might thus show some degree of parallel evolution of plasticity, in the form of parallel reaction norms, when exposed to analogous environmental gradients. Indeed, one might even expect parallelism of plasticity to repeatedly evolve in multiple traits responding to the same gradient, resulting in integrated parallelism of plasticity. In this study, we experimentally tested for parallel patterns of predator-mediated plasticity of size, shape, and behavior of 2 species and sexes of mosquitofish. Examination of behavioral trials indicated that the 2 species showed unique patterns of behavioral plasticity, whereas the 2 sexes in each species showed parallel responses. Fish shape showed parallel patterns of plasticity for both sexes and species, albeit males showed evidence of unique plasticity related to reproductive anatomy. Moreover, patterns of shape plasticity due to predator exposure were broadly parallel to what has been depicted for predator-mediated population divergence in other studies (slender bodies, expanded caudal regions, ventrally located eyes, and reduced male gonopodia). We did not find evidence of phenotypic plasticity in fish size for either species or sex. Hence, our findings support broadly integrated parallelism of plasticity for sexes within species and less integrated parallelism for species. We interpret these findings with respect to their potential broader implications for the interacting roles of adaptation and constraint in the evolutionary origins of parallelism of plasticity in general. PMID:29491997
Su, Hao; Dickstein-Fischer, Laurie; Harrington, Kevin; Fu, Qiushi; Lu, Weina; Huang, Haibo; Cole, Gregory; Fischer, Gregory S
2010-01-01
This paper presents the development of new prismatic actuation approach and its application in human-safe humanoid head design. To reduce actuator output impedance and mitigate unexpected external shock, the prismatic actuation method uses cables to drive a piston with preloaded spring. By leveraging the advantages of parallel manipulator and cable-driven mechanism, the developed neck has a parallel manipulator embodiment with two cable-driven limbs embedded with preloaded springs and one passive limb. The eye mechanism is adapted for low-cost webcam with succinct "ball-in-socket" structure. Based on human head anatomy and biomimetics, the neck has 3 degree of freedom (DOF) motion: pan, tilt and one decoupled roll while each eye has independent pan and synchronous tilt motion (3 DOF eyes). A Kalman filter based face tracking algorithm is implemented to interact with the human. This neck and eye structure is translatable to other human-safe humanoid robots. The robot's appearance reflects a non-threatening image of a penguin, which can be translated into a possible therapeutic intervention for children with Autism Spectrum Disorders.
Moen, Spencer O.; Smith, Eric; Raymond, Amy C.; Fairman, James W.; Stewart, Lance J.; Staker, Bart L.; Begley, Darren W.; Edwards, Thomas E.; Lorimer, Donald D.
2013-01-01
Pandemic outbreaks of highly virulent influenza strains can cause widespread morbidity and mortality in human populations worldwide. In the United States alone, an average of 41,400 deaths and 1.86 million hospitalizations are caused by influenza virus infection each year 1. Point mutations in the polymerase basic protein 2 subunit (PB2) have been linked to the adaptation of the viral infection in humans 2. Findings from such studies have revealed the biological significance of PB2 as a virulence factor, thus highlighting its potential as an antiviral drug target. The structural genomics program put forth by the National Institute of Allergy and Infectious Disease (NIAID) provides funding to Emerald Bio and three other Pacific Northwest institutions that together make up the Seattle Structural Genomics Center for Infectious Disease (SSGCID). The SSGCID is dedicated to providing the scientific community with three-dimensional protein structures of NIAID category A-C pathogens. Making such structural information available to the scientific community serves to accelerate structure-based drug design. Structure-based drug design plays an important role in drug development. Pursuing multiple targets in parallel greatly increases the chance of success for new lead discovery by targeting a pathway or an entire protein family. Emerald Bio has developed a high-throughput, multi-target parallel processing pipeline (MTPP) for gene-to-structure determination to support the consortium. Here we describe the protocols used to determine the structure of the PB2 subunit from four different influenza A strains. PMID:23851357
A proposal for amending administrative law to facilitate adaptive management
NASA Astrophysics Data System (ADS)
Craig, Robin K.; Ruhl, J. B.; Brown, Eleanor D.; Williams, Byron K.
2017-07-01
In this article we examine how federal agencies use adaptive management. In order for federal agencies to implement adaptive management more successfully, administrative law must adapt to adaptive management, and we propose changes in administrative law that will help to steer the current process out of a dead end. Adaptive management is a form of structured decision making that is widely used in natural resources management. It involves specific steps integrated in an iterative process for adjusting management actions as new information becomes available. Theoretical requirements for adaptive management notwithstanding, federal agency decision making is subject to the requirements of the federal Administrative Procedure Act, and state agencies are subject to the states’ parallel statutes. We argue that conventional administrative law has unnecessarily shackled effective use of adaptive management. We show that through a specialized ‘adaptive management track’ of administrative procedures, the core values of administrative law—especially public participation, judicial review, and finality— can be implemented in ways that allow for more effective adaptive management. We present and explain draft model legislation (the Model Adaptive Management Procedure Act) that would create such a track for the specific types of agency decision making that could benefit from adaptive management.
A proposal for amending administrative law to facilitate adaptive management
Craig, Robin K.; Ruhl, J.B.; Brown, Eleanor D.; Williams, Byron K.
2017-01-01
In this article we examine how federal agencies use adaptive management. In order for federal agencies to implement adaptive management more successfully, administrative law must adapt to adaptive management, and we propose changes in administrative law that will help to steer the current process out of a dead end. Adaptive management is a form of structured decision making that is widely used in natural resources management. It involves specific steps integrated in an iterative process for adjusting management actions as new information becomes available. Theoretical requirements for adaptive management notwithstanding, federal agency decision making is subject to the requirements of the federal Administrative Procedure Act, and state agencies are subject to the states' parallel statutes. We argue that conventional administrative law has unnecessarily shackled effective use of adaptive management. We show that through a specialized 'adaptive management track' of administrative procedures, the core values of administrative law—especially public participation, judicial review, and finality— can be implemented in ways that allow for more effective adaptive management. We present and explain draft model legislation (the Model Adaptive Management Procedure Act) that would create such a track for the specific types of agency decision making that could benefit from adaptive management.
Tile-based Level of Detail for the Parallel Age
DOE Office of Scientific and Technical Information (OSTI.GOV)
Niski, K; Cohen, J D
Today's PCs incorporate multiple CPUs and GPUs and are easily arranged in clusters for high-performance, interactive graphics. We present an approach based on hierarchical, screen-space tiles to parallelizing rendering with level of detail. Adapt tiles, render tiles, and machine tiles are associated with CPUs, GPUs, and PCs, respectively, to efficiently parallelize the workload with good resource utilization. Adaptive tile sizes provide load balancing while our level of detail system allows total and independent management of the load on CPUs and GPUs. We demonstrate our approach on parallel configurations consisting of both single PCs and a cluster of PCs.
AdiosStMan: Parallelizing Casacore Table Data System using Adaptive IO System
NASA Astrophysics Data System (ADS)
Wang, R.; Harris, C.; Wicenec, A.
2016-07-01
In this paper, we investigate the Casacore Table Data System (CTDS) used in the casacore and CASA libraries, and methods to parallelize it. CTDS provides a storage manager plugin mechanism for third-party developers to design and implement their own CTDS storage managers. Having this in mind, we looked into various storage backend techniques that can possibly enable parallel I/O for CTDS by implementing new storage managers. After carrying on benchmarks showing the excellent parallel I/O throughput of the Adaptive IO System (ADIOS), we implemented an ADIOS based parallel CTDS storage manager. We then applied the CASA MSTransform frequency split task to verify the ADIOS Storage Manager. We also ran a series of performance tests to examine the I/O throughput in a massively parallel scenario.
Cameron, Chris; Ewara, Emmanuel; Wilson, Florence R; Varu, Abhishek; Dyrda, Peter; Hutton, Brian; Ingham, Michael
2017-11-01
Adaptive trial designs present a methodological challenge when performing network meta-analysis (NMA), as data from such adaptive trial designs differ from conventional parallel design randomized controlled trials (RCTs). We aim to illustrate the importance of considering study design when conducting an NMA. Three NMAs comparing anti-tumor necrosis factor drugs for ulcerative colitis were compared and the analyses replicated using Bayesian NMA. The NMA comprised 3 RCTs comparing 4 treatments (adalimumab 40 mg, golimumab 50 mg, golimumab 100 mg, infliximab 5 mg/kg) and placebo. We investigated the impact of incorporating differences in the study design among the 3 RCTs and presented 3 alternative methods on how to convert outcome data derived from one form of adaptive design to more conventional parallel RCTs. Combining RCT results without considering variations in study design resulted in effect estimates that were biased against golimumab. In contrast, using the 3 alternative methods to convert outcome data from one form of adaptive design to a format more consistent with conventional parallel RCTs facilitated more transparent consideration of differences in study design. This approach is more likely to yield appropriate estimates of comparative efficacy when conducting an NMA, which includes treatments that use an alternative study design. RCTs based on adaptive study designs should not be combined with traditional parallel RCT designs in NMA. We have presented potential approaches to convert data from one form of adaptive design to more conventional parallel RCTs to facilitate transparent and less-biased comparisons.
Beneath the Surface: Intelligence Preparation of the Battlespace for Counterterrorism
2004-11-01
consisting of those sub-systems existing below ground to include subways , sewers, utility structures and others.161 Although 155 Three reasons adapted...activities that provide for governance and basic human needs. Roads, subways , waterways, railroads and sea and airports are a few of the elements of the...recruiting, financing, and service (medicine, food , education) delivery oper- ations. Finally, the con- cept of avenues has parallels in cyberspace and
Simple adaptive control system design for a quadrotor with an internal PFC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mizumoto, Ikuro; Nakamura, Takuto; Kumon, Makoto
2014-12-10
The paper deals with an adaptive control system design problem for a four rotor helicopter or quadrotor. A simple adaptive control design scheme with a parallel feedforward compensator (PFC) in the internal loop of the considered quadrotor will be proposed based on the backstepping strategy. As is well known, the backstepping control strategy is one of the advanced control strategy for nonlinear systems. However, the control algorithm will become complex if the system has higher order relative degrees. We will show that one can skip some design steps of the backstepping method by introducing a PFC in the inner loopmore » of the considered quadrotor, so that the structure of the obtained controller will be simplified and a high gain based adaptive feedback control system will be designed. The effectiveness of the proposed method will be confirmed through numerical simulations.« less
An adaptive front tracking technique for three-dimensional transient flows
NASA Astrophysics Data System (ADS)
Galaktionov, O. S.; Anderson, P. D.; Peters, G. W. M.; van de Vosse, F. N.
2000-01-01
An adaptive technique, based on both surface stretching and surface curvature analysis for tracking strongly deforming fluid volumes in three-dimensional flows is presented. The efficiency and accuracy of the technique are demonstrated for two- and three-dimensional flow simulations. For the two-dimensional test example, the results are compared with results obtained using a different tracking approach based on the advection of a passive scalar. Although for both techniques roughly the same structures are found, the resolution for the front tracking technique is much higher. In the three-dimensional test example, a spherical blob is tracked in a chaotic mixing flow. For this problem, the accuracy of the adaptive tracking is demonstrated by the volume conservation for the advected blob. Adaptive front tracking is suitable for simulation of the initial stages of fluid mixing, where the interfacial area can grow exponentially with time. The efficiency of the algorithm significantly benefits from parallelization of the code. Copyright
Six degree-of-freedom scanning supports and manipulators based on parallel robots
NASA Astrophysics Data System (ADS)
Comin, Fabio
1995-02-01
The exploitation of third generation SR sources heavily relies on accurate and stable positioning and scanning of samples and optical elements. In some cases, active feedback is also necessary. Normally, these tasks are carried out by serial addition of individual components, each of them providing a well-defined excursion path. On the contrary, the exploitation of the concept of parallel robots, structures in close cinematic chain, permits us to follow any given trajectory in the six-dimensional space with a large increase in accuracy and stiffness. At ESRF, the parallel robot architecture conceived some tens of years ago for flight simulators has been adapted to both actively align and operate optical elements of considerable weight and position small samples in ultrahigh vacuum. The performance of these devices gives results far superior to the initial specification and a variety of drive mechanisms are being developed to fit the different needs of the ESRF beamlines.
IOPA: I/O-aware parallelism adaption for parallel programs
Liu, Tao; Liu, Yi; Qian, Chen; Qian, Depei
2017-01-01
With the development of multi-/many-core processors, applications need to be written as parallel programs to improve execution efficiency. For data-intensive applications that use multiple threads to read/write files simultaneously, an I/O sub-system can easily become a bottleneck when too many of these types of threads exist; on the contrary, too few threads will cause insufficient resource utilization and hurt performance. Therefore, programmers must pay much attention to parallelism control to find the appropriate number of I/O threads for an application. This paper proposes a parallelism control mechanism named IOPA that can adjust the parallelism of applications to adapt to the I/O capability of a system and balance computing resources and I/O bandwidth. The programming interface of IOPA is also provided to programmers to simplify parallel programming. IOPA is evaluated using multiple applications with both solid state and hard disk drives. The results show that the parallel applications using IOPA can achieve higher efficiency than those with a fixed number of threads. PMID:28278236
IOPA: I/O-aware parallelism adaption for parallel programs.
Liu, Tao; Liu, Yi; Qian, Chen; Qian, Depei
2017-01-01
With the development of multi-/many-core processors, applications need to be written as parallel programs to improve execution efficiency. For data-intensive applications that use multiple threads to read/write files simultaneously, an I/O sub-system can easily become a bottleneck when too many of these types of threads exist; on the contrary, too few threads will cause insufficient resource utilization and hurt performance. Therefore, programmers must pay much attention to parallelism control to find the appropriate number of I/O threads for an application. This paper proposes a parallelism control mechanism named IOPA that can adjust the parallelism of applications to adapt to the I/O capability of a system and balance computing resources and I/O bandwidth. The programming interface of IOPA is also provided to programmers to simplify parallel programming. IOPA is evaluated using multiple applications with both solid state and hard disk drives. The results show that the parallel applications using IOPA can achieve higher efficiency than those with a fixed number of threads.
Parallel kinematic mechanisms for distributed actuation of future structures
NASA Astrophysics Data System (ADS)
Lai, G.; Plummer, A. R.; Cleaver, D. J.; Zhou, H.
2016-09-01
Future machines will require distributed actuation integrated with load-bearing structures, so that they are lighter, move faster, use less energy, and are more adaptable. Good examples are shape-changing aircraft wings which can adapt precisely to the ideal aerodynamic form for current flying conditions, and light but powerful robotic manipulators which can interact safely with human co-workers. A 'tensegrity structure' is a good candidate for this application due to its potentially excellent stiffness and strength-to-weight ratio and a multi-element structure into which actuators could be embedded. This paper presents results of an analysis of an example practical actuated tensegrity structure consisting of 3 ‘unit cells’. A numerical method is used to determine the stability of the structure with varying actuator length, showing how four actuators can be used to control movement in three degrees of freedom as well as simultaneously maintaining the structural pre-load. An experimental prototype has been built, in which 4 pneumatic artificial muscles (PAMs) are embedded in one unit cell. The PAMs are controlled antagonistically, by high speed switching of on-off valves, to achieve control of position and structure pre-load. Experimental and simulation results are presented, and future prospects for the approach are discussed.
NASA Astrophysics Data System (ADS)
Ferrando, N.; Gosálvez, M. A.; Cerdá, J.; Gadea, R.; Sato, K.
2011-03-01
Presently, dynamic surface-based models are required to contain increasingly larger numbers of points and to propagate them over longer time periods. For large numbers of surface points, the octree data structure can be used as a balance between low memory occupation and relatively rapid access to the stored data. For evolution rules that depend on neighborhood states, extended simulation periods can be obtained by using simplified atomistic propagation models, such as the Cellular Automata (CA). This method, however, has an intrinsic parallel updating nature and the corresponding simulations are highly inefficient when performed on classical Central Processing Units (CPUs), which are designed for the sequential execution of tasks. In this paper, a series of guidelines is presented for the efficient adaptation of octree-based, CA simulations of complex, evolving surfaces into massively parallel computing hardware. A Graphics Processing Unit (GPU) is used as a cost-efficient example of the parallel architectures. For the actual simulations, we consider the surface propagation during anisotropic wet chemical etching of silicon as a computationally challenging process with a wide-spread use in microengineering applications. A continuous CA model that is intrinsically parallel in nature is used for the time evolution. Our study strongly indicates that parallel computations of dynamically evolving surfaces simulated using CA methods are significantly benefited by the incorporation of octrees as support data structures, substantially decreasing the overall computational time and memory usage.
Jeukens, Julie; Bittner, David; Knudsen, Rune; Bernatchez, Louis
2009-01-01
In the past 40 years, there has been increasing acceptance that variation in levels of gene expression represents a major source of evolutionary novelty. Gene expression divergence is therefore likely to be involved in the emergence of incipient species, namely, in a context of adaptive radiation. In the lake whitefish species complex (Coregonus clupeaformis), previous microarray experiments have led to the identification of candidate genes potentially implicated in the parallel evolution of the limnetic dwarf lake whitefish, which is highly distinct from the benthic normal lake whitefish in life history, morphology, metabolism, and behavior, and yet diverged from it only approximately 15,000 years before present. The aim of the present study was to address transcriptional divergence for six candidate genes among lake whitefish and European whitefish (Coregonus lavaretus) species pairs, as well as lake cisco (Coregonus artedi) and vendace (Coregonus albula). The main goal was to test the hypothesis that parallel phenotypic adaptation toward the use of the limnetic niche in coregonine fishes is accompanied by parallelism in candidate gene transcription as measured by quantitative real-time polymerase chain reaction. Results obtained for three candidate genes, whereby parallelism in expression was observed across all whitefish species pairs, provide strong support for the hypothesis that divergent natural selection plays an important role in the adaptive radiation of whitefish species. However, this parallelism in expression did not extend to cisco and vendace, thereby infirming transcriptional convergence between limnetic whitefish species and their limnetic congeners for these genes. As recently proposed (Lynch 2007a. The evolution of genetic networks by non-adaptive processes. Nat Rev Genet. 8:803-813), these results may suggest that convergent phenotypic evolution can result from nonadaptive shaping of genome architecture in independently evolved coregonine lineages.
Unstructured Adaptive (UA) NAS Parallel Benchmark. Version 1.0
NASA Technical Reports Server (NTRS)
Feng, Huiyu; VanderWijngaart, Rob; Biswas, Rupak; Mavriplis, Catherine
2004-01-01
We present a complete specification of a new benchmark for measuring the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. It complements the existing NAS Parallel Benchmark suite. The benchmark involves the solution of a stylized heat transfer problem in a cubic domain, discretized on an adaptively refined, unstructured mesh.
Wavelet Transforms in Parallel Image Processing
1994-01-27
NUMBER OF PAGES Object Segmentation, Texture Segmentation, Image Compression, Image 137 Halftoning , Neural Network, Parallel Algorithms, 2D and 3D...Vector Quantization of Wavelet Transform Coefficients ........ ............................. 57 B.1.f Adaptive Image Halftoning based on Wavelet...application has been directed to the adaptive image halftoning . The gray information at a pixel, including its gray value and gradient, is represented by
ALEGRA -- A massively parallel h-adaptive code for solid dynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Summers, R.M.; Wong, M.K.; Boucheron, E.A.
1997-12-31
ALEGRA is a multi-material, arbitrary-Lagrangian-Eulerian (ALE) code for solid dynamics designed to run on massively parallel (MP) computers. It combines the features of modern Eulerian shock codes, such as CTH, with modern Lagrangian structural analysis codes using an unstructured grid. ALEGRA is being developed for use on the teraflop supercomputers to conduct advanced three-dimensional (3D) simulations of shock phenomena important to a variety of systems. ALEGRA was designed with the Single Program Multiple Data (SPMD) paradigm, in which the mesh is decomposed into sub-meshes so that each processor gets a single sub-mesh with approximately the same number of elements. Usingmore » this approach the authors have been able to produce a single code that can scale from one processor to thousands of processors. A current major effort is to develop efficient, high precision simulation capabilities for ALEGRA, without the computational cost of using a global highly resolved mesh, through flexible, robust h-adaptivity of finite elements. H-adaptivity is the dynamic refinement of the mesh by subdividing elements, thus changing the characteristic element size and reducing numerical error. The authors are working on several major technical challenges that must be met to make effective use of HAMMER on MP computers.« less
Vertical Scan (V-SCAN) for 3-D Grid Adaptive Mesh Refinement for an atmospheric Model Dynamical Core
NASA Astrophysics Data System (ADS)
Andronova, N. G.; Vandenberg, D.; Oehmke, R.; Stout, Q. F.; Penner, J. E.
2009-12-01
One of the major building blocks of a rigorous representation of cloud evolution in global atmospheric models is a parallel adaptive grid MPI-based communication library (an Adaptive Blocks for Locally Cartesian Topologies library -- ABLCarT), which manages the block-structured data layout, handles ghost cell updates among neighboring blocks and splits a block as refinements occur. The library has several modules that provide a layer of abstraction for adaptive refinement: blocks, which contain individual cells of user data; shells - the global geometry for the problem, including a sphere, reduced sphere, and now a 3D sphere; a load balancer for placement of blocks onto processors; and a communication support layer which encapsulates all data movement. A major performance concern with adaptive mesh refinement is how to represent calculations that have need to be sequenced in a particular order in a direction, such as calculating integrals along a specific path (e.g. atmospheric pressure or geopotential in the vertical dimension). This concern is compounded if the blocks have varying levels of refinement, or are scattered across different processors, as can be the case in parallel computing. In this paper we describe an implementation in ABLCarT of a vertical scan operation, which allows computing along vertical paths in the correct order across blocks transparent to their resolution and processor location. We test this functionality on a 2D and a 3D advection problem, which tests the performance of the model’s dynamics (transport) and physics (sources and sinks) for different model resolutions needed for inclusion of cloud formation.
Adaptive multi-GPU Exchange Monte Carlo for the 3D Random Field Ising Model
NASA Astrophysics Data System (ADS)
Navarro, Cristóbal A.; Huang, Wei; Deng, Youjin
2016-08-01
This work presents an adaptive multi-GPU Exchange Monte Carlo approach for the simulation of the 3D Random Field Ising Model (RFIM). The design is based on a two-level parallelization. The first level, spin-level parallelism, maps the parallel computation as optimal 3D thread-blocks that simulate blocks of spins in shared memory with minimal halo surface, assuming a constant block volume. The second level, replica-level parallelism, uses multi-GPU computation to handle the simulation of an ensemble of replicas. CUDA's concurrent kernel execution feature is used in order to fill the occupancy of each GPU with many replicas, providing a performance boost that is more notorious at the smallest values of L. In addition to the two-level parallel design, the work proposes an adaptive multi-GPU approach that dynamically builds a proper temperature set free of exchange bottlenecks. The strategy is based on mid-point insertions at the temperature gaps where the exchange rate is most compromised. The extra work generated by the insertions is balanced across the GPUs independently of where the mid-point insertions were performed. Performance results show that spin-level performance is approximately two orders of magnitude faster than a single-core CPU version and one order of magnitude faster than a parallel multi-core CPU version running on 16-cores. Multi-GPU performance is highly convenient under a weak scaling setting, reaching up to 99 % efficiency as long as the number of GPUs and L increase together. The combination of the adaptive approach with the parallel multi-GPU design has extended our possibilities of simulation to sizes of L = 32 , 64 for a workstation with two GPUs. Sizes beyond L = 64 can eventually be studied using larger multi-GPU systems.
Parallel Evolution of Cold Tolerance within Drosophila melanogaster
Braun, Dylan T.; Lack, Justin B.
2017-01-01
Drosophila melanogaster originated in tropical Africa before expanding into strikingly different temperate climates in Eurasia and beyond. Here, we find elevated cold tolerance in three distinct geographic regions: beyond the well-studied non-African case, we show that populations from the highlands of Ethiopia and South Africa have significantly increased cold tolerance as well. We observe greater cold tolerance in outbred versus inbred flies, but only in populations with higher inversion frequencies. Each cold-adapted population shows lower inversion frequencies than a closely-related warm-adapted population, suggesting that inversion frequencies may decrease with altitude in addition to latitude. Using the FST-based “Population Branch Excess” statistic (PBE), we found only limited evidence for parallel genetic differentiation at the scale of ∼4 kb windows, specifically between Ethiopian and South African cold-adapted populations. And yet, when we looked for single nucleotide polymorphisms (SNPs) with codirectional frequency change in two or three cold-adapted populations, strong genomic enrichments were observed from all comparisons. These findings could reflect an important role for selection on standing genetic variation leading to “soft sweeps”. One SNP showed sufficient codirectional frequency change in all cold-adapted populations to achieve experiment-wide significance: an intronic variant in the synaptic gene Prosap. Another codirectional outlier SNP, at senseless-2, had a strong association with our cold trait measurements, but in the opposite direction as predicted. More generally, proteins involved in neurotransmission were enriched as potential targets of parallel adaptation. The ability to study cold tolerance evolution in a parallel framework will enhance this classic study system for climate adaptation. PMID:27777283
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter
In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter; ...
2016-06-30
In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
Nature of the water/aromatic parallel alignment interactions.
Mitoraj, Mariusz P; Janjić, Goran V; Medaković, Vesna B; Veljković, Dušan Ž; Michalak, Artur; Zarić, Snežana D; Milčić, Miloš K
2015-01-30
The water/aromatic parallel alignment interactions are interactions where the water molecule or one of its O-H bonds is parallel to the aromatic ring plane. The calculated energies of the interactions are significant, up to ΔE(CCSD)(T)(limit) = -2.45 kcal mol(-1) at large horizontal displacement, out of benzene ring and CH bond region. These interactions are stronger than CH···O water/benzene interactions, but weaker than OH···π interactions. To investigate the nature of water/aromatic parallel alignment interactions, energy decomposition methods, symmetry-adapted perturbation theory, and extended transition state-natural orbitals for chemical valence (NOCV), were used. The calculations have shown that, for the complexes at large horizontal displacements, major contribution to interaction energy comes from electrostatic interactions between monomers, and for the complexes at small horizontal displacements, dispersion interactions are dominant binding force. The NOCV-based analysis has shown that in structures with strong interaction energies charge transfer of the type π → σ*(O-H) between the monomers also exists. © 2014 Wiley Periodicals, Inc.
Cerebellarlike corrective model inference engine for manipulation tasks.
Luque, Niceto Rafael; Garrido, Jesús Alberto; Carrillo, Richard Rafael; Coenen, Olivier J-M D; Ros, Eduardo
2011-10-01
This paper presents how a simple cerebellumlike architecture can infer corrective models in the framework of a control task when manipulating objects that significantly affect the dynamics model of the system. The main motivation of this paper is to evaluate a simplified bio-mimetic approach in the framework of a manipulation task. More concretely, the paper focuses on how the model inference process takes place within a feedforward control loop based on the cerebellar structure and on how these internal models are built up by means of biologically plausible synaptic adaptation mechanisms. This kind of investigation may provide clues on how biology achieves accurate control of non-stiff-joint robot with low-power actuators which involve controlling systems with high inertial components. This paper studies how a basic temporal-correlation kernel including long-term depression (LTD) and a constant long-term potentiation (LTP) at parallel fiber-Purkinje cell synapses can effectively infer corrective models. We evaluate how this spike-timing-dependent plasticity correlates sensorimotor activity arriving through the parallel fibers with teaching signals (dependent on error estimates) arriving through the climbing fibers from the inferior olive. This paper addresses the study of how these LTD and LTP components need to be well balanced with each other to achieve accurate learning. This is of interest to evaluate the relevant role of homeostatic mechanisms in biological systems where adaptation occurs in a distributed manner. Furthermore, we illustrate how the temporal-correlation kernel can also work in the presence of transmission delays in sensorimotor pathways. We use a cerebellumlike spiking neural network which stores the corrective models as well-structured weight patterns distributed among the parallel fibers to Purkinje cell connections.
Multicoil resonance-based parallel array for smart wireless power delivery.
Mirbozorgi, S A; Sawan, M; Gosselin, B
2013-01-01
This paper presents a novel resonance-based multicoil structure as a smart power surface to wirelessly power up apparatus like mobile, animal headstage, implanted devices, etc. The proposed powering system is based on a 4-coil resonance-based inductive link, the resonance coil of which is formed by an array of several paralleled coils as a smart power transmitter. The power transmitter employs simple circuit connections and includes only one power driver circuit per multicoil resonance-based array, which enables higher power transfer efficiency and power delivery to the load. The power transmitted by the driver circuit is proportional to the load seen by the individual coil in the array. Thus, the transmitted power scales with respect to the load of the electric/electronic system to power up, and does not divide equally over every parallel coils that form the array. Instead, only the loaded coils of the parallel array transmit significant part of total transmitted power to the receiver. Such adaptive behavior enables superior power, size and cost efficiency then other solutions since it does not need to use complex detection circuitry to find the location of the load. The performance of the proposed structure is verified by measurement results. Natural load detection and covering 4 times bigger area than conventional topologies with a power transfer efficiency of 55% are the novelties of presented paper.
Li, Le-Bao; Sun, Ling-Ling; Zhang, Sheng-Zhou; Yang, Qing-Quan
2015-09-01
A new control approach for speed tracking and synchronization of multiple motors is developed, by incorporating an adaptive sliding mode control (ASMC) technique into a ring coupling synchronization control structure. This control approach can stabilize speed tracking of each motor and synchronize its motion with other motors' motion so that speed tracking errors and synchronization errors converge to zero. Moreover, an adaptive law is exploited to estimate the unknown bound of uncertainty, which is obtained in the sense of Lyapunov stability theorem to minimize the control effort and attenuate chattering. Performance comparisons with parallel control, relative coupling control and conventional PI control are investigated on a four-motor synchronization control system. Extensive simulation results show the effectiveness of the proposed control scheme. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Parallel, adaptive finite element methods for conservation laws
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Devine, Karen D.; Flaherty, Joseph E.
1994-01-01
We construct parallel finite element methods for the solution of hyperbolic conservation laws in one and two dimensions. Spatial discretization is performed by a discontinuous Galerkin finite element method using a basis of piecewise Legendre polynomials. Temporal discretization utilizes a Runge-Kutta method. Dissipative fluxes and projection limiting prevent oscillations near solution discontinuities. A posteriori estimates of spatial errors are obtained by a p-refinement technique using superconvergence at Radau points. The resulting method is of high order and may be parallelized efficiently on MIMD computers. We compare results using different limiting schemes and demonstrate parallel efficiency through computations on an NCUBE/2 hypercube. We also present results using adaptive h- and p-refinement to reduce the computational cost of the method.
Load Balancing Unstructured Adaptive Grids for CFD Problems
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Oliker, Leonid
1996-01-01
Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. A dynamic load balancing method is presented that balances the workload across all processors with a global view. After each parallel tetrahedral mesh adaption, the method first determines if the new mesh is sufficiently unbalanced to warrant a repartitioning. If so, the adapted mesh is repartitioned, with new partitions assigned to processors so that the redistribution cost is minimized. The new partitions are accepted only if the remapping cost is compensated by the improved load balance. Results indicate that this strategy is effective for large-scale scientific computations on distributed-memory multiprocessors.
Kwon, Ronald Y; Meays, Diana R; Meilan, Alexander S; Jones, Jeremiah; Miramontes, Rosa; Kardos, Natalie; Yeh, Jiunn-Chern; Frangos, John A
2012-01-01
Interstitial fluid flow (IFF) is a potent regulatory signal in bone. During mechanical loading, IFF is generated through two distinct mechanisms that result in spatially distinct flow profiles: poroelastic interactions within the lacunar-canalicular system, and intramedullary pressurization. While the former generates IFF primarily within the lacunar-canalicular network, the latter generates significant flow at the endosteal surface as well as within the tissue. This gives rise to the intriguing possibility that loading-induced IFF may differentially activate osteocytes or surface-residing cells depending on the generating mechanism, and that sensation of IFF generated via intramedullary pressurization may be mediated by a non-osteocytic bone cell population. To begin to explore this possibility, we used the Dmp1-HBEGF inducible osteocyte ablation mouse model and a microfluidic system for modulating intramedullary pressure (ImP) to assess whether structural adaptation to ImP-driven IFF is altered by partial osteocyte depletion. Canalicular convective velocities during pressurization were estimated through the use of fluorescence recovery after photobleaching and computational modeling. Following osteocyte ablation, transgenic mice exhibited severe losses in bone structure and altered responses to hindlimb suspension in a compartment-specific manner. In pressure-loaded limbs, transgenic mice displayed similar or significantly enhanced structural adaptation to Imp-driven IFF, particularly in the trabecular compartment, despite up to ∼50% of trabecular lacunae being uninhabited following ablation. Interestingly, regression analysis revealed relative gains in bone structure in pressure-loaded limbs were correlated with reductions in bone structure in unpressurized control limbs, suggesting that adaptation to ImP-driven IFF was potentiated by increases in osteoclastic activity and/or reductions in osteoblastic activity incurred independently of pressure loading. Collectively, these studies indicate that structural adaptation to ImP-driven IFF can proceed unimpeded following a significant depletion in osteocytes, consistent with the potential existence of a non-osteocytic bone cell population that senses ImP-driven IFF independently and potentially parallel to osteocytic sensation of poroelasticity-derived IFF.
Banerjee, Amartya S.; Lin, Lin; Hu, Wei; ...
2016-10-21
The Discontinuous Galerkin (DG) electronic structure method employs an adaptive local basis (ALB) set to solve the Kohn-Sham equations of density functional theory in a discontinuous Galerkin framework. The adaptive local basis is generated on-the-fly to capture the local material physics and can systematically attain chemical accuracy with only a few tens of degrees of freedom per atom. A central issue for large-scale calculations, however, is the computation of the electron density (and subsequently, ground state properties) from the discretized Hamiltonian in an efficient and scalable manner. We show in this work how Chebyshev polynomial filtered subspace iteration (CheFSI) canmore » be used to address this issue and push the envelope in large-scale materials simulations in a discontinuous Galerkin framework. We describe how the subspace filtering steps can be performed in an efficient and scalable manner using a two-dimensional parallelization scheme, thanks to the orthogonality of the DG basis set and block-sparse structure of the DG Hamiltonian matrix. The on-the-fly nature of the ALB functions requires additional care in carrying out the subspace iterations. We demonstrate the parallel scalability of the DG-CheFSI approach in calculations of large-scale twodimensional graphene sheets and bulk three-dimensional lithium-ion electrolyte systems. In conclusion, employing 55 296 computational cores, the time per self-consistent field iteration for a sample of the bulk 3D electrolyte containing 8586 atoms is 90 s, and the time for a graphene sheet containing 11 520 atoms is 75 s.« less
Direct Machining of Low-Loss THz Waveguide Components With an RF Choke.
Lewis, Samantha M; Nanni, Emilio A; Temkin, Richard J
2014-12-01
We present results for the successful fabrication of low-loss THz metallic waveguide components using direct machining with a CNC end mill. The approach uses a split-block machining process with the addition of an RF choke running parallel to the waveguide. The choke greatly reduces coupling to the parasitic mode of the parallel-plate waveguide produced by the split-block. This method has demonstrated loss as low as 0.2 dB/cm at 280 GHz for a copper WR-3 waveguide. It has also been used in the fabrication of 3 and 10 dB directional couplers in brass, demonstrating excellent agreement with design simulations from 240-260 GHz. The method may be adapted to structures with features on the order of 200 μm.
Big data mining analysis method based on cloud computing
NASA Astrophysics Data System (ADS)
Cai, Qing Qiu; Cui, Hong Gang; Tang, Hao
2017-08-01
Information explosion era, large data super-large, discrete and non-(semi) structured features have gone far beyond the traditional data management can carry the scope of the way. With the arrival of the cloud computing era, cloud computing provides a new technical way to analyze the massive data mining, which can effectively solve the problem that the traditional data mining method cannot adapt to massive data mining. This paper introduces the meaning and characteristics of cloud computing, analyzes the advantages of using cloud computing technology to realize data mining, designs the mining algorithm of association rules based on MapReduce parallel processing architecture, and carries out the experimental verification. The algorithm of parallel association rule mining based on cloud computing platform can greatly improve the execution speed of data mining.
Extent of QTL Reuse During Repeated Phenotypic Divergence of Sympatric Threespine Stickleback.
Conte, Gina L; Arnegard, Matthew E; Best, Jacob; Chan, Yingguang Frank; Jones, Felicity C; Kingsley, David M; Schluter, Dolph; Peichel, Catherine L
2015-11-01
How predictable is the genetic basis of phenotypic adaptation? Answering this question begins by estimating the repeatability of adaptation at the genetic level. Here, we provide a comprehensive estimate of the repeatability of the genetic basis of adaptive phenotypic evolution in a natural system. We used quantitative trait locus (QTL) mapping to discover genomic regions controlling a large number of morphological traits that have diverged in parallel between pairs of threespine stickleback (Gasterosteus aculeatus species complex) in Paxton and Priest lakes, British Columbia. We found that nearly half of QTL affected the same traits in the same direction in both species pairs. Another 40% influenced a parallel phenotypic trait in one lake but not the other. The remaining 10% of QTL had phenotypic effects in opposite directions in the two species pairs. Similarity in the proportional contributions of all QTL to parallel trait differences was about 0.4. Surprisingly, QTL reuse was unrelated to phenotypic effect size. Our results indicate that repeated use of the same genomic regions is a pervasive feature of parallel phenotypic adaptation, at least in sticklebacks. Identifying the causes of this pattern would aid prediction of the genetic basis of phenotypic evolution. Copyright © 2015 by the Genetics Society of America.
Coelho, V N; Coelho, I M; Souza, M J F; Oliveira, T A; Cota, L P; Haddad, M N; Mladenovic, N; Silva, R C P; Guimarães, F G
2016-01-01
This article presents an Evolution Strategy (ES)--based algorithm, designed to self-adapt its mutation operators, guiding the search into the solution space using a Self-Adaptive Reduced Variable Neighborhood Search procedure. In view of the specific local search operators for each individual, the proposed population-based approach also fits into the context of the Memetic Algorithms. The proposed variant uses the Greedy Randomized Adaptive Search Procedure with different greedy parameters for generating its initial population, providing an interesting exploration-exploitation balance. To validate the proposal, this framework is applied to solve three different [Formula: see text]-Hard combinatorial optimization problems: an Open-Pit-Mining Operational Planning Problem with dynamic allocation of trucks, an Unrelated Parallel Machine Scheduling Problem with Setup Times, and the calibration of a hybrid fuzzy model for Short-Term Load Forecasting. Computational results point out the convergence of the proposed model and highlight its ability in combining the application of move operations from distinct neighborhood structures along the optimization. The results gathered and reported in this article represent a collective evidence of the performance of the method in challenging combinatorial optimization problems from different application domains. The proposed evolution strategy demonstrates an ability of adapting the strength of the mutation disturbance during the generations of its evolution process. The effectiveness of the proposal motivates the application of this novel evolutionary framework for solving other combinatorial optimization problems.
Parallel processing in the honeybee olfactory pathway: structure, function, and evolution.
Rössler, Wolfgang; Brill, Martin F
2013-11-01
Animals face highly complex and dynamic olfactory stimuli in their natural environments, which require fast and reliable olfactory processing. Parallel processing is a common principle of sensory systems supporting this task, for example in visual and auditory systems, but its role in olfaction remained unclear. Studies in the honeybee focused on a dual olfactory pathway. Two sets of projection neurons connect glomeruli in two antennal-lobe hemilobes via lateral and medial tracts in opposite sequence with the mushroom bodies and lateral horn. Comparative studies suggest that this dual-tract circuit represents a unique adaptation in Hymenoptera. Imaging studies indicate that glomeruli in both hemilobes receive redundant sensory input. Recent simultaneous multi-unit recordings from projection neurons of both tracts revealed widely overlapping response profiles strongly indicating parallel olfactory processing. Whereas lateral-tract neurons respond fast with broad (generalistic) profiles, medial-tract neurons are odorant specific and respond slower. In analogy to "what-" and "where" subsystems in visual pathways, this suggests two parallel olfactory subsystems providing "what-" (quality) and "when" (temporal) information. Temporal response properties may support across-tract coincidence coding in higher centers. Parallel olfactory processing likely enhances perception of complex odorant mixtures to decode the diverse and dynamic olfactory world of a social insect.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, Cameron W.; Granzow, Brian; Diamond, Gerrett
Unstructured mesh methods, like finite elements and finite volumes, support the effective analysis of complex physical behaviors modeled by partial differential equations over general threedimensional domains. The most reliable and efficient methods apply adaptive procedures with a-posteriori error estimators that indicate where and how the mesh is to be modified. Although adaptive meshes can have two to three orders of magnitude fewer elements than a more uniform mesh for the same level of accuracy, there are many complex simulations where the meshes required are so large that they can only be solved on massively parallel systems.
Smith, Cameron W.; Granzow, Brian; Diamond, Gerrett; ...
2017-01-01
Unstructured mesh methods, like finite elements and finite volumes, support the effective analysis of complex physical behaviors modeled by partial differential equations over general threedimensional domains. The most reliable and efficient methods apply adaptive procedures with a-posteriori error estimators that indicate where and how the mesh is to be modified. Although adaptive meshes can have two to three orders of magnitude fewer elements than a more uniform mesh for the same level of accuracy, there are many complex simulations where the meshes required are so large that they can only be solved on massively parallel systems.
Genetic adaptations of the plateau zokor in high-elevation burrows.
Shao, Yong; Li, Jin-Xiu; Ge, Ri-Li; Zhong, Li; Irwin, David M; Murphy, Robert W; Zhang, Ya-Ping
2015-11-25
The plateau zokor (Myospalax baileyi) spends its entire life underground in sealed burrows. Confronting limited oxygen and high carbon dioxide concentrations, and complete darkness, they epitomize a successful physiological adaptation. Here, we employ transcriptome sequencing to explore the genetic underpinnings of their adaptations to this unique habitat. Compared to Rattus norvegicus, genes belonging to GO categories related to energy metabolism (e.g. mitochondrion and fatty acid beta-oxidation) underwent accelerated evolution in the plateau zokor. Furthermore, the numbers of positively selected genes were significantly enriched in the gene categories involved in ATPase activity, blood vessel development and respiratory gaseous exchange, functional categories that are relevant to adaptation to high altitudes. Among the 787 genes with evidence of parallel evolution, and thus identified as candidate genes, several GO categories (e.g. response to hypoxia, oxygen homeostasis and erythrocyte homeostasis) are significantly enriched, are two genes, EPAS1 and AJUBA, involved in the response to hypoxia, where the parallel evolved sites are at positions that are highly conserved in sequence alignments from multiple species. Thus, accelerated evolution of GO categories, positive selection and parallel evolution at the molecular level provide evidences to parse the genetic adaptations of the plateau zokor for living in high-elevation burrows.
Payen, Celia; Di Rienzi, Sara C; Ong, Giang T; Pogachar, Jamie L; Sanchez, Joseph C; Sunshine, Anna B; Raghuraman, M K; Brewer, Bonita J; Dunham, Maitreya J
2014-03-20
Population adaptation to strong selection can occur through the sequential or parallel accumulation of competing beneficial mutations. The dynamics, diversity, and rate of fixation of beneficial mutations within and between populations are still poorly understood. To study how the mutational landscape varies across populations during adaptation, we performed experimental evolution on seven parallel populations of Saccharomyces cerevisiae continuously cultured in limiting sulfate medium. By combining quantitative polymerase chain reaction, array comparative genomic hybridization, restriction digestion and contour-clamped homogeneous electric field gel electrophoresis, and whole-genome sequencing, we followed the trajectory of evolution to determine the identity and fate of beneficial mutations. During a period of 200 generations, the yeast populations displayed parallel evolutionary dynamics that were driven by the coexistence of independent beneficial mutations. Selective amplifications rapidly evolved under this selection pressure, in particular common inverted amplifications containing the sulfate transporter gene SUL1. Compared with single clones, detailed analysis of the populations uncovers a greater complexity whereby multiple subpopulations arise and compete despite a strong selection. The most common evolutionary adaptation to strong selection in these populations grown in sulfate limitation is determined by clonal interference, with adaptive variants both persisting and replacing one another.
Payen, Celia; Di Rienzi, Sara C.; Ong, Giang T.; Pogachar, Jamie L.; Sanchez, Joseph C.; Sunshine, Anna B.; Raghuraman, M. K.; Brewer, Bonita J.; Dunham, Maitreya J.
2014-01-01
Population adaptation to strong selection can occur through the sequential or parallel accumulation of competing beneficial mutations. The dynamics, diversity, and rate of fixation of beneficial mutations within and between populations are still poorly understood. To study how the mutational landscape varies across populations during adaptation, we performed experimental evolution on seven parallel populations of Saccharomyces cerevisiae continuously cultured in limiting sulfate medium. By combining quantitative polymerase chain reaction, array comparative genomic hybridization, restriction digestion and contour-clamped homogeneous electric field gel electrophoresis, and whole-genome sequencing, we followed the trajectory of evolution to determine the identity and fate of beneficial mutations. During a period of 200 generations, the yeast populations displayed parallel evolutionary dynamics that were driven by the coexistence of independent beneficial mutations. Selective amplifications rapidly evolved under this selection pressure, in particular common inverted amplifications containing the sulfate transporter gene SUL1. Compared with single clones, detailed analysis of the populations uncovers a greater complexity whereby multiple subpopulations arise and compete despite a strong selection. The most common evolutionary adaptation to strong selection in these populations grown in sulfate limitation is determined by clonal interference, with adaptive variants both persisting and replacing one another. PMID:24368781
Tobler, Ray; Hermisson, Joachim; Schlötterer, Christian
2015-01-01
Thermal stress is a pervasive selective agent in natural populations that impacts organismal growth, survival, and reproduction. Drosophila melanogaster exhibits a variety of putatively adaptive phenotypic responses to thermal stress in natural and experimental settings; however, accompanying assessments of fitness are typically lacking. Here, we quantify changes in fitness and known thermal tolerance traits in replicated experimental D. melanogaster populations following more than 40 generations of evolution to either cyclic cold or hot temperatures. By evaluating fitness for both evolved populations alongside a reconstituted starting population, we show that the evolved populations were the best adapted within their respective thermal environments. More strikingly, the evolved populations exhibited increased fitness in both environments and improved resistance to both acute heat and cold stress. This unexpected parallel response appeared to be an adaptation to the rapid temperature changes that drove the cycling thermal regimes, as parallel fitness changes were not observed when tested in a constant thermal environment. Our results add to a small, but growing group of studies that demonstrate the importance of fluctuating temperature changes for thermal adaptation and highlight the need for additional work in this area. PMID:26080903
Architecture Adaptive Computing Environment
NASA Technical Reports Server (NTRS)
Dorband, John E.
2006-01-01
Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
NASA Technical Reports Server (NTRS)
Tsvetov, Y. P.; Razin, S. I.; Rychko, A. V.
1980-01-01
The effect of 2 and 4 week hypokinesia regimens on the hypothalamo-pituitary-adrenal system (HPAS) was investigated in 110 inbred mice. Progressive exhaustion and pathological reorganization of the HPAS morphofunctional structures was revealed. On the basis of established facts of interlineary and interspecies differences in the HPAS response, it is suggested that the animal body response reaction to the long term effects of hypokinesia depends largely on its HPAS resistance and the values of this system's defensive adaptation potential.
NASA Technical Reports Server (NTRS)
Aftosmis, M. J.; Berger, M. J.; Murman, S. M.; Kwak, Dochan (Technical Monitor)
2002-01-01
The proposed paper will present recent extensions in the development of an efficient Euler solver for adaptively-refined Cartesian meshes with embedded boundaries. The paper will focus on extensions of the basic method to include solution adaptation, time-dependent flow simulation, and arbitrary rigid domain motion. The parallel multilevel method makes use of on-the-fly parallel domain decomposition to achieve extremely good scalability on large numbers of processors, and is coupled with an automatic coarse mesh generation algorithm for efficient processing by a multigrid smoother. Numerical results are presented demonstrating parallel speed-ups of up to 435 on 512 processors. Solution-based adaptation may be keyed off truncation error estimates using tau-extrapolation or a variety of feature detection based refinement parameters. The multigrid method is extended to for time-dependent flows through the use of a dual-time approach. The extension to rigid domain motion uses an Arbitrary Lagrangian-Eulerlarian (ALE) formulation, and results will be presented for a variety of two- and three-dimensional example problems with both simple and complex geometry.
Impact of ethnicity on cardiac adaptation to exercise.
Sheikh, Nabeel; Sharma, Sanjay
2014-04-01
The increasing globalization of sport has resulted in athletes from a wide range of ethnicities emerging onto the world stage. Fuelled by the untimely death of a number of young professional athletes, data generated from the parallel increase in preparticipation cardiovascular evaluation has indicated that ethnicity has a substantial influence on cardiac adaptation to exercise. From this perspective, the group most intensively studied comprises athletes of African or Afro-Caribbean ethnicity (black athletes), an ever-increasing number of whom are competing at the highest levels of sport and who often exhibit profound electrical and structural cardiac changes in response to exercise. Data on other ethnic cohorts are emerging, but remain incomplete. This Review describes our current knowledge on the impact of ethnicity on cardiac adaptation to exercise, starting with white athletes in whom the physiological electrical and structural changes--collectively termed the 'athlete's heart'--were first described. Discussion of the differences in the cardiac changes between ethnicities, with a focus on black athletes, and of the challenges that these variations can produce for the evaluating physician is also provided. The impact of ethnically mediated changes on preparticipation cardiovascular evaluation is highlighted, particularly with respect to false positive results, and potential genetic mechanisms underlying racial differences in cardiac adaptation to exercise are described.
Chromosome inversions and ecological plasticity in the main African malaria mosquitoes
Ayala, Diego; Acevedo, Pelayo; Pombi, Marco; Dia, Ibrahima; Boccolini, Daniela; Costantini, Carlo; Simard, Frédéric; Fontenille, Didier
2017-01-01
Chromosome inversions have fascinated the scientific community, mainly because of their role in the rapid adaption of different taxa to changing environments. However, the ecological traits linked to chromosome inversions have been poorly studied. Here, we investigated the roles played by 23 chromosome inversions in the adaptation of the four major African malaria mosquitoes to local environments in Africa. We studied their distribution patterns by using spatially explicit modeling and characterized the ecogeographical determinants of each inversion range. We then performed hierarchical clustering and constrained ordination analyses to assess the spatial and ecological similarities among inversions. Our results show that most inversions are environmentally structured, suggesting that they are actively involved in processes of local adaptation. Some inversions exhibited similar geographical patterns and ecological requirements among the four mosquito species, providing evidence for parallel evolution. Conversely, common inversion polymorphisms between sibling species displayed divergent ecological patterns, suggesting that they might have a different adaptive role in each species. These results are in agreement with the finding that chromosomal inversions play a role in Anopheles ecotypic adaptation. This study establishes a strong ecological basis for future genome-based analyses to elucidate the genetic mechanisms of local adaptation in these four mosquitoes. PMID:28071788
NASA Astrophysics Data System (ADS)
Ji, X.; Shen, C.
2017-12-01
Flood inundation presents substantial societal hazards and also changes biogeochemistry for systems like the Amazon. It is often expensive to simulate high-resolution flood inundation and propagation in a long-term watershed-scale model. Due to the Courant-Friedrichs-Lewy (CFL) restriction, high resolution and large local flow velocity both demand prohibitively small time steps even for parallel codes. Here we develop a parallel surface-subsurface process-based model enhanced by multi-resolution meshes that are adaptively switched on or off. The high-resolution overland flow meshes are enabled only when the flood wave invades to floodplains. This model applies semi-implicit, semi-Lagrangian (SISL) scheme in solving dynamic wave equations, and with the assistant of the multi-mesh method, it also adaptively chooses the dynamic wave equation only in the area of deep inundation. Therefore, the model achieves a balance between accuracy and computational cost.
Durham extremely large telescope adaptive optics simulation platform.
Basden, Alastair; Butterley, Timothy; Myers, Richard; Wilson, Richard
2007-03-01
Adaptive optics systems are essential on all large telescopes for which image quality is important. These are complex systems with many design parameters requiring optimization before good performance can be achieved. The simulation of adaptive optics systems is therefore necessary to categorize the expected performance. We describe an adaptive optics simulation platform, developed at Durham University, which can be used to simulate adaptive optics systems on the largest proposed future extremely large telescopes as well as on current systems. This platform is modular, object oriented, and has the benefit of hardware application acceleration that can be used to improve the simulation performance, essential for ensuring that the run time of a given simulation is acceptable. The simulation platform described here can be highly parallelized using parallelization techniques suited for adaptive optics simulation, while still offering the user complete control while the simulation is running. The results from the simulation of a ground layer adaptive optics system are provided as an example to demonstrate the flexibility of this simulation platform.
Young, Kyle A.; Snoeks, Jos; Seehausen, Ole
2009-01-01
Background Deterministic evolution, phylogenetic contingency and evolutionary chance each can influence patterns of morphological diversification during adaptive radiation. In comparative studies of replicate radiations, convergence in a common morphospace implicates determinism, whereas non-convergence suggests the importance of contingency or chance. Methodology/Principal Findings The endemic cichlid fish assemblages of the three African great lakes have evolved similar sets of ecomorphs but show evidence of non-convergence when compared in a common morphospace, suggesting the importance of contingency and/or chance. We then analyzed the morphological diversity of each assemblage independently and compared their axes of diversification in the unconstrained global morphospace. We find that despite differences in phylogenetic composition, invasion history, and ecological setting, the three assemblages are diversifying along parallel axes through morphospace and have nearly identical variance-covariance structures among morphological elements. Conclusions/Significance By demonstrating that replicate adaptive radiations are diverging along parallel axes, we have shown that non-convergence in the common morphospace is associated with convergence in the global morphospace. Applying these complimentary analyses to future comparative studies will improve our understanding of the relationship between morphological convergence and non-convergence, and the roles of contingency, chance and determinism in driving morphological diversification. PMID:19270732
Parallel evolutionary computation in bioinformatics applications.
Pinho, Jorge; Sobral, João Luis; Rocha, Miguel
2013-05-01
A large number of optimization problems within the field of Bioinformatics require methods able to handle its inherent complexity (e.g. NP-hard problems) and also demand increased computational efforts. In this context, the use of parallel architectures is a necessity. In this work, we propose ParJECoLi, a Java based library that offers a large set of metaheuristic methods (such as Evolutionary Algorithms) and also addresses the issue of its efficient execution on a wide range of parallel architectures. The proposed approach focuses on the easiness of use, making the adaptation to distinct parallel environments (multicore, cluster, grid) transparent to the user. Indeed, this work shows how the development of the optimization library can proceed independently of its adaptation for several architectures, making use of Aspect-Oriented Programming. The pluggable nature of parallelism related modules allows the user to easily configure its environment, adding parallelism modules to the base source code when needed. The performance of the platform is validated with two case studies within biological model optimization. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Sargent, Jeff Scott
1988-01-01
A new row-based parallel algorithm for standard-cell placement targeted for execution on a hypercube multiprocessor is presented. Key features of this implementation include a dynamic simulated-annealing schedule, row-partitioning of the VLSI chip image, and two novel new approaches to controlling error in parallel cell-placement algorithms; Heuristic Cell-Coloring and Adaptive (Parallel Move) Sequence Control. Heuristic Cell-Coloring identifies sets of noninteracting cells that can be moved repeatedly, and in parallel, with no buildup of error in the placement cost. Adaptive Sequence Control allows multiple parallel cell moves to take place between global cell-position updates. This feedback mechanism is based on an error bound derived analytically from the traditional annealing move-acceptance profile. Placement results are presented for real industry circuits and the performance is summarized of an implementation on the Intel iPSC/2 Hypercube. The runtime of this algorithm is 5 to 16 times faster than a previous program developed for the Hypercube, while producing equivalent quality placement. An integrated place and route program for the Intel iPSC/2 Hypercube is currently being developed.
Houssaye, Alexandra; Lindgren, Johan; Pellegrini, Rodrigo; Lee, Andrew H.; Germain, Damien; Polcyn, Michael J.
2013-01-01
Background During their evolution in the Late Cretaceous, mosasauroids attained a worldwide distribution, accompanied by a marked increase in body size and open ocean adaptations. This transition from land-dwellers to highly marine-adapted forms is readily apparent not only at the gross anatomic level but also in their inner bone architecture, which underwent profound modifications. Methodology/Principal Findings The present contribution describes, both qualitatively and quantitatively, the internal organization (microanatomy) and tissue types and characteristics (histology) of propodial and epipodial bones in one lineage of mosasauroids; i.e., the subfamily Mosasaurinae. By using microanatomical and histological data from limb bones in combination with recently acquired knowledge on the inner structure of ribs and vertebrae, and through comparisons with extant squamates and semi-aquatic to fully marine amniotes, we infer possible implications on mosasaurine evolution, aquatic adaptation, growth rates, and basal metabolic rates. Notably, we observe the occurrence of an unusual type of parallel-fibered bone, with large and randomly shaped osteocyte lacunae (otherwise typical of fibrous bone) and particular microanatomical features in Dallasaurus, which displays, rather than a spongious inner organization, bone mass increase in its humeri and a tubular organization in its femora and ribs. Conclusions/Significance The dominance of an unusual type of parallel-fibered bone suggests growth rates and, by extension, basal metabolic rates intermediate between that of the extant leatherback turtle, Dermochelys, and those suggested for plesiosaur and ichthyosaur reptiles. Moreover, the microanatomical features of the relatively primitive genus Dallasaurus differ from those of more derived mosasaurines, indicating an intermediate stage of adaptation for a marine existence. The more complete image of the various microanatomical trends observed in mosasaurine skeletal elements supports the evolutionary convergence between this lineage of secondarily aquatically adapted squamates and cetaceans in the ecological transition from a coastal to a pelagic lifestyle. PMID:24146919
Houssaye, Alexandra; Lindgren, Johan; Pellegrini, Rodrigo; Lee, Andrew H; Germain, Damien; Polcyn, Michael J
2013-01-01
During their evolution in the Late Cretaceous, mosasauroids attained a worldwide distribution, accompanied by a marked increase in body size and open ocean adaptations. This transition from land-dwellers to highly marine-adapted forms is readily apparent not only at the gross anatomic level but also in their inner bone architecture, which underwent profound modifications. The present contribution describes, both qualitatively and quantitatively, the internal organization (microanatomy) and tissue types and characteristics (histology) of propodial and epipodial bones in one lineage of mosasauroids; i.e., the subfamily Mosasaurinae. By using microanatomical and histological data from limb bones in combination with recently acquired knowledge on the inner structure of ribs and vertebrae, and through comparisons with extant squamates and semi-aquatic to fully marine amniotes, we infer possible implications on mosasaurine evolution, aquatic adaptation, growth rates, and basal metabolic rates. Notably, we observe the occurrence of an unusual type of parallel-fibered bone, with large and randomly shaped osteocyte lacunae (otherwise typical of fibrous bone) and particular microanatomical features in Dallasaurus, which displays, rather than a spongious inner organization, bone mass increase in its humeri and a tubular organization in its femora and ribs. The dominance of an unusual type of parallel-fibered bone suggests growth rates and, by extension, basal metabolic rates intermediate between that of the extant leatherback turtle, Dermochelys, and those suggested for plesiosaur and ichthyosaur reptiles. Moreover, the microanatomical features of the relatively primitive genus Dallasaurus differ from those of more derived mosasaurines, indicating an intermediate stage of adaptation for a marine existence. The more complete image of the various microanatomical trends observed in mosasaurine skeletal elements supports the evolutionary convergence between this lineage of secondarily aquatically adapted squamates and cetaceans in the ecological transition from a coastal to a pelagic lifestyle.
NASA Astrophysics Data System (ADS)
Lei, H.; Lu, Z.; Vesselinov, V. V.; Ye, M.
2017-12-01
Simultaneous identification of both the zonation structure of aquifer heterogeneity and the hydrogeological parameters associated with these zones is challenging, especially for complex subsurface heterogeneity fields. In this study, a new approach, based on the combination of the level set method and a parallel genetic algorithm is proposed. Starting with an initial guess for the zonation field (including both zonation structure and the hydraulic properties of each zone), the level set method ensures that material interfaces are evolved through the inverse process such that the total residual between the simulated and observed state variables (hydraulic head) always decreases, which means that the inversion result depends on the initial guess field and the minimization process might fail if it encounters a local minimum. To find the global minimum, the genetic algorithm (GA) is utilized to explore the parameters that define initial guess fields, and the minimal total residual corresponding to each initial guess field is considered as the fitness function value in the GA. Due to the expensive evaluation of the fitness function, a parallel GA is adapted in combination with a simulated annealing algorithm. The new approach has been applied to several synthetic cases in both steady-state and transient flow fields, including a case with real flow conditions at the chromium contaminant site at the Los Alamos National Laboratory. The results show that this approach is capable of identifying the arbitrary zonation structures of aquifer heterogeneity and the hydrogeological parameters associated with these zones effectively.
NASA Technical Reports Server (NTRS)
Johnson, C. R., Jr.; Balas, M. J.
1980-01-01
A novel interconnection of distributed parameter system (DPS) identification and adaptive filtering is presented, which culminates in a common statement of coupled autoregressive, moving-average expansion or parallel infinite impulse response configuration adaptive parameterization. The common restricted complexity filter objectives are seen as similar to the reduced-order requirements of the DPS expansion description. The interconnection presents the possibility of an exchange of problem formulations and solution approaches not yet easily addressed in the common finite dimensional lumped-parameter system context. It is concluded that the shared problems raised are nevertheless many and difficult.
Parallelization of Unsteady Adaptive Mesh Refinement for Unstructured Navier-Stokes Solvers
NASA Technical Reports Server (NTRS)
Schwing, Alan M.; Nompelis, Ioannis; Candler, Graham V.
2014-01-01
This paper explores the implementation of the MPI parallelization in a Navier-Stokes solver using adaptive mesh re nement. Viscous and inviscid test problems are considered for the purpose of benchmarking, as are implicit and explicit time advancement methods. The main test problem for comparison includes e ects from boundary layers and other viscous features and requires a large number of grid points for accurate computation. Ex- perimental validation against double cone experiments in hypersonic ow are shown. The adaptive mesh re nement shows promise for a staple test problem in the hypersonic com- munity. Extension to more advanced techniques for more complicated ows is described.
2012-05-22
tabulation of the reduced space is performed using the In Situ Adaptive Tabulation ( ISAT ) algorithm. In addition, we use x2f mpi – a Fortran library...for parallel vector-valued function evaluation (used with ISAT in this context) – to efficiently redistribute the chemistry workload among the...Constrained-Equilibrium (RCCE) method, and tabulation of the reduced space is performed using the In Situ Adaptive Tabulation ( ISAT ) algorithm. In addition
Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fijany, A.; Milman, M.; Redding, D.
1994-12-31
In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm,more » designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.« less
NASA Astrophysics Data System (ADS)
Samaké, Abdoulaye; Rampal, Pierre; Bouillon, Sylvain; Ólason, Einar
2017-12-01
We present a parallel implementation framework for a new dynamic/thermodynamic sea-ice model, called neXtSIM, based on the Elasto-Brittle rheology and using an adaptive mesh. The spatial discretisation of the model is done using the finite-element method. The temporal discretisation is semi-implicit and the advection is achieved using either a pure Lagrangian scheme or an Arbitrary Lagrangian Eulerian scheme (ALE). The parallel implementation presented here focuses on the distributed-memory approach using the message-passing library MPI. The efficiency and the scalability of the parallel algorithms are illustrated by the numerical experiments performed using up to 500 processor cores of a cluster computing system. The performance obtained by the proposed parallel implementation of the neXtSIM code is shown being sufficient to perform simulations for state-of-the-art sea ice forecasting and geophysical process studies over geographical domain of several millions squared kilometers like the Arctic region.
Deiterding, Ralf
2011-01-01
Numerical simulation can be key to the understanding of the multidimensional nature of transient detonation waves. However, the accurate approximation of realistic detonations is demanding as a wide range of scales needs to be resolved. This paper describes a successful solution strategy that utilizes logically rectangular dynamically adaptive meshes. The hydrodynamic transport scheme and the treatment of the nonequilibrium reaction terms are sketched. A ghost fluid approach is integrated into the method to allow for embedded geometrically complex boundaries. Large-scale parallel simulations of unstable detonation structures of Chapman-Jouguet detonations in low-pressure hydrogen-oxygen-argon mixtures demonstrate the efficiency of the described techniquesmore » in practice. In particular, computations of regular cellular structures in two and three space dimensions and their development under transient conditions, that is, under diffraction and for propagation through bends are presented. Some of the observed patterns are classified by shock polar analysis, and a diagram of the transition boundaries between possible Mach reflection structures is constructed.« less
Jueterbock, A; Franssen, S U; Bergmann, N; Gu, J; Coyer, J A; Reusch, T B H; Bornberg-Bauer, E; Olsen, J L
2016-11-01
Populations distributed across a broad thermal cline are instrumental in addressing adaptation to increasing temperatures under global warming. Using a space-for-time substitution design, we tested for parallel adaptation to warm temperatures along two independent thermal clines in Zostera marina, the most widely distributed seagrass in the temperate Northern Hemisphere. A North-South pair of populations was sampled along the European and North American coasts and exposed to a simulated heatwave in a common-garden mesocosm. Transcriptomic responses under control, heat stress and recovery were recorded in 99 RNAseq libraries with ~13 000 uniquely annotated, expressed genes. We corrected for phylogenetic differentiation among populations to discriminate neutral from adaptive differentiation. The two southern populations recovered faster from heat stress and showed parallel transcriptomic differentiation, as compared with northern populations. Among 2389 differentially expressed genes, 21 exceeded neutral expectations and were likely involved in parallel adaptation to warm temperatures. However, the strongest differentiation following phylogenetic correction was between the three Atlantic populations and the Mediterranean population with 128 of 4711 differentially expressed genes exceeding neutral expectations. Although adaptation to warm temperatures is expected to reduce sensitivity to heatwaves, the continued resistance of seagrass to further anthropogenic stresses may be impaired by heat-induced downregulation of genes related to photosynthesis, pathogen defence and stress tolerance. © 2016 John Wiley & Sons Ltd.
Unstructured Adaptive Grid Computations on an Array of SMPs
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Pramanick, Ira; Sohn, Andrew; Simon, Horst D.
1996-01-01
Dynamic load balancing is necessary for parallel adaptive methods to solve unsteady CFD problems on unstructured grids. We have presented such a dynamic load balancing framework called JOVE, in this paper. Results on a four-POWERnode POWER CHALLENGEarray demonstrated that load balancing gives significant performance improvements over no load balancing for such adaptive computations. The parallel speedup of JOVE, implemented using MPI on the POWER CHALLENCEarray, was significant, being as high as 31 for 32 processors. An implementation of JOVE that exploits 'an array of SMPS' architecture was also studied; this hybrid JOVE outperformed flat JOVE by up to 28% on the meshes and adaption models tested. With large, realistic meshes and actual flow-solver and adaption phases incorporated into JOVE, hybrid JOVE can be expected to yield significant advantage over flat JOVE, especially as the number of processors is increased, thus demonstrating the scalability of an array of SMPs architecture.
Parallel implementation of an adaptive scheme for 3D unstructured grids on the SP2
NASA Technical Reports Server (NTRS)
Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak
1996-01-01
Dynamic mesh adaption on unstructured grids is a powerful tool for computing unsteady flows that require local grid modifications to efficiently resolve solution features. For this work, we consider an edge-based adaption scheme that has shown good single-processor performance on the C90. We report on our experience parallelizing this code for the SP2. Results show a 47.0X speedup on 64 processors when 10 percent of the mesh is randomly refined. Performance deteriorates to 7.7X when the same number of edges are refined in a highly-localized region. This is because almost all the mesh adaption is confined to a single processor. However, this problem can be remedied by repartitioning the mesh immediately after targeting edges for refinement but before the actual adaption takes place. With this change, the speedup improves dramatically to 43.6X.
Parallel Implementation of an Adaptive Scheme for 3D Unstructured Grids on the SP2
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Biswas, Rupak; Strawn, Roger C.
1996-01-01
Dynamic mesh adaption on unstructured grids is a powerful tool for computing unsteady flows that require local grid modifications to efficiently resolve solution features. For this work, we consider an edge-based adaption scheme that has shown good single-processor performance on the C90. We report on our experience parallelizing this code for the SP2. Results show a 47.OX speedup on 64 processors when 10% of the mesh is randomly refined. Performance deteriorates to 7.7X when the same number of edges are refined in a highly-localized region. This is because almost all mesh adaption is confined to a single processor. However, this problem can be remedied by repartitioning the mesh immediately after targeting edges for refinement but before the actual adaption takes place. With this change, the speedup improves dramatically to 43.6X.
Tobler, Ray; Hermisson, Joachim; Schlötterer, Christian
2015-07-01
Thermal stress is a pervasive selective agent in natural populations that impacts organismal growth, survival, and reproduction. Drosophila melanogaster exhibits a variety of putatively adaptive phenotypic responses to thermal stress in natural and experimental settings; however, accompanying assessments of fitness are typically lacking. Here, we quantify changes in fitness and known thermal tolerance traits in replicated experimental D. melanogaster populations following more than 40 generations of evolution to either cyclic cold or hot temperatures. By evaluating fitness for both evolved populations alongside a reconstituted starting population, we show that the evolved populations were the best adapted within their respective thermal environments. More strikingly, the evolved populations exhibited increased fitness in both environments and improved resistance to both acute heat and cold stress. This unexpected parallel response appeared to be an adaptation to the rapid temperature changes that drove the cycling thermal regimes, as parallel fitness changes were not observed when tested in a constant thermal environment. Our results add to a small, but growing group of studies that demonstrate the importance of fluctuating temperature changes for thermal adaptation and highlight the need for additional work in this area. © 2015 The Author(s). Evolution published by Wiley Periodicals, Inc. on behalf of The Society for the Study of Evolution.
TU-AB-202-05: GPU-Based 4D Deformable Image Registration Using Adaptive Tetrahedral Mesh Modeling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhong, Z; Zhuang, L; Gu, X
Purpose: Deformable image registration (DIR) has been employed today as an automated and effective segmentation method to transfer tumor or organ contours from the planning image to daily images, instead of manual segmentation. However, the computational time and accuracy of current DIR approaches are still insufficient for online adaptive radiation therapy (ART), which requires real-time and high-quality image segmentation, especially in a large datasets of 4D-CT images. The objective of this work is to propose a new DIR algorithm, with fast computational speed and high accuracy, by using adaptive feature-based tetrahedral meshing and GPU-based parallelization. Methods: The first step ismore » to generate the adaptive tetrahedral mesh based on the image features of a reference phase of 4D-CT, so that the deformation can be well captured and accurately diffused from the mesh vertices to voxels of the image volume. Subsequently, the deformation vector fields (DVF) and other phases of 4D-CT can be obtained by matching each phase of the target 4D-CT images with the corresponding deformed reference phase. The proposed 4D DIR method is implemented on GPU, resulting in significantly increasing the computational efficiency due to its parallel computing ability. Results: A 4D NCAT digital phantom was used to test the efficiency and accuracy of our method. Both the image and DVF results show that the fine structures and shapes of lung are well preserved, and the tumor position is well captured, i.e., 3D distance error is 1.14 mm. Compared to the previous voxel-based CPU implementation of DIR, such as demons, the proposed method is about 160x faster for registering a 10-phase 4D-CT with a phase dimension of 256×256×150. Conclusion: The proposed 4D DIR method uses feature-based mesh and GPU-based parallelism, which demonstrates the capability to compute both high-quality image and motion results, with significant improvement on the computational speed.« less
Automatic Data Distribution for CFD Applications on Structured Grids
NASA Technical Reports Server (NTRS)
Frumkin, Michael; Yan, Jerry
1999-01-01
Data distribution is an important step in implementation of any parallel algorithm. The data distribution determines data traffic, utilization of the interconnection network and affects the overall code efficiency. In recent years a number data distribution methods have been developed and used in real programs for improving data traffic. We use some of the methods for translating data dependence and affinity relations into data distribution directives. We describe an automatic data alignment and placement tool (ADAPT) which implements these methods and show it results for some CFD codes (NPB and ARC3D). Algorithms for program analysis and derivation of data distribution implemented in ADAPT are efficient three pass algorithms. Most algorithms have linear complexity with the exception of some graph algorithms having complexity O(n(sup 4)) in the worst case.
Adaptive multi-resolution 3D Hartree-Fock-Bogoliubov solver for nuclear structure
NASA Astrophysics Data System (ADS)
Pei, J. C.; Fann, G. I.; Harrison, R. J.; Nazarewicz, W.; Shi, Yue; Thornton, S.
2014-08-01
Background: Complex many-body systems, such as triaxial and reflection-asymmetric nuclei, weakly bound halo states, cluster configurations, nuclear fragments produced in heavy-ion fusion reactions, cold Fermi gases, and pasta phases in neutron star crust, are all characterized by large sizes and complex topologies in which many geometrical symmetries characteristic of ground-state configurations are broken. A tool of choice to study such complex forms of matter is an adaptive multi-resolution wavelet analysis. This method has generated much excitement since it provides a common framework linking many diversified methodologies across different fields, including signal processing, data compression, harmonic analysis and operator theory, fractals, and quantum field theory. Purpose: To describe complex superfluid many-fermion systems, we introduce an adaptive pseudospectral method for solving self-consistent equations of nuclear density functional theory in three dimensions, without symmetry restrictions. Methods: The numerical method is based on the multi-resolution and computational harmonic analysis techniques with a multi-wavelet basis. The application of state-of-the-art parallel programming techniques include sophisticated object-oriented templates which parse the high-level code into distributed parallel tasks with a multi-thread task queue scheduler for each multi-core node. The internode communications are asynchronous. The algorithm is variational and is capable of solving coupled complex-geometric systems of equations adaptively, with functional and boundary constraints, in a finite spatial domain of very large size, limited by existing parallel computer memory. For smooth functions, user-defined finite precision is guaranteed. Results: The new adaptive multi-resolution Hartree-Fock-Bogoliubov (HFB) solver madness-hfb is benchmarked against a two-dimensional coordinate-space solver hfb-ax that is based on the B-spline technique and a three-dimensional solver hfodd that is based on the harmonic-oscillator basis expansion. Several examples are considered, including the self-consistent HFB problem for spin-polarized trapped cold fermions and the Skyrme-Hartree-Fock (+BCS) problem for triaxial deformed nuclei. Conclusions: The new madness-hfb framework has many attractive features when applied to nuclear and atomic problems involving many-particle superfluid systems. Of particular interest are weakly bound nuclear configurations close to particle drip lines, strongly elongated and dinuclear configurations such as those present in fission and heavy-ion fusion, and exotic pasta phases that appear in neutron star crust.
NASA Astrophysics Data System (ADS)
Popov, Igor; Sukov, Sergey
2018-02-01
A modification of the adaptive artificial viscosity (AAV) method is considered. This modification is based on one stage time approximation and is adopted to calculation of gasdynamics problems on unstructured grids with an arbitrary type of grid elements. The proposed numerical method has simplified logic, better performance and parallel efficiency compared to the implementation of the original AAV method. Computer experiments evidence the robustness and convergence of the method to difference solution.
NASA Technical Reports Server (NTRS)
Noor, A. K. (Editor); Hayduk, R. J. (Editor)
1985-01-01
Among the topics discussed are developments in structural engineering hardware and software, computation for fracture mechanics, trends in numerical analysis and parallel algorithms, mechanics of materials, advances in finite element methods, composite materials and structures, determinations of random motion and dynamic response, optimization theory, automotive tire modeling methods and contact problems, the damping and control of aircraft structures, and advanced structural applications. Specific topics covered include structural design expert systems, the evaluation of finite element system architectures, systolic arrays for finite element analyses, nonlinear finite element computations, hierarchical boundary elements, adaptive substructuring techniques in elastoplastic finite element analyses, automatic tracking of crack propagation, a theory of rate-dependent plasticity, the torsional stability of nonlinear eccentric structures, a computation method for fluid-structure interaction, the seismic analysis of three-dimensional soil-structure interaction, a stress analysis for a composite sandwich panel, toughness criterion identification for unidirectional composite laminates, the modeling of submerged cable dynamics, and damping synthesis for flexible spacecraft structures.
Advances in Patch-Based Adaptive Mesh Refinement Scalability
Gunney, Brian T.N.; Anderson, Robert W.
2015-12-18
Patch-based structured adaptive mesh refinement (SAMR) is widely used for high-resolution simu- lations. Combined with modern supercomputers, it could provide simulations of unprecedented size and resolution. A persistent challenge for this com- bination has been managing dynamically adaptive meshes on more and more MPI tasks. The dis- tributed mesh management scheme in SAMRAI has made some progress SAMR scalability, but early al- gorithms still had trouble scaling past the regime of 105 MPI tasks. This work provides two critical SAMR regridding algorithms, which are integrated into that scheme to ensure efficiency of the whole. The clustering algorithm is an extensionmore » of the tile- clustering approach, making it more flexible and efficient in both clustering and parallelism. The partitioner is a new algorithm designed to prevent the network congestion experienced by its prede- cessor. We evaluated performance using weak- and strong-scaling benchmarks designed to be difficult for dynamic adaptivity. Results show good scaling on up to 1.5M cores and 2M MPI tasks. Detailed timing diagnostics suggest scaling would continue well past that.« less
Advances in Patch-Based Adaptive Mesh Refinement Scalability
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gunney, Brian T.N.; Anderson, Robert W.
Patch-based structured adaptive mesh refinement (SAMR) is widely used for high-resolution simu- lations. Combined with modern supercomputers, it could provide simulations of unprecedented size and resolution. A persistent challenge for this com- bination has been managing dynamically adaptive meshes on more and more MPI tasks. The dis- tributed mesh management scheme in SAMRAI has made some progress SAMR scalability, but early al- gorithms still had trouble scaling past the regime of 105 MPI tasks. This work provides two critical SAMR regridding algorithms, which are integrated into that scheme to ensure efficiency of the whole. The clustering algorithm is an extensionmore » of the tile- clustering approach, making it more flexible and efficient in both clustering and parallelism. The partitioner is a new algorithm designed to prevent the network congestion experienced by its prede- cessor. We evaluated performance using weak- and strong-scaling benchmarks designed to be difficult for dynamic adaptivity. Results show good scaling on up to 1.5M cores and 2M MPI tasks. Detailed timing diagnostics suggest scaling would continue well past that.« less
Climbing with adhesion: from bioinspiration to biounderstanding
Cutkosky, Mark R.
2015-01-01
Bioinspiration is an increasingly popular design paradigm, especially as robots venture out of the laboratory and into the world. Animals are adept at coping with the variability that the world imposes. With advances in scientific tools for understanding biological structures in detail, we are increasingly able to identify design features that account for animals' robust performance. In parallel, advances in fabrication methods and materials are allowing us to engineer artificial structures with similar properties. The resulting robots become useful platforms for testing hypotheses about which principles are most important. Taking gecko-inspired climbing as an example, we show that the process of extracting principles from animals and adapting them to robots provides insights for both robotics and biology. PMID:26464786
Methylphenidate administration determines enduring changes in neuroglial network in rats.
Cavaliere, Carlo; Cirillo, Giovanni; Bianco, Maria Rosaria; Adriani, Walter; De Simone, Antonietta; Leo, Damiana; Perrone-Capano, Carla; Papa, Michele
2012-01-01
Repeated exposure to psychostimulant drugs induces complex molecular and structural modifications in discrete brain regions of the meso-cortico-limbic system. This structural remodeling is thought to underlie neurobehavioral adaptive responses. Administration to adolescent rats of methylphenidate (MPH), commonly used in attention deficit and hyperactivity disorder (ADHD), triggers alterations of reward-based behavior paralleled by persistent and plastic synaptic changes of neuronal and glial markers within key areas of the reward circuits. By immunohistochemistry, we observe a marked increase of glial fibrillary acidic protein (GFAP) and neuronal nitric oxide synthase (nNOS) expression and a down-regulation of glial glutamate transporter GLAST in dorso-lateral and ventro-medial striatum. Using electron microscopy, we find in the prefrontal cortex a significant reduction of the synaptic active zone length, paralleled by an increase of dendritic spines. We demonstrate that in limbic areas the MPH-induced reactive astrocytosis affects the glial glutamatergic uptake system that in turn could determine glutamate receptor sensitization. These processes could be sustained by NO production and synaptic rearrangement and contribute to MPH neuroglial induced rewiring. Copyright © 2011. Published by Elsevier B.V.
Parallel Adjective High-Order CFD Simulations Characterizing SOFIA Cavity Acoustics
NASA Technical Reports Server (NTRS)
Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak
2016-01-01
This paper presents large-scale MPI-parallel computational uid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft fuselage of a Boeing 747SP. These simulations focus on how the unsteady ow eld inside and over the cavity interferes with the optical path and mounting structure of the telescope. A temporally fourth-order accurate Runge-Kutta, and spatially fth-order accurate WENO- 5Z scheme was used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh re nement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32k CPU cores and 4 billion compu- tational cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregular numerical cost associated with blocks con- taining boundaries. Limits to scaling beyond 32k cores are identi ed, and targeted code optimizations are discussed.
Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array
NASA Astrophysics Data System (ADS)
Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul
2008-04-01
This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.
Enabling Object Storage via shims for Grid Middleware
NASA Astrophysics Data System (ADS)
Cadellin Skipsey, Samuel; De Witt, Shaun; Dewhurst, Alastair; Britton, David; Roy, Gareth; Crooks, David
2015-12-01
The Object Store model has quickly become the basis of most commercially successful mass storage infrastructure, backing so-called ”Cloud” storage such as Amazon S3, but also underlying the implementation of most parallel distributed storage systems. Many of the assumptions in Object Store design are similar, but not identical, to concepts in the design of Grid Storage Elements, although the requirement for ”POSIX-like” filesystem structures on top of SEs makes the disjunction seem larger. As modern Object Stores provide many features that most Grid SEs do not (block level striping, parallel access, automatic file repair, etc.), it is of interest to see how easily we can provide interfaces to typical Object Stores via plugins and shims for Grid tools, and how well experiments can adapt their data models to them. We present evaluation of, and first-deployment experiences with, (for example) Xrootd-Ceph interfaces for direct object-store access, as part of an initiative within GridPP[1] hosted at RAL. Additionally, we discuss the tradeoffs and experience of developing plugins for the currently-popular Ceph parallel distributed filesystem for the GFAL2 access layer, at Glasgow.
PROTO-PLASM: parallel language for adaptive and scalable modelling of biosystems.
Bajaj, Chandrajit; DiCarlo, Antonio; Paoluzzi, Alberto
2008-09-13
This paper discusses the design goals and the first developments of PROTO-PLASM, a novel computational environment to produce libraries of executable, combinable and customizable computer models of natural and synthetic biosystems, aiming to provide a supporting framework for predictive understanding of structure and behaviour through multiscale geometric modelling and multiphysics simulations. Admittedly, the PROTO-PLASM platform is still in its infancy. Its computational framework--language, model library, integrated development environment and parallel engine--intends to provide patient-specific computational modelling and simulation of organs and biosystem, exploiting novel functionalities resulting from the symbolic combination of parametrized models of parts at various scales. PROTO-PLASM may define the model equations, but it is currently focused on the symbolic description of model geometry and on the parallel support of simulations. Conversely, CellML and SBML could be viewed as defining the behavioural functions (the model equations) to be used within a PROTO-PLASM program. Here we exemplify the basic functionalities of PROTO-PLASM, by constructing a schematic heart model. We also discuss multiscale issues with reference to the geometric and physical modelling of neuromuscular junctions.
Proto-Plasm: parallel language for adaptive and scalable modelling of biosystems
Bajaj, Chandrajit; DiCarlo, Antonio; Paoluzzi, Alberto
2008-01-01
This paper discusses the design goals and the first developments of Proto-Plasm, a novel computational environment to produce libraries of executable, combinable and customizable computer models of natural and synthetic biosystems, aiming to provide a supporting framework for predictive understanding of structure and behaviour through multiscale geometric modelling and multiphysics simulations. Admittedly, the Proto-Plasm platform is still in its infancy. Its computational framework—language, model library, integrated development environment and parallel engine—intends to provide patient-specific computational modelling and simulation of organs and biosystem, exploiting novel functionalities resulting from the symbolic combination of parametrized models of parts at various scales. Proto-Plasm may define the model equations, but it is currently focused on the symbolic description of model geometry and on the parallel support of simulations. Conversely, CellML and SBML could be viewed as defining the behavioural functions (the model equations) to be used within a Proto-Plasm program. Here we exemplify the basic functionalities of Proto-Plasm, by constructing a schematic heart model. We also discuss multiscale issues with reference to the geometric and physical modelling of neuromuscular junctions. PMID:18559320
Narayanaswamy, Arunachalam; Dwarakapuram, Saritha; Bjornsson, Christopher S; Cutler, Barbara M; Shain, William; Roysam, Badrinath
2010-03-01
This paper presents robust 3-D algorithms to segment vasculature that is imaged by labeling laminae, rather than the lumenal volume. The signal is weak, sparse, noisy, nonuniform, low-contrast, and exhibits gaps and spectral artifacts, so adaptive thresholding and Hessian filtering based methods are not effective. The structure deviates from a tubular geometry, so tracing algorithms are not effective. We propose a four step approach. The first step detects candidate voxels using a robust hypothesis test based on a model that assumes Poisson noise and locally planar geometry. The second step performs an adaptive region growth to extract weakly labeled and fine vessels while rejecting spectral artifacts. To enable interactive visualization and estimation of features such as statistical confidence, local curvature, local thickness, and local normal, we perform the third step. In the third step, we construct an accurate mesh representation using marching tetrahedra, volume-preserving smoothing, and adaptive decimation algorithms. To enable topological analysis and efficient validation, we describe a method to estimate vessel centerlines using a ray casting and vote accumulation algorithm which forms the final step of our algorithm. Our algorithm lends itself to parallel processing, and yielded an 8 x speedup on a graphics processor (GPU). On synthetic data, our meshes had average error per face (EPF) values of (0.1-1.6) voxels per mesh face for peak signal-to-noise ratios from (110-28 dB). Separately, the error from decimating the mesh to less than 1% of its original size, the EPF was less than 1 voxel/face. When validated on real datasets, the average recall and precision values were found to be 94.66% and 94.84%, respectively.
Adaptation of the Long-Lived Monocarpic Perennial Saxifraga longifolia to High Altitude1[OPEN
Morales, Melanie; Fleta-Soriano, Eva; Garcia, Maria B.
2016-01-01
Global change is exerting a major effect on plant communities, altering their potential capacity for adaptation. Here, we aimed at unveiling mechanisms of adaptation to high altitude in an endemic long-lived monocarpic, Saxifraga longifolia, by combining demographic and physiological approaches. Plants from three altitudes (570, 1100, and 2100 m above sea level [a.s.l.]) were investigated in terms of leaf water and pigment contents, and activation of stress defense mechanisms. The influence of plant size on physiological performance and mortality was also investigated. Levels of photoprotective molecules (α-tocopherol, carotenoids, and anthocyanins) increased in response to high altitude (1100 relative to 570 m a.s.l.), which was paralleled by reduced soil and leaf water contents and increased ABA levels. The more demanding effect of high altitude on photoprotection was, however, partly abolished at very high altitudes (2100 m a.s.l.) due to improved soil water contents, with the exception of α-tocopherol accumulation. α-Tocopherol levels increased progressively at increasing altitudes, which paralleled with reductions in lipid peroxidation, thus suggesting plants from the highest altitude effectively withstood high light stress. Furthermore, mortality of juveniles was highest at the intermediate population, suggesting that drought stress was the main environmental driver of mortality of juveniles in this rocky plant species. Population structure and vital rates in the high population evidenced lower recruitment and mortality in juveniles, activation of clonal growth, and absence of plant size-dependent mortality. We conclude that, despite S. longifolia has evolved complex mechanisms of adaptation to altitude at the cellular, whole-plant and population levels, drought events may drive increased mortality in the framework of global change. PMID:27440756
Adaptation of the Long-Lived Monocarpic Perennial Saxifraga longifolia to High Altitude.
Munné-Bosch, Sergi; Cotado, Alba; Morales, Melanie; Fleta-Soriano, Eva; Villellas, Jesús; Garcia, Maria B
2016-10-01
Global change is exerting a major effect on plant communities, altering their potential capacity for adaptation. Here, we aimed at unveiling mechanisms of adaptation to high altitude in an endemic long-lived monocarpic, Saxifraga longifolia, by combining demographic and physiological approaches. Plants from three altitudes (570, 1100, and 2100 m above sea level [a.s.l.]) were investigated in terms of leaf water and pigment contents, and activation of stress defense mechanisms. The influence of plant size on physiological performance and mortality was also investigated. Levels of photoprotective molecules (α-tocopherol, carotenoids, and anthocyanins) increased in response to high altitude (1100 relative to 570 m a.s.l.), which was paralleled by reduced soil and leaf water contents and increased ABA levels. The more demanding effect of high altitude on photoprotection was, however, partly abolished at very high altitudes (2100 m a.s.l.) due to improved soil water contents, with the exception of α-tocopherol accumulation. α-Tocopherol levels increased progressively at increasing altitudes, which paralleled with reductions in lipid peroxidation, thus suggesting plants from the highest altitude effectively withstood high light stress. Furthermore, mortality of juveniles was highest at the intermediate population, suggesting that drought stress was the main environmental driver of mortality of juveniles in this rocky plant species. Population structure and vital rates in the high population evidenced lower recruitment and mortality in juveniles, activation of clonal growth, and absence of plant size-dependent mortality. We conclude that, despite S. longifolia has evolved complex mechanisms of adaptation to altitude at the cellular, whole-plant and population levels, drought events may drive increased mortality in the framework of global change. © 2016 American Society of Plant Biologists. All Rights Reserved.
NASA Astrophysics Data System (ADS)
Raeli, Alice; Bergmann, Michel; Iollo, Angelo
2018-02-01
We consider problems governed by a linear elliptic equation with varying coefficients across internal interfaces. The solution and its normal derivative can undergo significant variations through these internal boundaries. We present a compact finite-difference scheme on a tree-based adaptive grid that can be efficiently solved using a natively parallel data structure. The main idea is to optimize the truncation error of the discretization scheme as a function of the local grid configuration to achieve second-order accuracy. Numerical illustrations are presented in two and three-dimensional configurations.
Fast Particle Methods for Multiscale Phenomena Simulations
NASA Technical Reports Server (NTRS)
Koumoutsakos, P.; Wray, A.; Shariff, K.; Pohorille, Andrew
2000-01-01
We are developing particle methods oriented at improving computational modeling capabilities of multiscale physical phenomena in : (i) high Reynolds number unsteady vortical flows, (ii) particle laden and interfacial flows, (iii)molecular dynamics studies of nanoscale droplets and studies of the structure, functions, and evolution of the earliest living cell. The unifying computational approach involves particle methods implemented in parallel computer architectures. The inherent adaptivity, robustness and efficiency of particle methods makes them a multidisciplinary computational tool capable of bridging the gap of micro-scale and continuum flow simulations. Using efficient tree data structures, multipole expansion algorithms, and improved particle-grid interpolation, particle methods allow for simulations using millions of computational elements, making possible the resolution of a wide range of length and time scales of these important physical phenomena.The current challenges in these simulations are in : [i] the proper formulation of particle methods in the molecular and continuous level for the discretization of the governing equations [ii] the resolution of the wide range of time and length scales governing the phenomena under investigation. [iii] the minimization of numerical artifacts that may interfere with the physics of the systems under consideration. [iv] the parallelization of processes such as tree traversal and grid-particle interpolations We are conducting simulations using vortex methods, molecular dynamics and smooth particle hydrodynamics, exploiting their unifying concepts such as : the solution of the N-body problem in parallel computers, highly accurate particle-particle and grid-particle interpolations, parallel FFT's and the formulation of processes such as diffusion in the context of particle methods. This approach enables us to transcend among seemingly unrelated areas of research.
NASA Astrophysics Data System (ADS)
Rybakin, B.; Bogatencov, P.; Secrieru, G.; Iliuha, N.
2013-10-01
The paper deals with a parallel algorithm for calculations on multiprocessor computers and GPU accelerators. The calculations of shock waves interaction with low-density bubble results and the problem of the gas flow with the forces of gravity are presented. This algorithm combines a possibility to capture a high resolution of shock waves, the second-order accuracy for TVD schemes, and a possibility to observe a low-level diffusion of the advection scheme. Many complex problems of continuum mechanics are numerically solved on structured or unstructured grids. To improve the accuracy of the calculations is necessary to choose a sufficiently small grid (with a small cell size). This leads to the drawback of a substantial increase of computation time. Therefore, for the calculations of complex problems it is reasonable to use the method of Adaptive Mesh Refinement. That is, the grid refinement is performed only in the areas of interest of the structure, where, e.g., the shock waves are generated, or a complex geometry or other such features exist. Thus, the computing time is greatly reduced. In addition, the execution of the application on the resulting sequence of nested, decreasing nets can be parallelized. Proposed algorithm is based on the AMR method. Utilization of AMR method can significantly improve the resolution of the difference grid in areas of high interest, and from other side to accelerate the processes of the multi-dimensional problems calculating. Parallel algorithms of the analyzed difference models realized for the purpose of calculations on graphic processors using the CUDA technology [1].
NASA Astrophysics Data System (ADS)
Childers, J. T.; Uram, T. D.; LeCompte, T. J.; Papka, M. E.; Benjamin, D. P.
2017-01-01
As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. This paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application and the performance that was achieved.
3D CSEM inversion based on goal-oriented adaptive finite element method
NASA Astrophysics Data System (ADS)
Zhang, Y.; Key, K.
2016-12-01
We present a parallel 3D frequency domain controlled-source electromagnetic inversion code name MARE3DEM. Non-linear inversion of observed data is performed with the Occam variant of regularized Gauss-Newton optimization. The forward operator is based on the goal-oriented finite element method that efficiently calculates the responses and sensitivity kernels in parallel using a data decomposition scheme where independent modeling tasks contain different frequencies and subsets of the transmitters and receivers. To accommodate complex 3D conductivity variation with high flexibility and precision, we adopt the dual-grid approach where the forward mesh conforms to the inversion parameter grid and is adaptively refined until the forward solution converges to the desired accuracy. This dual-grid approach is memory efficient, since the inverse parameter grid remains independent from fine meshing generated around the transmitter and receivers by the adaptive finite element method. Besides, the unstructured inverse mesh efficiently handles multiple scale structures and allows for fine-scale model parameters within the region of interest. Our mesh generation engine keeps track of the refinement hierarchy so that the map of conductivity and sensitivity kernel between the forward and inverse mesh is retained. We employ the adjoint-reciprocity method to calculate the sensitivity kernels which establish a linear relationship between changes in the conductivity model and changes in the modeled responses. Our code uses a direcy solver for the linear systems, so the adjoint problem is efficiently computed by re-using the factorization from the primary problem. Further computational efficiency and scalability is obtained in the regularized Gauss-Newton portion of the inversion using parallel dense matrix-matrix multiplication and matrix factorization routines implemented with the ScaLAPACK library. We show the scalability, reliability and the potential of the algorithm to deal with complex geological scenarios by applying it to the inversion of synthetic marine controlled source EM data generated for a complex 3D offshore model with significant seafloor topography.
Kwon, Ronald Y.; Meays, Diana R.; Meilan, Alexander S.; Jones, Jeremiah; Miramontes, Rosa; Kardos, Natalie; Yeh, Jiunn-Chern; Frangos, John A.
2012-01-01
Interstitial fluid flow (IFF) is a potent regulatory signal in bone. During mechanical loading, IFF is generated through two distinct mechanisms that result in spatially distinct flow profiles: poroelastic interactions within the lacunar-canalicular system, and intramedullary pressurization. While the former generates IFF primarily within the lacunar-canalicular network, the latter generates significant flow at the endosteal surface as well as within the tissue. This gives rise to the intriguing possibility that loading-induced IFF may differentially activate osteocytes or surface-residing cells depending on the generating mechanism, and that sensation of IFF generated via intramedullary pressurization may be mediated by a non-osteocytic bone cell population. To begin to explore this possibility, we used the Dmp1-HBEGF inducible osteocyte ablation mouse model and a microfluidic system for modulating intramedullary pressure (ImP) to assess whether structural adaptation to ImP-driven IFF is altered by partial osteocyte depletion. Canalicular convective velocities during pressurization were estimated through the use of fluorescence recovery after photobleaching and computational modeling. Following osteocyte ablation, transgenic mice exhibited severe losses in bone structure and altered responses to hindlimb suspension in a compartment-specific manner. In pressure-loaded limbs, transgenic mice displayed similar or significantly enhanced structural adaptation to Imp-driven IFF, particularly in the trabecular compartment, despite up to ∼50% of trabecular lacunae being uninhabited following ablation. Interestingly, regression analysis revealed relative gains in bone structure in pressure-loaded limbs were correlated with reductions in bone structure in unpressurized control limbs, suggesting that adaptation to ImP-driven IFF was potentiated by increases in osteoclastic activity and/or reductions in osteoblastic activity incurred independently of pressure loading. Collectively, these studies indicate that structural adaptation to ImP-driven IFF can proceed unimpeded following a significant depletion in osteocytes, consistent with the potential existence of a non-osteocytic bone cell population that senses ImP-driven IFF independently and potentially parallel to osteocytic sensation of poroelasticity-derived IFF. PMID:22413015
High-Throughput, Adaptive FFT Architecture for FPGA-Based Spaceborne Data Processors
NASA Technical Reports Server (NTRS)
NguyenKobayashi, Kayla; Zheng, Jason X.; He, Yutao; Shah, Biren N.
2011-01-01
Exponential growth in microelectronics technology such as field-programmable gate arrays (FPGAs) has enabled high-performance spaceborne instruments with increasing onboard data processing capabilities. As a commonly used digital signal processing (DSP) building block, fast Fourier transform (FFT) has been of great interest in onboard data processing applications, which needs to strike a reasonable balance between high-performance (throughput, block size, etc.) and low resource usage (power, silicon footprint, etc.). It is also desirable to be designed so that a single design can be reused and adapted into instruments with different requirements. The Multi-Pass Wide Kernel FFT (MPWK-FFT) architecture was developed, in which the high-throughput benefits of the parallel FFT structure and the low resource usage of Singleton s single butterfly method is exploited. The result is a wide-kernel, multipass, adaptive FFT architecture. The 32K-point MPWK-FFT architecture includes 32 radix-2 butterflies, 64 FIFOs to store the real inputs, 64 FIFOs to store the imaginary inputs, complex twiddle factor storage, and FIFO logic to route the outputs to the correct FIFO. The inputs are stored in sequential fashion into the FIFOs, and the outputs of each butterfly are sequentially written first into the even FIFO, then the odd FIFO. Because of the order of the outputs written into the FIFOs, the depth of the even FIFOs, which are 768 each, are 1.5 times larger than the odd FIFOs, which are 512 each. The total memory needed for data storage, assuming that each sample is 36 bits, is 2.95 Mbits. The twiddle factors are stored in internal ROM inside the FPGA for fast access time. The total memory size to store the twiddle factors is 589.9Kbits. This FFT structure combines the benefits of high throughput from the parallel FFT kernels and low resource usage from the multi-pass FFT kernels with desired adaptability. Space instrument missions that need onboard FFT capabilities such as the proposed DESDynl, SWOT (Surface Water Ocean Topography), and Europa sounding radar missions would greatly benefit from this technology with significant reductions in non-recurring cost and risk.
Susoy, V; Herrmann, M
2014-05-01
Host-symbiont systems are of particular interest to evolutionary biology because they allow testable inferences of diversification processes while also providing both a historical basis and an ecological context for studies of adaptation. Our investigations of bark beetle symbionts, predatory nematodes of the genus Micoletzkya, have revealed remarkable diversity of the group along with a high level of host specificity. Cophylogenetic analyses suggest that evolution of the nematodes was largely influenced by the evolutionary history of beetles. The diversification of the symbionts, however, could not be attributed to parallel divergence alone; our results indicate that adaptive radiation of the nematodes was shaped by preferential host shifts among closely related beetles along with codivergence. Whereas ecological and geographic isolation have played a major role in the diversification of Micoletzkya at shallow phylogenetic depths, adaptations towards related hosts have played a role in shaping cophylogenetic structure at a larger evolutionary scale. © 2014 The Authors. Journal of Evolutionary Biology © 2014 European Society For Evolutionary Biology.
An intrinsic algorithm for parallel Poisson disk sampling on arbitrary surfaces.
Ying, Xiang; Xin, Shi-Qing; Sun, Qian; He, Ying
2013-09-01
Poisson disk sampling has excellent spatial and spectral properties, and plays an important role in a variety of visual computing. Although many promising algorithms have been proposed for multidimensional sampling in euclidean space, very few studies have been reported with regard to the problem of generating Poisson disks on surfaces due to the complicated nature of the surface. This paper presents an intrinsic algorithm for parallel Poisson disk sampling on arbitrary surfaces. In sharp contrast to the conventional parallel approaches, our method neither partitions the given surface into small patches nor uses any spatial data structure to maintain the voids in the sampling domain. Instead, our approach assigns each sample candidate a random and unique priority that is unbiased with regard to the distribution. Hence, multiple threads can process the candidates simultaneously and resolve conflicts by checking the given priority values. Our algorithm guarantees that the generated Poisson disks are uniformly and randomly distributed without bias. It is worth noting that our method is intrinsic and independent of the embedding space. This intrinsic feature allows us to generate Poisson disk patterns on arbitrary surfaces in IR(n). To our knowledge, this is the first intrinsic, parallel, and accurate algorithm for surface Poisson disk sampling. Furthermore, by manipulating the spatially varying density function, we can obtain adaptive sampling easily.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Onishi, Yasuo
Four Japan Atomic Energy Agency (JAEA) researchers visited Pacific Northwest National Laboratory (PNNL) for seven working days and have evaluated the suitability and adaptability of FLESCOT to a JAEA’s supercomputer system to effectively simulate cesium behavior in dam reservoirs, river mouths, and coastal areas in Fukushima contaminated by the Fukushima Daiichi nuclear accident. PNNL showed the following to JAEA visitors during the seven-working day period: FLESCOT source code; User’s manual; FLESCOT description – Program structure – Algorism – Solver – Boundary condition handling – Data definition – Input and output methods – How to run. During the visit, JAEA hadmore » access to FLESCOT to run with an input data set to evaluate the capacity and feasibility of adapting it to a JAEA super computer with massive parallel processors. As a part of this evaluation, PNNL ran FLESCOT for sample cases of the contaminant migration simulation to further describe FLESCOT in action. JAEA and PNNL researchers also evaluated time spent for each subroutine of FLESCOT, and the JAEA researcher implemented some initial parallelization schemes to FLESCOT. Based on this code evaluation, JAEA and PNNL determined that FLESCOT is: applicable to Fukushima lakes/dam reservoirs, river mouth areas, and coastal water; and feasible to implement parallelization for the JAEA supercomputer. In addition, PNNL and JAEA researchers discussed molecular modeling approaches on cesium adsorption mechanisms to enhance the JAEA molecular modeling activities. PNNL and JAEA also discussed specific collaboration of molecular and computational modeling activities.« less
Clinal variation at phenology-related genes in spruce: parallel evolution in FTL2 and Gigantea?
Chen, Jun; Tsuda, Yoshiaki; Stocks, Michael; Källman, Thomas; Xu, Nannan; Kärkkäinen, Katri; Huotari, Tea; Semerikov, Vladimir L; Vendramin, Giovanni G; Lascoux, Martin
2014-07-01
Parallel clines in different species, or in different geographical regions of the same species, are an important source of information on the genetic basis of local adaptation. We recently detected latitudinal clines in SNPs frequencies and gene expression of candidate genes for growth cessation in Scandinavian populations of Norway spruce (Picea abies). Here we test whether the same clines are also present in Siberian spruce (P. obovata), a close relative of Norway spruce with a different Quaternary history. We sequenced nine candidate genes and 27 control loci and genotyped 14 SSR loci in six populations of P. obovata located along the Yenisei river from latitude 56°N to latitude 67°N. In contrast to Scandinavian Norway spruce that both departs from the standard neutral model (SNM) and shows a clear population structure, Siberian spruce populations along the Yenisei do not depart from the SNM and are genetically unstructured. Nonetheless, as in Norway spruce, growth cessation is significantly clinal. Polymorphisms in photoperiodic (FTL2) and circadian clock (Gigantea, GI, PRR3) genes also show significant clinal variation and/or evidence of local selection. In GI, one of the variants is the same as in Norway spruce. Finally, a strong cline in gene expression is observed for FTL2, but not for GI. These results, together with recent physiological studies, confirm the key role played by FTL2 and circadian clock genes in the control of growth cessation in spruce species and suggest the presence of parallel adaptation in these two species. Copyright © 2014 by the Genetics Society of America.
Kanarska, Yuliya; Walton, Otis
2015-11-30
Fluid-granular flows are common phenomena in nature and industry. Here, an efficient computational technique based on the distributed Lagrange multiplier method is utilized to simulate complex fluid-granular flows. Each particle is explicitly resolved on an Eulerian grid as a separate domain, using solid volume fractions. The fluid equations are solved through the entire computational domain, however, Lagrange multiplier constrains are applied inside the particle domain such that the fluid within any volume associated with a solid particle moves as an incompressible rigid body. The particle–particle interactions are implemented using explicit force-displacement interactions for frictional inelastic particles similar to the DEMmore » method with some modifications using the volume of an overlapping region as an input to the contact forces. Here, a parallel implementation of the method is based on the SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) library.« less
CSM parallel structural methods research
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.
1989-01-01
Parallel structural methods, research team activities, advanced architecture computers for parallel computational structural mechanics (CSM) research, the FLEX/32 multicomputer, a parallel structural analyses testbed, blade-stiffened aluminum panel with a circular cutout and the dynamic characteristics of a 60 meter, 54-bay, 3-longeron deployable truss beam are among the topics discussed.
Automated three-component synthesis of a library of γ-lactams
Fenster, Erik; Hill, David; Reiser, Oliver
2012-01-01
Summary A three-component method for the synthesis of γ-lactams from commercially available maleimides, aldehydes, and amines was adapted to parallel library synthesis. Improvements to the chemistry over previous efforts include the optimization of the method to a one-pot process, the management of by-products and excess reagents, the development of an automated parallel sequence, and the adaption of the method to permit the preparation of enantiomerically enriched products. These efforts culminated in the preparation of a library of 169 γ-lactams. PMID:23209515
LMC: Logarithmantic Monte Carlo
NASA Astrophysics Data System (ADS)
Mantz, Adam B.
2017-06-01
LMC is a Markov Chain Monte Carlo engine in Python that implements adaptive Metropolis-Hastings and slice sampling, as well as the affine-invariant method of Goodman & Weare, in a flexible framework. It can be used for simple problems, but the main use case is problems where expensive likelihood evaluations are provided by less flexible third-party software, which benefit from parallelization across many nodes at the sampling level. The parallel/adaptive methods use communication through MPI, or alternatively by writing/reading files, and mostly follow the approaches pioneered by CosmoMC (ascl:1106.025).
Parallel adaptive discontinuous Galerkin approximation for thin layer avalanche modeling
NASA Astrophysics Data System (ADS)
Patra, A. K.; Nichita, C. C.; Bauer, A. C.; Pitman, E. B.; Bursik, M.; Sheridan, M. F.
2006-08-01
This paper describes the development of highly accurate adaptive discontinuous Galerkin schemes for the solution of the equations arising from a thin layer type model of debris flows. Such flows have wide applicability in the analysis of avalanches induced by many natural calamities, e.g. volcanoes, earthquakes, etc. These schemes are coupled with special parallel solution methodologies to produce a simulation tool capable of very high-order numerical accuracy. The methodology successfully replicates cold rock avalanches at Mount Rainier, Washington and hot volcanic particulate flows at Colima Volcano, Mexico.
Unstructured grids on SIMD torus machines
NASA Technical Reports Server (NTRS)
Bjorstad, Petter E.; Schreiber, Robert
1994-01-01
Unstructured grids lead to unstructured communication on distributed memory parallel computers, a problem that has been considered difficult. Here, we consider adaptive, offline communication routing for a SIMD processor grid. Our approach is empirical. We use large data sets drawn from supercomputing applications instead of an analytic model of communication load. The chief contribution of this paper is an experimental demonstration of the effectiveness of certain routing heuristics. Our routing algorithm is adaptive, nonminimal, and is generally designed to exploit locality. We have a parallel implementation of the router, and we report on its performance.
NASA Astrophysics Data System (ADS)
Benichou, Jennifer I. C.; van Heijst, Jeroen W. J.; Glanville, Jacob; Louzoun, Yoram
2017-08-01
T and B cell receptor (TCR and BCR) complementarity determining region 3 (CDR3) genetic diversity is produced through multiple diversification and selection stages. Potential holes in the CDR3 repertoire were argued to be linked to immunodeficiencies and diseases. In contrast with BCRs, TCRs have practically no Dβ germline genetic diversity, and the question emerges as to whether they can produce a diverse CDR3 repertoire. In order to address the genetic diversity of the adaptive immune system, appropriate quantitative measures for diversity and large-scale sequencing are required. Such a diversity method should incorporate the complex diversification mechanisms of the adaptive immune response and the BCR and TCR loci structure. We combined large-scale sequencing and diversity measures to show that TCRs have a near maximal CDR3 genetic diversity. Specifically, TCR have a larger junctional and V germline diversity, which starts more 5‧ in Vβ than BCRs. Selection decreases the TCR repertoire diversity, but does not affect BCR repertoire. As a result, TCR is as diverse as BCR repertoire, with a biased CDR3 length toward short TCRs and long BCRs. These differences suggest parallel converging evolutionary tracks to reach the required diversity to avoid holes in the CDR3 repertoire.
Kimmel, Charles B.; Cresko, William A.; Phillips, Patrick C.; Ullmann, Bonnie; Currey, Mark; von Hippel, Frank; Kristjánsson, Bjarni K.; Gelmond, Ofer; McGuigan, Katrina
2014-01-01
Evolution of similar phenotypes in independent populations is often taken as evidence of adaptation to the same fitness optimum. However, the genetic architecture of traits might cause evolution to proceed more often toward particular phenotypes, and less often toward others, independently of the adaptive value of the traits. Freshwater populations of Alaskan threespine stickleback have repeatedly evolved the same distinctive opercle shape after divergence from an oceanic ancestor. Here we demonstrate that this pattern of parallel evolution is widespread, distinguishing oceanic and freshwater populations across the Pacific Coast of North America and Iceland. We test whether this parallel evolution reflects genetic bias by estimating the additive genetic variance– covariance matrix (G) of opercle shape in an Alaskan oceanic (putative ancestral) population. We find significant additive genetic variance for opercle shape and that G has the potential to be biasing, because of the existence of regions of phenotypic space with low additive genetic variation. However, evolution did not occur along major eigenvectors of G, rather it occurred repeatedly in the same directions of high evolvability. We conclude that the parallel opercle evolution is most likely due to selection during adaptation to freshwater habitats, rather than due to biasing effects of opercle genetic architecture. PMID:22276538
[Advanced Development for Space Robotics With Emphasis on Fault Tolerance Technology
NASA Technical Reports Server (NTRS)
Tesar, Delbert
1997-01-01
This report describes work developing fault tolerant redundant robotic architectures and adaptive control strategies for robotic manipulator systems which can dynamically accommodate drastic robot manipulator mechanism, sensor or control failures and maintain stable end-point trajectory control with minimum disturbance. Kinematic designs of redundant, modular, reconfigurable arms for fault tolerance were pursued at a fundamental level. The approach developed robotic testbeds to evaluate disturbance responses of fault tolerant concepts in robotic mechanisms and controllers. The development was implemented in various fault tolerant mechanism testbeds including duality in the joint servo motor modules, parallel and serial structural architectures, and dual arms. All have real-time adaptive controller technologies to react to mechanism or controller disturbances (failures) to perform real-time reconfiguration to continue the task operations. The developments fall into three main areas: hardware, software, and theoretical.
NASA Technical Reports Server (NTRS)
Feng, Hui-Yu; VanderWijngaart, Rob; Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2001-01-01
We describe the design of a new method for the measurement of the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. The method involves the solution of a stylized heat transfer problem on an unstructured, adaptive grid. A Spectral Element Method (SEM) with an adaptive, nonconforming mesh is selected to discretize the transport equation. The relatively high order of the SEM lowers the fraction of wall clock time spent on inter-processor communication, which eases the load balancing task and allows us to concentrate on the memory accesses. The benchmark is designed to be three-dimensional. Parallelization and load balance issues of a reference implementation will be described in detail in future reports.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Childers, J. T.; Uram, T. D.; LeCompte, T. J.
As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the World- wide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. This paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application andmore » the performance that was achieved.« less
Childers, J. T.; Uram, T. D.; LeCompte, T. J.; ...
2016-09-29
As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. Finally, this paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application andmore » the performance that was achieved.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Childers, J. T.; Uram, T. D.; LeCompte, T. J.
As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. Finally, this paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application andmore » the performance that was achieved.« less
Adaptive Nulling for the Terrestrial Planet Finder Interferometer
NASA Technical Reports Server (NTRS)
Peters, Robert D.; Lay, Oliver P.; Jeganathan, Muthu; Hirai, Akiko
2006-01-01
A description of adaptive nulling for Terrestrial Planet Finder Interferometer (TPFI) is presented. The topics include: 1) Nulling in TPF-I; 2) Why Do Adaptive Nulling; 3) Parallel High-Order Compensator Design; 4) Phase and Amplitude Control; 5) Development Activates; 6) Requirements; 7) Simplified Experimental Setup; 8) Intensity Correction; and 9) Intensity Dispersion Stability. A short summary is also given on adaptive nulling for the TPFI.
Contrast adaptation in cat visual cortex is not mediated by GABA.
DeBruyn, E J; Bonds, A B
1986-09-24
The possible involvement of gamma-aminobutyric acid (GABA) in contrast adaptation in single cells in area 17 of the cat was investigated. Iontophoretic application of N-methyl bicuculline increased cell responses, but had no effect on the magnitude of adaptation. These results suggest that contrast adaptation is the result of inhibition through a parallel pathway, but that GABA does not mediate this process.
PLUM: Parallel Load Balancing for Unstructured Adaptive Meshes. Degree awarded by Colorado Univ.
NASA Technical Reports Server (NTRS)
Oliker, Leonid
1998-01-01
Dynamic mesh adaption on unstructured grids is a powerful tool for computing large-scale problems that require grid modifications to efficiently resolve solution features. By locally refining and coarsening the mesh to capture physical phenomena of interest, such procedures make standard computational methods more cost effective. Unfortunately, an efficient parallel implementation of these adaptive methods is rather difficult to achieve, primarily due to the load imbalance created by the dynamically-changing nonuniform grid. This requires significant communication at runtime, leading to idle processors and adversely affecting the total execution time. Nonetheless, it is generally thought that unstructured adaptive- grid techniques will constitute a significant fraction of future high-performance supercomputing. Various dynamic load balancing methods have been reported to date; however, most of them either lack a global view of loads across processors or do not apply their techniques to realistic large-scale applications.
Phase reconstruction using compressive two-step parallel phase-shifting digital holography
NASA Astrophysics Data System (ADS)
Ramachandran, Prakash; Alex, Zachariah C.; Nelleri, Anith
2018-04-01
The linear relationship between the sample complex object wave and its approximated complex Fresnel field obtained using single shot parallel phase-shifting digital holograms (PPSDH) is used in compressive sensing framework and an accurate phase reconstruction is demonstrated. It is shown that the accuracy of phase reconstruction of this method is better than that of compressive sensing adapted single exposure inline holography (SEOL) method. It is derived that the measurement model of PPSDH method retains both the real and imaginary parts of the Fresnel field but with an approximation noise and the measurement model of SEOL retains only the real part exactly equal to the real part of the complex Fresnel field and its imaginary part is completely not available. Numerical simulation is performed for CS adapted PPSDH and CS adapted SEOL and it is demonstrated that the phase reconstruction is accurate for CS adapted PPSDH and can be used for single shot digital holographic reconstruction.
PLUM: Parallel Load Balancing for Adaptive Unstructured Meshes
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Biswas, Rupak; Saini, Subhash (Technical Monitor)
1998-01-01
Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. We present a novel method called PLUM to dynamically balance the processor workloads with a global view. This paper presents the implementation and integration of all major components within our dynamic load balancing strategy for adaptive grid calculations. Mesh adaption, repartitioning, processor assignment, and remapping are critical components of the framework that must be accomplished rapidly and efficiently so as not to cause a significant overhead to the numerical simulation. A data redistribution model is also presented that predicts the remapping cost on the SP2. This model is required to determine whether the gain from a balanced workload distribution offsets the cost of data movement. Results presented in this paper demonstrate that PLUM is an effective dynamic load balancing strategy which remains viable on a large number of processors.
NASA Technical Reports Server (NTRS)
Mccormick, S.; Quinlan, D.
1989-01-01
The fast adaptive composite grid method (FAC) is an algorithm that uses various levels of uniform grids (global and local) to provide adaptive resolution and fast solution of PDEs. Like all such methods, it offers parallelism by using possibly many disconnected patches per level, but is hindered by the need to handle these levels sequentially. The finest levels must therefore wait for processing to be essentially completed on all the coarser ones. A recently developed asynchronous version of FAC, called AFAC, completely eliminates this bottleneck to parallelism. This paper describes timing results for AFAC, coupled with a simple load balancing scheme, applied to the solution of elliptic PDEs on an Intel iPSC hypercube. These tests include performance of certain processes necessary in adaptive methods, including moving grids and changing refinement. A companion paper reports on numerical and analytical results for estimating convergence factors of AFAC applied to very large scale examples.
A Model for Speedup of Parallel Programs
1997-01-01
Sanjeev. K Setia . The interaction between mem- ory allocation and adaptive partitioning in message- passing multicomputers. In IPPS Workshop on Job...Scheduling Strategies for Parallel Processing, pages 89{99, 1995. [15] Sanjeev K. Setia and Satish K. Tripathi. A compar- ative analysis of static
Developing parallel GeoFEST(P) using the PYRAMID AMR library
NASA Technical Reports Server (NTRS)
Norton, Charles D.; Lyzenga, Greg; Parker, Jay; Tisdale, Robert E.
2004-01-01
The PYRAMID parallel unstructured adaptive mesh refinement (AMR) library has been coupled with the GeoFEST geophysical finite element simulation tool to support parallel active tectonics simulations. Specifically, we have demonstrated modeling of coseismic and postseismic surface displacement due to a simulated Earthquake for the Landers system of interacting faults in Southern California. The new software demonstrated a 25-times resolution improvement and a 4-times reduction in time to solution over the sequential baseline milestone case. Simulations on workstations using a few tens of thousands of stress displacement finite elements can now be expanded to multiple millions of elements with greater than 98% scaled efficiency on various parallel platforms over many hundreds of processors. Our most recent work has demonstrated that we can dynamically adapt the computational grid as stress grows on a fault. In this paper, we will describe the major issues and challenges associated with coupling these two programs to create GeoFEST(P). Performance and visualization results will also be described.
A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS
Fu, Zhisong; Kirby, Robert M.; Whitaker, Ross T.
2014-01-01
Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers. PMID:25221418
Flexbar 3.0 - SIMD and multicore parallelization.
Roehr, Johannes T; Dieterich, Christoph; Reinert, Knut
2017-09-15
High-throughput sequencing machines can process many samples in a single run. For Illumina systems, sequencing reads are barcoded with an additional DNA tag that is contained in the respective sequencing adapters. The recognition of barcode and adapter sequences is hence commonly needed for the analysis of next-generation sequencing data. Flexbar performs demultiplexing based on barcodes and adapter trimming for such data. The massive amounts of data generated on modern sequencing machines demand that this preprocessing is done as efficiently as possible. We present Flexbar 3.0, the successor of the popular program Flexbar. It employs now twofold parallelism: multi-threading and additionally SIMD vectorization. Both types of parallelism are used to speed-up the computation of pair-wise sequence alignments, which are used for the detection of barcodes and adapters. Furthermore, new features were included to cover a wide range of applications. We evaluated the performance of Flexbar based on a simulated sequencing dataset. Our program outcompetes other tools in terms of speed and is among the best tools in the presented quality benchmark. https://github.com/seqan/flexbar. johannes.roehr@fu-berlin.de or knut.reinert@fu-berlin.de. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
NASA Technical Reports Server (NTRS)
Aftosmis, M. J.; Berger, M. J.; Adomavicius, G.
2000-01-01
Preliminary verification and validation of an efficient Euler solver for adaptively refined Cartesian meshes with embedded boundaries is presented. The parallel, multilevel method makes use of a new on-the-fly parallel domain decomposition strategy based upon the use of space-filling curves, and automatically generates a sequence of coarse meshes for processing by the multigrid smoother. The coarse mesh generation algorithm produces grids which completely cover the computational domain at every level in the mesh hierarchy. A series of examples on realistically complex three-dimensional configurations demonstrate that this new coarsening algorithm reliably achieves mesh coarsening ratios in excess of 7 on adaptively refined meshes. Numerical investigations of the scheme's local truncation error demonstrate an achieved order of accuracy between 1.82 and 1.88. Convergence results for the multigrid scheme are presented for both subsonic and transonic test cases and demonstrate W-cycle multigrid convergence rates between 0.84 and 0.94. Preliminary parallel scalability tests on both simple wing and complex complete aircraft geometries shows a computational speedup of 52 on 64 processors using the run-time mesh partitioner.
An efficicient data structure for three-dimensional vertex based finite volume method
NASA Astrophysics Data System (ADS)
Akkurt, Semih; Sahin, Mehmet
2017-11-01
A vertex based three-dimensional finite volume algorithm has been developed using an edge based data structure.The mesh data structure of the given algorithm is similar to ones that exist in the literature. However, the data structures are redesigned and simplied in order to fit requirements of the vertex based finite volume method. In order to increase the cache efficiency, the data access patterns for the vertex based finite volume method are investigated and these datas are packed/allocated in a way that they are close to each other in the memory. The present data structure is not limited with tetrahedrons, arbitrary polyhedrons are also supported in the mesh without putting any additional effort. Furthermore, the present data structure also supports adaptive refinement and coarsening. For the implicit and parallel implementation of the FVM algorithm, PETSc and MPI libraries are employed. The performance and accuracy of the present algorithm are tested for the classical benchmark problems by comparing the CPU time for the open source algorithms.
Prosodic Structure as a Parallel to Musical Structure
Heffner, Christopher C.; Slevc, L. Robert
2015-01-01
What structural properties do language and music share? Although early speculation identified a wide variety of possibilities, the literature has largely focused on the parallels between musical structure and syntactic structure. Here, we argue that parallels between musical structure and prosodic structure deserve more attention. We review the evidence for a link between musical and prosodic structure and find it to be strong. In fact, certain elements of prosodic structure may provide a parsimonious comparison with musical structure without sacrificing empirical findings related to the parallels between language and music. We then develop several predictions related to such a hypothesis. PMID:26733930
Tobias, Joseph A; Seddon, Nathalie
2009-12-01
Natural selection is known to produce convergent phenotypes through mimicry or ecological adaptation. It has also been proposed that social selection--i.e., selection exerted by social competition--may drive convergent evolution in signals mediating interspecific communication, yet this idea remains controversial. Here, we use color spectrophotometry, acoustic analyses, and playback experiments to assess the hypothesis of adaptive signal convergence in two competing nonsister taxa, Hypocnemis peruviana and H. subflava (Aves: Thamnophilidae). We show that the structure of territorial songs in males overlaps in sympatry, with some evidence of convergent character displacement. Conversely, nonterritorial vocal and visual signals in males are strikingly diagnostic, in line with 6.8% divergence in mtDNA sequences. The same pattern of variation applies to females. Finally, we show that songs in both sexes elicit strong territorial responses within and between species, whereas songs of a third, allopatric and more closely related species (H. striata) are structurally divergent and elicit weaker responses. Taken together, our results provide compelling evidence that social selection can act across species boundaries to drive convergent or parallel evolution in taxa competing for space and resources.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castellana, Vito G.; Tumeo, Antonino; Ferrandi, Fabrizio
Emerging applications such as data mining, bioinformatics, knowledge discovery, social network analysis are irregular. They use data structures based on pointers or linked lists, such as graphs, unbalanced trees or unstructures grids, which generates unpredictable memory accesses. These data structures usually are large, but difficult to partition. These applications mostly are memory bandwidth bounded and have high synchronization intensity. However, they also have large amounts of inherent dynamic parallelism, because they potentially perform a task for each one of the element they are exploring. Several efforts are looking at accelerating these applications on hybrid architectures, which integrate general purpose processorsmore » with reconfigurable devices. Some solutions, which demonstrated significant speedups, include custom-hand tuned accelerators or even full processor architectures on the reconfigurable logic. In this paper we present an approach for the automatic synthesis of accelerators from C, targeted at irregular applications. In contrast to typical High Level Synthesis paradigms, which construct a centralized Finite State Machine, our approach generates dynamically scheduled hardware components. While parallelism exploitation in typical HLS-generated accelerators is usually bound within a single execution flow, our solution allows concurrently running multiple execution flow, thus also exploiting the coarser grain task parallelism of irregular applications. Our approach supports multiple, multi-ported and distributed memories, and atomic memory operations. Its main objective is parallelizing as many memory operations as possible, independently from their execution time, to maximize the memory bandwidth utilization. This significantly differs from current HLS flows, which usually consider a single memory port and require precise scheduling of memory operations. A key innovation of our approach is the generation of a memory interface controller, which dynamically maps concurrent memory accesses to multiple ports. We present a case study on a typical irregular kernel, Graph Breadth First search (BFS), exploring different tradeoffs in terms of parallelism and number of memories.« less
Development of a scalable generic platform for adaptive optics real time control
NASA Astrophysics Data System (ADS)
Surendran, Avinash; Burse, Mahesh P.; Ramaprakash, A. N.; Parihar, Padmakar
2015-06-01
The main objective of the present project is to explore the viability of an adaptive optics control system based exclusively on Field Programmable Gate Arrays (FPGAs), making strong use of their parallel processing capability. In an Adaptive Optics (AO) system, the generation of the Deformable Mirror (DM) control voltages from the Wavefront Sensor (WFS) measurements is usually through the multiplication of the wavefront slopes with a predetermined reconstructor matrix. The ability to access several hundred hard multipliers and memories concurrently in an FPGA allows performance far beyond that of a modern CPU or GPU for tasks with a well-defined structure such as Adaptive Optics control. The target of the current project is to generate a signal for a real time wavefront correction, from the signals coming from a Wavefront Sensor, wherein the system would be flexible to accommodate all the current Wavefront Sensing techniques and also the different methods which are used for wavefront compensation. The system should also accommodate for different data transmission protocols (like Ethernet, USB, IEEE 1394 etc.) for transmitting data to and from the FPGA device, thus providing a more flexible platform for Adaptive Optics control. Preliminary simulation results for the formulation of the platform, and a design of a fully scalable slope computer is presented.
MODA A Framework for Memory Centric Performance Characterization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shrestha, Sunil; Su, Chun-Yi; White, Amanda M.
2012-06-29
In the age of massive parallelism, the focus of performance analysis has switched from the processor and related structures to the memory and I/O resources. Adapting to this new reality, a performance analysis tool has to provide a way to analyze resource usage to pinpoint existing and potential problems in a given application. This paper provides an overview of the Memory Observant Data Analysis (MODA) tool, a memory-centric tool first implemented on the Cray XMT supercomputer. Throughout the paper, MODA's capabilities have been showcased with experiments done on matrix multiply and Graph-500 application codes.
NASA Technical Reports Server (NTRS)
Starks, Scott; Abdel-Hafeez, Saleh; Usevitch, Bryan
1997-01-01
This paper discusses the implementation of a fuzzy logic system using an ASICs design approach. The approach is based upon combining the inherent advantages of symmetric triangular membership functions and fuzzy singleton sets to obtain a novel structure for fuzzy logic system application development. The resulting structure utilizes a fuzzy static RAM to store the rule-base and the end-points of the triangular membership functions. This provides advantages over other approaches in which all sampled values of membership functions for all universes must be stored. The fuzzy coprocessor structure implements the fuzzification and defuzzification processes through a two-stage parallel pipeline architecture which is capable of executing complex fuzzy computations in less than 0.55us with an accuracy of more than 95%, thus making it suitable for a wide range of applications. Using the approach presented in this paper, a fuzzy logic rule-base can be directly downloaded via a host processor to an onchip rule-base memory with a size of 64 words. The fuzzy coprocessor's design supports up to 49 rules for seven fuzzy membership functions associated with each of the chip's two input variables. This feature allows designers to create fuzzy logic systems without the need for additional on-board memory. Finally, the paper reports on simulation studies that were conducted for several adaptive filter applications using the least mean squared adaptive algorithm for adjusting the knowledge rule-base.
NASA Technical Reports Server (NTRS)
Farhat, Charbel
1998-01-01
In this grant, we have proposed a three-year research effort focused on developing High Performance Computation and Communication (HPCC) methodologies for structural analysis on parallel processors and clusters of workstations, with emphasis on reducing the structural design cycle time. Besides consolidating and further improving the FETI solver technology to address plate and shell structures, we have proposed to tackle the following design related issues: (a) parallel coupling and assembly of independently designed and analyzed three-dimensional substructures with non-matching interfaces, (b) fast and smart parallel re-analysis of a given structure after it has undergone design modifications, (c) parallel evaluation of sensitivity operators (derivatives) for design optimization, and (d) fast parallel analysis of mildly nonlinear structures. While our proposal was accepted, support was provided only for one year.
Genetic diversification of chemokine CXCL16 and its receptor CXCR6 in primates.
Xu, Feifei; He, Dan; Liu, Jiabin; Ni, Qingyong; Lyu, Yongqing; Xiong, Shiqiu; Li, Yan
2018-08-01
Chemokine CXCL16 and its receptor CXCR6 are associated with a series of physiological and pathological processes in cooperative and stand-alone fashions. To shed insight into their versatile nature, we studied genetic variations of CXCL16 and CXCR6 in primates. Evolutionary analyses revealed that these genes underwent a similar evolutionary fate. Both genes experienced adaptive diversification with the phylogenetic division of cercopithecoids (Old World monkeys) and hominoids (humans, great apes, and gibbons) from their common ancestor. In contrast, they were conserved in the periods preceding and following the dividing process. In terms of the adaptive diversification between cercopithecoids and hominoids, the adaptive genetic changes have occurred in the mucin-like and chemokine domains of CXCL16 and the N-terminus and transmembrane helixes of CXCR6. In combination with currently available structural and functional information for CXCL16 and CXCR6, the parallels between the evolutionary footprints and the co-occurrence of adaptive diversification at some evolutionary stage suggest that interplay could exist between the diversification-related amino acid sites, or between the domains on which the identified sites are located, in physiological processes such as chemotaxis and/or cell adhesion. Copyright © 2018 Elsevier Ltd. All rights reserved.
Adaptive Environment for Supercompiling with Optimized Parallelism (AESOP)
2011-09-01
DATES COVERED (From - To) September 2011 Final 09 March 2009 – 31 July 2011 4 . TITLE AND SUBTITLE ADAPTIVE ENVIRONMENT FOR SUPERCOMPILING WITH... 4 2.1 System characterization loop...Integration Points for AESOP .......................................................................................10 4 . LLVM and the AESOP Compiler
Adaptive implicit-explicit and parallel element-by-element iteration schemes
NASA Technical Reports Server (NTRS)
Tezduyar, T. E.; Liou, J.; Nguyen, T.; Poole, S.
1989-01-01
Adaptive implicit-explicit (AIE) and grouped element-by-element (GEBE) iteration schemes are presented for the finite element solution of large-scale problems in computational mechanics and physics. The AIE approach is based on the dynamic arrangement of the elements into differently treated groups. The GEBE procedure, which is a way of rewriting the EBE formulation to make its parallel processing potential and implementation more clear, is based on the static arrangement of the elements into groups with no inter-element coupling within each group. Various numerical tests performed demonstrate the savings in the CPU time and memory.
Intelligent flight control systems
NASA Technical Reports Server (NTRS)
Stengel, Robert F.
1993-01-01
The capabilities of flight control systems can be enhanced by designing them to emulate functions of natural intelligence. Intelligent control functions fall in three categories. Declarative actions involve decision-making, providing models for system monitoring, goal planning, and system/scenario identification. Procedural actions concern skilled behavior and have parallels in guidance, navigation, and adaptation. Reflexive actions are spontaneous, inner-loop responses for control and estimation. Intelligent flight control systems learn knowledge of the aircraft and its mission and adapt to changes in the flight environment. Cognitive models form an efficient basis for integrating 'outer-loop/inner-loop' control functions and for developing robust parallel-processing algorithms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arumugam, Kamesh
Efficient parallel implementations of scientific applications on multi-core CPUs with accelerators such as GPUs and Xeon Phis is challenging. This requires - exploiting the data parallel architecture of the accelerator along with the vector pipelines of modern x86 CPU architectures, load balancing, and efficient memory transfer between different devices. It is relatively easy to meet these requirements for highly structured scientific applications. In contrast, a number of scientific and engineering applications are unstructured. Getting performance on accelerators for these applications is extremely challenging because many of these applications employ irregular algorithms which exhibit data-dependent control-ow and irregular memory accesses. Furthermore,more » these applications are often iterative with dependency between steps, and thus making it hard to parallelize across steps. As a result, parallelism in these applications is often limited to a single step. Numerical simulation of charged particles beam dynamics is one such application where the distribution of work and memory access pattern at each time step is irregular. Applications with these properties tend to present significant branch and memory divergence, load imbalance between different processor cores, and poor compute and memory utilization. Prior research on parallelizing such irregular applications have been focused around optimizing the irregular, data-dependent memory accesses and control-ow during a single step of the application independent of the other steps, with the assumption that these patterns are completely unpredictable. We observed that the structure of computation leading to control-ow divergence and irregular memory accesses in one step is similar to that in the next step. It is possible to predict this structure in the current step by observing the computation structure of previous steps. In this dissertation, we present novel machine learning based optimization techniques to address the parallel implementation challenges of such irregular applications on different HPC architectures. In particular, we use supervised learning to predict the computation structure and use it to address the control-ow and memory access irregularities in the parallel implementation of such applications on GPUs, Xeon Phis, and heterogeneous architectures composed of multi-core CPUs with GPUs or Xeon Phis. We use numerical simulation of charged particles beam dynamics simulation as a motivating example throughout the dissertation to present our new approach, though they should be equally applicable to a wide range of irregular applications. The machine learning approach presented here use predictive analytics and forecasting techniques to adaptively model and track the irregular memory access pattern at each time step of the simulation to anticipate the future memory access pattern. Access pattern forecasts can then be used to formulate optimization decisions during application execution which improves the performance of the application at a future time step based on the observations from earlier time steps. In heterogeneous architectures, forecasts can also be used to improve the memory performance and resource utilization of all the processing units to deliver a good aggregate performance. We used these optimization techniques and anticipation strategy to design a cache-aware, memory efficient parallel algorithm to address the irregularities in the parallel implementation of charged particles beam dynamics simulation on different HPC architectures. Experimental result using a diverse mix of HPC architectures shows that our approach in using anticipation strategy is effective in maximizing data reuse, ensuring workload balance, minimizing branch and memory divergence, and in improving resource utilization.« less
Similar traits, different genes? Examining convergent evolution in related weedy rice populations
USDA-ARS?s Scientific Manuscript database
Convergent phenotypic evolution may or may not be associated with parallel genotypic evolution. Agricultural weeds have repeatedly been selected for weed-adaptive traits such as rapid growth, increased seed dispersal and dormancy, thus providing an ideal system for the study of parallel evolution. H...
Scaling Semantic Graph Databases in Size and Performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morari, Alessandro; Castellana, Vito G.; Villa, Oreste
In this paper we present SGEM, a full software system for accelerating large-scale semantic graph databases on commodity clusters. Unlike current approaches, SGEM addresses semantic graph databases by only employing graph methods at all the levels of the stack. On one hand, this allows exploiting the space efficiency of graph data structures and the inherent parallelism of graph algorithms. These features adapt well to the increasing system memory and core counts of modern commodity clusters. On the other hand, however, these systems are optimized for regular computation and batched data transfers, while graph methods usually are irregular and generate fine-grainedmore » data accesses with poor spatial and temporal locality. Our framework comprises a SPARQL to data parallel C compiler, a library of parallel graph methods and a custom, multithreaded runtime system. We introduce our stack, motivate its advantages with respect to other solutions and show how we solved the challenges posed by irregular behaviors. We present the result of our software stack on the Berlin SPARQL benchmarks with datasets up to 10 billion triples (a triple corresponds to a graph edge), demonstrating scaling in dataset size and in performance as more nodes are added to the cluster.« less
Synaptic plasticity in a cerebellum-like structure depends on temporal order
NASA Astrophysics Data System (ADS)
Bell, Curtis C.; Han, Victor Z.; Sugawara, Yoshiko; Grant, Kirsty
1997-05-01
Cerebellum-like structures in fish appear to act as adaptive sensory processors, in which learned predictions about sensory input are generated and subtracted from actual sensory input, allowing unpredicted inputs to stand out1-3. Pairing sensory input with centrally originating predictive signals, such as corollary discharge signals linked to motor commands, results in neural responses to the predictive signals alone that are Negative images' of the previously paired sensory responses. Adding these 'negative images' to actual sensory inputs minimizes the neural response to predictable sensory features. At the cellular level, sensory input is relayed to the basal region of Purkinje-like cells, whereas predictive signals are relayed by parallel fibres to the apical dendrites of the same cells4. The generation of negative images could be explained by plasticity at parallel fibre synapses5-7. We show here that such plasticity exists in the electrosensory lobe of mormyrid electric fish and that it has the necessary properties for such a model: it is reversible, anti-hebbian (excitatory postsynaptic potentials (EPSPs) are depressed after pairing with a postsynaptic spike) and tightly dependent on the sequence of pre- and postsynaptic events, with depression occurring only if the postsynaptic spike follows EPSP onset within 60 ms.
McGlothlin, Joel W; Chuckalovcak, John P; Janes, Daniel E; Edwards, Scott V; Feldman, Chris R; Brodie, Edmund D; Pfrender, Michael E; Brodie, Edmund D
2014-11-01
Members of a gene family expressed in a single species often experience common selection pressures. Consequently, the molecular basis of complex adaptations may be expected to involve parallel evolutionary changes in multiple paralogs. Here, we use bacterial artificial chromosome library scans to investigate the evolution of the voltage-gated sodium channel (Nav) family in the garter snake Thamnophis sirtalis, a predator of highly toxic Taricha newts. Newts possess tetrodotoxin (TTX), which blocks Nav's, arresting action potentials in nerves and muscle. Some Thamnophis populations have evolved resistance to extremely high levels of TTX. Previous work has identified amino acid sites in the skeletal muscle sodium channel Nav1.4 that confer resistance to TTX and vary across populations. We identify parallel evolution of TTX resistance in two additional Nav paralogs, Nav1.6 and 1.7, which are known to be expressed in the peripheral nervous system and should thus be exposed to ingested TTX. Each paralog contains at least one TTX-resistant substitution identical to a substitution previously identified in Nav1.4. These sites are fixed across populations, suggesting that the resistant peripheral nerves antedate resistant muscle. In contrast, three sodium channels expressed solely in the central nervous system (Nav1.1-1.3) showed no evidence of TTX resistance, consistent with protection from toxins by the blood-brain barrier. We also report the exon-intron structure of six Nav paralogs, the first such analysis for snake genes. Our results demonstrate that the molecular basis of adaptation may be both repeatable across members of a gene family and predictable based on functional considerations. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The JCSG high-throughput structural biology pipeline.
Elsliger, Marc André; Deacon, Ashley M; Godzik, Adam; Lesley, Scott A; Wooley, John; Wüthrich, Kurt; Wilson, Ian A
2010-10-01
The Joint Center for Structural Genomics high-throughput structural biology pipeline has delivered more than 1000 structures to the community over the past ten years. The JCSG has made a significant contribution to the overall goal of the NIH Protein Structure Initiative (PSI) of expanding structural coverage of the protein universe, as well as making substantial inroads into structural coverage of an entire organism. Targets are processed through an extensive combination of bioinformatics and biophysical analyses to efficiently characterize and optimize each target prior to selection for structure determination. The pipeline uses parallel processing methods at almost every step in the process and can adapt to a wide range of protein targets from bacterial to human. The construction, expansion and optimization of the JCSG gene-to-structure pipeline over the years have resulted in many technological and methodological advances and developments. The vast number of targets and the enormous amounts of associated data processed through the multiple stages of the experimental pipeline required the development of variety of valuable resources that, wherever feasible, have been converted to free-access web-based tools and applications.
The role of parallelism in the real-time processing of anaphora.
Poirier, Josée; Walenski, Matthew; Shapiro, Lewis P
2012-06-01
Parallelism effects refer to the facilitated processing of a target structure when it follows a similar, parallel structure. In coordination, a parallelism-related conjunction triggers the expectation that a second conjunct with the same structure as the first conjunct should occur. It has been proposed that parallelism effects reflect the use of the first structure as a template that guides the processing of the second. In this study, we examined the role of parallelism in real-time anaphora resolution by charting activation patterns in coordinated constructions containing anaphora, Verb-Phrase Ellipsis (VPE) and Noun-Phrase Traces (NP-traces). Specifically, we hypothesised that an expectation of parallelism would incite the parser to assume a structure similar to the first conjunct in the second, anaphora-containing conjunct. The speculation of a similar structure would result in early postulation of covert anaphora. Experiment 1 confirms that following a parallelism-related conjunction, first-conjunct material is activated in the second conjunct. Experiment 2 reveals that an NP-trace in the second conjunct is posited immediately where licensed, which is earlier than previously reported in the literature. In light of our findings, we propose an intricate relation between structural expectations and anaphor resolution.
The role of parallelism in the real-time processing of anaphora
Poirier, Josée; Walenski, Matthew; Shapiro, Lewis P.
2012-01-01
Parallelism effects refer to the facilitated processing of a target structure when it follows a similar, parallel structure. In coordination, a parallelism-related conjunction triggers the expectation that a second conjunct with the same structure as the first conjunct should occur. It has been proposed that parallelism effects reflect the use of the first structure as a template that guides the processing of the second. In this study, we examined the role of parallelism in real-time anaphora resolution by charting activation patterns in coordinated constructions containing anaphora, Verb-Phrase Ellipsis (VPE) and Noun-Phrase Traces (NP-traces). Specifically, we hypothesised that an expectation of parallelism would incite the parser to assume a structure similar to the first conjunct in the second, anaphora-containing conjunct. The speculation of a similar structure would result in early postulation of covert anaphora. Experiment 1 confirms that following a parallelism-related conjunction, first-conjunct material is activated in the second conjunct. Experiment 2 reveals that an NP-trace in the second conjunct is posited immediately where licensed, which is earlier than previously reported in the literature. In light of our findings, we propose an intricate relation between structural expectations and anaphor resolution. PMID:23741080
Efficacy of the SU(3) scheme for ab initio large-scale calculations beyond the lightest nuclei
Dytrych, T.; Maris, P.; Launey, K. D.; ...
2016-06-22
We report on the computational characteristics of ab initio nuclear structure calculations in a symmetry-adapted no-core shell model (SA-NCSM) framework. We examine the computational complexity of the current implementation of the SA-NCSM approach, dubbed LSU3shell, by analyzing ab initio results for 6Li and 12C in large harmonic oscillator model spaces and SU3-selected subspaces. We demonstrate LSU3shell’s strong-scaling properties achieved with highly-parallel methods for computing the many-body matrix elements. Results compare favorably with complete model space calculations and significant memory savings are achieved in physically important applications. In particular, a well-chosen symmetry-adapted basis affords memory savings in calculations of states withmore » a fixed total angular momentum in large model spaces while exactly preserving translational invariance.« less
Efficacy of the SU(3) scheme for ab initio large-scale calculations beyond the lightest nuclei
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dytrych, T.; Maris, Pieter; Launey, K. D.
2016-06-09
We report on the computational characteristics of ab initio nuclear structure calculations in a symmetry-adapted no-core shell model (SA-NCSM) framework. We examine the computational complexity of the current implementation of the SA-NCSM approach, dubbed LSU3shell, by analyzing ab initio results for 6Li and 12C in large harmonic oscillator model spaces and SU(3)-selected subspaces. We demonstrate LSU3shell's strong-scaling properties achieved with highly-parallel methods for computing the many-body matrix elements. Results compare favorably with complete model space calculations and signi cant memory savings are achieved in physically important applications. In particular, a well-chosen symmetry-adapted basis a ords memory savings in calculations ofmore » states with a fixed total angular momentum in large model spaces while exactly preserving translational invariance.« less
Seismic analysis of parallel structures coupled by lead extrusion dampers
NASA Astrophysics Data System (ADS)
Patel, C. C.
2017-06-01
In this paper, the response behaviors of two parallel structures coupled by Lead Extrusion Dampers (LED) under various earthquake ground motion excitations are investigated. The equation of motion for the two parallel, multi-degree-of-freedom (MDOF) structures connected by LEDs is formulated. To explore the viability of LED to control the responses, namely displacement, acceleration and shear force of parallel coupled structures, the numerical study is done in two parts: (1) two parallel MDOF structures connected with LEDs having same damper damping in all the dampers and (2) two parallel MDOF structures connected with LEDs having different damper damping. A parametric study is conducted to investigate the optimum damping of the dampers. Moreover, to limit the cost of the dampers, the study is conducted with only 50% of total dampers at optimal locations, instead of placing the dampers at all the floor level. Results show that LEDs connecting the parallel structures of different fundamental frequencies, the earthquake-induced responses of either structure can be effectively reduced. Further, it is not necessary to connect the two structures at all floors; however, lesser damper at appropriate locations can significantly reduce the earthquake response of the coupled system, thus reducing the cost of the dampers significantly.
Sachetto Oliveira, Rafael; Martins Rocha, Bernardo; Burgarelli, Denise; Meira, Wagner; Constantinides, Christakis; Weber Dos Santos, Rodrigo
2018-02-01
The use of computer models as a tool for the study and understanding of the complex phenomena of cardiac electrophysiology has attained increased importance nowadays. At the same time, the increased complexity of the biophysical processes translates into complex computational and mathematical models. To speed up cardiac simulations and to allow more precise and realistic uses, 2 different techniques have been traditionally exploited: parallel computing and sophisticated numerical methods. In this work, we combine a modern parallel computing technique based on multicore and graphics processing units (GPUs) and a sophisticated numerical method based on a new space-time adaptive algorithm. We evaluate each technique alone and in different combinations: multicore and GPU, multicore and GPU and space adaptivity, multicore and GPU and space adaptivity and time adaptivity. All the techniques and combinations were evaluated under different scenarios: 3D simulations on slabs, 3D simulations on a ventricular mouse mesh, ie, complex geometry, sinus-rhythm, and arrhythmic conditions. Our results suggest that multicore and GPU accelerate the simulations by an approximate factor of 33×, whereas the speedups attained by the space-time adaptive algorithms were approximately 48. Nevertheless, by combining all the techniques, we obtained speedups that ranged between 165 and 498. The tested methods were able to reduce the execution time of a simulation by more than 498× for a complex cellular model in a slab geometry and by 165× in a realistic heart geometry simulating spiral waves. The proposed methods will allow faster and more realistic simulations in a feasible time with no significant loss of accuracy. Copyright © 2017 John Wiley & Sons, Ltd.
Visualization of Octree Adaptive Mesh Refinement (AMR) in Astrophysical Simulations
NASA Astrophysics Data System (ADS)
Labadens, M.; Chapon, D.; Pomaréde, D.; Teyssier, R.
2012-09-01
Computer simulations are important in current cosmological research. Those simulations run in parallel on thousands of processors, and produce huge amount of data. Adaptive mesh refinement is used to reduce the computing cost while keeping good numerical accuracy in regions of interest. RAMSES is a cosmological code developed by the Commissariat à l'énergie atomique et aux énergies alternatives (English: Atomic Energy and Alternative Energies Commission) which uses Octree adaptive mesh refinement. Compared to grid based AMR, the Octree AMR has the advantage to fit very precisely the adaptive resolution of the grid to the local problem complexity. However, this specific octree data type need some specific software to be visualized, as generic visualization tools works on Cartesian grid data type. This is why the PYMSES software has been also developed by our team. It relies on the python scripting language to ensure a modular and easy access to explore those specific data. In order to take advantage of the High Performance Computer which runs the RAMSES simulation, it also uses MPI and multiprocessing to run some parallel code. We would like to present with more details our PYMSES software with some performance benchmarks. PYMSES has currently two visualization techniques which work directly on the AMR. The first one is a splatting technique, and the second one is a custom ray tracing technique. Both have their own advantages and drawbacks. We have also compared two parallel programming techniques with the python multiprocessing library versus the use of MPI run. The load balancing strategy has to be smartly defined in order to achieve a good speed up in our computation. Results obtained with this software are illustrated in the context of a massive, 9000-processor parallel simulation of a Milky Way-like galaxy.
Progress on the development of FullWave, a Hot and Cold Plasma Parallel Full Wave Code
NASA Astrophysics Data System (ADS)
Spencer, J. Andrew; Svidzinski, Vladimir; Zhao, Liangji; Kim, Jin-Soo
2017-10-01
FullWave is being developed at FAR-TECH, Inc. to simulate RF waves in hot inhomogeneous magnetized plasmas without making small orbit approximations. FullWave is based on a meshless formulation in configuration space on non-uniform clouds of computational points (CCP) adapted to better resolve plasma resonances, antenna structures and complex boundaries. The linear frequency domain wave equation is formulated using two approaches: for cold plasmas the local cold plasma dielectric tensor is used (resolving resonances by particle collisions), while for hot plasmas the conductivity kernel is calculated. The details of FullWave and some preliminary results will be presented, including: 1) a monitor function based on analytic solutions of the cold-plasma dispersion relation; 2) an adaptive CCP based on the monitor function; 3) construction of the finite differences for approximation of derivatives on adaptive CCP; 4) results of 2-D full wave simulations in the cold plasma model in tokamak geometry using the formulated approach for ECRH, ICRH and Lower Hybrid range of frequencies. Work is supported by the U.S. DOE SBIR program.
Linde, Jennifer A; Stringer, Deborah; Simms, Leonard J; Clark, Lee Anna
2013-08-01
The Schedule for Nonadaptive and Adaptive Personality-Youth Version (SNAP-Y) is a new, reliable self-report questionnaire that assesses 15 personality traits relevant to both normal-range personality and the alternative DSM-5 model for personality disorder. Community adolescents, 12 to 18 years old (N = 364), completed the SNAP-Y; 347 also completed the Big Five Inventory-Adolescent, 144 provided 2-week retest data, and 128 others completed the Minnesota Multiphasic Personality Inventory-Adolescent. Outpatient adolescents (N = 103) completed the SNAP-Y, and 97 also completed the Minnesota Multiphasic Personality Inventory-Adolescent. The SNAP-Y demonstrated strong psychometric properties, and structural, convergent, discriminant, and external validities. Consistent with the continuity of personality, results paralleled those in adult and college samples using the adult Schedule for Nonadaptive and Adaptive Personality-Second Edition (SNAP-2), from which the SNAP-Y derives and which has established validity in personality-trait assessment across the normal-abnormal continuum. The SNAP-Y thus provides a new, clinically useful instrument to assess personality traits and personality pathology in adolescents.
NASA Astrophysics Data System (ADS)
Omidi, Parsa; Diop, Mamadou; Carson, Jeffrey; Nasiriavanaki, Mohammadreza
2017-03-01
Linear-array-based photoacoustic computed tomography is a popular methodology for deep and high resolution imaging. However, issues such as phase aberration, side-lobe effects, and propagation limitations deteriorate the resolution. The effect of phase aberration due to acoustic attenuation and constant assumption of the speed of sound (SoS) can be reduced by applying an adaptive weighting method such as the coherence factor (CF). Utilizing an adaptive beamforming algorithm such as the minimum variance (MV) can improve the resolution at the focal point by eliminating the side-lobes. Moreover, invisibility of directional objects emitting parallel to the detection plane, such as vessels and other absorbing structures stretched in the direction perpendicular to the detection plane can degrade resolution. In this study, we propose a full-view array level weighting algorithm in which different weighs are assigned to different positions of the linear array based on an orientation algorithm which uses the histogram of oriented gradient (HOG). Simulation results obtained from a synthetic phantom show the superior performance of the proposed method over the existing reconstruction methods.
Yu, Yinan; Diamantaras, Konstantinos I; McKelvey, Tomas; Kung, Sun-Yuan
2018-02-01
In kernel-based classification models, given limited computational power and storage capacity, operations over the full kernel matrix becomes prohibitive. In this paper, we propose a new supervised learning framework using kernel models for sequential data processing. The framework is based on two components that both aim at enhancing the classification capability with a subset selection scheme. The first part is a subspace projection technique in the reproducing kernel Hilbert space using a CLAss-specific Subspace Kernel representation for kernel approximation. In the second part, we propose a novel structural risk minimization algorithm called the adaptive margin slack minimization to iteratively improve the classification accuracy by an adaptive data selection. We motivate each part separately, and then integrate them into learning frameworks for large scale data. We propose two such frameworks: the memory efficient sequential processing for sequential data processing and the parallelized sequential processing for distributed computing with sequential data acquisition. We test our methods on several benchmark data sets and compared with the state-of-the-art techniques to verify the validity of the proposed techniques.
Knowledge representation into Ada parallel processing
NASA Technical Reports Server (NTRS)
Masotto, Tom; Babikyan, Carol; Harper, Richard
1990-01-01
The Knowledge Representation into Ada Parallel Processing project is a joint NASA and Air Force funded project to demonstrate the execution of intelligent systems in Ada on the Charles Stark Draper Laboratory fault-tolerant parallel processor (FTPP). Two applications were demonstrated - a portion of the adaptive tactical navigator and a real time controller. Both systems are implemented as Activation Framework Objects on the Activation Framework intelligent scheduling mechanism developed by Worcester Polytechnic Institute. The implementations, results of performance analyses showing speedup due to parallelism and initial efficiency improvements are detailed and further areas for performance improvements are suggested.
Attention and apparent motion.
Horowitz, T; Treisman, A
1994-01-01
Two dissociations between short- and long-range motion in visual search are reported. Previous research has shown parallel processing for short-range motion and apparently serial processing for long-range motion. This finding has been replicated and it has also been found that search for short-range targets can be impaired both by using bicontrast stimuli, and by prior adaptation to the target direction of motion. Neither factor impaired search in long-range motion displays. Adaptation actually facilitated search with long-range displays, which is attributed to response-level effects. A feature-integration account of apparent motion is proposed. In this theory, short-range motion depends on specialized motion feature detectors operating in parallel across the display, but subject to selective adaptation, whereas attention is needed to link successive elements when they appear at greater separations, or across opposite contrasts.
Zhang, Zhongcai; Wu, Yuqiang; Huang, Jinming
2016-11-01
The antiswing control and accurate positioning are simultaneously investigated for underactuated crane systems in the presence of two parallel payloads on the trolley and rail length limitation. The equations of motion for the crane system in question are established via the Euler-Lagrange equation. An adaptive control strategy is proposed with the help of system energy function and energy shaping technique. Stability analysis shows that under the designed adaptive controller, the payload swings can be suppressed ultimately and the trolley can be regulated to the destination while not exceeding the pre-specified boundaries. Simulation results are provided to show the satisfactory control performances of the presented control method in terms of working efficiency as well as robustness with respect to external disturbances. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Performance Analysis and Portability of the PLUM Load Balancing System
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Biswas, Rupak; Gabow, Harold N.
1998-01-01
The ability to dynamically adapt an unstructured mesh is a powerful tool for solving computational problems with evolving physical features; however, an efficient parallel implementation is rather difficult. To address this problem, we have developed PLUM, an automatic portable framework for performing adaptive numerical computations in a message-passing environment. PLUM requires that all data be globally redistributed after each mesh adaption to achieve load balance. We present an algorithm for minimizing this remapping overhead by guaranteeing an optimal processor reassignment. We also show that the data redistribution cost can be significantly reduced by applying our heuristic processor reassignment algorithm to the default mapping of the parallel partitioner. Portability is examined by comparing performance on a SP2, an Origin2000, and a T3E. Results show that PLUM can be successfully ported to different platforms without any code modifications.
3D magnetospheric parallel hybrid multi-grid method applied to planet–plasma interactions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leclercq, L., E-mail: ludivine.leclercq@latmos.ipsl.fr; Modolo, R., E-mail: ronan.modolo@latmos.ipsl.fr; Leblanc, F.
2016-03-15
We present a new method to exploit multiple refinement levels within a 3D parallel hybrid model, developed to study planet–plasma interactions. This model is based on the hybrid formalism: ions are kinetically treated whereas electrons are considered as a inertia-less fluid. Generally, ions are represented by numerical particles whose size equals the volume of the cells. Particles that leave a coarse grid subsequently entering a refined region are split into particles whose volume corresponds to the volume of the refined cells. The number of refined particles created from a coarse particle depends on the grid refinement rate. In order tomore » conserve velocity distribution functions and to avoid calculations of average velocities, particles are not coalesced. Moreover, to ensure the constancy of particles' shape function sizes, the hybrid method is adapted to allow refined particles to move within a coarse region. Another innovation of this approach is the method developed to compute grid moments at interfaces between two refinement levels. Indeed, the hybrid method is adapted to accurately account for the special grid structure at the interfaces, avoiding any overlapping grid considerations. Some fundamental test runs were performed to validate our approach (e.g. quiet plasma flow, Alfven wave propagation). Lastly, we also show a planetary application of the model, simulating the interaction between Jupiter's moon Ganymede and the Jovian plasma.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gamblin, T; de Supinski, B R; Schulz, M
Good load balance is crucial on very large parallel systems, but the most sophisticated algorithms introduce dynamic imbalances through adaptation in domain decomposition or use of adaptive solvers. To observe and diagnose imbalance, developers need system-wide, temporally-ordered measurements from full-scale runs. This potentially requires data collection from multiple code regions on all processors over the entire execution. Doing this instrumentation naively can, in combination with the application itself, exceed available I/O bandwidth and storage capacity, and can induce severe behavioral perturbations. We present and evaluate a novel technique for scalable, low-error load balance measurement. This uses a parallel wavelet transformmore » and other parallel encoding methods. We show that our technique collects and reconstructs system-wide measurements with low error. Compression time scales sublinearly with system size and data volume is several orders of magnitude smaller than the raw data. The overhead is low enough for online use in a production environment.« less
Parallel Signal Processing and System Simulation using aCe
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2003-01-01
Recently, networked and cluster computation have become very popular for both signal processing and system simulation. A new language is ideally suited for parallel signal processing applications and system simulation since it allows the programmer to explicitly express the computations that can be performed concurrently. In addition, the new C based parallel language (ace C) for architecture-adaptive programming allows programmers to implement algorithms and system simulation applications on parallel architectures by providing them with the assurance that future parallel architectures will be able to run their applications with a minimum of modification. In this paper, we will focus on some fundamental features of ace C and present a signal processing application (FFT).
Boessen, Ruud; van der Baan, Frederieke; Groenwold, Rolf; Egberts, Antoine; Klungel, Olaf; Grobbee, Diederick; Knol, Mirjam; Roes, Kit
2013-01-01
Two-stage clinical trial designs may be efficient in pharmacogenetics research when there is some but inconclusive evidence of effect modification by a genomic marker. Two-stage designs allow to stop early for efficacy or futility and can offer the additional opportunity to enrich the study population to a specific patient subgroup after an interim analysis. This study compared sample size requirements for fixed parallel group, group sequential, and adaptive selection designs with equal overall power and control of the family-wise type I error rate. The designs were evaluated across scenarios that defined the effect sizes in the marker positive and marker negative subgroups and the prevalence of marker positive patients in the overall study population. Effect sizes were chosen to reflect realistic planning scenarios, where at least some effect is present in the marker negative subgroup. In addition, scenarios were considered in which the assumed 'true' subgroup effects (i.e., the postulated effects) differed from those hypothesized at the planning stage. As expected, both two-stage designs generally required fewer patients than a fixed parallel group design, and the advantage increased as the difference between subgroups increased. The adaptive selection design added little further reduction in sample size, as compared with the group sequential design, when the postulated effect sizes were equal to those hypothesized at the planning stage. However, when the postulated effects deviated strongly in favor of enrichment, the comparative advantage of the adaptive selection design increased, which precisely reflects the adaptive nature of the design. Copyright © 2013 John Wiley & Sons, Ltd.
Unstructured Adaptive Meshes: Bad for Your Memory?
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Feng, Hui-Yu; VanderWijngaart, Rob
2003-01-01
This viewgraph presentation explores the need for a NASA Advanced Supercomputing (NAS) parallel benchmark for problems with irregular dynamical memory access. This benchmark is important and necessary because: 1) Problems with localized error source benefit from adaptive nonuniform meshes; 2) Certain machines perform poorly on such problems; 3) Parallel implementation may provide further performance improvement but is difficult. Some examples of problems which use irregular dynamical memory access include: 1) Heat transfer problem; 2) Heat source term; 3) Spectral element method; 4) Base functions; 5) Elemental discrete equations; 6) Global discrete equations. Nonconforming Mesh and Mortar Element Method are covered in greater detail in this presentation.
Host shifts result in parallel genetic changes when viruses evolve in closely related species
Day, Jonathan P.; Smith, Sophia C. L.; Houslay, Thomas M.; Tagliaferri, Lucia
2018-01-01
Host shifts, where a pathogen invades and establishes in a new host species, are a major source of emerging infectious diseases. They frequently occur between related host species and often rely on the pathogen evolving adaptations that increase their fitness in the novel host species. To investigate genetic changes in novel hosts, we experimentally evolved replicate lineages of an RNA virus (Drosophila C Virus) in 19 different species of Drosophilidae and deep sequenced the viral genomes. We found a strong pattern of parallel evolution, where viral lineages from the same host were genetically more similar to each other than to lineages from other host species. When we compared viruses that had evolved in different host species, we found that parallel genetic changes were more likely to occur if the two host species were closely related. This suggests that when a virus adapts to one host it might also become better adapted to closely related host species. This may explain in part why host shifts tend to occur between related species, and may mean that when a new pathogen appears in a given species, closely related species may become vulnerable to the new disease. PMID:29649296
DeFaveri, Jacquelin; Shikano, Takahito; Shimada, Yukinori; Goto, Akira; Merilä, Juha
2011-06-01
Examples of parallel evolution of phenotypic traits have been repeatedly demonstrated in threespine sticklebacks (Gasterosteus aculeatus) across their global distribution. Using these as a model, we performed a targeted genome scan--focusing on physiologically important genes potentially related to freshwater adaptation--to identify genetic signatures of parallel physiological evolution on a global scale. To this end, 50 microsatellite loci, including 26 loci within or close to (<6 kb) physiologically important genes, were screened in paired marine and freshwater populations from six locations across the Northern Hemisphere. Signatures of directional selection were detected in 24 loci, including 17 physiologically important genes, in at least one location. Although no loci showed consistent signatures of selection in all divergent population pairs, several outliers were common in multiple locations. In particular, seven physiologically important genes, as well as reference ectodysplasin gene (EDA), showed signatures of selection in three or more locations. Hence, although these results give some evidence for consistent parallel molecular evolution in response to freshwater colonization, they suggest that different evolutionary pathways may underlie physiological adaptation to freshwater habitats within the global distribution of the threespine stickleback. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.
Efficient Delaunay Tessellation through K-D Tree Decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morozov, Dmitriy; Peterka, Tom
Delaunay tessellations are fundamental data structures in computational geometry. They are important in data analysis, where they can represent the geometry of a point set or approximate its density. The algorithms for computing these tessellations at scale perform poorly when the input data is unbalanced. We investigate the use of k-d trees to evenly distribute points among processes and compare two strategies for picking split points between domain regions. Because resulting point distributions no longer satisfy the assumptions of existing parallel Delaunay algorithms, we develop a new parallel algorithm that adapts to its input and prove its correctness. We evaluatemore » the new algorithm using two late-stage cosmology datasets. The new running times are up to 50 times faster using k-d tree compared with regular grid decomposition. Moreover, in the unbalanced data sets, decomposing the domain into a k-d tree is up to five times faster than decomposing it into a regular grid.« less
An Implicit Solver on A Parallel Block-Structured Adaptive Mesh Grid for FLASH
NASA Astrophysics Data System (ADS)
Lee, D.; Gopal, S.; Mohapatra, P.
2012-07-01
We introduce a fully implicit solver for FLASH based on a Jacobian-Free Newton-Krylov (JFNK) approach with an appropriate preconditioner. The main goal of developing this JFNK-type implicit solver is to provide efficient high-order numerical algorithms and methodology for simulating stiff systems of differential equations on large-scale parallel computer architectures. A large number of natural problems in nonlinear physics involve a wide range of spatial and time scales of interest. A system that encompasses such a wide magnitude of scales is described as "stiff." A stiff system can arise in many different fields of physics, including fluid dynamics/aerodynamics, laboratory/space plasma physics, low Mach number flows, reactive flows, radiation hydrodynamics, and geophysical flows. One of the big challenges in solving such a stiff system using current-day computational resources lies in resolving time and length scales varying by several orders of magnitude. We introduce FLASH's preliminary implementation of a time-accurate JFNK-based implicit solver in the framework of FLASH's unsplit hydro solver.
Parallel processors and nonlinear structural dynamics algorithms and software
NASA Technical Reports Server (NTRS)
Belytschko, Ted
1990-01-01
Techniques are discussed for the implementation and improvement of vectorization and concurrency in nonlinear explicit structural finite element codes. In explicit integration methods, the computation of the element internal force vector consumes the bulk of the computer time. The program can be efficiently vectorized by subdividing the elements into blocks and executing all computations in vector mode. The structuring of elements into blocks also provides a convenient way to implement concurrency by creating tasks which can be assigned to available processors for evaluation. The techniques were implemented in a 3-D nonlinear program with one-point quadrature shell elements. Concurrency and vectorization were first implemented in a single time step version of the program. Techniques were developed to minimize processor idle time and to select the optimal vector length. A comparison of run times between the program executed in scalar, serial mode and the fully vectorized code executed concurrently using eight processors shows speed-ups of over 25. Conjugate gradient methods for solving nonlinear algebraic equations are also readily adapted to a parallel environment. A new technique for improving convergence properties of conjugate gradients in nonlinear problems is developed in conjunction with other techniques such as diagonal scaling. A significant reduction in the number of iterations required for convergence is shown for a statically loaded rigid bar suspended by three equally spaced springs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Michael J. Bockelie
2002-01-04
This DOE SBIR Phase II final report summarizes research that has been performed to develop a parallel adaptive tool for modeling steady, two phase turbulent reacting flow. The target applications for the new tool are full scale, fossil-fuel fired boilers and furnaces such as those used in the electric utility industry, chemical process industry and mineral/metal process industry. The type of analyses to be performed on these systems are engineering calculations to evaluate the impact on overall furnace performance due to operational, process or equipment changes. To develop a Computational Fluid Dynamics (CFD) model of an industrial scale furnace requiresmore » a carefully designed grid that will capture all of the large and small scale features of the flowfield. Industrial systems are quite large, usually measured in tens of feet, but contain numerous burners, air injection ports, flames and localized behavior with dimensions that are measured in inches or fractions of inches. To create an accurate computational model of such systems requires capturing length scales within the flow field that span several orders of magnitude. In addition, to create an industrially useful model, the grid can not contain too many grid points - the model must be able to execute on an inexpensive desktop PC in a matter of days. An adaptive mesh provides a convenient means to create a grid that can capture both fine flow field detail within a very large domain with a ''reasonable'' number of grid points. However, the use of an adaptive mesh requires the development of a new flow solver. To create the new simulation tool, we have combined existing reacting CFD modeling software with new software based on emerging block structured Adaptive Mesh Refinement (AMR) technologies developed at Lawrence Berkeley National Laboratory (LBNL). Specifically, we combined: -physical models, modeling expertise, and software from existing combustion simulation codes used by Reaction Engineering International; -mesh adaption, data management, and parallelization software and technology being developed by users of the BoxLib library at LBNL; and -solution methods for problems formulated on block structured grids that were being developed in collaboration with technical staff members at the University of Utah Center for High Performance Computing (CHPC) and at LBNL. The combustion modeling software used by Reaction Engineering International represents an investment of over fifty man-years of development, conducted over a period of twenty years. Thus, it was impractical to achieve our objective by starting from scratch. The research program resulted in an adaptive grid, reacting CFD flow solver that can be used only on limited problems. In current form the code is appropriate for use on academic problems with simplified geometries. The new solver is not sufficiently robust or sufficiently general to be used in a ''production mode'' for industrial applications. The principle difficulty lies with the multi-level solver technology. The use of multi-level solvers on adaptive grids with embedded boundaries is not yet a mature field and there are many issues that remain to be resolved. From the lessons learned in this SBIR program, we have started work on a new flow solver with an AMR capability. The new code is based on a conventional cell-by-cell mesh refinement strategy used in unstructured grid solvers that employ hexahedral cells. The new solver employs several of the concepts and solution strategies developed within this research program. The formulation of the composite grid problem for the new solver has been designed to avoid the embedded boundary complications encountered in this SBIR project. This follow-on effort will result in a reacting flow CFD solver with localized mesh capability that can be used to perform engineering calculations on industrial problems in a production mode.« less
A parallel finite element simulator for ion transport through three-dimensional ion channel systems.
Tu, Bin; Chen, Minxin; Xie, Yan; Zhang, Linbo; Eisenberg, Bob; Lu, Benzhuo
2013-09-15
A parallel finite element simulator, ichannel, is developed for ion transport through three-dimensional ion channel systems that consist of protein and membrane. The coordinates of heavy atoms of the protein are taken from the Protein Data Bank and the membrane is represented as a slab. The simulator contains two components: a parallel adaptive finite element solver for a set of Poisson-Nernst-Planck (PNP) equations that describe the electrodiffusion process of ion transport, and a mesh generation tool chain for ion channel systems, which is an essential component for the finite element computations. The finite element method has advantages in modeling irregular geometries and complex boundary conditions. We have built a tool chain to get the surface and volume mesh for ion channel systems, which consists of a set of mesh generation tools. The adaptive finite element solver in our simulator is implemented using the parallel adaptive finite element package Parallel Hierarchical Grid (PHG) developed by one of the authors, which provides the capability of doing large scale parallel computations with high parallel efficiency and the flexibility of choosing high order elements to achieve high order accuracy. The simulator is applied to a real transmembrane protein, the gramicidin A (gA) channel protein, to calculate the electrostatic potential, ion concentrations and I - V curve, with which both primitive and transformed PNP equations are studied and their numerical performances are compared. To further validate the method, we also apply the simulator to two other ion channel systems, the voltage dependent anion channel (VDAC) and α-Hemolysin (α-HL). The simulation results agree well with Brownian dynamics (BD) simulation results and experimental results. Moreover, because ionic finite size effects can be included in PNP model now, we also perform simulations using a size-modified PNP (SMPNP) model on VDAC and α-HL. It is shown that the size effects in SMPNP can effectively lead to reduced current in the channel, and the results are closer to BD simulation results. Copyright © 2013 Wiley Periodicals, Inc.
A model of the magnetosheath magnetic field during magnetic clouds
NASA Astrophysics Data System (ADS)
Turc, L.; Fontaine, D.; Savoini, P.; Kilpua, E. K. J.
2014-02-01
Magnetic clouds (MCs) are huge interplanetary structures which originate from the Sun and have a paramount importance in driving magnetospheric storms. Before reaching the magnetosphere, MCs interact with the Earth's bow shock. This may alter their structure and therefore modify their expected geoeffectivity. We develop a simple 3-D model of the magnetosheath adapted to MCs conditions. This model is the first to describe the interaction of MCs with the bow shock and their propagation inside the magnetosheath. We find that when the MC encounters the Earth centrally and with its axis perpendicular to the Sun-Earth line, the MC's magnetic structure remains mostly unchanged from the solar wind to the magnetosheath. In this case, the entire dayside magnetosheath is located downstream of a quasi-perpendicular bow shock. When the MC is encountered far from its centre, or when its axis has a large tilt towards the ecliptic plane, the MC's structure downstream of the bow shock differs significantly from that upstream. Moreover, the MC's structure also differs from one region of the magnetosheath to another and these differences vary with time and space as the MC passes by. In these cases, the bow shock configuration is mainly quasi-parallel. Strong magnetic field asymmetries arise in the magnetosheath; the sign of the magnetic field north-south component may change from the solar wind to some parts of the magnetosheath. We stress the importance of the Bx component. We estimate the regions where the magnetosheath and magnetospheric magnetic fields are anti-parallel at the magnetopause (i.e. favourable to reconnection). We find that the location of anti-parallel fields varies with time as the MCs move past Earth's environment, and that they may be situated near the subsolar region even for an initially northward magnetic field upstream of the bow shock. Our results point out the major role played by the bow shock configuration in modifying or keeping the structure of the MCs unchanged. Note that this model is not restricted to MCs, it can be used to describe the magnetosheath magnetic field under an arbitrary slowly varying interplanetary magnetic field.
Self-Avoiding Walks Over Adaptive Triangular Grids
NASA Technical Reports Server (NTRS)
Heber, Gerd; Biswas, Rupak; Gao, Guang R.; Saini, Subhash (Technical Monitor)
1999-01-01
Space-filling curves is a popular approach based on a geometric embedding for linearizing computational meshes. We present a new O(n log n) combinatorial algorithm for constructing a self avoiding walk through a two dimensional mesh containing n triangles. We show that for hierarchical adaptive meshes, the algorithm can be locally adapted and easily parallelized by taking advantage of the regularity of the refinement rules. The proposed approach should be very useful in the runtime partitioning and load balancing of adaptive unstructured grids.
Electrically controlled polymeric gel actuators
Adolf, Douglas B.; Shahinpoor, Mohsen; Segalman, Daniel J.; Witkowski, Walter R.
1993-01-01
Electrically controlled polymeric gel actuators or synthetic muscles capable of undergoing substantial expansion and contraction when subjected to changing pH environments, temperature, or solvent. The actuators employ compliant containers for the gels and their solvents. The gels employed may be cylindrical electromechanical gel fibers such as polyacrylamide fibers or a mixture of poly vinyl alcohol-polyacrylic acid arranged in a parallel aggregate and contained in an electrolytic solvent bath such as salt water. The invention includes smart, electrically activated devices exploiting this phenomenon. These devices are capable of being manipulated via active computer control as large displacement actuators for use in adaptive structure such as robots.
Electrically controlled polymeric gel actuators
Adolf, D.B.; Shahinpoor, M.; Segalman, D.J.; Witkowski, W.R.
1993-10-05
Electrically controlled polymeric gel actuators or synthetic muscles are described capable of undergoing substantial expansion and contraction when subjected to changing pH environments, temperature, or solvent. The actuators employ compliant containers for the gels and their solvents. The gels employed may be cylindrical electromechanical gel fibers such as polyacrylamide fibers or a mixture of poly vinyl alcohol-polyacrylic acid arranged in a parallel aggregate and contained in an electrolytic solvent bath such as salt water. The invention includes smart, electrically activated devices exploiting this phenomenon. These devices are capable of being manipulated via active computer control as large displacement actuators for use in adaptive structure such as robots. 11 figures.
Enhancing data locality by using terminal propagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hendrickson, B.; Leland, R.; Van Driessche, R.
1995-12-31
Terminal propagation is a method developed in the circuit placement community for adding constraints to graph partitioning problems. This paper adapts and expands this idea, and applies it to the problem of partitioning data structures among the processors of a parallel computer. We show how the constraints in terminal propagation can be used to encourage partitions in which messages are communicated only between architecturally near processors. We then show how these constraints can be handled in two important partitioning algorithms, spectral bisection and multilevel-KL. We compare the quality of partitions generated by these algorithms to each other and to Partitionsmore » generated by more familiar techniques.« less
NASA Astrophysics Data System (ADS)
Pozzi, Paolo; Wilding, Dean; Soloviev, Oleg; Vdovin, Gleb; Verhaegen, Michel
2018-02-01
In this work, we present a new confocal laser scanning microscope capable to perform sensorless wavefront optimization in real time. The device is a parallelized laser scanning microscope in which the excitation light is structured in a lattice of spots by a spatial light modulator, while a deformable mirror provides aberration correction and scanning. A binary DMD is positioned in an image plane of the detection optical path, acting as a dynamic array of reflective confocal pinholes, images by a high performance cmos camera. A second camera detects images of the light rejected by the pinholes for sensorless aberration correction.
Partitioning problems in parallel, pipelined and distributed computing
NASA Technical Reports Server (NTRS)
Bokhari, S.
1985-01-01
The problem of optimally assigning the modules of a parallel program over the processors of a multiple computer system is addressed. A Sum-Bottleneck path algorithm is developed that permits the efficient solution of many variants of this problem under some constraints on the structure of the partitions. In particular, the following problems are solved optimally for a single-host, multiple satellite system: partitioning multiple chain structured parallel programs, multiple arbitrarily structured serial programs and single tree structured parallel programs. In addition, the problems of partitioning chain structured parallel programs across chain connected systems and across shared memory (or shared bus) systems are also solved under certain constraints. All solutions for parallel programs are equally applicable to pipelined programs. These results extend prior research in this area by explicitly taking concurrency into account and permit the efficient utilization of multiple computer architectures for a wide range of problems of practical interest.
Parallel Algorithm Solves Coupled Differential Equations
NASA Technical Reports Server (NTRS)
Hayashi, A.
1987-01-01
Numerical methods adapted to concurrent processing. Algorithm solves set of coupled partial differential equations by numerical integration. Adapted to run on hypercube computer, algorithm separates problem into smaller problems solved concurrently. Increase in computing speed with concurrent processing over that achievable with conventional sequential processing appreciable, especially for large problems.
Research in Parallel Algorithms and Software for Computational Aerosciences
NASA Technical Reports Server (NTRS)
Domel, Neal D.
1996-01-01
Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Research in Parallel Algorithms and Software for Computational Aerosciences
NASA Technical Reports Server (NTRS)
Domel, Neal D.
1996-01-01
Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Wu, Xiaoping; Akgün, Can; Vaughan, J Thomas; Andersen, Peter; Strupp, John; Uğurbil, Kâmil; Van de Moortele, Pierre-François
2010-07-01
Parallel excitation holds strong promises to mitigate the impact of large transmit B1 (B+1) distortion at very high magnetic field. Accelerated RF pulses, however, inherently tend to require larger values in RF peak power which may result in substantial increase in Specific Absorption Rate (SAR) in tissues, which is a constant concern for patient safety at very high field. In this study, we demonstrate adapted rate RF pulse design allowing for SAR reduction while preserving excitation target accuracy. Compared with other proposed implementations of adapted rate RF pulses, our approach is compatible with any k-space trajectories, does not require an analytical expression of the gradient waveform and can be used for large flip angle excitation. We demonstrate our method with numerical simulations based on electromagnetic modeling and we include an experimental verification of transmit pattern accuracy on an 8 transmit channel 9.4 T system.
A framework for grand scale parallelization of the combined finite discrete element method in 2d
NASA Astrophysics Data System (ADS)
Lei, Z.; Rougier, E.; Knight, E. E.; Munjiza, A.
2014-09-01
Within the context of rock mechanics, the Combined Finite-Discrete Element Method (FDEM) has been applied to many complex industrial problems such as block caving, deep mining techniques (tunneling, pillar strength, etc.), rock blasting, seismic wave propagation, packing problems, dam stability, rock slope stability, rock mass strength characterization problems, etc. The reality is that most of these were accomplished in a 2D and/or single processor realm. In this work a hardware independent FDEM parallelization framework has been developed using the Virtual Parallel Machine for FDEM, (V-FDEM). With V-FDEM, a parallel FDEM software can be adapted to different parallel architecture systems ranging from just a few to thousands of cores.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Terawaki, Shin-ichi, E-mail: terawaki@gunma-u.ac.jp; SPring-8 Center, RIKEN, 1-1-1 Koto, Sayo-cho, Sayo-gun, Hyogo 679-5148; Yoshikane, Asuka
Bicaudal-D1 (BICD1) is an α-helical coiled-coil protein mediating the attachment of specific cargo to cytoplasmic dynein. It plays an essential role in minus end-directed intracellular transport along microtubules. The third C-terminal coiled-coil region of BICD1 (BICD1 CC3) has an important role in cargo sorting, including intracellular vesicles associating with the small GTPase Rab6 and the nuclear pore complex Ran binding protein 2 (RanBP2), and inhibiting the association with cytoplasmic dynein by binding to the first N-terminal coiled-coil region (CC1). The crystal structure of BICD1 CC3 revealed a parallel homodimeric coiled-coil with asymmetry and complementary knobs-into-holes interactions, differing from Drosophila BicDmore » CC3. Furthermore, our binding study indicated that BICD1 CC3 possesses a binding surface for two distinct cargos, Rab6 and RanBP2, and that the CC1-binding site overlaps with the Rab6-binding site. These findings suggest a molecular basis for cargo recognition and autoinhibition of BICD proteins during dynein-dependent intracellular retrograde transport. - Highlights: • BICD1 CC3 is a parallel homodimeric coiled-coil with axial asymmetry. • The coiled-coil packing of BICD1 CC3 is adapted to the equivalent heptad position. • BICD1 CC3 has distinct binding sites for two classes of cargo, Rab6 and RanBP2. • The CC1-binding site of BICD1 CC3 overlaps with the Rab6-binding site.« less
Pinzon-Morales, Ruben-Dario; Hirata, Yutaka
2014-01-01
To acquire and maintain precise movement controls over a lifespan, changes in the physical and physiological characteristics of muscles must be compensated for adaptively. The cerebellum plays a crucial role in such adaptation. Changes in muscle characteristics are not always symmetrical. For example, it is unlikely that muscles that bend and straighten a joint will change to the same degree. Thus, different (i.e., asymmetrical) adaptation is required for bending and straightening motions. To date, little is known about the role of the cerebellum in asymmetrical adaptation. Here, we investigate the cerebellar mechanisms required for asymmetrical adaptation using a bi-hemispheric cerebellar neuronal network model (biCNN). The bi-hemispheric structure is inspired by the observation that lesioning one hemisphere reduces motor performance asymmetrically. The biCNN model was constructed to run in real-time and used to control an unstable two-wheeled balancing robot. The load of the robot and its environment were modified to create asymmetrical perturbations. Plasticity at parallel fiber-Purkinje cell synapses in the biCNN model was driven by error signal in the climbing fiber (cf) input. This cf input was configured to increase and decrease its firing rate from its spontaneous firing rate (approximately 1 Hz) with sensory errors in the preferred and non-preferred direction of each hemisphere, as demonstrated in the monkey cerebellum. Our results showed that asymmetrical conditions were successfully handled by the biCNN model, in contrast to a single hemisphere model or a classical non-adaptive proportional and derivative controller. Further, the spontaneous activity of the cf, while relatively small, was critical for balancing the contribution of each cerebellar hemisphere to the overall motor command sent to the robot. Eliminating the spontaneous activity compromised the asymmetrical learning capabilities of the biCNN model. Thus, we conclude that a bi-hemispheric structure and adequate spontaneous activity of cf inputs are critical for cerebellar asymmetrical motor learning.
Pinzon-Morales, Ruben-Dario; Hirata, Yutaka
2014-01-01
To acquire and maintain precise movement controls over a lifespan, changes in the physical and physiological characteristics of muscles must be compensated for adaptively. The cerebellum plays a crucial role in such adaptation. Changes in muscle characteristics are not always symmetrical. For example, it is unlikely that muscles that bend and straighten a joint will change to the same degree. Thus, different (i.e., asymmetrical) adaptation is required for bending and straightening motions. To date, little is known about the role of the cerebellum in asymmetrical adaptation. Here, we investigate the cerebellar mechanisms required for asymmetrical adaptation using a bi-hemispheric cerebellar neuronal network model (biCNN). The bi-hemispheric structure is inspired by the observation that lesioning one hemisphere reduces motor performance asymmetrically. The biCNN model was constructed to run in real-time and used to control an unstable two-wheeled balancing robot. The load of the robot and its environment were modified to create asymmetrical perturbations. Plasticity at parallel fiber-Purkinje cell synapses in the biCNN model was driven by error signal in the climbing fiber (cf) input. This cf input was configured to increase and decrease its firing rate from its spontaneous firing rate (approximately 1 Hz) with sensory errors in the preferred and non-preferred direction of each hemisphere, as demonstrated in the monkey cerebellum. Our results showed that asymmetrical conditions were successfully handled by the biCNN model, in contrast to a single hemisphere model or a classical non-adaptive proportional and derivative controller. Further, the spontaneous activity of the cf, while relatively small, was critical for balancing the contribution of each cerebellar hemisphere to the overall motor command sent to the robot. Eliminating the spontaneous activity compromised the asymmetrical learning capabilities of the biCNN model. Thus, we conclude that a bi-hemispheric structure and adequate spontaneous activity of cf inputs are critical for cerebellar asymmetrical motor learning. PMID:25414644
Parallel basal ganglia circuits for voluntary and automatic behaviour to reach rewards
Hikosaka, Okihide
2015-01-01
The basal ganglia control body movements, value processing and decision-making. Many studies have shown that the inputs and outputs of each basal ganglia structure are topographically organized, which suggests that the basal ganglia consist of separate circuits that serve distinct functions. A notable example is the circuits that originate from the rostral (head) and caudal (tail) regions of the caudate nucleus, both of which target the superior colliculus. These two caudate regions encode the reward values of visual objects differently: flexible (short-term) values by the caudate head and stable (long-term) values by the caudate tail. These value signals in the caudate guide the orienting of gaze differently: voluntary saccades by the caudate head circuit and automatic saccades by the caudate tail circuit. Moreover, separate groups of dopamine neurons innervate the caudate head and tail and may selectively guide the flexible and stable learning/memory in the caudate regions. Studies focusing on manual handling of objects also suggest that rostrocaudally separated circuits in the basal ganglia control the action differently. These results suggest that the basal ganglia contain parallel circuits for two steps of goal-directed behaviour: finding valuable objects and manipulating the valuable objects. These parallel circuits may underlie voluntary behaviour and automatic skills, enabling animals (including humans) to adapt to both volatile and stable environments. This understanding of the functions and mechanisms of the basal ganglia parallel circuits may inform the differential diagnosis and treatment of basal ganglia disorders. PMID:25981958
Efficient parallelization for AMR MHD multiphysics calculations; implementation in AstroBEAR
NASA Astrophysics Data System (ADS)
Carroll-Nellenback, Jonathan J.; Shroyer, Brandon; Frank, Adam; Ding, Chen
2013-03-01
Current adaptive mesh refinement (AMR) simulations require algorithms that are highly parallelized and manage memory efficiently. As compute engines grow larger, AMR simulations will require algorithms that achieve new levels of efficient parallelization and memory management. We have attempted to employ new techniques to achieve both of these goals. Patch or grid based AMR often employs ghost cells to decouple the hyperbolic advances of each grid on a given refinement level. This decoupling allows each grid to be advanced independently. In AstroBEAR we utilize this independence by threading the grid advances on each level with preference going to the finer level grids. This allows for global load balancing instead of level by level load balancing and allows for greater parallelization across both physical space and AMR level. Threading of level advances can also improve performance by interleaving communication with computation, especially in deep simulations with many levels of refinement. While we see improvements of up to 30% on deep simulations run on a few cores, the speedup is typically more modest (5-20%) for larger scale simulations. To improve memory management we have employed a distributed tree algorithm that requires processors to only store and communicate local sections of the AMR tree structure with neighboring processors. Using this distributed approach we are able to get reasonable scaling efficiency (>80%) out to 12288 cores and up to 8 levels of AMR - independent of the use of threading.
Porting plasma physics simulation codes to modern computing architectures using the
NASA Astrophysics Data System (ADS)
Germaschewski, Kai; Abbott, Stephen
2015-11-01
Available computing power has continued to grow exponentially even after single-core performance satured in the last decade. The increase has since been driven by more parallelism, both using more cores and having more parallelism in each core, e.g. in GPUs and Intel Xeon Phi. Adapting existing plasma physics codes is challenging, in particular as there is no single programming model that covers current and future architectures. We will introduce the open-source
Dowdy, David W; Pai, Madhukar
2012-11-01
Epidemiology occupies a unique role as a knowledge-generating scientific discipline with roots in the knowledge translation of public health practice. As our fund of incompletely-translated knowledge expands and as budgets for health research contract, epidemiology must rediscover and adapt its historical skill set in knowledge translation. The existing incentive structures of academic epidemiology - designed largely for knowledge generation - are ill-equipped to train and develop epidemiologists as knowledge translators. A useful heuristic is the epidemiologist as Accountable Health Advocate (AHA) who enables society to judge the value of research, develops new methods to translate existing knowledge into improved health, and actively engages with policymakers and society. Changes to incentive structures could include novel funding streams (and review), alternative publication practices, and parallel frameworks for professional advancement and promotion.
Three-beam aerosol backscatter correlation lidar for wind profiling
NASA Astrophysics Data System (ADS)
Prasad, Narasimha S.; Radhakrishnan Mylapore, Anand
2017-03-01
The development of a three-beam aerosol backscatter correlation (ABC) light detection and ranging (lidar) to measure wind characteristics for wake vortex and plume tracking applications is discussed. This is a direct detection elastic lidar that uses three laser transceivers, operating at 1030-nm wavelength with ˜10-kHz pulse repetition frequency and nanosec class pulse widths, to directly obtain three components of wind velocities. By tracking the motion of aerosol structures along and between three near-parallel laser beams, three-component wind speed profiles along the field-of-view of laser beams are obtained. With three 8-in. transceiver modules, placed in a near-parallel configuration on a two-axis pan-tilt scanner, the lidar measures wind speeds up to 2 km away. Optical flow algorithms have been adapted to obtain the movement of aerosol structures between the beams. Aerosol density fluctuations are cross-correlated between successive scans to obtain the displacements of the aerosol features along the three axes. Using the range resolved elastic backscatter data from each laser beam, which is scanned over the volume of interest, a three-dimensional map of aerosol density can be generated in a short time span. The performance of the ABC wind lidar prototype, validated using sonic anemometer measurements, is discussed.
Mühlebach, Anneke; Adam, Joachim; Schön, Uwe
2011-11-01
Automated medicinal chemistry (parallel chemistry) has become an integral part of the drug-discovery process in almost every large pharmaceutical company. Parallel array synthesis of individual organic compounds has been used extensively to generate diverse structural libraries to support different phases of the drug-discovery process, such as hit-to-lead, lead finding, or lead optimization. In order to guarantee effective project support, efficiency in the production of compound libraries has been maximized. As a consequence, also throughput in chromatographic purification and analysis has been adapted. As a recent trend, more laboratories are preparing smaller, yet more focused libraries with even increasing demands towards quality, i.e. optimal purity and unambiguous confirmation of identity. This paper presents an automated approach how to combine effective purification and structural conformation of a lead optimization library created by microwave-assisted organic synthesis. The results of complementary analytical techniques such as UHPLC-HRMS and NMR are not only regarded but even merged for fast and easy decision making, providing optimal quality of compound stock. In comparison with the previous procedures, throughput times are at least four times faster, while compound consumption could be decreased more than threefold. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Knowledge-Based Approach for Item Exposure Control in Computerized Adaptive Testing
ERIC Educational Resources Information Center
Doong, Shing H.
2009-01-01
The purpose of this study is to investigate a functional relation between item exposure parameters (IEPs) and item parameters (IPs) over parallel pools. This functional relation is approximated by a well-known tool in machine learning. Let P and Q be parallel item pools and suppose IEPs for P have been obtained via a Sympson and Hetter-type…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koniges, A.E.
The author describes the new T3D parallel computer at NERSC. The adaptive mesh ICF3D code is one of the current applications being ported and developed for use on the T3D. It has been stressed in other papers in this proceedings that the development environment and tools available on the parallel computer is similar to any planned for the future including networks of workstations.
The island dynamics model on parallel quadtree grids
NASA Astrophysics Data System (ADS)
Mistani, Pouria; Guittet, Arthur; Bochkov, Daniil; Schneider, Joshua; Margetis, Dionisios; Ratsch, Christian; Gibou, Frederic
2018-05-01
We introduce an approach for simulating epitaxial growth by use of an island dynamics model on a forest of quadtree grids, and in a parallel environment. To this end, we use a parallel framework introduced in the context of the level-set method. This framework utilizes: discretizations that achieve a second-order accurate level-set method on non-graded adaptive Cartesian grids for solving the associated free boundary value problem for surface diffusion; and an established library for the partitioning of the grid. We consider the cases with: irreversible aggregation, which amounts to applying Dirichlet boundary conditions at the island boundary; and an asymmetric (Ehrlich-Schwoebel) energy barrier for attachment/detachment of atoms at the island boundary, which entails the use of a Robin boundary condition. We provide the scaling analyses performed on the Stampede supercomputer and numerical examples that illustrate the capability of our methodology to efficiently simulate different aspects of epitaxial growth. The combination of adaptivity and parallelism in our approach enables simulations that are several orders of magnitude faster than those reported in the recent literature and, thus, provides a viable framework for the systematic study of mound formation on crystal surfaces.
Parallel design of JPEG-LS encoder on graphics processing units
NASA Astrophysics Data System (ADS)
Duan, Hao; Fang, Yong; Huang, Bormin
2012-01-01
With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.
Some Considerations in Maintaining Adaptive Test Item Pools.
ERIC Educational Resources Information Center
Stocking, Martha L.
The construction of parallel editions of conventional tests for purposes of test security while maintaining score comparability has always been a recognized and difficult problem in psychometrics and test construction. The introduction of new modes of test construction, e.g., adaptive testing, changes the nature of the problem, but does not make…
Electrooptical adaptive switching network for the hypercube computer
NASA Technical Reports Server (NTRS)
Chow, E.; Peterson, J.
1988-01-01
An all-optical network design for the hyperswitch network using regular free-space interconnects between electronic processor nodes is presented. The adaptive routing model used is described, and an adaptive routing control example is presented. The design demonstrates that existing electrooptical techniques are sufficient for implementing efficient parallel architectures without the need for more complex means of implementing arbitrary interconnection schemes. The electrooptical hyperswitch network significantly improves the communication performance of the hypercube computer.
NASA Technical Reports Server (NTRS)
Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.
1991-01-01
Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.
Approximation algorithms for scheduling unrelated parallel machines with release dates
NASA Astrophysics Data System (ADS)
Avdeenko, T. V.; Mesentsev, Y. A.; Estraykh, I. V.
2017-01-01
In this paper we propose approaches to optimal scheduling of unrelated parallel machines with release dates. One approach is based on the scheme of dynamic programming modified with adaptive narrowing of search domain ensuring its computational effectiveness. We discussed complexity of the exact schedules synthesis and compared it with approximate, close to optimal, solutions. Also we explain how the algorithm works for the example of two unrelated parallel machines and five jobs with release dates. Performance results that show the efficiency of the proposed approach have been given.
NASA Astrophysics Data System (ADS)
Sardinha-Lourenço, A.; Andrade-Campos, A.; Antunes, A.; Oliveira, M. S.
2018-03-01
Recent research on water demand short-term forecasting has shown that models using univariate time series based on historical data are useful and can be combined with other prediction methods to reduce errors. The behavior of water demands in drinking water distribution networks focuses on their repetitive nature and, under meteorological conditions and similar consumers, allows the development of a heuristic forecast model that, in turn, combined with other autoregressive models, can provide reliable forecasts. In this study, a parallel adaptive weighting strategy of water consumption forecast for the next 24-48 h, using univariate time series of potable water consumption, is proposed. Two Portuguese potable water distribution networks are used as case studies where the only input data are the consumption of water and the national calendar. For the development of the strategy, the Autoregressive Integrated Moving Average (ARIMA) method and a short-term forecast heuristic algorithm are used. Simulations with the model showed that, when using a parallel adaptive weighting strategy, the prediction error can be reduced by 15.96% and the average error by 9.20%. This reduction is important in the control and management of water supply systems. The proposed methodology can be extended to other forecast methods, especially when it comes to the availability of multiple forecast models.
Capabilities of Fully Parallelized MHD Stability Code MARS
NASA Astrophysics Data System (ADS)
Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang
2016-10-01
Results of full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. Parallel version of MARS, named PMARS, has been recently developed at FAR-TECH. Parallelized MARS is an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, implemented in MARS. Parallelization of the code included parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse vector iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the MARS algorithm using parallel libraries and procedures. Parallelized MARS is capable of calculating eigenmodes with significantly increased spatial resolution: up to 5,000 adapted radial grid points with up to 500 poloidal harmonics. Such resolution is sufficient for simulation of kink, tearing and peeling-ballooning instabilities with physically relevant parameters. Work is supported by the U.S. DOE SBIR program.
Fully Parallel MHD Stability Analysis Tool
NASA Astrophysics Data System (ADS)
Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang
2015-11-01
Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Results of MARS parallelization and of the development of a new fix boundary equilibrium code adapted for MARS input will be reported. Work is supported by the U.S. DOE SBIR program.
Vaughan, T. J.; Haugh, M. G.; McNamara, L. M.
2013-01-01
Bone continuously adapts its internal structure to accommodate the functional demands of its mechanical environment and strain-induced flow of interstitial fluid is believed to be the primary mediator of mechanical stimuli to bone cells in vivo. In vitro investigations have shown that bone cells produce important biochemical signals in response to fluid flow applied using parallel-plate flow chamber (PPFC) systems. However, the exact mechanical stimulus experienced by the cells within these systems remains unclear. To fully understand this behaviour represents a most challenging multi-physics problem involving the interaction between deformable cellular structures and adjacent fluid flows. In this study, we use a fluid–structure interaction computational approach to investigate the nature of the mechanical stimulus being applied to a single osteoblast cell under fluid flow within a PPFC system. The analysis decouples the contribution of pressure and shear stress on cellular deformation and for the first time highlights that cell strain under flow is dominated by the pressure in the PPFC system rather than the applied shear stress. Furthermore, it was found that strains imparted on the cell membrane were relatively low whereas significant strain amplification occurred at the cell–substrate interface. These results suggest that strain transfer through focal attachments at the base of the cell are the primary mediators of mechanical signals to the cell under flow in a PPFC system. Such information is vital in order to correctly interpret biological responses of bone cells under in vitro stimulation and elucidate the mechanisms associated with mechanotransduction in vivo. PMID:23365189
Lindgren, Johan; Everhart, Michael J; Caldwell, Michael W
2011-01-01
The physical properties of water and the environment it presents to its inhabitants provide stringent constraints and selection pressures affecting aquatic adaptation and evolution. Mosasaurs (a group of secondarily aquatic reptiles that occupied a broad array of predatory niches in the Cretaceous marine ecosystems about 98-65 million years ago) have traditionally been considered as anguilliform locomotors capable only of generating short bursts of speed during brief ambush pursuits. Here we report on an exceptionally preserved, long-snouted mosasaur (Ectenosaurus clidastoides) from the Santonian (Upper Cretaceous) part of the Smoky Hill Chalk Member of the Niobrara Formation in western Kansas, USA, that contains phosphatized remains of the integument displaying both depth and structure. The small, ovoid neck and/or anterior trunk scales exhibit a longitudinal central keel, and are obliquely arrayed into an alternating pattern where neighboring scales overlap one another. Supportive sculpturing in the form of two parallel, longitudinal ridges on the inner scale surface and a complex system of multiple, superimposed layers of straight, cross-woven helical fiber bundles in the underlying dermis, may have served to minimize surface deformation and frictional drag during locomotion. Additional parallel fiber bundles oriented at acute angles to the long axis of the animal presumably provided stiffness in the lateral plane. These features suggest that the anterior torso of Ectenosaurus was held somewhat rigid during swimming, thereby limiting propulsive movements to the posterior body and tail.
Improving GPU-accelerated adaptive IDW interpolation algorithm using fast kNN search.
Mei, Gang; Xu, Nengxiong; Xu, Liangliang
2016-01-01
This paper presents an efficient parallel Adaptive Inverse Distance Weighting (AIDW) interpolation algorithm on modern Graphics Processing Unit (GPU). The presented algorithm is an improvement of our previous GPU-accelerated AIDW algorithm by adopting fast k-nearest neighbors (kNN) search. In AIDW, it needs to find several nearest neighboring data points for each interpolated point to adaptively determine the power parameter; and then the desired prediction value of the interpolated point is obtained by weighted interpolating using the power parameter. In this work, we develop a fast kNN search approach based on the space-partitioning data structure, even grid, to improve the previous GPU-accelerated AIDW algorithm. The improved algorithm is composed of the stages of kNN search and weighted interpolating. To evaluate the performance of the improved algorithm, we perform five groups of experimental tests. The experimental results indicate: (1) the improved algorithm can achieve a speedup of up to 1017 over the corresponding serial algorithm; (2) the improved algorithm is at least two times faster than our previous GPU-accelerated AIDW algorithm; and (3) the utilization of fast kNN search can significantly improve the computational efficiency of the entire GPU-accelerated AIDW algorithm.
A parallel time integrator for noisy nonlinear oscillatory systems
NASA Astrophysics Data System (ADS)
Subber, Waad; Sarkar, Abhijit
2018-06-01
In this paper, we adapt a parallel time integration scheme to track the trajectories of noisy non-linear dynamical systems. Specifically, we formulate a parallel algorithm to generate the sample path of nonlinear oscillator defined by stochastic differential equations (SDEs) using the so-called parareal method for ordinary differential equations (ODEs). The presence of Wiener process in SDEs causes difficulties in the direct application of any numerical integration techniques of ODEs including the parareal algorithm. The parallel implementation of the algorithm involves two SDEs solvers, namely a fine-level scheme to integrate the system in parallel and a coarse-level scheme to generate and correct the required initial conditions to start the fine-level integrators. For the numerical illustration, a randomly excited Duffing oscillator is investigated in order to study the performance of the stochastic parallel algorithm with respect to a range of system parameters. The distributed implementation of the algorithm exploits Massage Passing Interface (MPI).
A parallel orbital-updating based plane-wave basis method for electronic structure calculations
NASA Astrophysics Data System (ADS)
Pan, Yan; Dai, Xiaoying; de Gironcoli, Stefano; Gong, Xin-Gao; Rignanese, Gian-Marco; Zhou, Aihui
2017-11-01
Motivated by the recently proposed parallel orbital-updating approach in real space method [1], we propose a parallel orbital-updating based plane-wave basis method for electronic structure calculations, for solving the corresponding eigenvalue problems. In addition, we propose two new modified parallel orbital-updating methods. Compared to the traditional plane-wave methods, our methods allow for two-level parallelization, which is particularly interesting for large scale parallelization. Numerical experiments show that these new methods are more reliable and efficient for large scale calculations on modern supercomputers.
Song, Xiaoxia; Anderson, Timothy; Beutler, Larry E; Sun, Shijin; Wu, Guohong; Kimpara, Satoko
2015-01-01
This study aimed to develop a culturally adapted version of the Systematic Treatment Selection-Innerlife (STS) in China. A total of 300 nonclinical participants collected from Mainland China and 240 nonclinical US participants were drawn from archival data. A Chinese version of the STS was developed, using translation and back-translation procedures. After confirmatory factor analysis (CFA) of the original STS sub scales failed on both samples, exploratory factor analysis (EFA) was then used to access whether a simple structure would emerge on these STS treatment items. Parallel analysis and minimum average partial were used to determine the number of factor to retain. Three cross-cultural factors were found in this study, Internalized Distress, Externalized Distress and interpersonal relations. This supported that regardless of whether one is in presumably different cultural contexts of the USA or China, psychological distress is expressed in a few basic channels of internalized distress, externalized distress, and interpersonal relations, from which different manifestations in different culture were also discussed.
The carotid rete and artiodactyl success.
Mitchell, G; Lust, A
2008-08-23
Since the Eocene, the diversity of artiodactyls has increased while that of perissodactyls has decreased. Reasons given for this contrasting pattern are that the evolution of a ruminant digestive tract and improved locomotion in artiodactyls were adaptively advantageous in the highly seasonal post-Eocene climate. We suggest that evolution of a carotid rete, a structure highly developed in artiodactyls but absent in perissodactyls, was at least as important. The rete confers an ability to regulate brain temperature independently of body temperature. The net effect is that in hot ambient conditions artiodactyls are able to conserve energy and water, and in cold ambient conditions they are able to conserve body temperature. In perissodactyls, brain and body temperature change in parallel and thermoregulation requires abundant food and water to warm/cool the body. Consequently, perissodactyls occupy habitats of low seasonality and rich in food and water, such as tropical forests. Conversely, the increased thermoregulatory flexibility of artiodactyls has facilitated invasion of new adaptive zones ranging from the Arctic Circle to deserts and tropical savannahs.
NASA Astrophysics Data System (ADS)
Hayasaki, Yoshio
2017-02-01
Femtosecond laser processing is a promising tool for fabricating novel and useful structures on the surfaces of and inside materials. An enormous number of pulse irradiation points will be required for fabricating actual structures with millimeter scale, and therefore, the throughput of femtosecond laser processing must be improved for practical adoption of this technique. One promising method to improve throughput is parallel pulse generation based on a computer-generated hologram (CGH) displayed on a spatial light modulator (SLM), a technique called holographic femtosecond laser processing. The holographic method has the advantages such as high throughput, high light use efficiency, and variable, instantaneous, and 3D patterning. Furthermore, the use of an SLM gives an ability to correct unknown imperfections of the optical system and inhomogeneity in a sample using in-system optimization of the CGH. Furthermore, the CGH can adaptively compensate in response to dynamic unpredictable mechanical movements, air and liquid disturbances, a shape variation and deformation of the target sample, as well as adaptive wavefront control for environmental changes. Therefore, it is a powerful tool for the fabrication of biological cells and tissues, because they have free form, variable, and deformable structures. In this paper, we present the principle and the experimental setup of holographic femtosecond laser processing, and the effective way for processing the biological sample. We demonstrate the femtosecond laser processing of biological materials and the processing properties.
Rosenthal, Eric I; Holt, Amanda L; Sweeney, Alison M
2017-05-01
The largest habitat by volume on Earth is the oceanic midwater, which is also one of the least understood in terms of animal ecology. The organisms here exhibit a spectacular array of optical adaptations for living in a visual void that have only barely begun to be described. We describe a complex pattern of broadband scattering from the skin of Argyropelecus sp., a hatchetfish found in the mesopelagic zone of the world's oceans. Hatchetfish skin superficially resembles the unpolished side of aluminium foil, but on closer inspection contains a complex composite array of subwavelength-scale dielectric structures. The superficial layer of this array contains dielectric stacks that are rectangular in cross-section, while the deeper layer contains dielectric bundles that are elliptical in cross-section; the cells in both layers have their longest dimension running parallel to the dorsal-ventral axis of the fish. Using the finite-difference time-domain approach and photographic radiometry, we explored the structural origins of this scattering behaviour and its environmental consequences. When the fish's flank is illuminated from an arbitrary incident angle, a portion of the scattered light exits in an arc parallel to the fish's anterior-posterior axis. Simultaneously, some incident light is also scattered downwards through the complex birefringent skin structure and exits from the ventral photophores. We show that this complex scattering pattern will provide camouflage simultaneously against the horizontal radially symmetric solar radiance in this habitat, and the predatory bioluminescent searchlights that are common here. The structure also directs light incident on the flank of the fish into the downwelling, silhouette-hiding counter-illumination of the ventral photophores. © 2017 The Authors.
Rosenthal, Eric I.; Holt, Amanda L.
2017-01-01
The largest habitat by volume on Earth is the oceanic midwater, which is also one of the least understood in terms of animal ecology. The organisms here exhibit a spectacular array of optical adaptations for living in a visual void that have only barely begun to be described. We describe a complex pattern of broadband scattering from the skin of Argyropelecus sp., a hatchetfish found in the mesopelagic zone of the world's oceans. Hatchetfish skin superficially resembles the unpolished side of aluminium foil, but on closer inspection contains a complex composite array of subwavelength-scale dielectric structures. The superficial layer of this array contains dielectric stacks that are rectangular in cross-section, while the deeper layer contains dielectric bundles that are elliptical in cross-section; the cells in both layers have their longest dimension running parallel to the dorsal–ventral axis of the fish. Using the finite-difference time-domain approach and photographic radiometry, we explored the structural origins of this scattering behaviour and its environmental consequences. When the fish's flank is illuminated from an arbitrary incident angle, a portion of the scattered light exits in an arc parallel to the fish's anterior–posterior axis. Simultaneously, some incident light is also scattered downwards through the complex birefringent skin structure and exits from the ventral photophores. We show that this complex scattering pattern will provide camouflage simultaneously against the horizontal radially symmetric solar radiance in this habitat, and the predatory bioluminescent searchlights that are common here. The structure also directs light incident on the flank of the fish into the downwelling, silhouette-hiding counter-illumination of the ventral photophores. PMID:28468923
Not different, Just Better: The Adaptive Evolution of an Enzyme
2015-12-20
ELEMENT NUMBER 5b. GRANT NUMBER 5a. CONTRACT NUMBER Form Approved OMB NO. 0704-0188 3 . DATES COVERED (From - To) - UU UU UU UU 20-12-2015 1-Oct-2011 30...is precisely regulated by allostery and the adaptation of allostery is unknown, and 3 ) multiple experiments by others have demonstrated that adaptive...mutations in the same gene, but replicate populations, functionally parallel? • Aim 3 ) Expression, purification and functional analysis of evolved pyruvate
Specification and Analysis of Parallel Machine Architecture
1990-03-17
Parallel Machine Architeture C.V. Ramamoorthy Computer Science Division Dept. of Electrical Engineering and Computer Science University of California...capacity. (4) Adaptive: The overhead in resolution of deadlocks, etc. should be in proportion to their frequency. (5) Avoid rollbacks: Rollbacks can be...snapshots of system state graphically at a rate proportional to simulation time. Some of the examples are as follow: (1) When the simulation clock of
Compiler and Runtime Support for Programming in Adaptive Parallel Environments
1998-10-15
noother job is waiting for resources, and use a smaller number of processors when other jobs needresources. Setia et al. [15, 20] have shown that such...15] Vijay K. Naik, Sanjeev Setia , and Mark Squillante. Performance analysis of job scheduling policiesin parallel supercomputing environments. In...on networks ofheterogeneous workstations. Technical Report CSE-94-012, Oregon Graduate Institute of Scienceand Technology, 1994.[20] Sanjeev Setia
Automatic Adaptation of Tunable Distributed Applications
2001-01-01
size, weight, and battery life, with a single CPU, less memory, smaller hard disk, and lower bandwidth network connectivity. The power of PDAs is...wireless, and bluetooth [32] facilities; thus achieving different rates of data transmission. 1 With the trend of “write once, run everywhere...applications, a single component can execute on multiple processors (or machines) in parallel. These parallel applications, written in a specialized language
Domain Adaptation of Translation Models for Multilingual Applications
2009-04-01
expansion effect that corpus (or dictionary ) based trans- lation introduces - however, this effect is maintained even with monolingual query expansion [12...every day; bilingual web pages are harvested as parallel corpora as the quantity of non-English data on the web increases; online dictionaries of...approach is to customize translation models to a domain, by automatically selecting the resources ( dictionaries , parallel corpora) that are best for
Missileborne Artificial Vision System (MAVIS)
NASA Technical Reports Server (NTRS)
Andes, David K.; Witham, James C.; Miles, Michael D.
1994-01-01
Several years ago when INTEL and China Lake designed the ETANN chip, analog VLSI appeared to be the only way to do high density neural computing. In the last five years, however, digital parallel processing chips capable of performing neural computation functions have evolved to the point of rough equality with analog chips in system level computational density. The Naval Air Warfare Center, China Lake, has developed a real time, hardware and software system designed to implement and evaluate biologically inspired retinal and cortical models. The hardware is based on the Adaptive Solutions Inc. massively parallel CNAPS system COHO boards. Each COHO board is a standard size 6U VME card featuring 256 fixed point, RISC processors running at 20 MHz in a SIMD configuration. Each COHO board has a companion board built to support a real time VSB interface to an imaging seeker, a NTSC camera, and to other COHO boards. The system is designed to have multiple SIMD machines each performing different corticomorphic functions. The system level software has been developed which allows a high level description of corticomorphic structures to be translated into the native microcode of the CNAPS chips. Corticomorphic structures are those neural structures with a form similar to that of the retina, the lateral geniculate nucleus, or the visual cortex. This real time hardware system is designed to be shrunk into a volume compatible with air launched tactical missiles. Initial versions of the software and hardware have been completed and are in the early stages of integration with a missile seeker.
User's Guide for ENSAERO_FE Parallel Finite Element Solver
NASA Technical Reports Server (NTRS)
Eldred, Lloyd B.; Guruswamy, Guru P.
1999-01-01
A high fidelity parallel static structural analysis capability is created and interfaced to the multidisciplinary analysis package ENSAERO-MPI of Ames Research Center. This new module replaces ENSAERO's lower fidelity simple finite element and modal modules. Full aircraft structures may be more accurately modeled using the new finite element capability. Parallel computation is performed by breaking the full structure into multiple substructures. This approach is conceptually similar to ENSAERO's multizonal fluid analysis capability. The new substructure code is used to solve the structural finite element equations for each substructure in parallel. NASTRANKOSMIC is utilized as a front end for this code. Its full library of elements can be used to create an accurate and realistic aircraft model. It is used to create the stiffness matrices for each substructure. The new parallel code then uses an iterative preconditioned conjugate gradient method to solve the global structural equations for the substructure boundary nodes.
Parallel 3D Mortar Element Method for Adaptive Nonconforming Meshes
NASA Technical Reports Server (NTRS)
Feng, Huiyu; Mavriplis, Catherine; VanderWijngaart, Rob; Biswas, Rupak
2004-01-01
High order methods are frequently used in computational simulation for their high accuracy. An efficient way to avoid unnecessary computation in smooth regions of the solution is to use adaptive meshes which employ fine grids only in areas where they are needed. Nonconforming spectral elements allow the grid to be flexibly adjusted to satisfy the computational accuracy requirements. The method is suitable for computational simulations of unsteady problems with very disparate length scales or unsteady moving features, such as heat transfer, fluid dynamics or flame combustion. In this work, we select the Mark Element Method (MEM) to handle the non-conforming interfaces between elements. A new technique is introduced to efficiently implement MEM in 3-D nonconforming meshes. By introducing an "intermediate mortar", the proposed method decomposes the projection between 3-D elements and mortars into two steps. In each step, projection matrices derived in 2-D are used. The two-step method avoids explicitly forming/deriving large projection matrices for 3-D meshes, and also helps to simplify the implementation. This new technique can be used for both h- and p-type adaptation. This method is applied to an unsteady 3-D moving heat source problem. With our new MEM implementation, mesh adaptation is able to efficiently refine the grid near the heat source and coarsen the grid once the heat source passes. The savings in computational work resulting from the dynamic mesh adaptation is demonstrated by the reduction of the the number of elements used and CPU time spent. MEM and mesh adaptation, respectively, bring irregularity and dynamics to the computer memory access pattern. Hence, they provide a good way to gauge the performance of computer systems when running scientific applications whose memory access patterns are irregular and unpredictable. We select a 3-D moving heat source problem as the Unstructured Adaptive (UA) grid benchmark, a new component of the NAS Parallel Benchmarks (NPB). In this paper, we present some interesting performance results of ow OpenMP parallel implementation on different architectures such as the SGI Origin2000, SGI Altix, and Cray MTA-2.
Execution time supports for adaptive scientific algorithms on distributed memory machines
NASA Technical Reports Server (NTRS)
Berryman, Harry; Saltz, Joel; Scroggs, Jeffrey
1990-01-01
Optimizations are considered that are required for efficient execution of code segments that consists of loops over distributed data structures. The PARTI (Parallel Automated Runtime Toolkit at ICASE) execution time primitives are designed to carry out these optimizations and can be used to implement a wide range of scientific algorithms on distributed memory machines. These primitives allow the user to control array mappings in a way that gives an appearance of shared memory. Computations can be based on a global index set. Primitives are used to carry out gather and scatter operations on distributed arrays. Communications patterns are derived at runtime, and the appropriate send and receive messages are automatically generated.
FALCON: A distributed scheduler for MIMD architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grimshaw, A.S.; Vivas, V.E. Jr.
1991-01-01
This paper describes FALCON (Fully Automatic Load COordinator for Networks), the scheduler for the Mentat parallel processing system. FALCON has a modular structure and is designed for systems that use a task scheduling mechanism. FALCON is distributed, stable, supports system heterogeneities, and employs a sender-initiated adaptive load sharing policy with static task assignment. FALCON is parameterizable and is implemented in Mentat, a working distributed system. We present the design and implementation of FALCON as well as a brief introduction to those features of the Mentat run-time system that influence FALCON. Performance measures under different scheduler configurations are also presented andmore » analyzed with respect to the system parameters. 36 refs., 8 figs.« less
RFQ device for accelerating particles
Shepard, Kenneth W.; Delayen, Jean R.
1995-01-01
A superconducting radio frequency quadrupole (RFQ) device includes four spaced elongated, linear, tubular rods disposed parallel to a charged particle beam axis, with each rod supported by two spaced tubular posts oriented radially with respect to the beam axis. The rod and post geometry of the device has four-fold rotation symmetry, lowers the frequency of the quadrupole mode below that of the dipole mode, and provides large dipole-quadrupole mode isolation to accommodate a range of mechanical tolerances. The simplicity of the geometry of the structure, which can be formed by joining eight simple T-sections, provides a high degree of mechanical stability, is insensitive to mechanical displacement, and is particularly adapted for fabrication with superconducting materials such as niobium.
Giordano, Daniela; Boron, Ignacio; Abbruzzetti, Stefania; Van Leuven, Wendy; Nicoletti, Francesco P.; Forti, Flavio; Bruno, Stefano; Cheng, C-H. Christina; Moens, Luc; di Prisco, Guido; Nadra, Alejandro D.; Estrin, Darío; Smulevich, Giulietta; Dewilde, Sylvia; Viappiani, Cristiano; Verde, Cinzia
2012-01-01
The Antarctic icefish Chaenocephalus aceratus lacks the globins common to most vertebrates, hemoglobin and myoglobin, but has retained neuroglobin in the brain. This conserved globin has been cloned, over-expressed and purified. To highlight similarities and differences, the structural features of the neuroglobin of this colourless-blooded fish were compared with those of the well characterised human neuroglobin as well as with the neuroglobin from the retina of the red blooded, hemoglobin and myoglobin-containing, closely related Antarctic notothenioid Dissostichus mawsoni. A detailed structural and functional analysis of the two Antarctic fish neuroglobins was carried out by UV-visible and Resonance Raman spectroscopies, molecular dynamics simulations and laser-flash photolysis. Similar to the human protein, Antarctic fish neuroglobins can reversibly bind oxygen and CO in the Fe2+ form, and show six-coordination by distal His in the absence of exogenous ligands. A very large and structured internal cavity, with discrete docking sites, was identified in the modelled three-dimensional structures of the Antarctic neuroglobins. Estimate of the free-energy barriers from laser-flash photolysis and Implicit Ligand Sampling showed that the cavities are accessible from the solvent in both proteins. Comparison of structural and functional properties suggests that the two Antarctic fish neuroglobins most likely preserved and possibly improved the function recently proposed for human neuroglobin in ligand multichemistry. Despite subtle differences, the adaptation of Antarctic fish neuroglobins does not seem to parallel the dramatic adaptation of the oxygen carrying globins, hemoglobin and myoglobin, in the same organisms. PMID:23226490
HALOS: fast, autonomous, holographic adaptive optics
NASA Astrophysics Data System (ADS)
Andersen, Geoff P.; Gelsinger-Austin, Paul; Gaddipati, Ravi; Gaddipati, Phani; Ghebremichael, Fassil
2014-08-01
We present progress on our holographic adaptive laser optics system (HALOS): a compact, closed-loop aberration correction system that uses a multiplexed hologram to deconvolve the phase aberrations in an input beam. The wavefront characterization is based on simple, parallel measurements of the intensity of fixed focal spots and does not require any complex calculations. As such, the system does not require a computer and is thus much cheaper, less complex than conventional approaches. We present details of a fully functional, closed-loop prototype incorporating a 32-element MEMS mirror, operating at a bandwidth of over 10kHz. Additionally, since the all-optical sensing is made in parallel, the speed is independent of actuator number - running at the same bandwidth for one actuator as for a million.
Parallel Programming Strategies for Irregular Adaptive Applications
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2001-01-01
Achieving scalable performance for dynamic irregular applications is eminently challenging. Traditional message-passing approaches have been making steady progress towards this goal; however, they suffer from complex implementation requirements. The use of a global address space greatly simplifies the programming task, but can degrade the performance for such computations. In this work, we examine two typical irregular adaptive applications, Dynamic Remeshing and N-Body, under competing programming methodologies and across various parallel architectures. The Dynamic Remeshing application simulates flow over an airfoil, and refines localized regions of the underlying unstructured mesh. The N-Body experiment models two neighboring Plummer galaxies that are about to undergo a merger. Both problems demonstrate dramatic changes in processor workloads and interprocessor communication with time; thus, dynamic load balancing is a required component.
Santangelo, James S; Johnson, Marc T J; Ness, Rob W
2018-05-16
Urban environments offer the opportunity to study the role of adaptive and non-adaptive evolutionary processes on an unprecedented scale. While the presence of parallel clines in heritable phenotypic traits is often considered strong evidence for the role of natural selection, non-adaptive evolutionary processes can also generate clines, and this may be more likely when traits have a non-additive genetic basis due to epistasis. In this paper, we use spatially explicit simulations modelled according to the cyanogenesis (hydrogen cyanide, HCN) polymorphism in white clover ( Trifolium repens ) to examine the formation of phenotypic clines along urbanization gradients under varying levels of drift, gene flow and selection. HCN results from an epistatic interaction between two Mendelian-inherited loci. Our results demonstrate that the genetic architecture of this trait makes natural populations susceptible to decreases in HCN frequencies via drift. Gradients in the strength of drift across a landscape resulted in phenotypic clines with lower frequencies of HCN in strongly drifting populations, giving the misleading appearance of deterministic adaptive changes in the phenotype. Studies of heritable phenotypic change in urban populations should generate null models of phenotypic evolution based on the genetic architecture underlying focal traits prior to invoking selection's role in generating adaptive differentiation. © 2018 The Author(s).
Feiner, Nathalie
2016-10-12
Transposable elements (TEs) are DNA sequences that can insert elsewhere in the genome and modify genome structure and gene regulation. The role of TEs in evolution is contentious. One hypothesis posits that TE activity generates genomic incompatibilities that can cause reproductive isolation between incipient species. This predicts that TEs will accumulate during speciation events. Here, I tested the prediction that extant lineages with a relatively high rate of speciation have a high number of TEs in their genomes. I sequenced and analysed the TE content of a marker genomic region (Hox clusters) in Anolis lizards, a classic case of an adaptive radiation. Unlike other vertebrates, including closely related lizards, Anolis lizards have high numbers of TEs in their Hox clusters, genomic regions that regulate development of the morphological adaptations that characterize habitat specialists in these lizards. Following a burst of TE activity in the lineage leading to extant Anolis, TEs have continued to accumulate during or after speciation events, resulting in a positive relationship between TE density and lineage speciation rate. These results are consistent with the prediction that TE activity contributes to adaptive radiation by promoting speciation. Although there was no evidence that TE density per se is associated with ecological morphology, the activity of TEs in Hox clusters could have been a rich source for phenotypic variation that may have facilitated the rapid parallel morphological adaptation to microhabitats seen in extant Anolis lizards. © 2016 The Author(s).
2016-01-01
Transposable elements (TEs) are DNA sequences that can insert elsewhere in the genome and modify genome structure and gene regulation. The role of TEs in evolution is contentious. One hypothesis posits that TE activity generates genomic incompatibilities that can cause reproductive isolation between incipient species. This predicts that TEs will accumulate during speciation events. Here, I tested the prediction that extant lineages with a relatively high rate of speciation have a high number of TEs in their genomes. I sequenced and analysed the TE content of a marker genomic region (Hox clusters) in Anolis lizards, a classic case of an adaptive radiation. Unlike other vertebrates, including closely related lizards, Anolis lizards have high numbers of TEs in their Hox clusters, genomic regions that regulate development of the morphological adaptations that characterize habitat specialists in these lizards. Following a burst of TE activity in the lineage leading to extant Anolis, TEs have continued to accumulate during or after speciation events, resulting in a positive relationship between TE density and lineage speciation rate. These results are consistent with the prediction that TE activity contributes to adaptive radiation by promoting speciation. Although there was no evidence that TE density per se is associated with ecological morphology, the activity of TEs in Hox clusters could have been a rich source for phenotypic variation that may have facilitated the rapid parallel morphological adaptation to microhabitats seen in extant Anolis lizards. PMID:27733546
ERIC Educational Resources Information Center
Mungal, Angus Shiva
2016-01-01
In New York City, a partnership between Teach For America (TFA), the New York City Department of Education (NYCDOE), the Relay Graduate School of Education (Relay), and three charter school networks produced a "parallel education structure" within the public school system. Driving the partnership and the parallel education structure are…
Adaptive optics parallel spectral domain optical coherence tomography for imaging the living retina
NASA Astrophysics Data System (ADS)
Zhang, Yan; Rha, Jungtae; Jonnal, Ravi S.; Miller, Donald T.
2005-06-01
Although optical coherence tomography (OCT) can axially resolve and detect reflections from individual cells, there are no reports of imaging cells in the living human retina using OCT. To supplement the axial resolution and sensitivity of OCT with the necessary lateral resolution and speed, we developed a novel spectral domain OCT (SD-OCT) camera based on a free-space parallel illumination architecture and equipped with adaptive optics (AO). Conventional flood illumination, also with AO, was integrated into the camera and provided confirmation of the focus position in the retina with an accuracy of ±10.3 μm. Short bursts of narrow B-scans (100x560 μm) of the living retina were subsequently acquired at 500 Hz during dynamic compensation (up to 14 Hz) that successfully corrected the most significant ocular aberrations across a dilated 6 mm pupil. Camera sensitivity (up to 94 dB) was sufficient for observing reflections from essentially all neural layers of the retina. Signal-to-noise of the detected reflection from the photoreceptor layer was highly sensitive to the level of cular aberrations and defocus with changes of 11.4 and 13.1 dB (single pass) observed when the ocular aberrations (astigmatism, 3rd order and higher) were corrected and when the focus was shifted by 200 μm (0.54 diopters) in the retina, respectively. The 3D resolution of the B-scans (3.0x3.0x5.7 μm) is the highest reported to date in the living human eye and was sufficient to observe the interface between the inner and outer segments of individual photoreceptor cells, resolved in both lateral and axial dimensions. However, high contrast speckle, which is intrinsic to OCT, was present throughout the AO parallel SD-OCT B-scans and obstructed correlating retinal reflections to cell-sized retinal structures.
Principles, Techniques, and Applications of Tissue Microfluidics
NASA Technical Reports Server (NTRS)
Wade, Lawrence A.; Kartalov, Emil P.; Shibata, Darryl; Taylor, Clive
2011-01-01
The principle of tissue microfluidics and its resultant techniques has been applied to cell analysis. Building microfluidics to suit a particular tissue sample would allow the rapid, reliable, inexpensive, highly parallelized, selective extraction of chosen regions of tissue for purposes of further biochemical analysis. Furthermore, the applicability of the techniques ranges beyond the described pathology application. For example, they would also allow the posing and successful answering of new sets of questions in many areas of fundamental research. The proposed integration of microfluidic techniques and tissue slice samples is called "tissue microfluidics" because it molds the microfluidic architectures in accordance with each particular structure of each specific tissue sample. Thus, microfluidics can be built around the tissues, following the tissue structure, or alternatively, the microfluidics can be adapted to the specific geometry of particular tissues. By contrast, the traditional approach is that microfluidic devices are structured in accordance with engineering considerations, while the biological components in applied devices are forced to comply with these engineering presets.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reynolds, John; Jankovsky, Zachary; Metzroth, Kyle G
2018-04-04
The purpose of the ADAPT code is to generate Dynamic Event Trees (DET) using a user specified set of simulators. ADAPT can utilize any simulation tool which meets a minimal set of requirements. ADAPT is based on the concept of DET which uses explicit modeling of the deterministic dynamic processes that take place during a nuclear reactor plant system (or other complex system) evolution along with stochastic modeling. When DET are used to model various aspects of Probabilistic Risk Assessment (PRA), all accident progression scenarios starting from an initiating event are considered simultaneously. The DET branching occurs at user specifiedmore » times and/or when an action is required by the system and/or the operator. These outcomes then decide how the dynamic system variables will evolve in time for each DET branch. Since two different outcomes at a DET branching may lead to completely different paths for system evolution, the next branching for these paths may occur not only at separate times, but can be based on different branching criteria. The computational infrastructure allows for flexibility in ADAPT to link with different system simulation codes, parallel processing of the scenarios under consideration, on-line scenario management (initiation as well as termination), analysis of results, and user friendly graphical capabilities. The ADAPT system is designed for a distributed computing environment; the scheduler can track multiple concurrent branches simultaneously. The scheduler is modularized so that the DET branching strategy can be modified (e.g. biasing towards the worst-case scenario/event). Independent database systems store data from the simulation tasks and the DET structure so that the event tree can be constructed and analyzed later. ADAPT is provided with a user-friendly client which can easily sort through and display the results of an experiment, precluding the need for the user to manually inspect individual simulator runs.« less
Schmidt, James R
2013-01-01
The present work introduces a computational model, the Parallel Episodic Processing (PEP) model, which demonstrates that contingency learning achieved via simple storage and retrieval of episodic memories can explain the item-specific proportion congruency effect in the colour-word Stroop paradigm. The current work also presents a new experimental procedure to more directly dissociate contingency biases from conflict adaptation (i.e., proportion congruency). This was done with three different types of incongruent words that allow a comparison of: (a) high versus low contingency while keeping proportion congruency constant, and (b) high versus low proportion congruency while keeping contingency constant. Results demonstrated a significant contingency effect, but no effect of proportion congruence. It was further shown that the proportion congruency associated with the colour does not matter, either. Thus, the results quite directly demonstrate that ISPC effects are not due to conflict adaptation, but instead to contingency learning biases. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Hsieh, Shang-Hsien
1993-01-01
The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.
Yazawa, Koji; Suzuki, Furitsu; Nishiyama, Yusuke; Ohata, Takuya; Aoki, Akihiro; Nishimura, Katsuyuki; Kaji, Hironori; Shimizu, Tadashi; Asakura, Tetsuo
2012-11-25
The accurate (1)H positions of alanine tripeptide, A(3), with anti-parallel and parallel β-sheet structures could be determined by highly resolved (1)H DQMAS solid-state NMR spectra and (1)H chemical shift calculation with gauge-including projector augmented wave calculations.
A High Order, Locally-Adaptive Method for the Navier-Stokes Equations
NASA Astrophysics Data System (ADS)
Chan, Daniel
1998-11-01
I have extended the FOSLS method of Cai, Manteuffel and McCormick (1997) and implemented it within the framework of a spectral element formulation using the Legendre polynomial basis function. The FOSLS method solves the Navier-Stokes equations as a system of coupled first-order equations and provides the ellipticity that is needed for fast iterative matrix solvers like multigrid to operate efficiently. Each element is treated as an object and its properties are self-contained. Only C^0 continuity is imposed across element interfaces; this design allows local grid refinement and coarsening without the burden of having an elaborate data structure, since only information along element boundaries is needed. With the FORTRAN 90 programming environment, I can maintain a high computational efficiency by employing a hybrid parallel processing model. The OpenMP directives provides parallelism in the loop level which is executed in a shared-memory SMP and the MPI protocol allows the distribution of elements to a cluster of SMP's connected via a commodity network. This talk will provide timing results and a comparison with a second order finite difference method.
NASA Astrophysics Data System (ADS)
Vnukov, A. A.; Shershnev, M. B.
2018-01-01
The aim of this work is the software implementation of three image scaling algorithms using parallel computations, as well as the development of an application with a graphical user interface for the Windows operating system to demonstrate the operation of algorithms and to study the relationship between system performance, algorithm execution time and the degree of parallelization of computations. Three methods of interpolation were studied, formalized and adapted to scale images. The result of the work is a program for scaling images by different methods. Comparison of the quality of scaling by different methods is given.
Parallel Narrative Structure in Paul Harding's "Tinkers"
ERIC Educational Resources Information Center
Çirakli, Mustafa Zeki
2014-01-01
The present paper explores the implications of parallel narrative structure in Paul Harding's "Tinkers" (2009). Besides primarily recounting the two sets of parallel narratives, "Tinkers" also comprises of seemingly unrelated fragments such as excerpts from clock repair manuals and diaries. The main stories, however, told…
Managing a big ground-based astronomy project: the Thirty Meter Telescope (TMT) project
NASA Astrophysics Data System (ADS)
Sanders, Gary H.
2008-07-01
TMT is a big science project and its scale is greater than previous ground-based optical/infrared telescope projects. This paper will describe the ideal "linear" project and how the TMT project departs from that ideal. The paper will describe the needed adaptations to successfully manage real world complexities. The progression from science requirements to a reference design, the development of a product-oriented Work Breakdown Structure (WBS) and an organization that parallels the WBS, the implementation of system engineering, requirements definition and the progression through Conceptual Design to Preliminary Design will be summarized. The development of a detailed cost estimate structured by the WBS, and the methodology of risk analysis to estimate contingency fund requirements will be summarized. Designing the project schedule defines the construction plan and, together with the cost model, provides the basis for executing the project guided by an earned value performance measurement system.
Hexapole-compensated magneto-optical trap on a mesoscopic atom chip
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joellenbeck, S.; Mahnke, J.; Randoll, R.
2011-04-15
Magneto-optical traps on atom chips are usually restricted to small atomic samples due to a limited capture volume caused primarily by distorted field configurations. Here we present a magneto-optical trap based on a millimeter-sized wire structure which generates a magnetic field with minimized distortions. Together with the loading from a high-flux two-dimensional magneto-optical trap, we achieve a loading rate of 8.4x10{sup 10} atoms/s and maximum number of 8.7x10{sup 9} captured atoms. The wire structure is placed outside of the vacuum to enable a further adaptation to new scientific objectives. Since all magnetic fields are applied locally without the need formore » external bias fields, the presented setup will facilitate parallel generation of Bose-Einstein condensates on a conveyor belt with a cycle rate above 1 Hz.« less
NASA Astrophysics Data System (ADS)
Schwing, Alan Michael
For computational fluid dynamics, the governing equations are solved on a discretized domain of nodes, faces, and cells. The quality of the grid or mesh can be a driving source for error in the results. While refinement studies can help guide the creation of a mesh, grid quality is largely determined by user expertise and understanding of the flow physics. Adaptive mesh refinement is a technique for enriching the mesh during a simulation based on metrics for error, impact on important parameters, or location of important flow features. This can offload from the user some of the difficult and ambiguous decisions necessary when discretizing the domain. This work explores the implementation of adaptive mesh refinement in an implicit, unstructured, finite-volume solver. Consideration is made for applying modern computational techniques in the presence of hanging nodes and refined cells. The approach is developed to be independent of the flow solver in order to provide a path for augmenting existing codes. It is designed to be applicable for unsteady simulations and refinement and coarsening of the grid does not impact the conservatism of the underlying numerics. The effect on high-order numerical fluxes of fourth- and sixth-order are explored. Provided the criteria for refinement is appropriately selected, solutions obtained using adapted meshes have no additional error when compared to results obtained on traditional, unadapted meshes. In order to leverage large-scale computational resources common today, the methods are parallelized using MPI. Parallel performance is considered for several test problems in order to assess scalability of both adapted and unadapted grids. Dynamic repartitioning of the mesh during refinement is crucial for load balancing an evolving grid. Development of the methods outlined here depend on a dual-memory approach that is described in detail. Validation of the solver developed here against a number of motivating problems shows favorable comparisons across a range of regimes. Unsteady and steady applications are considered in both subsonic and supersonic flows. Inviscid and viscous simulations achieve similar results at a much reduced cost when employing dynamic mesh adaptation. Several techniques for guiding adaptation are compared. Detailed analysis of statistics from the instrumented solver enable understanding of the costs associated with adaptation. Adaptive mesh refinement shows promise for the test cases presented here. It can be considerably faster than using conventional grids and provides accurate results. The procedures for adapting the grid are light-weight enough to not require significant computational time and yield significant reductions in grid size.
MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harrison, Robert J.; Beylkin, Gregory; Bischoff, Florian A.
2016-01-01
MADNESS (multiresolution adaptive numerical environment for scientific simulation) is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.
NASA Astrophysics Data System (ADS)
Olano, C. A.
2009-11-01
Context: Using certain simplifications, Kompaneets derived a partial differential equation that states the local geometrical and kinematical conditions that each surface element of a shock wave, created by a point blast in a stratified gaseous medium, must satisfy. Kompaneets could solve his equation analytically for the case of a wave propagating in an exponentially stratified medium, obtaining the form of the shock front at progressive evolutionary stages. Complete analytical solutions of the Kompaneets equation for shock wave motion in further plane-parallel stratified media were not found, except for radially stratified media. Aims: We aim to analytically solve the Kompaneets equation for the motion of a shock wave in different plane-parallel stratified media that can reflect a wide variety of astrophysical contexts. We were particularly interested in solving the Kompaneets equation for a strong explosion in the interstellar medium of the Galactic disk, in which, due to intense winds and explosions of stars, gigantic gaseous structures known as superbubbles and supershells are formed. Methods: Using the Kompaneets approximation, we derived a pair of equations that we call adapted Kompaneets equations, that govern the propagation of a shock wave in a stratified medium and that permit us to obtain solutions in parametric form. The solutions provided by the system of adapted Kompaneets equations are equivalent to those of the Kompaneets equation. We solved the adapted Kompaneets equations for shock wave propagation in a generic stratified medium by means of a power-series method. Results: Using the series solution for a shock wave in a generic medium, we obtained the series solutions for four specific media whose respective density distributions in the direction perpendicular to the stratification plane are of an exponential, power-law type (one with exponent k=-1 and the other with k =-2) and a quadratic hyperbolic-secant. From these series solutions, we deduced exact solutions for the four media in terms of elemental functions. The exact solution for shock wave propagation in a medium of quadratic hyperbolic-secant density distribution is very appropriate to describe the growth of superbubbles in the Galactic disk. Member of the Carrera del Investigador Científico del CONICET, Argentina.
Maxillary distraction osteogenesis in cleft lip and palate patients with skeletal anchorage.
Minami, Katsuhiro; Mori, Yoshihide; Tae-Geon, Kwon; Shimizu, Hidetaka; Ohtani, Miyuki; Yura, Yoshiaki
2007-03-01
Maxillary distraction osteogenesis with the rigid external distraction (RED) system has been used to treat cleft lip and palate (CLP) patients with severe maxillary hypoplasia. We introduce maxillary distraction osteogenesis for CLP patients with skeletal anchorage adapted on a stereolithographic model. Six maxillary deficiency CLP patients treated according to our CLP treatment protocol had undergone maxillary distraction osteogenesis. In all patients, computed tomography (CT) images were recorded preoperatively, and the data were transferred to a workstation. Three-dimensional skeletal structures were reconstructed with CT data sets, and a stereolithographic model was produced. On the stereolithographic model, miniplates were adapted to the surface of maxilla beside aperture piriforms. The operation performed involved a high Le Fort I osteotomy with pterygomaxillary disjunction. Miniplates were fixed to the maxillary segment with three or four screws and used for anchorage of the RED system. Retraction of the maxillary segment was initiated after 1 week. The accuracy of the stereolithographic models was enough to adapt the miniplates so that there was no need to readjust the plates during surgery. Postoperative cephalometric analysis showed that the direction of the retraction was almost parallel to the palatal plane, and dental compensation did not occur. We performed maxillary distraction osteogenesis with skeletal anchorage adapted on the stereolithographic models. Excellent esthetic outcome and skeletal advancement were achieved without dentoalveolar compensations.
NASA Workshop on Computational Structural Mechanics 1987, part 1
NASA Technical Reports Server (NTRS)
Sykes, Nancy P. (Editor)
1989-01-01
Topics in Computational Structural Mechanics (CSM) are reviewed. CSM parallel structural methods, a transputer finite element solver, architectures for multiprocessor computers, and parallel eigenvalue extraction are among the topics discussed.
ERIC Educational Resources Information Center
Chief of Naval Education and Training Support, Pensacola, FL.
This individualized learning module on parallel circuits is one in a series of modules for a course in basic electricity and electronics. The course is one of a number of military-developed curriculum packages selected for adaptation to vocational instructional and curriculum development in a civilian setting. Four lessons are included in the…
GPU accelerated cell-based adaptive mesh refinement on unstructured quadrilateral grid
NASA Astrophysics Data System (ADS)
Luo, Xisheng; Wang, Luying; Ran, Wei; Qin, Fenghua
2016-10-01
A GPU accelerated inviscid flow solver is developed on an unstructured quadrilateral grid in the present work. For the first time, the cell-based adaptive mesh refinement (AMR) is fully implemented on GPU for the unstructured quadrilateral grid, which greatly reduces the frequency of data exchange between GPU and CPU. Specifically, the AMR is processed with atomic operations to parallelize list operations, and null memory recycling is realized to improve the efficiency of memory utilization. It is found that results obtained by GPUs agree very well with the exact or experimental results in literature. An acceleration ratio of 4 is obtained between the parallel code running on the old GPU GT9800 and the serial code running on E3-1230 V2. With the optimization of configuring a larger L1 cache and adopting Shared Memory based atomic operations on the newer GPU C2050, an acceleration ratio of 20 is achieved. The parallelized cell-based AMR processes have achieved 2x speedup on GT9800 and 18x on Tesla C2050, which demonstrates that parallel running of the cell-based AMR method on GPU is feasible and efficient. Our results also indicate that the new development of GPU architecture benefits the fluid dynamics computing significantly.
Multiprocessor smalltalk: Implementation, performance, and analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pallas, J.I.
1990-01-01
Multiprocessor Smalltalk demonstrates the value of object-oriented programming on a multiprocessor. Its implementation and analysis shed light on three areas: concurrent programming in an object oriented language without special extensions, implementation techniques for adapting to multiprocessors, and performance factors in the resulting system. Adding parallelism to Smalltalk code is easy, because programs already use control abstractions like iterators. Smalltalk's basic control and concurrency primitives (lambda expressions, processes and semaphores) can be used to build parallel control abstractions, including parallel iterators, parallel objects, atomic objects, and futures. Language extensions for concurrency are not required. This implementation demonstrates that it is possiblemore » to build an efficient parallel object-oriented programming system and illustrates techniques for doing so. Three modification tools-serialization, replication, and reorganization-adapted the Berkeley Smalltalk interpreter to the Firefly multiprocessor. Multiprocessor Smalltalk's performance shows that the combination of multiprocessing and object-oriented programming can be effective: speedups (relative to the original serial version) exceed 2.0 for five processors on all the benchmarks; the median efficiency is 48%. Analysis shows both where performance is lost and how to improve and generalize the experimental results. Changes in the interpreter to support concurrency add at most 12% overhead; better access to per-process variables could eliminate much of that. Changes in the user code to express concurrency add as much as 70% overhead; this overhead could be reduced to 54% if blocks (lambda expressions) were reentrant. Performance is also lost when the program cannot keep all five processors busy.« less
Gaut, Brandon S
2015-07-01
In this commentary, I make inferences about the level of repeatability and constraint in the evolutionary process, based on two sets of replicated experiments. The first experiment is crop domestication, which has been replicated across many different species. I focus on results of whole-genome scans for genes selected during domestication and ask whether genes are, in fact, selected in parallel across different domestication events. If genes are selected in parallel, it implies that the number of genetic solutions to the challenge of domestication is constrained. However, I find no evidence for parallel selection events either between species (maize vs. rice) or within species (two domestication events within beans). These results suggest that there are few constraints on genetic adaptation, but conclusions must be tempered by several complicating factors, particularly the lack of explicit design standards for selection screens. The second experiment involves the evolution of Escherichia coli to thermal stress. Unlike domestication, this highly replicated experiment detected a limited set of genes that appear prone to modification during adaptation to thermal stress. However, the number of potentially beneficial mutations within these genes is large, such that adaptation is constrained at the genic level but much less so at the nucleotide level. Based on these two experiments, I make the general conclusion that evolution is remarkably flexible, despite the presence of epistatic interactions that constrain evolutionary trajectories. I also posit that evolution is so rapid that we should establish a Speciation Prize, to be awarded to the first researcher who demonstrates speciation with a sexual organism in the laboratory. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Accelerating large-scale protein structure alignments with graphics processing units
2012-01-01
Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132
Velotta, Jonathan P.; Wegrzyn, Jill L.; Ginzburg, Samuel; Kang, Lin; Czesny, Sergiusz J.; O'Neill, Rachel J.; McCormick, Stephen; Michalak, Pawel; Schultz, Eric T.
2017-01-01
Comparative approaches in physiological genomics offer an opportunity to understand the functional importance of genes involved in niche exploitation. We used populations of Alewife (Alosa pseudoharengus) to explore the transcriptional mechanisms that underlie adaptation to fresh water. Ancestrally anadromous Alewives have recently formed multiple, independently derived, landlocked populations, which exhibit reduced tolerance of saltwater and enhanced tolerance of fresh water. Using RNA-seq, we compared transcriptional responses of an anadromous Alewife population to two landlocked populations after acclimation to fresh (0 ppt) and saltwater (35 ppt). Our results suggest that the gill transcriptome has evolved in primarily discordant ways between independent landlocked populations and their anadromous ancestor. By contrast, evolved shifts in the transcription of a small suite of well-characterized osmoregulatory genes exhibited a strong degree of parallelism. In particular, transcription of genes that regulate gill ion exchange has diverged in accordance with functional predictions: freshwater ion-uptake genes (most notably, the ‘freshwater paralog’ of Na+/K+-ATPase α-subunit) were more highly expressed in landlocked forms, whereas genes that regulate saltwater ion secretion (e.g. the ‘saltwater paralog’ of NKAα) exhibited a blunted response to saltwater. Parallel divergence of ion transport gene expression is associated with shifts in salinity tolerance limits among landlocked forms, suggesting that changes to the gill's transcriptional response to salinity facilitate freshwater adaptation.
Besnier, Francois; Glover, Kevin A.
2013-01-01
This software package provides an R-based framework to make use of multi-core computers when running analyses in the population genetics program STRUCTURE. It is especially addressed to those users of STRUCTURE dealing with numerous and repeated data analyses, and who could take advantage of an efficient script to automatically distribute STRUCTURE jobs among multiple processors. It also consists of additional functions to divide analyses among combinations of populations within a single data set without the need to manually produce multiple projects, as it is currently the case in STRUCTURE. The package consists of two main functions: MPI_structure() and parallel_structure() as well as an example data file. We compared the performance in computing time for this example data on two computer architectures and showed that the use of the present functions can result in several-fold improvements in terms of computation time. ParallelStructure is freely available at https://r-forge.r-project.org/projects/parallstructure/. PMID:23923012
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chin, George; Marquez, Andres; Choudhury, Sutanay
2012-09-01
Triadic analysis encompasses a useful set of graph mining methods that is centered on the concept of a triad, which is a subgraph of three nodes and the configuration of directed edges across the nodes. Such methods are often applied in the social sciences as well as many other diverse fields. Triadic methods commonly operate on a triad census that counts the number of triads of every possible edge configuration in a graph. Like other graph algorithms, triadic census algorithms do not scale well when graphs reach tens of millions to billions of nodes. To enable the triadic analysis ofmore » large-scale graphs, we developed and optimized a triad census algorithm to efficiently execute on shared memory architectures. We will retrace the development and evolution of a parallel triad census algorithm. Over the course of several versions, we continually adapted the code’s data structures and program logic to expose more opportunities to exploit parallelism on shared memory that would translate into improved computational performance. We will recall the critical steps and modifications that occurred during code development and optimization. Furthermore, we will compare the performances of triad census algorithm versions on three specific systems: Cray XMT, HP Superdome, and AMD multi-core NUMA machine. These three systems have shared memory architectures but with markedly different hardware capabilities to manage parallelism.« less
Knoeferle, Pia; Crocker, Matthew W
2009-12-01
Reading times for the second conjunct of and-coordinated clauses are faster when the second conjunct parallels the first conjunct in its syntactic or semantic (animacy) structure than when its structure differs (Frazier, Munn, & Clifton, 2000; Frazier, Taft, Roeper, & Clifton, 1984). What remains unclear, however, is the time course of parallelism effects, their scope, and the kinds of linguistic information to which they are sensitive. Findings from the first two eye-tracking experiments revealed incremental constituent order parallelism across the board-both during structural disambiguation (Experiment 1) and in sentences with unambiguously case-marked constituent order (Experiment 2), as well as for both marked and unmarked constituent orders (Experiments 1 and 2). Findings from Experiment 3 revealed effects of both constituent order and subtle semantic (noun phrase similarity) parallelism. Together our findings provide evidence for an across-the-board account of parallelism for processing and-coordinated clauses, in which both constituent order and semantic aspects of representations contribute towards incremental parallelism effects. We discuss our findings in the context of existing findings on parallelism and priming, as well as mechanisms of sentence processing.
Strategies for Large Scale Implementation of a Multiscale, Multiprocess Integrated Hydrologic Model
NASA Astrophysics Data System (ADS)
Kumar, M.; Duffy, C.
2006-05-01
Distributed models simulate hydrologic state variables in space and time while taking into account the heterogeneities in terrain, surface, subsurface properties and meteorological forcings. Computational cost and complexity associated with these model increases with its tendency to accurately simulate the large number of interacting physical processes at fine spatio-temporal resolution in a large basin. A hydrologic model run on a coarse spatial discretization of the watershed with limited number of physical processes needs lesser computational load. But this negatively affects the accuracy of model results and restricts physical realization of the problem. So it is imperative to have an integrated modeling strategy (a) which can be universally applied at various scales in order to study the tradeoffs between computational complexity (determined by spatio- temporal resolution), accuracy and predictive uncertainty in relation to various approximations of physical processes (b) which can be applied at adaptively different spatial scales in the same domain by taking into account the local heterogeneity of topography and hydrogeologic variables c) which is flexible enough to incorporate different number and approximation of process equations depending on model purpose and computational constraint. An efficient implementation of this strategy becomes all the more important for Great Salt Lake river basin which is relatively large (~89000 sq. km) and complex in terms of hydrologic and geomorphic conditions. Also the types and the time scales of hydrologic processes which are dominant in different parts of basin are different. Part of snow melt runoff generated in the Uinta Mountains infiltrates and contributes as base flow to the Great Salt Lake over a time scale of decades to centuries. The adaptive strategy helps capture the steep topographic and climatic gradient along the Wasatch front. Here we present the aforesaid modeling strategy along with an associated hydrologic modeling framework which facilitates a seamless, computationally efficient and accurate integration of the process model with the data model. The flexibility of this framework leads to implementation of multiscale, multiresolution, adaptive refinement/de-refinement and nested modeling simulations with least computational burden. However, performing these simulations and related calibration of these models over a large basin at higher spatio- temporal resolutions is computationally intensive and requires use of increasing computing power. With the advent of parallel processing architectures, high computing performance can be achieved by parallelization of existing serial integrated-hydrologic-model code. This translates to running the same model simulation on a network of large number of processors thereby reducing the time needed to obtain solution. The paper also discusses the implementation of the integrated model on parallel processors. Also will be discussed the mapping of the problem on multi-processor environment, method to incorporate coupling between hydrologic processes using interprocessor communication models, model data structure and parallel numerical algorithms to obtain high performance.
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.
An FPGA-based High Speed Parallel Signal Processing System for Adaptive Optics Testbed
NASA Astrophysics Data System (ADS)
Kim, H.; Choi, Y.; Yang, Y.
In this paper a state-of-the-art FPGA (Field Programmable Gate Array) based high speed parallel signal processing system (SPS) for adaptive optics (AO) testbed with 1 kHz wavefront error (WFE) correction frequency is reported. The AO system consists of Shack-Hartmann sensor (SHS) and deformable mirror (DM), tip-tilt sensor (TTS), tip-tilt mirror (TTM) and an FPGA-based high performance SPS to correct wavefront aberrations. The SHS is composed of 400 subapertures and the DM 277 actuators with Fried geometry, requiring high speed parallel computing capability SPS. In this study, the target WFE correction speed is 1 kHz; therefore, it requires massive parallel computing capabilities as well as strict hard real time constraints on measurements from sensors, matrix computation latency for correction algorithms, and output of control signals for actuators. In order to meet them, an FPGA based real-time SPS with parallel computing capabilities is proposed. In particular, the SPS is made up of a National Instrument's (NI's) real time computer and five FPGA boards based on state-of-the-art Xilinx Kintex 7 FPGA. Programming is done with NI's LabView environment, providing flexibility when applying different algorithms for WFE correction. It also facilitates faster programming and debugging environment as compared to conventional ones. One of the five FPGA's is assigned to measure TTS and calculate control signals for TTM, while the rest four are used to receive SHS signal, calculate slops for each subaperture and correction signal for DM. With this parallel processing capabilities of the SPS the overall closed-loop WFE correction speed of 1 kHz has been achieved. System requirements, architecture and implementation issues are described; furthermore, experimental results are also given.
Displacement and deformation measurement for large structures by camera network
NASA Astrophysics Data System (ADS)
Shang, Yang; Yu, Qifeng; Yang, Zhen; Xu, Zhiqiang; Zhang, Xiaohu
2014-03-01
A displacement and deformation measurement method for large structures by a series-parallel connection camera network is presented. By taking the dynamic monitoring of a large-scale crane in lifting operation as an example, a series-parallel connection camera network is designed, and the displacement and deformation measurement method by using this series-parallel connection camera network is studied. The movement range of the crane body is small, and that of the crane arm is large. The displacement of the crane body, the displacement of the crane arm relative to the body and the deformation of the arm are measured. Compared with a pure series or parallel connection camera network, the designed series-parallel connection camera network can be used to measure not only the movement and displacement of a large structure but also the relative movement and deformation of some interesting parts of the large structure by a relatively simple optical measurement system.
NASA Astrophysics Data System (ADS)
Chan, Chia-Hsin; Tu, Chun-Chuan; Tsai, Wen-Jiin
2017-01-01
High efficiency video coding (HEVC) not only improves the coding efficiency drastically compared to the well-known H.264/AVC but also introduces coding tools for parallel processing, one of which is tiles. Tile partitioning is allowed to be arbitrary in HEVC, but how to decide tile boundaries remains an open issue. An adaptive tile boundary (ATB) method is proposed to select a better tile partitioning to improve load balancing (ATB-LoadB) and coding efficiency (ATB-Gain) with a unified scheme. Experimental results show that, compared to ordinary uniform-space partitioning, the proposed ATB can save up to 17.65% of encoding times in parallel encoding scenarios and can reduce up to 0.8% of total bit rates for coding efficiency.
NASA Technical Reports Server (NTRS)
Campbell, R. H.; Essick, Ray B.; Johnston, Gary; Kenny, Kevin; Russo, Vince
1987-01-01
Project EOS is studying the problems of building adaptable real-time embedded operating systems for the scientific missions of NASA. Choices (A Class Hierarchical Open Interface for Custom Embedded Systems) is an operating system designed and built by Project EOS to address the following specific issues: the software architecture for adaptable embedded parallel operating systems, the achievement of high-performance and real-time operation, the simplification of interprocess communications, the isolation of operating system mechanisms from one another, and the separation of mechanisms from policy decisions. Choices is written in C++ and runs on a ten processor Encore Multimax. The system is intended for use in constructing specialized computer applications and research on advanced operating system features including fault tolerance and parallelism.
Marine Controlled-Source Electromagnetic 2D Inversion for synthetic models.
NASA Astrophysics Data System (ADS)
Liu, Y.; Li, Y.
2016-12-01
We present a 2D inverse algorithm for frequency domain marine controlled-source electromagnetic (CSEM) data, which is based on the regularized Gauss-Newton approach. As a forward solver, our parallel adaptive finite element forward modeling program is employed. It is a self-adaptive, goal-oriented grid refinement algorithm in which a finite element analysis is performed on a sequence of refined meshes. The mesh refinement process is guided by a dual error estimate weighting to bias refinement towards elements that affect the solution at the EM receiver locations. With the use of the direct solver (MUMPS), we can effectively compute the electromagnetic fields for multi-sources and parametric sensitivities. We also implement the parallel data domain decomposition approach of Key and Ovall (2011), with the goal of being able to compute accurate responses in parallel for complicated models and a full suite of data parameters typical of offshore CSEM surveys. All minimizations are carried out by using the Gauss-Newton algorithm and model perturbations at each iteration step are obtained by using the Inexact Conjugate Gradient iteration method. Synthetic test inversions are presented.
Ecological adaptation of diverse honey bee (Apis mellifera) populations.
Parker, Robert; Melathopoulos, Andony P; White, Rick; Pernal, Stephen F; Guarna, M Marta; Foster, Leonard J
2010-06-15
Honey bees are complex eusocial insects that provide a critical contribution to human agricultural food production. Their natural migration has selected for traits that increase fitness within geographical areas, but in parallel their domestication has selected for traits that enhance productivity and survival under local conditions. Elucidating the biochemical mechanisms of these local adaptive processes is a key goal of evolutionary biology. Proteomics provides tools unique among the major 'omics disciplines for identifying the mechanisms employed by an organism in adapting to environmental challenges. Through proteome profiling of adult honey bee midgut from geographically dispersed, domesticated populations combined with multiple parallel statistical treatments, the data presented here suggest some of the major cellular processes involved in adapting to different climates. These findings provide insight into the molecular underpinnings that may confer an advantage to honey bee populations. Significantly, the major energy-producing pathways of the mitochondria, the organelle most closely involved in heat production, were consistently higher in bees that had adapted to colder climates. In opposition, up-regulation of protein metabolism capacity, from biosynthesis to degradation, had been selected for in bees from warmer climates. Overall, our results present a proteomic interpretation of expression polymorphisms between honey bee ecotypes and provide insight into molecular aspects of local adaptation or selection with consequences for honey bee management and breeding. The implications of our findings extend beyond apiculture as they underscore the need to consider the interdependence of animal populations and their agro-ecological context.
A new parallel-vector finite element analysis software on distributed-memory computers
NASA Technical Reports Server (NTRS)
Qin, Jiangning; Nguyen, Duc T.
1993-01-01
A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation
NASA Astrophysics Data System (ADS)
Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.
2011-03-01
A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.
Efficient Parallelization of a Dynamic Unstructured Application on the Tera MTA
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Biswas, Rupak
1999-01-01
The success of parallel computing in solving real-life computationally-intensive problems relies on their efficient mapping and execution on large-scale multiprocessor architectures. Many important applications are both unstructured and dynamic in nature, making their efficient parallel implementation a daunting task. This paper presents the parallelization of a dynamic unstructured mesh adaptation algorithm using three popular programming paradigms on three leading supercomputers. We examine an MPI message-passing implementation on the Cray T3E and the SGI Origin2OOO, a shared-memory implementation using cache coherent nonuniform memory access (CC-NUMA) of the Origin2OOO, and a multi-threaded version on the newly-released Tera Multi-threaded Architecture (MTA). We compare several critical factors of this parallel code development, including runtime, scalability, programmability, and memory overhead. Our overall results demonstrate that multi-threaded systems offer tremendous potential for quickly and efficiently solving some of the most challenging real-life problems on parallel computers.
Zhou, Xian; Chen, Xue
2011-05-09
The digital coherent receivers combine coherent detection with digital signal processing (DSP) to compensate for transmission impairments, and therefore are a promising candidate for future high-speed optical transmission system. However, the maximum symbol rate supported by such real-time receivers is limited by the processing rate of hardware. In order to cope with this difficulty, the parallel processing algorithms is imperative. In this paper, we propose a novel parallel digital timing recovery loop (PDTRL) based on our previous work. Furthermore, for increasing the dynamic dispersion tolerance range of receivers, we embed a parallel adaptive equalizer in the PDTRL. This parallel joint scheme (PJS) can be used to complete synchronization, equalization and polarization de-multiplexing simultaneously. Finally, we demonstrate that PDTRL and PJS allow the hardware to process 112 Gbit/s POLMUX-DQPSK signal at the hundreds MHz range. © 2011 Optical Society of America
NASA Astrophysics Data System (ADS)
Wichert, Viktoria; Arkenberg, Mario; Hauschildt, Peter H.
2016-10-01
Highly resolved state-of-the-art 3D atmosphere simulations will remain computationally extremely expensive for years to come. In addition to the need for more computing power, rethinking coding practices is necessary. We take a dual approach by introducing especially adapted, parallel numerical methods and correspondingly parallelizing critical code passages. In the following, we present our respective work on PHOENIX/3D. With new parallel numerical algorithms, there is a big opportunity for improvement when iteratively solving the system of equations emerging from the operator splitting of the radiative transfer equation J = ΛS. The narrow-banded approximate Λ-operator Λ* , which is used in PHOENIX/3D, occurs in each iteration step. By implementing a numerical algorithm which takes advantage of its characteristic traits, the parallel code's efficiency is further increased and a speed-up in computational time can be achieved.
Generalized parallel-perspective stereo mosaics from airborne video.
Zhu, Zhigang; Hanson, Allen R; Riseman, Edward M
2004-02-01
In this paper, we present a new method for automatically and efficiently generating stereoscopic mosaics by seamless registration of images collected by a video camera mounted on an airborne platform. Using a parallel-perspective representation, a pair of geometrically registered stereo mosaics can be precisely constructed under quite general motion. A novel parallel ray interpolation for stereo mosaicing (PRISM) approach is proposed to make stereo mosaics seamless in the presence of obvious motion parallax and for rather arbitrary scenes. Parallel-perspective stereo mosaics generated with the PRISM method have better depth resolution than perspective stereo due to the adaptive baseline geometry. Moreover, unlike previous results showing that parallel-perspective stereo has a constant depth error, we conclude that the depth estimation error of stereo mosaics is in fact a linear function of the absolute depths of a scene. Experimental results on long video sequences are given.
A Parallel Workload Model and its Implications for Processor Allocation
1996-11-01
with SEV or AVG, both of which can tolerate c = 0.4 { 0.6 before their performance deteriorates signi cantly. On the other hand, Setia [10] has...Sanjeev. K Setia . The interaction between memory allocation and adaptive partitioning in message-passing multicomputers. In IPPS Workshop on Job...Scheduling Strategies for Parallel Processing, pages 89{99, 1995. [11] Sanjeev K. Setia and Satish K. Tripathi. An analysis of several processor
Parallel deterministic neutronics with AMR in 3D
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clouse, C.; Ferguson, J.; Hendrickson, C.
1997-12-31
AMTRAN, a three dimensional Sn neutronics code with adaptive mesh refinement (AMR) has been parallelized over spatial domains and energy groups and runs on the Meiko CS-2 with MPI message passing. Block refined AMR is used with linear finite element representations for the fluxes, which allows for a straight forward interpretation of fluxes at block interfaces with zoning differences. The load balancing algorithm assumes 8 spatial domains, which minimizes idle time among processors.
Parallel Implementation of the Wideband DOA Algorithm on the IBM Cell BE Processor
2010-05-01
Abstract—The Multiple Signal Classification ( MUSIC ) algorithm is a powerful technique for determining the Direction of Arrival (DOA) of signals...Broadband Engine Processor (Cell BE). The process of adapting the serial based MUSIC algorithm to the Cell BE will be analyzed in terms of parallelism and...using Multiple Signal Classification MUSIC algorithm [4] • Computation of Focus matrix • Computation of number of sources • Separation of Signal
Sullam, Karen E; Rubin, Benjamin E R; Dalton, Christopher M; Kilham, Susan S; Flecker, Alexander S; Russell, Jacob A
2015-07-01
Diverse microbial consortia profoundly influence animal biology, necessitating an understanding of microbiome variation in studies of animal adaptation. Yet, little is known about such variability among fish, in spite of their importance in aquatic ecosystems. The Trinidadian guppy, Poecilia reticulata, is an intriguing candidate to test microbiome-related hypotheses on the drivers and consequences of animal adaptation, given the recent parallel origins of a similar ecotype across streams. To assess the relationships between the microbiome and host adaptation, we used 16S rRNA amplicon sequencing to characterize gut bacteria of two guppy ecotypes with known divergence in diet, life history, physiology and morphology collected from low-predation (LP) and high-predation (HP) habitats in four Trinidadian streams. Guts were populated by several recurring, core bacteria that are related to other fish associates and rarely detected in the environment. Although gut communities of lab-reared guppies differed from those in the wild, microbiome divergence between ecotypes from the same stream was evident under identical rearing conditions, suggesting host genetic divergence can affect associations with gut bacteria. In the field, gut communities varied over time, across streams and between ecotypes in a stream-specific manner. This latter finding, along with PICRUSt predictions of metagenome function, argues against strong parallelism of the gut microbiome in association with LP ecotype evolution. Thus, bacteria cannot be invoked in facilitating the heightened reliance of LP guppies on lower-quality diets. We argue that the macroevolutionary microbiome convergence seen across animals with similar diets may be a signature of secondary microbial shifts arising some time after host-driven adaptation.
Sullam, Karen E; Rubin, Benjamin ER; Dalton, Christopher M; Kilham, Susan S; Flecker, Alexander S; Russell, Jacob A
2015-01-01
Diverse microbial consortia profoundly influence animal biology, necessitating an understanding of microbiome variation in studies of animal adaptation. Yet, little is known about such variability among fish, in spite of their importance in aquatic ecosystems. The Trinidadian guppy, Poecilia reticulata, is an intriguing candidate to test microbiome-related hypotheses on the drivers and consequences of animal adaptation, given the recent parallel origins of a similar ecotype across streams. To assess the relationships between the microbiome and host adaptation, we used 16S rRNA amplicon sequencing to characterize gut bacteria of two guppy ecotypes with known divergence in diet, life history, physiology and morphology collected from low-predation (LP) and high-predation (HP) habitats in four Trinidadian streams. Guts were populated by several recurring, core bacteria that are related to other fish associates and rarely detected in the environment. Although gut communities of lab-reared guppies differed from those in the wild, microbiome divergence between ecotypes from the same stream was evident under identical rearing conditions, suggesting host genetic divergence can affect associations with gut bacteria. In the field, gut communities varied over time, across streams and between ecotypes in a stream-specific manner. This latter finding, along with PICRUSt predictions of metagenome function, argues against strong parallelism of the gut microbiome in association with LP ecotype evolution. Thus, bacteria cannot be invoked in facilitating the heightened reliance of LP guppies on lower-quality diets. We argue that the macroevolutionary microbiome convergence seen across animals with similar diets may be a signature of secondary microbial shifts arising some time after host-driven adaptation. PMID:25575311
Reconfigurable Model Execution in the OpenMDAO Framework
NASA Technical Reports Server (NTRS)
Hwang, John T.
2017-01-01
NASA's OpenMDAO framework facilitates constructing complex models and computing their derivatives for multidisciplinary design optimization. Decomposing a model into components that follow a prescribed interface enables OpenMDAO to assemble multidisciplinary derivatives from the component derivatives using what amounts to the adjoint method, direct method, chain rule, global sensitivity equations, or any combination thereof, using the MAUD architecture. OpenMDAO also handles the distribution of processors among the disciplines by hierarchically grouping the components, and it automates the data transfer between components that are on different processors. These features have made OpenMDAO useful for applications in aircraft design, satellite design, wind turbine design, and aircraft engine design, among others. This paper presents new algorithms for OpenMDAO that enable reconfigurable model execution. This concept refers to dynamically changing, during execution, one or more of: the variable sizes, solution algorithm, parallel load balancing, or set of variables-i.e., adding and removing components, perhaps to switch to a higher-fidelity sub-model. Any component can reconfigure at any point, even when running in parallel with other components, and the reconfiguration algorithm presented here performs the synchronized updates to all other components that are affected. A reconfigurable software framework for multidisciplinary design optimization enables new adaptive solvers, adaptive parallelization, and new applications such as gradient-based optimization with overset flow solvers and adaptive mesh refinement. Benchmarking results demonstrate the time savings for reconfiguration compared to setting up the model again from scratch, which can be significant in large-scale problems. Additionally, the new reconfigurability feature is applied to a mission profile optimization problem for commercial aircraft where both the parametrization of the mission profile and the time discretization are adaptively refined, resulting in computational savings of roughly 10% and the elimination of oscillations in the optimized altitude profile.
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.
Parallel altitudinal clines reveal trends in adaptive evolution of genome size in Zea mays
Berg, Jeremy J.; Birchler, James A.; Grote, Mark N.; Lorant, Anne; Quezada, Juvenal
2018-01-01
While the vast majority of genome size variation in plants is due to differences in repetitive sequence, we know little about how selection acts on repeat content in natural populations. Here we investigate parallel changes in intraspecific genome size and repeat content of domesticated maize (Zea mays) landraces and their wild relative teosinte across altitudinal gradients in Mesoamerica and South America. We combine genotyping, low coverage whole-genome sequence data, and flow cytometry to test for evidence of selection on genome size and individual repeat abundance. We find that population structure alone cannot explain the observed variation, implying that clinal patterns of genome size are maintained by natural selection. Our modeling additionally provides evidence of selection on individual heterochromatic knob repeats, likely due to their large individual contribution to genome size. To better understand the phenotypes driving selection on genome size, we conducted a growth chamber experiment using a population of highland teosinte exhibiting extensive variation in genome size. We find weak support for a positive correlation between genome size and cell size, but stronger support for a negative correlation between genome size and the rate of cell production. Reanalyzing published data of cell counts in maize shoot apical meristems, we then identify a negative correlation between cell production rate and flowering time. Together, our data suggest a model in which variation in genome size is driven by natural selection on flowering time across altitudinal clines, connecting intraspecific variation in repetitive sequence to important differences in adaptive phenotypes. PMID:29746459
Implicit schemes and parallel computing in unstructured grid CFD
NASA Technical Reports Server (NTRS)
Venkatakrishnam, V.
1995-01-01
The development of implicit schemes for obtaining steady state solutions to the Euler and Navier-Stokes equations on unstructured grids is outlined. Applications are presented that compare the convergence characteristics of various implicit methods. Next, the development of explicit and implicit schemes to compute unsteady flows on unstructured grids is discussed. Next, the issues involved in parallelizing finite volume schemes on unstructured meshes in an MIMD (multiple instruction/multiple data stream) fashion are outlined. Techniques for partitioning unstructured grids among processors and for extracting parallelism in explicit and implicit solvers are discussed. Finally, some dynamic load balancing ideas, which are useful in adaptive transient computations, are presented.
Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes
Jones, Terry R.; Watson, Pythagoras C.; Tuel, William; Brenner, Larry; ,Caffrey, Patrick; Fier, Jeffrey
2010-10-05
In a parallel computing environment comprising a network of SMP nodes each having at least one processor, a parallel-aware co-scheduling method and system for improving the performance and scalability of a dedicated parallel job having synchronizing collective operations. The method and system uses a global co-scheduler and an operating system kernel dispatcher adapted to coordinate interfering system and daemon activities on a node and across nodes to promote intra-node and inter-node overlap of said interfering system and daemon activities as well as intra-node and inter-node overlap of said synchronizing collective operations. In this manner, the impact of random short-lived interruptions, such as timer-decrement processing and periodic daemon activity, on synchronizing collective operations is minimized on large processor-count SPMD bulk-synchronous programming styles.
Design of High Field Solenoids made of High Temperature Superconductors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bartalesi, Antonio; /Pisa U.
2010-12-01
This thesis starts from the analytical mechanical analysis of a superconducting solenoid, loaded by self generated Lorentz forces. Also, a finite element model is proposed and verified with the analytical results. To study the anisotropic behavior of a coil made by layers of superconductor and insulation, a finite element meso-mechanic model is proposed and designed. The resulting material properties are then used in the main solenoid analysis. In parallel, design work is performed as well: an existing Insert Test Facility (ITF) is adapted and structurally verified to support a coil made of YBa{sub 2}Cu{sub 3}O{sub 7}, a High Temperature Superconductormore » (HTS). Finally, a technological winding process was proposed and the required tooling is designed.« less
Twisted, multifilament Nb3Sn superconductive ribbon
NASA Technical Reports Server (NTRS)
Coles, W. D.
1972-01-01
An experimental study of superconductor stabilization has resulted in the successful application of the concepts of filamentary structure and conductor twist to Nb3Sn ribbon. The Nb3Sn is formed in parallel, helical paths, which are continuous around the ribbon. Short lengths (12-18cm) of 1.27 cm wide superconductive ribbon were produced. The filamentary and twist characteristics are incorporated in the ribbon by means of an inert mask formed on the ribbon surface early in the fabrication process. Diffusion reaction of the niobium and tin is prevented at the filament boundaries. Described are the conductor methods of fabrication, and test results obtained. The technology required to adapt the processes for the production of long lengths of ribbon is available.
RFQ device for accelerating particles
Shepard, K.W.; Delayen, J.R.
1995-06-06
A superconducting radio frequency quadrupole (RFQ) device includes four spaced elongated, linear, tubular rods disposed parallel to a charged particle beam axis, with each rod supported by two spaced tubular posts oriented radially with respect to the beam axis. The rod and post geometry of the device has four-fold rotation symmetry, lowers the frequency of the quadrupole mode below that of the dipole mode, and provides large dipole-quadrupole mode isolation to accommodate a range of mechanical tolerances. The simplicity of the geometry of the structure, which can be formed by joining eight simple T-sections, provides a high degree of mechanical stability, is insensitive to mechanical displacement, and is particularly adapted for fabrication with superconducting materials such as niobium. 5 figs.
Programming Probabilistic Structural Analysis for Parallel Processing Computer
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.
1991-01-01
The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.
Yokoyama, Masaru; Nomaguchi, Masako; Doi, Naoya; Kanda, Tadahito; Adachi, Akio; Sato, Hironori
2016-01-01
Variable V1/V2 and V3 loops on human immunodeficiency virus type 1 (HIV-1) envelope-gp120 core play key roles in modulating viral competence to recognize two infection receptors, CD4 and chemokine-receptors. However, molecular bases for the modulation largely remain unclear. To address these issues, we constructed structural models for a full-length gp120 in CD4-free and -bound states. The models showed topologies of gp120 surface loop that agree with those in reported structural data. Molecular dynamics simulation showed that in the unliganded state, V1/V2 loop settled into a thermodynamically stable arrangement near V3 loop for conformational masking of V3 tip, a potent neutralization epitope. In the CD4-bound state, however, V1/V2 loop was rearranged near the bound CD4 to support CD4 binding. In parallel, cell-based adaptation in the absence of anti-viral antibody pressures led to the identification of amino acid substitutions that individually enhance viral entry and growth efficiencies in association with reduced sensitivity to CCR5 antagonist TAK-779. Notably, all these substitutions were positioned on the receptors binding surfaces in V1/V2 or V3 loop. In silico structural studies predicted some physical changes of gp120 by substitutions with alterations in viral replication phenotypes. These data suggest that V1/V2 loop is critical for creating a gp120 structure that masks co-receptor binding site compatible with maintenance of viral infectivity, and for tuning a functional balance of gp120 between immune escape ability and infectivity to optimize HIV-1 replication fitness. PMID:26903989
Free-energy landscapes from adaptively biased methods: Application to quantum systems
NASA Astrophysics Data System (ADS)
Calvo, F.
2010-10-01
Several parallel adaptive biasing methods are applied to the calculation of free-energy pathways along reaction coordinates, choosing as a difficult example the double-funnel landscape of the 38-atom Lennard-Jones cluster. In the case of classical statistics, the Wang-Landau and adaptively biased molecular-dynamics (ABMD) methods are both found efficient if multiple walkers and replication and deletion schemes are used. An extension of the ABMD technique to quantum systems, implemented through the path-integral MD framework, is presented and tested on Ne38 against the quantum superposition method.
F-8C adaptive control law refinement and software development
NASA Technical Reports Server (NTRS)
Hartmann, G. L.; Stein, G.
1981-01-01
An explicit adaptive control algorithm based on maximum likelihood estimation of parameters was designed. To avoid iterative calculations, the algorithm uses parallel channels of Kalman filters operating at fixed locations in parameter space. This algorithm was implemented in NASA/DFRC's Remotely Augmented Vehicle (RAV) facility. Real-time sensor outputs (rate gyro, accelerometer, surface position) are telemetered to a ground computer which sends new gain values to an on-board system. Ground test data and flight records were used to establish design values of noise statistics and to verify the ground-based adaptive software.
MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation
Harrison, Robert J.; Beylkin, Gregory; Bischoff, Florian A.; ...
2016-01-01
We present MADNESS (multiresolution adaptive numerical environment for scientific simulation) that is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision that are based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.
Joost, Stéphane; Vuilleumier, Séverine; Jensen, Jeffrey D; Schoville, Sean; Leempoel, Kevin; Stucki, Sylvie; Widmer, Ivo; Melodelima, Christelle; Rolland, Jonathan; Manel, Stéphanie
2013-07-01
A workshop recently held at the École Polytechnique Fédérale de Lausanne (EPFL, Switzerland) was dedicated to understanding the genetic basis of adaptive change, taking stock of the different approaches developed in theoretical population genetics and landscape genomics and bringing together knowledge accumulated in both research fields. Indeed, an important challenge in theoretical population genetics is to incorporate effects of demographic history and population structure. But important design problems (e.g. focus on populations as units, focus on hard selective sweeps, no hypothesis-based framework in the design of the statistical tests) reduce their capability of detecting adaptive genetic variation. In parallel, landscape genomics offers a solution to several of these problems and provides a number of advantages (e.g. fast computation, landscape heterogeneity integration). But the approach makes several implicit assumptions that should be carefully considered (e.g. selection has had enough time to create a functional relationship between the allele distribution and the environmental variable, or this functional relationship is assumed to be constant). To address the respective strengths and weaknesses mentioned above, the workshop brought together a panel of experts from both disciplines to present their work and discuss the relevance of combining these approaches, possibly resulting in a joint software solution in the future.
NASA Astrophysics Data System (ADS)
Robinson, Tyler D.; Crisp, David
2018-05-01
Solar and thermal radiation are critical aspects of planetary climate, with gradients in radiative energy fluxes driving heating and cooling. Climate models require that radiative transfer tools be versatile, computationally efficient, and accurate. Here, we describe a technique that uses an accurate full-physics radiative transfer model to generate a set of atmospheric radiative quantities which can be used to linearly adapt radiative flux profiles to changes in the atmospheric and surface state-the Linearized Flux Evolution (LiFE) approach. These radiative quantities describe how each model layer in a plane-parallel atmosphere reflects and transmits light, as well as how the layer generates diffuse radiation by thermal emission and by scattering light from the direct solar beam. By computing derivatives of these layer radiative properties with respect to dynamic elements of the atmospheric state, we can then efficiently adapt the flux profiles computed by the full-physics model to new atmospheric states. We validate the LiFE approach, and then apply this approach to Mars, Earth, and Venus, demonstrating the information contained in the layer radiative properties and their derivatives, as well as how the LiFE approach can be used to determine the thermal structure of radiative and radiative-convective equilibrium states in one-dimensional atmospheric models.
Multifocal multiphoton microscopy with adaptive optical correction
NASA Astrophysics Data System (ADS)
Coelho, Simao; Poland, Simon; Krstajic, Nikola; Li, David; Monypenny, James; Walker, Richard; Tyndall, David; Ng, Tony; Henderson, Robert; Ameer-Beg, Simon
2013-02-01
Fluorescence lifetime imaging microscopy (FLIM) is a well established approach for measuring dynamic signalling events inside living cells, including detection of protein-protein interactions. The improvement in optical penetration of infrared light compared with linear excitation due to Rayleigh scattering and low absorption have provided imaging depths of up to 1mm in brain tissue but significant image degradation occurs as samples distort (aberrate) the infrared excitation beam. Multiphoton time-correlated single photon counting (TCSPC) FLIM is a method for obtaining functional, high resolution images of biological structures. In order to achieve good statistical accuracy TCSPC typically requires long acquisition times. We report the development of a multifocal multiphoton microscope (MMM), titled MegaFLI. Beam parallelization performed via a 3D Gerchberg-Saxton (GS) algorithm using a Spatial Light Modulator (SLM), increases TCSPC count rate proportional to the number of beamlets produced. A weighted 3D GS algorithm is employed to improve homogeneity. An added benefit is the implementation of flexible and adaptive optical correction. Adaptive optics performed by means of Zernike polynomials are used to correct for system induced aberrations. Here we present results with significant improvement in throughput obtained using a novel complementary metal-oxide-semiconductor (CMOS) 1024 pixel single-photon avalanche diode (SPAD) array, opening the way to truly high-throughput FLIM.
A heuristic re-mapping algorithm reducing inter-level communication in SAMR applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steensland, Johan; Ray, Jaideep
2003-07-01
This paper aims at decreasing execution time for large-scale structured adaptive mesh refinement (SAMR) applications by proposing a new heuristic re-mapping algorithm and experimentally showing its effectiveness in reducing inter-level communication. Tests were done for five different SAMR applications. The overall goal is to engineer a dynamically adaptive meta-partitioner capable of selecting and configuring the most appropriate partitioning strategy at run-time based on current system and application state. Such a metapartitioner can significantly reduce execution times for general SAMR applications. Computer simulations of physical phenomena are becoming increasingly popular as they constitute an important complement to real-life testing. In manymore » cases, such simulations are based on solving partial differential equations by numerical methods. Adaptive methods are crucial to efficiently utilize computer resources such as memory and CPU. But even with adaption, the simulations are computationally demanding and yield huge data sets. Thus parallelization and the efficient partitioning of data become issues of utmost importance. Adaption causes the workload to change dynamically, calling for dynamic (re-) partitioning to maintain efficient resource utilization. The proposed heuristic algorithm reduced inter-level communication substantially. Since the complexity of the proposed algorithm is low, this decrease comes at a relatively low cost. As a consequence, we draw the conclusion that the proposed re-mapping algorithm would be useful to lower overall execution times for many large SAMR applications. Due to its usefulness and its parameterization, the proposed algorithm would constitute a natural and important component of the meta-partitioner.« less
Parallel-vector solution of large-scale structural analysis problems on supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.
1989-01-01
A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.
An intercalation-locked parallel-stranded DNA tetraplex
Tripathi, S.; Zhang, D.; Paukstelis, P. J.
2015-01-27
DNA has proved to be an excellent material for nanoscale construction because complementary DNA duplexes are programmable and structurally predictable. However, in the absence of Watson–Crick pairings, DNA can be structurally more diverse. Here, we describe the crystal structures of d(ACTCGGATGAT) and the brominated derivative, d(AC BrUCGGA BrUGAT). These oligonucleotides form parallel-stranded duplexes with a crystallographically equivalent strand, resulting in the first examples of DNA crystal structures that contains four different symmetric homo base pairs. Two of the parallel-stranded duplexes are coaxially stacked in opposite directions and locked together to form a tetraplex through intercalation of the 5'-most A–A basemore » pairs between adjacent G–G pairs in the partner duplex. The intercalation region is a new type of DNA tertiary structural motif with similarities to the i-motif. 1H– 1H nuclear magnetic resonance and native gel electrophoresis confirmed the formation of a parallel-stranded duplex in solution. Finally, we modified specific nucleotide positions and added d(GAY) motifs to oligonucleotides and were readily able to obtain similar crystals. This suggests that this parallel-stranded DNA structure may be useful in the rational design of DNA crystals and nanostructures.« less
1986-12-01
17 III. Analysis of Parallel Design ................................................ 18 Parallel Abstract Data ...Types ........................................... 18 Abstract Data Type .................................................. 19 Parallel ADT...22 Data -Structure Design ........................................... 23 Object-Oriented Design
Cooperative storage of shared files in a parallel computing system with dynamic block size
Bent, John M.; Faibish, Sorin; Grider, Gary
2015-11-10
Improved techniques are provided for parallel writing of data to a shared object in a parallel computing system. A method is provided for storing data generated by a plurality of parallel processes to a shared object in a parallel computing system. The method is performed by at least one of the processes and comprises: dynamically determining a block size for storing the data; exchanging a determined amount of the data with at least one additional process to achieve a block of the data having the dynamically determined block size; and writing the block of the data having the dynamically determined block size to a file system. The determined block size comprises, e.g., a total amount of the data to be stored divided by the number of parallel processes. The file system comprises, for example, a log structured virtual parallel file system, such as a Parallel Log-Structured File System (PLFS).
Production of yarns composed of oriented nanofibers for ophthalmological implants
NASA Astrophysics Data System (ADS)
Shynkarenko, A.; Klapstova, A.; Krotov, A.; Moucka, M.; Lukas, D.
2017-10-01
Parallelized nanofibrous structures are commonly used in medical sector, especially for the ophthalmological implants. In this research self-fabricated device is tested for improved collection and twisting of the parallel nanofibers. Previously manual techniques are used to collect the nanofibers and then twist is given, where as in our device different parameters can be optimized to obtained parallel nanofibers and further twisting can be given. The device is used to bring automation to the technique of achieving parallel fibrous structures for medical applications.
Vehicular impact absorption system
NASA Technical Reports Server (NTRS)
Knoell, A. C.; Wilson, A. H. (Inventor)
1978-01-01
An improved vehicular impact absorption system characterized by a plurality of aligned crash cushions of substantially cubic configuration is described. Each consists of a plurality of voided aluminum beverage cans arranged in substantial parallelism within a plurality of superimposed tiers and a covering envelope formed of metal hardware cloth. A plurality of cables is extended through the cushions in substantial parallelism with an axis of alignment for the cushions adapted to be anchored at each of the opposite end thereof.
Hybrid Parallelization of Adaptive MHD-Kinetic Module in Multi-Scale Fluid-Kinetic Simulation Suite
Borovikov, Sergey; Heerikhuisen, Jacob; Pogorelov, Nikolai
2013-04-01
The Multi-Scale Fluid-Kinetic Simulation Suite has a computational tool set for solving partially ionized flows. In this paper we focus on recent developments of the kinetic module which solves the Boltzmann equation using the Monte-Carlo method. The module has been recently redesigned to utilize intra-node hybrid parallelization. We describe in detail the redesign process, implementation issues, and modifications made to the code. Finally, we conduct a performance analysis.
Durner, Bernhard; Ehmann, Thomas; Matysik, Frank-Michael
2018-06-05
The adaption of an parallel-path poly(tetrafluoroethylene)(PTFE) ICP-nebulizer to an evaporative light scattering detector (ELSD) was realized. This was done by substituting the originally installed concentric glass nebulizer of the ELSD. The performance of both nebulizers was compared regarding nebulizer temperature, evaporator temperature, flow rate of nebulizing gas and flow rate of mobile phase of different solvents using caffeine and poly(dimethylsiloxane) (PDMS) as analytes. Both nebulizers showed similar performances but for the parallel-path PTFE nebulizer the performance was considerably better at low LC flow rates and the nebulizer lifetime was substantially increased. In general, for both nebulizers the highest sensitivity was obtained by applying the lowest possible evaporator temperature in combination with the highest possible nebulizer temperature at preferably low gas flow rates. Besides the optimization of detector parameters, response factors for various PDMS oligomers were determined and the dependency of the detector signal on molar mass of the analytes was studied. The significant improvement regarding long-term stability made the modified ELSD much more robust and saved time and money by reducing the maintenance efforts. Thus, especially in polymer HPLC, associated with a complex matrix situation, the PTFE-based parallel-path nebulizer exhibits attractive characteristics for analytical studies of polymers. Copyright © 2018. Published by Elsevier B.V.
Zhao, Li; Wit, Janneke; Svetec, Nicolas; Begun, David J.
2015-01-01
Gene expression variation within species is relatively common, however, the role of natural selection in the maintenance of this variation is poorly understood. Here we investigate low and high latitude populations of Drosophila melanogaster and its sister species, D. simulans, to determine whether the two species show similar patterns of population differentiation, consistent with a role for spatially varying selection in maintaining gene expression variation. We compared at two temperatures the whole male transcriptome of D. melanogaster and D. simulans sampled from Panama City (Panama) and Maine (USA). We observed a significant excess of genes exhibiting differential expression in both species, consistent with parallel adaptation to heterogeneous environments. Moreover, the majority of genes showing parallel expression differentiation showed the same direction of differential expression in the two species and the magnitudes of expression differences between high and low latitude populations were correlated across species, further bolstering the conclusion that parallelism for expression phenotypes results from spatially varying selection. However, the species also exhibited important differences in expression phenotypes. For example, the genomic extent of genotype × environment interaction was much more common in D. melanogaster. Highly differentiated SNPs between low and high latitudes were enriched in the 3’ UTRs and CDS of the geographically differently expressed genes in both species, consistent with an important role for cis-acting variants in driving local adaptation for expression-related phenotypes. PMID:25950438
Zhao, Li; Wit, Janneke; Svetec, Nicolas; Begun, David J
2015-05-01
Gene expression variation within species is relatively common, however, the role of natural selection in the maintenance of this variation is poorly understood. Here we investigate low and high latitude populations of Drosophila melanogaster and its sister species, D. simulans, to determine whether the two species show similar patterns of population differentiation, consistent with a role for spatially varying selection in maintaining gene expression variation. We compared at two temperatures the whole male transcriptome of D. melanogaster and D. simulans sampled from Panama City (Panama) and Maine (USA). We observed a significant excess of genes exhibiting differential expression in both species, consistent with parallel adaptation to heterogeneous environments. Moreover, the majority of genes showing parallel expression differentiation showed the same direction of differential expression in the two species and the magnitudes of expression differences between high and low latitude populations were correlated across species, further bolstering the conclusion that parallelism for expression phenotypes results from spatially varying selection. However, the species also exhibited important differences in expression phenotypes. For example, the genomic extent of genotype × environment interaction was much more common in D. melanogaster. Highly differentiated SNPs between low and high latitudes were enriched in the 3' UTRs and CDS of the geographically differently expressed genes in both species, consistent with an important role for cis-acting variants in driving local adaptation for expression-related phenotypes.
High-speed prediction of crystal structures for organic molecules
NASA Astrophysics Data System (ADS)
Obata, Shigeaki; Goto, Hitoshi
2015-02-01
We developed a master-worker type parallel algorithm for allocating tasks of crystal structure optimizations to distributed compute nodes, in order to improve a performance of simulations for crystal structure predictions. The performance experiments were demonstrated on TUT-ADSIM supercomputer system (HITACHI HA8000-tc/HT210). The experimental results show that our parallel algorithm could achieve speed-ups of 214 and 179 times using 256 processor cores on crystal structure optimizations in predictions of crystal structures for 3-aza-bicyclo(3.3.1)nonane-2,4-dione and 2-diazo-3,5-cyclohexadiene-1-one, respectively. We expect that this parallel algorithm is always possible to reduce computational costs of any crystal structure predictions.
SU-F-I-10: Spatially Local Statistics for Adaptive Image Filtering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Iliopoulos, AS; Sun, X; Floros, D
Purpose: To facilitate adaptive image filtering operations, addressing spatial variations in both noise and signal. Such issues are prevalent in cone-beam projections, where physical effects such as X-ray scattering result in spatially variant noise, violating common assumptions of homogeneous noise and challenging conventional filtering approaches to signal extraction and noise suppression. Methods: We present a computational mechanism for probing into and quantifying the spatial variance of noise throughout an image. The mechanism builds a pyramid of local statistics at multiple spatial scales; local statistical information at each scale includes (weighted) mean, median, standard deviation, median absolute deviation, as well asmore » histogram or dynamic range after local mean/median shifting. Based on inter-scale differences of local statistics, the spatial scope of distinguishable noise variation is detected in a semi- or un-supervised manner. Additionally, we propose and demonstrate the incorporation of such information in globally parametrized (i.e., non-adaptive) filters, effectively transforming the latter into spatially adaptive filters. The multi-scale mechanism is materialized by efficient algorithms and implemented in parallel CPU/GPU architectures. Results: We demonstrate the impact of local statistics for adaptive image processing and analysis using cone-beam projections of a Catphan phantom, fitted within an annulus to increase X-ray scattering. The effective spatial scope of local statistics calculations is shown to vary throughout the image domain, necessitating multi-scale noise and signal structure analysis. Filtering results with and without spatial filter adaptation are compared visually, illustrating improvements in imaging signal extraction and noise suppression, and in preserving information in low-contrast regions. Conclusion: Local image statistics can be incorporated in filtering operations to equip them with spatial adaptivity to spatial signal/noise variations. An efficient multi-scale computational mechanism is developed to curtail processing latency. Spatially adaptive filtering may impact subsequent processing tasks such as reconstruction and numerical gradient computations for deformable registration. NIH Grant No. R01-184173.« less
NASA Astrophysics Data System (ADS)
Li, Gaohua; Fu, Xiang; Wang, Fuxin
2017-10-01
The low-dissipation high-order accurate hybrid up-winding/central scheme based on fifth-order weighted essentially non-oscillatory (WENO) and sixth-order central schemes, along with the Spalart-Allmaras (SA)-based delayed detached eddy simulation (DDES) turbulence model, and the flow feature-based adaptive mesh refinement (AMR), are implemented into a dual-mesh overset grid infrastructure with parallel computing capabilities, for the purpose of simulating vortex-dominated unsteady detached wake flows with high spatial resolutions. The overset grid assembly (OGA) process based on collection detection theory and implicit hole-cutting algorithm achieves an automatic coupling for the near-body and off-body solvers, and the error-and-try method is used for obtaining a globally balanced load distribution among the composed multiple codes. The results of flows over high Reynolds cylinder and two-bladed helicopter rotor show that the combination of high-order hybrid scheme, advanced turbulence model, and overset adaptive mesh refinement can effectively enhance the spatial resolution for the simulation of turbulent wake eddies.
Locally adaptive parallel temperature accelerated dynamics method
NASA Astrophysics Data System (ADS)
Shim, Yunsic; Amar, Jacques G.
2010-03-01
The recently-developed temperature-accelerated dynamics (TAD) method [M. Sørensen and A.F. Voter, J. Chem. Phys. 112, 9599 (2000)] along with the more recently developed parallel TAD (parTAD) method [Y. Shim et al, Phys. Rev. B 76, 205439 (2007)] allow one to carry out non-equilibrium simulations over extended time and length scales. The basic idea behind TAD is to speed up transitions by carrying out a high-temperature MD simulation and then use the resulting information to obtain event times at the desired low temperature. In a typical implementation, a fixed high temperature Thigh is used. However, in general one expects that for each configuration there exists an optimal value of Thigh which depends on the particular transition pathways and activation energies for that configuration. Here we present a locally adaptive high-temperature TAD method in which instead of using a fixed Thigh the high temperature is dynamically adjusted in order to maximize simulation efficiency. Preliminary results of the performance obtained from parTAD simulations of Cu/Cu(100) growth using the locally adaptive Thigh method will also be presented.
A parallel adaptive quantum genetic algorithm for the controllability of arbitrary networks.
Li, Yuhong; Gong, Guanghong; Li, Ni
2018-01-01
In this paper, we propose a novel algorithm-parallel adaptive quantum genetic algorithm-which can rapidly determine the minimum control nodes of arbitrary networks with both control nodes and state nodes. The corresponding network can be fully controlled with the obtained control scheme. We transformed the network controllability issue into a combinational optimization problem based on the Popov-Belevitch-Hautus rank condition. A set of canonical networks and a list of real-world networks were experimented. Comparison results demonstrated that the algorithm was more ideal to optimize the controllability of networks, especially those larger-size networks. We demonstrated subsequently that there were links between the optimal control nodes and some network statistical characteristics. The proposed algorithm provides an effective approach to improve the controllability optimization of large networks or even extra-large networks with hundreds of thousands nodes.
A comparative study of serial and parallel aeroelastic computations of wings
NASA Technical Reports Server (NTRS)
Byun, Chansup; Guruswamy, Guru P.
1994-01-01
A procedure for computing the aeroelasticity of wings on parallel multiple-instruction, multiple-data (MIMD) computers is presented. In this procedure, fluids are modeled using Euler equations, and structures are modeled using modal or finite element equations. The procedure is designed in such a way that each discipline can be developed and maintained independently by using a domain decomposition approach. In the present parallel procedure, each computational domain is scalable. A parallel integration scheme is used to compute aeroelastic responses by solving fluid and structural equations concurrently. The computational efficiency issues of parallel integration of both fluid and structural equations are investigated in detail. This approach, which reduces the total computational time by a factor of almost 2, is demonstrated for a typical aeroelastic wing by using various numbers of processors on the Intel iPSC/860.
NASA Astrophysics Data System (ADS)
Lin, Mingpei; Xu, Ming; Fu, Xiaoyu
2017-05-01
Currently, a tremendous amount of space debris in Earth's orbit imperils operational spacecraft. It is essential to undertake risk assessments of collisions and predict dangerous encounters in space. However, collision predictions for an enormous amount of space debris give rise to large-scale computations. In this paper, a parallel algorithm is established on the Compute Unified Device Architecture (CUDA) platform of NVIDIA Corporation for collision prediction. According to the parallel structure of NVIDIA graphics processors, a block decomposition strategy is adopted in the algorithm. Space debris is divided into batches, and the computation and data transfer operations of adjacent batches overlap. As a consequence, the latency to access shared memory during the entire computing process is significantly reduced, and a higher computing speed is reached. Theoretically, a simulation of collision prediction for space debris of any amount and for any time span can be executed. To verify this algorithm, a simulation example including 1382 pieces of debris, whose operational time scales vary from 1 min to 3 days, is conducted on Tesla C2075 of NVIDIA. The simulation results demonstrate that with the same computational accuracy as that of a CPU, the computing speed of the parallel algorithm on a GPU is 30 times that on a CPU. Based on this algorithm, collision prediction of over 150 Chinese spacecraft for a time span of 3 days can be completed in less than 3 h on a single computer, which meets the timeliness requirement of the initial screening task. Furthermore, the algorithm can be adapted for multiple tasks, including particle filtration, constellation design, and Monte-Carlo simulation of an orbital computation.
NASA Astrophysics Data System (ADS)
Sarti, E.; Zamuner, S.; Cossio, P.; Laio, A.; Seno, F.; Trovato, A.
2013-12-01
In protein structure prediction it is of crucial importance, especially at the refinement stage, to score efficiently large sets of models by selecting the ones that are closest to the native state. We here present a new computational tool, BACHSCORE, that allows its users to rank different structural models of the same protein according to their quality, evaluated by using the BACH++ (Bayesian Analysis Conformation Hunt) scoring function. The original BACH statistical potential was already shown to discriminate with very good reliability the protein native state in large sets of misfolded models of the same protein. BACH++ features a novel upgrade in the solvation potential of the scoring function, now computed by adapting the LCPO (Linear Combination of Pairwise Orbitals) algorithm. This change further enhances the already good performance of the scoring function. BACHSCORE can be accessed directly through the web server: bachserver.pd.infn.it. Catalogue identifier: AEQD_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEQD_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: GNU General Public License version 3 No. of lines in distributed program, including test data, etc.: 130159 No. of bytes in distributed program, including test data, etc.: 24 687 455 Distribution format: tar.gz Programming language: C++. Computer: Any computer capable of running an executable produced by a g++ compiler (4.6.3 version). Operating system: Linux, Unix OS-es. RAM: 1 073 741 824 bytes Classification: 3. Nature of problem: Evaluate the quality of a protein structural model, taking into account the possible “a priori” knowledge of a reference primary sequence that may be different from the amino-acid sequence of the model; the native protein structure should be recognized as the best model. Solution method: The contact potential scores the occurrence of any given type of residue pair in 5 possible contact classes (α-helical contact, parallel β-sheet contact, anti-parallel β-sheet contact, side-chain contact, no contact). The solvation potential scores the occurrence of any residue type in 2 possible environments: buried and solvent exposed. Residue environment is assigned by adapting the LCPO algorithm. Residues present in the reference primary sequence and not present in the model structure contribute to the model score as solvent exposed and as non contacting all other residues. Restrictions: Input format file according to the Protein Data Bank standard Additional comments: Parameter values used in the scoring function can be found in the file /folder-to-bachscore/BACH/examples/bach_std.par. Running time: Roughly one minute to score one hundred structures on a desktop PC, depending on their size.
NASA Astrophysics Data System (ADS)
Wan, Tat C.; Kabuka, Mansur R.
1994-05-01
With the tremendous growth in imaging applications and the development of filmless radiology, the need for compression techniques that can achieve high compression ratios with user specified distortion rates becomes necessary. Boundaries and edges in the tissue structures are vital for detection of lesions and tumors, which in turn requires the preservation of edges in the image. The proposed edge preserving image compressor (EPIC) combines lossless compression of edges with neural network compression techniques based on dynamic associative neural networks (DANN), to provide high compression ratios with user specified distortion rates in an adaptive compression system well-suited to parallel implementations. Improvements to DANN-based training through the use of a variance classifier for controlling a bank of neural networks speed convergence and allow the use of higher compression ratios for `simple' patterns. The adaptation and generalization capabilities inherent in EPIC also facilitate progressive transmission of images through varying the number of quantization levels used to represent compressed patterns. Average compression ratios of 7.51:1 with an averaged average mean squared error of 0.0147 were achieved.
Full Wave Parallel Code for Modeling RF Fields in Hot Plasmas
NASA Astrophysics Data System (ADS)
Spencer, Joseph; Svidzinski, Vladimir; Evstatiev, Evstati; Galkin, Sergei; Kim, Jin-Soo
2015-11-01
FAR-TECH, Inc. is developing a suite of full wave RF codes in hot plasmas. It is based on a formulation in configuration space with grid adaptation capability. The conductivity kernel (which includes a nonlocal dielectric response) is calculated by integrating the linearized Vlasov equation along unperturbed test particle orbits. For Tokamak applications a 2-D version of the code is being developed. Progress of this work will be reported. This suite of codes has the following advantages over existing spectral codes: 1) It utilizes the localized nature of plasma dielectric response to the RF field and calculates this response numerically without approximations. 2) It uses an adaptive grid to better resolve resonances in plasma and antenna structures. 3) It uses an efficient sparse matrix solver to solve the formulated linear equations. The linear wave equation is formulated using two approaches: for cold plasmas the local cold plasma dielectric tensor is used (resolving resonances by particle collisions), while for hot plasmas the conductivity kernel is calculated. Work is supported by the U.S. DOE SBIR program.
Adaptive mesh fluid simulations on GPU
NASA Astrophysics Data System (ADS)
Wang, Peng; Abel, Tom; Kaehler, Ralf
2010-10-01
We describe an implementation of compressible inviscid fluid solvers with block-structured adaptive mesh refinement on Graphics Processing Units using NVIDIA's CUDA. We show that a class of high resolution shock capturing schemes can be mapped naturally on this architecture. Using the method of lines approach with the second order total variation diminishing Runge-Kutta time integration scheme, piecewise linear reconstruction, and a Harten-Lax-van Leer Riemann solver, we achieve an overall speedup of approximately 10 times faster execution on one graphics card as compared to a single core on the host computer. We attain this speedup in uniform grid runs as well as in problems with deep AMR hierarchies. Our framework can readily be applied to more general systems of conservation laws and extended to higher order shock capturing schemes. This is shown directly by an implementation of a magneto-hydrodynamic solver and comparing its performance to the pure hydrodynamic case. Finally, we also combined our CUDA parallel scheme with MPI to make the code run on GPU clusters. Close to ideal speedup is observed on up to four GPUs.
Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P
DOE Office of Scientific and Technical Information (OSTI.GOV)
Candel, A.; Kabel, A.; Lee, L.
In recent years, SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic time-domain code T3P. Higher-order Finite Element methods on conformal unstructured meshes and massively parallel processing allow unprecedented simulation accuracy for wakefield computations and simulations of transient effects in realistic accelerator structures. Applications include simulation of wakefield damping in the Compact Linear Collider (CLIC) power extraction and transfer structure (PETS).
NASA Technical Reports Server (NTRS)
Lee-Rausch, E. M.; Park, M. A.; Jones, W. T.; Hammond, D. P.; Nielsen, E. J.
2005-01-01
This paper demonstrates the extension of error estimation and adaptation methods to parallel computations enabling larger, more realistic aerospace applications and the quantification of discretization errors for complex 3-D solutions. Results were shown for an inviscid sonic-boom prediction about a double-cone configuration and a wing/body segmented leading edge (SLE) configuration where the output function of the adjoint was pressure integrated over a part of the cylinder in the near field. After multiple cycles of error estimation and surface/field adaptation, a significant improvement in the inviscid solution for the sonic boom signature of the double cone was observed. Although the double-cone adaptation was initiated from a very coarse mesh, the near-field pressure signature from the final adapted mesh compared very well with the wind-tunnel data which illustrates that the adjoint-based error estimation and adaptation process requires no a priori refinement of the mesh. Similarly, the near-field pressure signature for the SLE wing/body sonic boom configuration showed a significant improvement from the initial coarse mesh to the final adapted mesh in comparison with the wind tunnel results. Error estimation and field adaptation results were also presented for the viscous transonic drag prediction of the DLR-F6 wing/body configuration, and results were compared to a series of globally refined meshes. Two of these globally refined meshes were used as a starting point for the error estimation and field-adaptation process where the output function for the adjoint was the total drag. The field-adapted results showed an improvement in the prediction of the drag in comparison with the finest globally refined mesh and a reduction in the estimate of the remaining drag error. The adjoint-based adaptation parameter showed a need for increased resolution in the surface of the wing/body as well as a need for wake resolution downstream of the fuselage and wing trailing edge in order to achieve the requested drag tolerance. Although further adaptation was required to meet the requested tolerance, no further cycles were computed in order to avoid large discrepancies between the surface mesh spacing and the refined field spacing.
Method for fabricating high aspect ratio structures in perovskite material
Karapetrov, Goran T.; Kwok, Wai-Kwong; Crabtree, George W.; Iavarone, Maria
2003-10-28
A method of fabricating high aspect ratio ceramic structures in which a selected portion of perovskite or perovskite-like crystalline material is exposed to a high energy ion beam for a time sufficient to cause the crystalline material contacted by the ion beam to have substantially parallel columnar defects. Then selected portions of the material having substantially parallel columnar defects are etched leaving material with and without substantially parallel columnar defects in a predetermined shape having high aspect ratios of not less than 2 to 1. Etching is accomplished by optical or PMMA lithography. There is also disclosed a structure of a ceramic which is superconducting at a temperature in the range of from about 10.degree. K. to about 90.degree. K. with substantially parallel columnar defects in which the smallest lateral dimension of the structure is less than about 5 microns, and the thickness of the structure is greater than 2 times the smallest lateral dimension of the structure.
NASA Astrophysics Data System (ADS)
Sheng, Lizeng
The dissertation focuses on one of the major research needs in the area of adaptive/intelligent/smart structures, the development and application of finite element analysis and genetic algorithms for optimal design of large-scale adaptive structures. We first review some basic concepts in finite element method and genetic algorithms, along with the research on smart structures. Then we propose a solution methodology for solving a critical problem in the design of a next generation of large-scale adaptive structures---optimal placements of a large number of actuators to control thermal deformations. After briefly reviewing the three most frequently used general approaches to derive a finite element formulation, the dissertation presents techniques associated with general shell finite element analysis using flat triangular laminated composite elements. The element used here has three nodes and eighteen degrees of freedom and is obtained by combining a triangular membrane element and a triangular plate bending element. The element includes the coupling effect between membrane deformation and bending deformation. The membrane element is derived from the linear strain triangular element using Cook's transformation. The discrete Kirchhoff triangular (DKT) element is used as the plate bending element. For completeness, a complete derivation of the DKT is presented. Geometrically nonlinear finite element formulation is derived for the analysis of adaptive structures under the combined thermal and electrical loads. Next, we solve the optimization problems of placing a large number of piezoelectric actuators to control thermal distortions in a large mirror in the presence of four different thermal loads. We then extend this to a multi-objective optimization problem of determining only one set of piezoelectric actuator locations that can be used to control the deformation in the same mirror under the action of any one of the four thermal loads. A series of genetic algorithms, GA Version 1, 2 and 3, were developed to find the optimal locations of piezoelectric actuators from the order of 1021 ˜ 1056 candidate placements. Introducing a variable population approach, we improve the flexibility of selection operation in genetic algorithms. Incorporating mutation and hill climbing into micro-genetic algorithms, we are able to develop a more efficient genetic algorithm. Through extensive numerical experiments, we find that the design search space for the optimal placements of a large number of actuators is highly multi-modal and that the most distinct nature of genetic algorithms is their robustness. They give results that are random but with only a slight variability. The genetic algorithms can be used to get adequate solution using a limited number of evaluations. To get the highest quality solution, multiple runs including different random seed generators are necessary. The investigation time can be significantly reduced using a very coarse grain parallel computing. Overall, the methodology of using finite element analysis and genetic algorithm optimization provides a robust solution approach for the challenging problem of optimal placements of a large number of actuators in the design of next generation of adaptive structures.
Applications of Parallel Computation in Micro-Mechanics and Finite Element Method
NASA Technical Reports Server (NTRS)
Tan, Hui-Qian
1996-01-01
This project discusses the application of parallel computations related with respect to material analyses. Briefly speaking, we analyze some kind of material by elements computations. We call an element a cell here. A cell is divided into a number of subelements called subcells and all subcells in a cell have the identical structure. The detailed structure will be given later in this paper. It is obvious that the problem is "well-structured". SIMD machine would be a better choice. In this paper we try to look into the potentials of SIMD machine in dealing with finite element computation by developing appropriate algorithms on MasPar, a SIMD parallel machine. In section 2, the architecture of MasPar will be discussed. A brief review of the parallel programming language MPL also is given in that section. In section 3, some general parallel algorithms which might be useful to the project will be proposed. And, combining with the algorithms, some features of MPL will be discussed in more detail. In section 4, the computational structure of cell/subcell model will be given. The idea of designing the parallel algorithm for the model will be demonstrated. Finally in section 5, a summary will be given.
THE PLUTO CODE FOR ADAPTIVE MESH COMPUTATIONS IN ASTROPHYSICAL FLUID DYNAMICS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mignone, A.; Tzeferacos, P.; Zanni, C.
We present a description of the adaptive mesh refinement (AMR) implementation of the PLUTO code for solving the equations of classical and special relativistic magnetohydrodynamics (MHD and RMHD). The current release exploits, in addition to the static grid version of the code, the distributed infrastructure of the CHOMBO library for multidimensional parallel computations over block-structured, adaptively refined grids. We employ a conservative finite-volume approach where primary flow quantities are discretized at the cell center in a dimensionally unsplit fashion using the Corner Transport Upwind method. Time stepping relies on a characteristic tracing step where piecewise parabolic method, weighted essentially non-oscillatory,more » or slope-limited linear interpolation schemes can be handily adopted. A characteristic decomposition-free version of the scheme is also illustrated. The solenoidal condition of the magnetic field is enforced by augmenting the equations with a generalized Lagrange multiplier providing propagation and damping of divergence errors through a mixed hyperbolic/parabolic explicit cleaning step. Among the novel features, we describe an extension of the scheme to include non-ideal dissipative processes, such as viscosity, resistivity, and anisotropic thermal conduction without operator splitting. Finally, we illustrate an efficient treatment of point-local, potentially stiff source terms over hierarchical nested grids by taking advantage of the adaptivity in time. Several multidimensional benchmarks and applications to problems of astrophysical relevance assess the potentiality of the AMR version of PLUTO in resolving flow features separated by large spatial and temporal disparities.« less
A Parallel Rendering Algorithm for MIMD Architectures
NASA Technical Reports Server (NTRS)
Crockett, Thomas W.; Orloff, Tobias
1991-01-01
Applications such as animation and scientific visualization demand high performance rendering of complex three dimensional scenes. To deliver the necessary rendering rates, highly parallel hardware architectures are required. The challenge is then to design algorithms and software which effectively use the hardware parallelism. A rendering algorithm targeted to distributed memory MIMD architectures is described. For maximum performance, the algorithm exploits both object-level and pixel-level parallelism. The behavior of the algorithm is examined both analytically and experimentally. Its performance for large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 shows increasing performance from 1 to 128 processors across a wide range of scene complexities. It is shown that minimal modifications to the algorithm will adapt it for use on shared memory architectures as well.
NASA Technical Reports Server (NTRS)
Nguyen, Howard; Willacy, Karen; Allen, Mark
2012-01-01
KINETICS is a coupled dynamics and chemistry atmosphere model that is data intensive and computationally demanding. The potential performance gain from using a supercomputer motivates the adaptation from a serial version to a parallelized one. Although the initial parallelization had been done, bottlenecks caused by an abundance of communication calls between processors led to an unfavorable drop in performance. Before starting on the parallel optimization process, a partial overhaul was required because a large emphasis was placed on streamlining the code for user convenience and revising the program to accommodate the new supercomputers at Caltech and JPL. After the first round of optimizations, the partial runtime was reduced by a factor of 23; however, performance gains are dependent on the size of the data, the number of processors requested, and the computer used.
Synthesis of Efficient Structures for Concurrent Computation.
1983-10-01
formal presentation of these techniques, called virtualisation and aggregation, can be found n [King-83$. 113.2 Census Functions Trees perform broadcast... Functions .. .. .. .. ... .... ... ... .... ... ... ....... 6 4 User-Assisted Aggregation .. .. .. .. ... ... ... .... ... .. .......... 6 5 Parallel...6. Simple Parallel Structure for Broadcasting .. .. .. .. .. . ... .. . .. . .... 4 Figure 7. Internal Structure of a Prefix Computation Network
Adapter plate assembly for adjustable mounting of objects
Blackburn, R.S.
1986-05-02
An adapter plate and two locking discs are together affixed to an optic table with machine screws or bolts threaded into a fixed array of internally threaded holes provided in the table surface. The adapter plate preferably has two, and preferably parallel, elongated locating slots each freely receiving a portion of one of the locking discs for secure affixation of the adapter plate to the optic table. A plurality of threaded apertures provided in the adapter plate are available to attach optical mounts or other devices onto the adapter plate in an orientation not limited by the disposition of the array of threaded holes in the table surface. An axially aligned but radially offset hole through each locking disc receives a screw that tightens onto the table, such that prior to tightening of the screw the locking disc may rotate and translate within each locating slot of the adapter plate for maximum flexibility of the orientation thereof.
Bi-Objective Optimal Control Modification Adaptive Control for Systems with Input Uncertainty
NASA Technical Reports Server (NTRS)
Nguyen, Nhan T.
2012-01-01
This paper presents a new model-reference adaptive control method based on a bi-objective optimal control formulation for systems with input uncertainty. A parallel predictor model is constructed to relate the predictor error to the estimation error of the control effectiveness matrix. In this work, we develop an optimal control modification adaptive control approach that seeks to minimize a bi-objective linear quadratic cost function of both the tracking error norm and predictor error norm simultaneously. The resulting adaptive laws for the parametric uncertainty and control effectiveness uncertainty are dependent on both the tracking error and predictor error, while the adaptive laws for the feedback gain and command feedforward gain are only dependent on the tracking error. The optimal control modification term provides robustness to the adaptive laws naturally from the optimal control framework. Simulations demonstrate the effectiveness of the proposed adaptive control approach.
Adapter plate assembly for adjustable mounting of objects
Blackburn, Robert S.
1987-01-01
An adapter plate and two locking discs are together affixed to an optic table with machine screws or bolts threaded into a fixed array of internally threaded holes provided in the table surface. The adapter plate preferably has two, and preferably parallel, elongated locating slots each freely receiving a portion of one of the locking discs for secure affixation of the adapter plate to the optic table. A plurality of threaded apertures provided in the adapter plate are available to attach optical mounts or other devices onto the adapter plate in an orientation not limited by the disposition of the array of threaded holes in the table surface. An axially aligned but radially offset hole through each locking disc receives a screw that tightens onto the table, such that prior to tightening of the screw the locking disc may rotate and translate within each locating slot of the adapter plate for maximum flexibility of the orientation thereof.
Cartesian Off-Body Grid Adaption for Viscous Time- Accurate Flow Simulation
NASA Technical Reports Server (NTRS)
Buning, Pieter G.; Pulliam, Thomas H.
2011-01-01
An improved solution adaption capability has been implemented in the OVERFLOW overset grid CFD code. Building on the Cartesian off-body approach inherent in OVERFLOW and the original adaptive refinement method developed by Meakin, the new scheme provides for automated creation of multiple levels of finer Cartesian grids. Refinement can be based on the undivided second-difference of the flow solution variables, or on a specific flow quantity such as vorticity. Coupled with load-balancing and an inmemory solution interpolation procedure, the adaption process provides very good performance for time-accurate simulations on parallel compute platforms. A method of using refined, thin body-fitted grids combined with adaption in the off-body grids is presented, which maximizes the part of the domain subject to adaption. Two- and three-dimensional examples are used to illustrate the effectiveness and performance of the adaption scheme.
TRIP13 is a protein-remodeling AAA+ ATPase that catalyzes MAD2 conformation switching
Ye, Qiaozhen; Rosenberg, Scott C.; Moeller, Arne; ...
2015-04-28
The AAA+ family ATPase TRIP13 is a key regulator of meiotic recombination and the spindle assembly checkpoint, acting on signaling proteins of the conserved HORMA domain family. Here we present the structure of the Caenorhabditis elegans TRIP13 ortholog PCH-2, revealing a new family of AAA+ ATPase protein remodelers. PCH-2 possesses a substrate-recognition domain related to those of the protein remodelers NSF and p97, while its overall hexameric architecture and likely structural mechanism bear close similarities to the bacterial protein unfoldase ClpX. We find that TRIP13, aided by the adapter protein p31(comet), converts the HORMA-family spindle checkpoint protein MAD2 from amore » signaling-active ‘closed’ conformer to an inactive ‘open’ conformer. We propose that TRIP13 and p31(comet) collaborate to inactivate the spindle assembly checkpoint through MAD2 conformational conversion and disassembly of mitotic checkpoint complexes. A parallel HORMA protein disassembly activity likely underlies TRIP13's critical regulatory functions in meiotic chromosome structure and recombination.« less
A fast ultrasonic simulation tool based on massively parallel implementations
NASA Astrophysics Data System (ADS)
Lambert, Jason; Rougeron, Gilles; Lacassagne, Lionel; Chatillon, Sylvain
2014-02-01
This paper presents a CIVA optimized ultrasonic inspection simulation tool, which takes benefit of the power of massively parallel architectures: graphical processing units (GPU) and multi-core general purpose processors (GPP). This tool is based on the classical approach used in CIVA: the interaction model is based on Kirchoff, and the ultrasonic field around the defect is computed by the pencil method. The model has been adapted and parallelized for both architectures. At this stage, the configurations addressed by the tool are : multi and mono-element probes, planar specimens made of simple isotropic materials, planar rectangular defects or side drilled holes of small diameter. Validations on the model accuracy and performances measurements are presented.
Conceptual design of a hybrid parallel mechanism for mask exchanging of TMT
NASA Astrophysics Data System (ADS)
Wang, Jianping; Zhou, Hongfei; Li, Kexuan; Zhou, Zengxiang; Zhai, Chao
2015-10-01
Mask exchange system is an important part of the Multi-Object Broadband Imaging Echellette (MOBIE) on the Thirty Meter Telescope (TMT). To solve the problem of stiffness changing with the gravity vector of the mask exchange system in the MOBIE, the hybrid parallel mechanism design method was introduced into the whole research. By using the characteristics of high stiffness and precision of parallel structure, combined with large moving range of serial structure, a conceptual design of a hybrid parallel mask exchange system based on 3-RPS parallel mechanism was presented. According to the position requirements of the MOBIE, the SolidWorks structure model of the hybrid parallel mask exchange robot was established and the appropriate installation position without interfering with the related components and light path in the MOBIE of TMT was analyzed. Simulation results in SolidWorks suggested that 3-RPS parallel platform had good stiffness property in different gravity vector directions. Furthermore, through the research of the mechanism theory, the inverse kinematics solution of the 3-RPS parallel platform was calculated and the mathematical relationship between the attitude angle of moving platform and the angle of ball-hinges on the moving platform was established, in order to analyze the attitude adjustment ability of the hybrid parallel mask exchange robot. The proposed conceptual design has some guiding significance for the design of mask exchange system of the MOBIE on TMT.
Lü, Qiang; Xia, Xiao-Yan; Chen, Rong; Miao, Da-Jun; Chen, Sha-Sha; Quan, Li-Jun; Li, Hai-Ou
2012-01-01
Protein structure prediction (PSP), which is usually modeled as a computational optimization problem, remains one of the biggest challenges in computational biology. PSP encounters two difficult obstacles: the inaccurate energy function problem and the searching problem. Even if the lowest energy has been luckily found by the searching procedure, the correct protein structures are not guaranteed to obtain. A general parallel metaheuristic approach is presented to tackle the above two problems. Multi-energy functions are employed to simultaneously guide the parallel searching threads. Searching trajectories are in fact controlled by the parameters of heuristic algorithms. The parallel approach allows the parameters to be perturbed during the searching threads are running in parallel, while each thread is searching the lowest energy value determined by an individual energy function. By hybridizing the intelligences of parallel ant colonies and Monte Carlo Metropolis search, this paper demonstrates an implementation of our parallel approach for PSP. 16 classical instances were tested to show that the parallel approach is competitive for solving PSP problem. This parallel approach combines various sources of both searching intelligences and energy functions, and thus predicts protein conformations with good quality jointly determined by all the parallel searching threads and energy functions. It provides a framework to combine different searching intelligence embedded in heuristic algorithms. It also constructs a container to hybridize different not-so-accurate objective functions which are usually derived from the domain expertise.
Lü, Qiang; Xia, Xiao-Yan; Chen, Rong; Miao, Da-Jun; Chen, Sha-Sha; Quan, Li-Jun; Li, Hai-Ou
2012-01-01
Background Protein structure prediction (PSP), which is usually modeled as a computational optimization problem, remains one of the biggest challenges in computational biology. PSP encounters two difficult obstacles: the inaccurate energy function problem and the searching problem. Even if the lowest energy has been luckily found by the searching procedure, the correct protein structures are not guaranteed to obtain. Results A general parallel metaheuristic approach is presented to tackle the above two problems. Multi-energy functions are employed to simultaneously guide the parallel searching threads. Searching trajectories are in fact controlled by the parameters of heuristic algorithms. The parallel approach allows the parameters to be perturbed during the searching threads are running in parallel, while each thread is searching the lowest energy value determined by an individual energy function. By hybridizing the intelligences of parallel ant colonies and Monte Carlo Metropolis search, this paper demonstrates an implementation of our parallel approach for PSP. 16 classical instances were tested to show that the parallel approach is competitive for solving PSP problem. Conclusions This parallel approach combines various sources of both searching intelligences and energy functions, and thus predicts protein conformations with good quality jointly determined by all the parallel searching threads and energy functions. It provides a framework to combine different searching intelligence embedded in heuristic algorithms. It also constructs a container to hybridize different not-so-accurate objective functions which are usually derived from the domain expertise. PMID:23028708
Parallel algorithms for simulating continuous time Markov chains
NASA Technical Reports Server (NTRS)
Nicol, David M.; Heidelberger, Philip
1992-01-01
We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallel simulation of continuous-time Markov chains. This paper reviews the basic method and compares five different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors.
Eroglu, Duygu Yilmaz; Ozmutlu, H Cenk
2014-01-01
We developed mixed integer programming (MIP) models and hybrid genetic-local search algorithms for the scheduling problem of unrelated parallel machines with job sequence and machine-dependent setup times and with job splitting property. The first contribution of this paper is to introduce novel algorithms which make splitting and scheduling simultaneously with variable number of subjobs. We proposed simple chromosome structure which is constituted by random key numbers in hybrid genetic-local search algorithm (GAspLA). Random key numbers are used frequently in genetic algorithms, but it creates additional difficulty when hybrid factors in local search are implemented. We developed algorithms that satisfy the adaptation of results of local search into the genetic algorithms with minimum relocation operation of genes' random key numbers. This is the second contribution of the paper. The third contribution of this paper is three developed new MIP models which are making splitting and scheduling simultaneously. The fourth contribution of this paper is implementation of the GAspLAMIP. This implementation let us verify the optimality of GAspLA for the studied combinations. The proposed methods are tested on a set of problems taken from the literature and the results validate the effectiveness of the proposed algorithms.
Parallel evolution of mound-building and grass-feeding in Australian nasute termites.
Arab, Daej A; Namyatova, Anna; Evans, Theodore A; Cameron, Stephen L; Yeates, David K; Ho, Simon Y W; Lo, Nathan
2017-02-01
Termite mounds built by representatives of the family Termitidae are among the most spectacular constructions in the animal kingdom, reaching 6-8 m in height and housing millions of individuals. Although functional aspects of these structures are well studied, their evolutionary origins remain poorly understood. Australian representatives of the termitid subfamily Nasutitermitinae display a wide variety of nesting habits, making them an ideal group for investigating the evolution of mound building. Because they feed on a variety of substrates, they also provide an opportunity to illuminate the evolution of termite diets. Here, we investigate the evolution of termitid mound building and diet, through a comprehensive molecular phylogenetic analysis of Australian Nasutitermitinae. Molecular dating analysis indicates that the subfamily has colonized Australia on three occasions over the past approximately 20 Myr. Ancestral-state reconstruction showed that mound building arose on multiple occasions and from diverse ancestral nesting habits, including arboreal and wood or soil nesting. Grass feeding appears to have evolved from wood feeding via ancestors that fed on both wood and leaf litter. Our results underscore the adaptability of termites to ancient environmental change, and provide novel examples of parallel evolution of extended phenotypes. © 2017 The Author(s).
Massively parallel support for a case-based planning system
NASA Technical Reports Server (NTRS)
Kettler, Brian P.; Hendler, James A.; Anderson, William A.
1993-01-01
Case-based planning (CBP), a kind of case-based reasoning, is a technique in which previously generated plans (cases) are stored in memory and can be reused to solve similar planning problems in the future. CBP can save considerable time over generative planning, in which a new plan is produced from scratch. CBP thus offers a potential (heuristic) mechanism for handling intractable problems. One drawback of CBP systems has been the need for a highly structured memory to reduce retrieval times. This approach requires significant domain engineering and complex memory indexing schemes to make these planners efficient. In contrast, our CBP system, CaPER, uses a massively parallel frame-based AI language (PARKA) and can do extremely fast retrieval of complex cases from a large, unindexed memory. The ability to do fast, frequent retrievals has many advantages: indexing is unnecessary; very large case bases can be used; memory can be probed in numerous alternate ways; and queries can be made at several levels, allowing more specific retrieval of stored plans that better fit the target problem with less adaptation. In this paper we describe CaPER's case retrieval techniques and some experimental results showing its good performance, even on large case bases.
Development of iterative techniques for the solution of unsteady compressible viscous flows
NASA Technical Reports Server (NTRS)
Hixon, Duane; Sankar, L. N.
1993-01-01
During the past two decades, there has been significant progress in the field of numerical simulation of unsteady compressible viscous flows. At present, a variety of solution techniques exist such as the transonic small disturbance analyses (TSD), transonic full potential equation-based methods, unsteady Euler solvers, and unsteady Navier-Stokes solvers. These advances have been made possible by developments in three areas: (1) improved numerical algorithms; (2) automation of body-fitted grid generation schemes; and (3) advanced computer architectures with vector processing and massively parallel processing features. In this work, the GMRES scheme has been considered as a candidate for acceleration of a Newton iteration time marching scheme for unsteady 2-D and 3-D compressible viscous flow calculation; from preliminary calculations, this will provide up to a 65 percent reduction in the computer time requirements over the existing class of explicit and implicit time marching schemes. The proposed method has ben tested on structured grids, but is flexible enough for extension to unstructured grids. The described scheme has been tested only on the current generation of vector processor architecture of the Cray Y/MP class, but should be suitable for adaptation to massively parallel machines.
The many shades of prion strain adaptation.
Baskakov, Ilia V
2014-01-01
In several recent studies transmissible prion disease was induced in animals by inoculation with recombinant prion protein amyloid fibrils produced in vitro. Serial transmission of amyloid fibrils gave rise to a new class of prion strains of synthetic origin. Gradual transformation of disease phenotypes and PrP(Sc) properties was observed during serial transmission of synthetic prions, a process that resembled the phenomenon of prion strain adaptation. The current article discusses the remarkable parallels between phenomena of prion strain adaptation that accompanies cross-species transmission and the evolution of synthetic prions occurring within the same host. Two alternative mechanisms underlying prion strain adaptation and synthetic strain evolution are discussed. The current article highlights the complexity of the prion transmission barrier and strain adaptation and proposes that the phenomenon of prion adaptation is more common than previously thought.
A sequential adaptation technique and its application to the Mark 12 IFF system
NASA Astrophysics Data System (ADS)
Bailey, John S.; Mallett, John D.; Sheppard, Duane J.; Warner, F. Neal; Adams, Robert
1986-07-01
Sequential adaptation uses only two sets of receivers, correlators, and A/D converters which are time multiplexed to effect spatial adaptation in a system with (N) adaptive degrees of freedom. This technique can substantially reduce the hardware cost over what is realizable in a parallel architecture. A three channel L-band version of the sequential adapter was built and tested for use with the MARK XII IFF (identify friend or foe) system. In this system the sequentially determined adaptive weights were obtained digitally but implemented at RF. As a result, many of the post RF hardware induced sources of error that normally limit cancellation, such as receiver mismatch, are removed by the feedback property. The result is a system that can yield high levels of cancellation and be readily retrofitted to currently fielded equipment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram
Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain.« less
Work stealing for GPU-accelerated parallel programs in a global address space framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram
Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain« less
Low, R; Pothérat, A
2015-05-01
We investigate aspects of low-magnetic-Reynolds-number flow between two parallel, perfectly insulating walls in the presence of an imposed magnetic field parallel to the bounding walls. We find a functional basis to describe the flow, well adapted to the problem of finding the attractor dimension and which is also used in subsequent direct numerical simulation of these flows. For given Reynolds and Hartmann numbers, we obtain an upper bound for the dimension of the attractor by means of known bounds on the nonlinear inertial term and this functional basis for the flow. Three distinct flow regimes emerge: a quasi-isotropic three-dimensional (3D) flow, a nonisotropic 3D flow, and a 2D flow. We find the transition curves between these regimes in the space parametrized by Hartmann number Ha and attractor dimension d(att). We find how the attractor dimension scales as a function of Reynolds and Hartmann numbers (Re and Ha) in each regime. We also investigate the thickness of the boundary layer along the bounding wall and find that in all regimes this scales as 1/Re, independently of the value of Ha, unlike Hartmann boundary layers found when the field is normal to the channel. The structure of the set of least dissipative modes is indeed quite different between these two cases but the properties of turbulence far from the walls (smallest scales and number of degrees of freedom) are found to be very similar.
Zhang, S.; Yuen, D.A.; Zhu, A.; Song, S.; George, D.L.
2011-01-01
We parallelized the GeoClaw code on one-level grid using OpenMP in March, 2011 to meet the urgent need of simulating tsunami waves at near-shore from Tohoku 2011 and achieved over 75% of the potential speed-up on an eight core Dell Precision T7500 workstation [1]. After submitting that work to SC11 - the International Conference for High Performance Computing, we obtained an unreleased OpenMP version of GeoClaw from David George, who developed the GeoClaw code as part of his PH.D thesis. In this paper, we will show the complementary characteristics of the two approaches used in parallelizing GeoClaw and the speed-up obtained by combining the advantage of each of the two individual approaches with adaptive mesh refinement (AMR), demonstrating the capabilities of running GeoClaw efficiently on many-core systems. We will also show a novel simulation of the Tohoku 2011 Tsunami waves inundating the Sendai airport and Fukushima Nuclear Power Plants, over which the finest grid distance of 20 meters is achieved through a 4-level AMR. This simulation yields quite good predictions about the wave-heights and travel time of the tsunami waves. ?? 2011 IEEE.
Piras, P; Sansalone, G; Teresi, L; Kotsakis, T; Colangelo, P; Loy, A
2012-07-01
The shape and mechanical performance in Talpidae humeri were studied by means of Geometric Morphometrics and Finite Element Analysis, including both extinct and extant taxa. The aim of this study was to test whether the ability to dig, quantified by humerus mechanical performance, was characterized by convergent or parallel adaptations in different clades of complex tunnel digger within Talpidae, that is, Talpinae+Condylura (monophyletic) and some complex tunnel diggers not belonging to this clade. Our results suggest that the pattern underlying Talpidae humerus evolution is evolutionary parallelism. However, this insight changed to true convergence when we tested an alternative phylogeny based on molecular data, with Condylura moved to a more basal phylogenetic position. Shape and performance analyses, as well as specific comparative methods, provided strong evidence that the ability to dig complex tunnels reached a functional optimum in distantly related taxa. This was also confirmed by the lower phenotypic variance in complex tunnel digger taxa, compared to non-complex tunnel diggers. Evolutionary rates of phenotypic change showed a smooth deceleration in correspondence with the most recent common ancestor of the Talpinae+Condylura clade. Copyright © 2012 Wiley Periodicals, Inc.
NASA Technical Reports Server (NTRS)
Weeks, Cindy Lou
1986-01-01
Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
The Refinement-Tree Partition for Parallel Solution of Partial Differential Equations
Mitchell, William F.
1998-01-01
Dynamic load balancing is considered in the context of adaptive multilevel methods for partial differential equations on distributed memory multiprocessors. An approach that periodically repartitions the grid is taken. The important properties of a partitioning algorithm are presented and discussed in this context. A partitioning algorithm based on the refinement tree of the adaptive grid is presented and analyzed in terms of these properties. Theoretical and numerical results are given. PMID:28009355
The Refinement-Tree Partition for Parallel Solution of Partial Differential Equations.
Mitchell, William F
1998-01-01
Dynamic load balancing is considered in the context of adaptive multilevel methods for partial differential equations on distributed memory multiprocessors. An approach that periodically repartitions the grid is taken. The important properties of a partitioning algorithm are presented and discussed in this context. A partitioning algorithm based on the refinement tree of the adaptive grid is presented and analyzed in terms of these properties. Theoretical and numerical results are given.
Adaptive-optics optical coherence tomography processing using a graphics processing unit.
Shafer, Brandon A; Kriske, Jeffery E; Kocaoglu, Omer P; Turner, Timothy L; Liu, Zhuolin; Lee, John Jaehwan; Miller, Donald T
2014-01-01
Graphics processing units are increasingly being used for scientific computing for their powerful parallel processing abilities, and moderate price compared to super computers and computing grids. In this paper we have used a general purpose graphics processing unit to process adaptive-optics optical coherence tomography (AOOCT) images in real time. Increasing the processing speed of AOOCT is an essential step in moving the super high resolution technology closer to clinical viability.
Optimizing Input/Output Using Adaptive File System Policies
NASA Technical Reports Server (NTRS)
Madhyastha, Tara M.; Elford, Christopher L.; Reed, Daniel A.
1996-01-01
Parallel input/output characterization studies and experiments with flexible resource management algorithms indicate that adaptivity is crucial to file system performance. In this paper we propose an automatic technique for selecting and refining file system policies based on application access patterns and execution environment. An automatic classification framework allows the file system to select appropriate caching and pre-fetching policies, while performance sensors provide feedback used to tune policy parameters for specific system environments. To illustrate the potential performance improvements possible using adaptive file system policies, we present results from experiments involving classification-based and performance-based steering.
NASA Astrophysics Data System (ADS)
Ali-Bey, Mohamed; Moughamir, Saïd; Manamanni, Noureddine
2011-12-01
in this paper a simulator of a multi-view shooting system with parallel optical axes and structurally variable configuration is proposed. The considered system is dedicated to the production of 3D contents for auto-stereoscopic visualization. The global shooting/viewing geometrical process, which is the kernel of this shooting system, is detailed and the different viewing, transformation and capture parameters are then defined. An appropriate perspective projection model is afterward derived to work out a simulator. At first, this latter is used to validate the global geometrical process in the case of a static configuration. Next, the simulator is used to show the limitations of a static configuration of this shooting system type by considering the case of dynamic scenes and then a dynamic scheme is achieved to allow a correct capture of this kind of scenes. After that, the effect of the different geometrical capture parameters on the 3D rendering quality and the necessity or not of their adaptation is studied. Finally, some dynamic effects and their repercussions on the 3D rendering quality of dynamic scenes are analyzed using error images and some image quantization tools. Simulation and experimental results are presented throughout this paper to illustrate the different studied points. Some conclusions and perspectives end the paper. [Figure not available: see fulltext.
Pham, Hoang Nam; Michalet, Serge; Bodillis, Josselin; Nguyen, Tien Dat; Nguyen, Thi Kieu Oanh; Le, Thi Phuong Quynh; Haddad, Mohamed; Nazaret, Sylvie; Dijoux-Franca, Marie-Geneviève
2017-07-01
Plants adapt to metal stress by modifying their metabolism including the production of secondary metabolites in plant tissues. Such changes may impact the diversity and functions of plant associated microbial communities. Our study aimed to evaluate the influence of metals on the secondary metabolism of plants and the indirect impact on rhizosphere bacterial communities. We then compared the secondary metabolites of the hyperaccumulator Pteris vittata L. collected from a contaminated mining site to a non-contaminated site in Vietnam and identified the discriminant metabolites. Our data showed a significant increase in chlorogenic acid derivatives and A-type procyanidin in plant roots at the contaminated site. We hypothesized that the intensive production of these compounds could be part of the antioxidant defense mechanism in response to metals. In parallel, the structure and diversity of bulk soil and rhizosphere communities was studied using high-throughput sequencing. The results showed strong differences in bacterial composition, characterized by the dominance of Proteobacteria and Nitrospira in the contaminated bulk soil, and the enrichment of some potential human pathogens, i.e., Acinetobacter, Mycobacterium, and Cupriavidus in P. vittata's rhizosphere at the mining site. Overall, metal pollution modified the production of P. vittata secondary metabolites and altered the diversity and structure of bacterial communities. Further investigations are needed to understand whether the plant recruits specific bacteria to adapt to metal stress.
Parallel aeroelastic computations for wing and wing-body configurations
NASA Technical Reports Server (NTRS)
Byun, Chansup
1994-01-01
The objective of this research is to develop computationally efficient methods for solving fluid-structural interaction problems by directly coupling finite difference Euler/Navier-Stokes equations for fluids and finite element dynamics equations for structures on parallel computers. This capability will significantly impact many aerospace projects of national importance such as Advanced Subsonic Civil Transport (ASCT), where the structural stability margin becomes very critical at the transonic region. This research effort will have direct impact on the High Performance Computing and Communication (HPCC) Program of NASA in the area of parallel computing.
Akula, Nagaraju; Pattabiraman, Nagarajan
2005-06-01
Membrane proteins play a major role in number of biological processes such as signaling pathways. The determination of the three-dimensional structure of these proteins is increasingly important for our understanding of their structure-function relationships. Due to the difficulty in isolating membrane proteins for X-ray diffraction studies, computational techniques are being developed to generate the 3D structures of TM domains. Here, we present a systematic search method for the identification of energetically favorable and tightly packed transmembrane parallel alpha-helices. The first step in our systematic search method is the generation of 3D models for pairs of parallel helix bundles with all possible orientations followed by an energy-based filter to eliminate structures with severe non-bonded contacts. Then, a RMS-based filter was used to cluster these structures into families. Furthermore, these dimers were energy minimized using molecular mechanics force field. Finally, we identified the tightly packed parallel alpha-helices by using an interface surface area. To validate our search method, we compared our predicted GlycophorinA dimer structures with the reported NMR structures. With our search method, we are able to reproduce NMR structures of GPA with 0.9A RMSD. In addition, by considering the reported mutational data on GxxxG motif interactions, twenty percent of our predicted dimers are within in the 2.0A RMSD. The dimers obtained from our method were used to generate parallel trimeric and tetramer TM structures of GPA and found that the structure of GPA might exist only in a dimer form as reported earlier.
Voice-coil technology for the E-ELT M4 Adaptive Unit
NASA Astrophysics Data System (ADS)
Gallieni, D.; Tintori, M.; Mantegazza, M.; Anaclerio, E.; Crimella, L.; Acerboni, M.; Biasi, R.; Angerer, G.; Andrigettoni, M.; Merler, A.; Veronese, D.; Carel, J.-L.; Marque, G.; Molinari, E.; Tresoldi, D.; Toso, G.; Spanó, P.; Riva, M.; Mazzoleni, R.; Riccardi, A.; Mantegazza, P.; Manetti, M.; Morandini, M.; Vernet, E.; Hubin, N.; Jochum, L.; Madec, P.; Dimmler, M.; Koch, F.
We present our design of the E-ELT M4 Adaptive Unit based on voice-coil driven deformable mirror technology. This technology was developed by INAF-Arcetri, Microgate and ADS team in the past 15 years and it has been adopted by a number of large ground based telescopes as the MMT, LBT, Magellan and lastly the VLT in the frame of the Adaptive Telescope Facility project. Our design is based on contactless force actuators made by permanent magnets glued on the back of the deformable mirror and coils mounted on a stiff reference structure. We use capacitive sensors to close a position loop co-located with each actuator. Dedicated high performance parallel processors are used to implement the local de-centralized control at actuator level and a centralized feed-forward computation of all the actuators forces. This allowed achieving in our previous systems dynamic performances well in line with the requirements of the M4 Adaptive Unit (M4AU) case. The actuator density of our design is in the order of 30-mm spacing for a figure of about 6000 actuators on the M4AU and it allows fulfilling the fitting error and corrections requirements of the E-ELT high order DM. Moreover, our contact-less technology makes the Deformable Mirror tolerant to up 5% actuators failures without spoiling system capability to reach its specified performances, besides allowing large mechanical tolerances between the reference structure and the deformable mirror. Finally, we present the Demonstration Prototype we are building in the frame of the M4AU Phase B study to measure the optical dynamical performances predicted by our design. Such a prototype will be fully representative of the M4AU features, in particular it will address the controllability of two adjacent segments of the 2-mm thick mirror and implement the actuators "brick" modular concept that has been adopted to dramatically improve the maintainability of the final unit.
NASA Astrophysics Data System (ADS)
Hird, J. P.; Twilley, R.; Shelden, J.; Carney, J.; Georgiou, I. Y.; Agre, C.
2016-02-01
In response to the Changing Course Design Competition a bold, innovative "systems approach" to link the specific needs of the region's ecosystem, economy and community is proposed. "The Giving Delta" plan empowers the Mississippi River's seasonal natural flood pulse to maximized sediment capture in order to build and maintain wetlands, mitigate the effects of climate change and subsidence, and to slow the inevitable marine transgression of the Delta. Sediment capture is optimized by a series of sediment retention strategies and passive sediment diversion structures, as well as establishing a new deep draft navigation channel connected to the Barataria Bay shoreline littoral zone 40 miles north of the current channel.This paradigm shift from "flood control" to "controlled floods", connects the River's natural flood pulse to the coastal landscape. Using hydraulic residence time in the basin as a design and operational criteria for these controlled and passive structures, balances estuarine recovery and system response tolerance in order to determine the magnitude of the peak flows possible without intolerable salinity suppression in the receiving basins. Seasonal salinity gradients can be established that enable the diversion program to operate in harmony with and promote regional fisheries. On an annual basis, fisheries, communities and ecosystems will adapt to seasonally changing conditions. This plan is not designed to completely rebuild the wetlands that have been lost over the last century. Instead, the design encourages wetland adaptation to accelerated sea level rise in the coastal basins. With this plan, the basin ecologies would "self-organize" in parallel to the human settlement's natural ability to adapt and change to this long-term vision, as a new, consolidated and sustainable Delta emerges. By establishing a framework of implementation over 100 years, incremental adaptation minimizes individual uncertainty and costs within each human generation.
Development of Underwater Laser Scaling Adapter
NASA Astrophysics Data System (ADS)
Bluss, Kaspars
2012-12-01
In this paper the developed laser scaling adapter is presented. The scaling adapter is equipped with a twin laser unit where the two parallel laser beams are projected onto any target giving an exact indication of scale. The body of the laser scaling adapter is made of Teflon, the density of which is approximately two times the water density. The development involved multiple challenges - numerical hydrodynamic calculations for choosing an appropriate shape which would reduce the effects of turbulence, an accurate sealing of the power supply and the laser diodes, and others. The precision is estimated by the partial derivation method. Both experimental and theoretical data conclude the overall precision error to be in the 1% margin. This paper presents the development steps of such an underwater laser scaling adapter for a remotely operated vehicle (ROV).
Research on the adaptive optical control technology based on DSP
NASA Astrophysics Data System (ADS)
Zhang, Xiaolu; Xue, Qiao; Zeng, Fa; Zhao, Junpu; Zheng, Kuixing; Su, Jingqin; Dai, Wanjun
2018-02-01
Adaptive optics is a real-time compensation technique using high speed support system for wavefront errors caused by atmospheric turbulence. However, the randomness and instantaneity of atmospheric changing introduce great difficulties to the design of adaptive optical systems. A large number of complex real-time operations lead to large delay, which is an insurmountable problem. To solve this problem, hardware operation and parallel processing strategy are proposed, and a high-speed adaptive optical control system based on DSP is developed. The hardware counter is used to check the system. The results show that the system can complete a closed loop control in 7.1ms, and improve the controlling bandwidth of the adaptive optical system. Using this system, the wavefront measurement and closed loop experiment are carried out, and obtain the good results.
NASA Astrophysics Data System (ADS)
Somavarapu, Dhathri H.
This thesis proposes a new parallel computing genetic algorithm framework for designing fuel-optimal trajectories for interplanetary spacecraft missions. The framework can capture the deep search space of the problem with the use of a fixed chromosome structure and hidden-genes concept, can explore the diverse set of candidate solutions with the use of the adaptive and twin-space crowding techniques and, can execute on any high-performance computing (HPC) platform with the adoption of the portable message passing interface (MPI) standard. The algorithm is implemented in C++ with the use of the MPICH implementation of the MPI standard. The algorithm uses a patched-conic approach with two-body dynamics assumptions. New procedures are developed for determining trajectories in the Vinfinity-leveraging legs of the flight from the launch and non-launch planets and, deep-space maneuver legs of the flight from the launch and non-launch planets. The chromosome structure maintains the time of flight as a free parameter within certain boundaries. The fitness or the cost function of the algorithm uses only the mission Delta V, and does not include time of flight. The optimization is conducted with two variations for the minimum mission gravity-assist sequence, the 4-gravity-assist, and the 3-gravity-assist, with a maximum of 5 gravity-assists allowed in both the cases. The optimal trajectories discovered using the framework in both of the cases demonstrate the success of this framework.
NASA Astrophysics Data System (ADS)
Zatarain Salazar, Jazmin; Reed, Patrick M.; Quinn, Julianne D.; Giuliani, Matteo; Castelletti, Andrea
2017-11-01
Reservoir operations are central to our ability to manage river basin systems serving conflicting multi-sectoral demands under increasingly uncertain futures. These challenges motivate the need for new solution strategies capable of effectively and efficiently discovering the multi-sectoral tradeoffs that are inherent to alternative reservoir operation policies. Evolutionary many-objective direct policy search (EMODPS) is gaining importance in this context due to its capability of addressing multiple objectives and its flexibility in incorporating multiple sources of uncertainties. This simulation-optimization framework has high potential for addressing the complexities of water resources management, and it can benefit from current advances in parallel computing and meta-heuristics. This study contributes a diagnostic assessment of state-of-the-art parallel strategies for the auto-adaptive Borg Multi Objective Evolutionary Algorithm (MOEA) to support EMODPS. Our analysis focuses on the Lower Susquehanna River Basin (LSRB) system where multiple sectoral demands from hydropower production, urban water supply, recreation and environmental flows need to be balanced. Using EMODPS with different parallel configurations of the Borg MOEA, we optimize operating policies over different size ensembles of synthetic streamflows and evaporation rates. As we increase the ensemble size, we increase the statistical fidelity of our objective function evaluations at the cost of higher computational demands. This study demonstrates how to overcome the mathematical and computational barriers associated with capturing uncertainties in stochastic multiobjective reservoir control optimization, where parallel algorithmic search serves to reduce the wall-clock time in discovering high quality representations of key operational tradeoffs. Our results show that emerging self-adaptive parallelization schemes exploiting cooperative search populations are crucial. Such strategies provide a promising new set of tools for effectively balancing exploration, uncertainty, and computational demands when using EMODPS.
Accelerated Adaptive MGS Phase Retrieval
NASA Technical Reports Server (NTRS)
Lam, Raymond K.; Ohara, Catherine M.; Green, Joseph J.; Bikkannavar, Siddarayappa A.; Basinger, Scott A.; Redding, David C.; Shi, Fang
2011-01-01
The Modified Gerchberg-Saxton (MGS) algorithm is an image-based wavefront-sensing method that can turn any science instrument focal plane into a wavefront sensor. MGS characterizes optical systems by estimating the wavefront errors in the exit pupil using only intensity images of a star or other point source of light. This innovative implementation of MGS significantly accelerates the MGS phase retrieval algorithm by using stream-processing hardware on conventional graphics cards. Stream processing is a relatively new, yet powerful, paradigm to allow parallel processing of certain applications that apply single instructions to multiple data (SIMD). These stream processors are designed specifically to support large-scale parallel computing on a single graphics chip. Computationally intensive algorithms, such as the Fast Fourier Transform (FFT), are particularly well suited for this computing environment. This high-speed version of MGS exploits commercially available hardware to accomplish the same objective in a fraction of the original time. The exploit involves performing matrix calculations in nVidia graphic cards. The graphical processor unit (GPU) is hardware that is specialized for computationally intensive, highly parallel computation. From the software perspective, a parallel programming model is used, called CUDA, to transparently scale multicore parallelism in hardware. This technology gives computationally intensive applications access to the processing power of the nVidia GPUs through a C/C++ programming interface. The AAMGS (Accelerated Adaptive MGS) software takes advantage of these advanced technologies, to accelerate the optical phase error characterization. With a single PC that contains four nVidia GTX-280 graphic cards, the new implementation can process four images simultaneously to produce a JWST (James Webb Space Telescope) wavefront measurement 60 times faster than the previous code.
Noninvasive Medical Diagnostics & Treatment Using Ultrasonics
NASA Technical Reports Server (NTRS)
Bar-Cohen, Y.; Siegel, R.; Grandia, W.
1998-01-01
In parallel to the industrial application of NDE to flaw detection and material property determination, the medical community has succesfully adapted such methods to the noninvasaive diagnostics and treatment of many conditions and disorders of the human body.
Research in Parallel Algorithms and Software for Computational Aerosciences
DOT National Transportation Integrated Search
1996-04-01
Phase I is complete for the development of a Computational Fluid Dynamics : with automatic grid generation and adaptation for the Euler : analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian : grid code developed at Lockheed...
A Comparison of Three Programming Models for Adaptive Applications
NASA Technical Reports Server (NTRS)
Shan, Hong-Zhang; Singh, Jaswinder Pal; Oliker, Leonid; Biswa, Rupak; Kwak, Dochan (Technical Monitor)
2000-01-01
We study the performance and programming effort for two major classes of adaptive applications under three leading parallel programming models. We find that all three models can achieve scalable performance on the state-of-the-art multiprocessor machines. The basic parallel algorithms needed for different programming models to deliver their best performance are similar, but the implementations differ greatly, far beyond the fact of using explicit messages versus implicit loads/stores. Compared with MPI and SHMEM, CC-SAS (cache-coherent shared address space) provides substantial ease of programming at the conceptual and program orchestration level, which often leads to the performance gain. However it may also suffer from the poor spatial locality of physically distributed shared data on large number of processors. Our CC-SAS implementation of the PARMETIS partitioner itself runs faster than in the other two programming models, and generates more balanced result for our application.
3D Printed, Microgroove Pattern-Driven Generation of Oriented Ligamentous Architectures.
Park, Chan Ho; Kim, Kyoung-Hwa; Lee, Yong-Moo; Giannobile, William V; Seol, Yang-Jo
2017-09-08
Specific orientations of regenerated ligaments are crucially required for mechanoresponsive properties and various biomechanical adaptations, which are the key interplay to support mineralized tissues. Although various 2D platforms or 3D printing systems can guide cellular activities or aligned organizations, it remains a challenge to develop ligament-guided, 3D architectures with the angular controllability for parallel, oblique or perpendicular orientations of cells required for biomechanical support of organs. Here, we show the use of scaffold design by additive manufacturing for specific topographies or angulated microgroove patterns to control cell orientations such as parallel (0°), oblique (45°) and perpendicular (90°) angulations. These results demonstrate that ligament cells displayed highly predictable and controllable orientations along microgroove patterns on 3D biopolymeric scaffolds. Our findings demonstrate that 3D printed topographical approaches can regulate spatiotemporal cell organizations that offer strong potential for adaptation to complex tissue defects to regenerate ligament-bone complexes.
Portable Parallel Programming for the Dynamic Load Balancing of Unstructured Grid Applications
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Das, Sajal K.; Harvey, Daniel; Oliker, Leonid
1999-01-01
The ability to dynamically adapt an unstructured -rid (or mesh) is a powerful tool for solving computational problems with evolving physical features; however, an efficient parallel implementation is rather difficult, particularly from the view point of portability on various multiprocessor platforms We address this problem by developing PLUM, tin automatic anti architecture-independent framework for adaptive numerical computations in a message-passing environment. Portability is demonstrated by comparing performance on an SP2, an Origin2000, and a T3E, without any code modifications. We also present a general-purpose load balancer that utilizes symmetric broadcast networks (SBN) as the underlying communication pattern, with a goal to providing a global view of system loads across processors. Experiments on, an SP2 and an Origin2000 demonstrate the portability of our approach which achieves superb load balance at the cost of minimal extra overhead.
Gilgamesh: A Multithreaded Processor-In-Memory Architecture for Petaflops Computing
NASA Technical Reports Server (NTRS)
Sterling, T. L.; Zima, H. P.
2002-01-01
Processor-in-Memory (PIM) architectures avoid the von Neumann bottleneck in conventional machines by integrating high-density DRAM and CMOS logic on the same chip. Parallel systems based on this new technology are expected to provide higher scalability, adaptability, robustness, fault tolerance and lower power consumption than current MPPs or commodity clusters. In this paper we describe the design of Gilgamesh, a PIM-based massively parallel architecture, and elements of its execution model. Gilgamesh extends existing PIM capabilities by incorporating advanced mechanisms for virtualizing tasks and data and providing adaptive resource management for load balancing and latency tolerance. The Gilgamesh execution model is based on macroservers, a middleware layer which supports object-based runtime management of data and threads allowing explicit and dynamic control of locality and load balancing. The paper concludes with a discussion of related research activities and an outlook to future work.
Aluminum integral foams with tailored density profile by adapted blowing agents
NASA Astrophysics Data System (ADS)
Hartmann, Johannes; Fiegl, Tobias; Körner, Carolin
2014-05-01
The goal of the present work is the variation of the structure of aluminum integral foams regarding the thickness of the integral solid skin as well as the density profile. A modified die casting process, namely integral foam molding, is used in which an aluminum melt and blowing agent particles (magnesium hydride MgH2) are injected in a permanent steel mold. The high solidification rates at the cooled walls of the mold lead to the formation of a solid skin. In the inner region, hydrogen is released by thermal decomposition of MgH2 particles. Thus, the pore formation takes place parallel to the continuing solidification of the melt. The thickness of the solid skin and the density profile of the core strongly depend on the interplay between solidification velocity and kinetics of hydrogen release. By varying the melt and blowing agent properties, the structure of integral foams can be systematically changed to meet the requirements of the desired field of application of the produced component.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Micah Johnson, Andrew Slaughter
PIKA is a MOOSE-based application for modeling micro-structure evolution of seasonal snow. The model will be useful for environmental, atmospheric, and climate scientists. Possible applications include application to energy balance models, ice sheet modeling, and avalanche forecasting. The model implements physics from published, peer-reviewed articles. The main purpose is to foster university and laboratory collaboration to build a larger multi-scale snow model using MOOSE. The main feature of the code is that it is implemented using the MOOSE framework, thus making features such as multiphysics coupling, adaptive mesh refinement, and parallel scalability native to the application. PIKA implements three equations:more » the phase-field equation for tracking the evolution of the ice-air interface within seasonal snow at the grain-scale; the heat equation for computing the temperature of both the ice and air within the snow; and the mass transport equation for monitoring the diffusion of water vapor in the pore space of the snow.« less
AESOP: A Python Library for Investigating Electrostatics in Protein Interactions.
Harrison, Reed E S; Mohan, Rohith R; Gorham, Ronald D; Kieslich, Chris A; Morikis, Dimitrios
2017-05-09
Electric fields often play a role in guiding the association of protein complexes. Such interactions can be further engineered to accelerate complex association, resulting in protein systems with increased productivity. This is especially true for enzymes where reaction rates are typically diffusion limited. To facilitate quantitative comparisons of electrostatics in protein families and to describe electrostatic contributions of individual amino acids, we previously developed a computational framework called AESOP. We now implement this computational tool in Python with increased usability and the capability of performing calculations in parallel. AESOP utilizes PDB2PQR and Adaptive Poisson-Boltzmann Solver to generate grid-based electrostatic potential files for protein structures provided by the end user. There are methods within AESOP for quantitatively comparing sets of grid-based electrostatic potentials in terms of similarity or generating ensembles of electrostatic potential files for a library of mutants to quantify the effects of perturbations in protein structure and protein-protein association. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.
A cascaded neuro-computational model for spoken word recognition
NASA Astrophysics Data System (ADS)
Hoya, Tetsuya; van Leeuwen, Cees
2010-03-01
In human speech recognition, words are analysed at both pre-lexical (i.e., sub-word) and lexical (word) levels. The aim of this paper is to propose a constructive neuro-computational model that incorporates both these levels as cascaded layers of pre-lexical and lexical units. The layered structure enables the system to handle the variability of real speech input. Within the model, receptive fields of the pre-lexical layer consist of radial basis functions; the lexical layer is composed of units that perform pattern matching between their internal template and a series of labels, corresponding to the winning receptive fields in the pre-lexical layer. The model adapts through self-tuning of all units, in combination with the formation of a connectivity structure through unsupervised (first layer) and supervised (higher layers) network growth. Simulation studies show that the model can achieve a level of performance in spoken word recognition similar to that of a benchmark approach using hidden Markov models, while enabling parallel access to word candidates in lexical decision making.
Evolving phenotypic networks in silico.
François, Paul
2014-11-01
Evolved gene networks are constrained by natural selection. Their structures and functions are consequently far from being random, as exemplified by the multiple instances of parallel/convergent evolution. One can thus ask if features of actual gene networks can be recovered from evolutionary first principles. I review a method for in silico evolution of small models of gene networks aiming at performing predefined biological functions. I summarize the current implementation of the algorithm, insisting on the construction of a proper "fitness" function. I illustrate the approach on three examples: biochemical adaptation, ligand discrimination and vertebrate segmentation (somitogenesis). While the structure of the evolved networks is variable, dynamics of our evolved networks are usually constrained and present many similar features to actual gene networks, including properties that were not explicitly selected for. In silico evolution can thus be used to predict biological behaviours without a detailed knowledge of the mapping between genotype and phenotype. Copyright © 2014 The Author. Published by Elsevier Ltd.. All rights reserved.
Physics Structure Analysis of Parallel Waves Concept of Physics Teacher Candidate
NASA Astrophysics Data System (ADS)
Sarwi, S.; Supardi, K. I.; Linuwih, S.
2017-04-01
The aim of this research was to find a parallel structure concept of wave physics and the factors that influence on the formation of parallel conceptions of physics teacher candidates. The method used qualitative research which types of cross-sectional design. These subjects were five of the third semester of basic physics and six of the fifth semester of wave course students. Data collection techniques used think aloud and written tests. Quantitative data were analysed with descriptive technique-percentage. The data analysis technique for belief and be aware of answers uses an explanatory analysis. Results of the research include: 1) the structure of the concept can be displayed through the illustration of a map containing the theoretical core, supplements the theory and phenomena that occur daily; 2) the trend of parallel conception of wave physics have been identified on the stationary waves, resonance of the sound and the propagation of transverse electromagnetic waves; 3) the influence on the parallel conception that reading textbooks less comprehensive and knowledge is partial understanding as forming the structure of the theory.
Transition from Longitudinal to Block Structure of Preclinical Courses: Outcomes and Experiences
Marinović, Darko; Hren, Darko; Sambunjak, Dario; Rašić, Ivan; Škegro, Ivan; Marušić, Ana; Marušić, Matko
2009-01-01
Aim To evaluate the transition from a longitudinal to block/modular structure of preclinical courses in a medical school adapting to the process of higher education harmonization in Europe. Methods Average grades and the exam pass rates were compared for 11 preclinical courses before and after the transition from the longitudinal (academic years 1999/2000 to 2001/2002) to block/modular curriculum (academic years 2002/2003 to 2004/2005) at Zagreb University School of Medicine, Croatia. Attitudes of teachers toward the 2 curriculum structures were assessed by a semantic differential scale, and the experiences during the transition were explored in focus groups of students and teachers. Results With the introduction of the block/modular curriculum, average grades mostly increased, except in 3 major courses: Anatomy, Physiology, and Pathology. The proportion of students who passed the exams at first attempt decreased in most courses, but the proportion of students who successfully passed the exam by the end of the summer exam period increased. Teachers generally had more positive attitudes toward the longitudinal (median [C]±intequartile range [Q], 24 ± 16) than block/modular curriculum (C±Q, 38 ± 26) (P = 0.001, Wilcoxon signed rank test). The qualitative inquiry indicated that the dissatisfaction of students and teachers with the block/modular preclinical curriculum was caused by perceived hasty introduction of the reform under pressure and without much adaptation of the teaching program and materials, which reflected negatively on the learning processes and outcomes. Conclusion Any significant alteration in the temporal structure of preclinical courses should be paralleled by a change in the content and teaching methodology, and carefully planned and executed in order to achieve better academic outcomes. PMID:19839073
Parallel grid library for rapid and flexible simulation development
NASA Astrophysics Data System (ADS)
Honkonen, I.; von Alfthan, S.; Sandroos, A.; Janhunen, P.; Palmroth, M.
2013-04-01
We present an easy to use and flexible grid library for developing highly scalable parallel simulations. The distributed cartesian cell-refinable grid (dccrg) supports adaptive mesh refinement and allows an arbitrary C++ class to be used as cell data. The amount of data in grid cells can vary both in space and time allowing dccrg to be used in very different types of simulations, for example in fluid and particle codes. Dccrg transfers the data between neighboring cells on different processes transparently and asynchronously allowing one to overlap computation and communication. This enables excellent scalability at least up to 32 k cores in magnetohydrodynamic tests depending on the problem and hardware. In the version of dccrg presented here part of the mesh metadata is replicated between MPI processes reducing the scalability of adaptive mesh refinement (AMR) to between 200 and 600 processes. Dccrg is free software that anyone can use, study and modify and is available at https://gitorious.org/dccrg. Users are also kindly requested to cite this work when publishing results obtained with dccrg. Catalogue identifier: AEOM_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOM_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: GNU Lesser General Public License version 3 No. of lines in distributed program, including test data, etc.: 54975 No. of bytes in distributed program, including test data, etc.: 974015 Distribution format: tar.gz Programming language: C++. Computer: PC, cluster, supercomputer. Operating system: POSIX. The code has been parallelized using MPI and tested with 1-32768 processes RAM: 10 MB-10 GB per process Classification: 4.12, 4.14, 6.5, 19.3, 19.10, 20. External routines: MPI-2 [1], boost [2], Zoltan [3], sfc++ [4] Nature of problem: Grid library supporting arbitrary data in grid cells, parallel adaptive mesh refinement, transparent remote neighbor data updates and load balancing. Solution method: The simulation grid is represented by an adjacency list (graph) with vertices stored into a hash table and edges into contiguous arrays. Message Passing Interface standard is used for parallelization. Cell data is given as a template parameter when instantiating the grid. Restrictions: Logically cartesian grid. Running time: Running time depends on the hardware, problem and the solution method. Small problems can be solved in under a minute and very large problems can take weeks. The examples and tests provided with the package take less than about one minute using default options. In the version of dccrg presented here the speed of adaptive mesh refinement is at most of the order of 106 total created cells per second. http://www.mpi-forum.org/. http://www.boost.org/. K. Devine, E. Boman, R. Heaphy, B. Hendrickson, C. Vaughan, Zoltan data management services for parallel dynamic applications, Comput. Sci. Eng. 4 (2002) 90-97. http://dx.doi.org/10.1109/5992.988653. https://gitorious.org/sfc++.
NASA Astrophysics Data System (ADS)
Ma, Wenying; Ma, Changwei; Wang, Weimin
2018-03-01
Deformable mirrors (DM) based on microelectromechanical system (MEMS) technology are being applied in adaptive optics (AO) system for astronomical telescopes and human eyes more and more. In this paper a MEMS DM with hexagonal actuator is proposed and designed. The relationship between structural design and performance parameters, mainly actuator coupling, is analyzed carefully and calculated. The optimum value of actuator coupling is obtained. A 7-element DM prototype is fabricated using a commercial available standard three-layer polysilicon surface multi-user-MEMS-processes (PolyMUMPs). Some key performances, including surface figure and voltage-displacement curve, are measured through a 3D white light profiler. The measured performances are very consistent with the theoretical values. The proposed DM will benefit the miniaturization of AO systems and lower their cost.
NASA Technical Reports Server (NTRS)
Chung, T. J. (Editor); Karr, Gerald R. (Editor)
1989-01-01
Recent advances in computational fluid dynamics are examined in reviews and reports, with an emphasis on finite-element methods. Sections are devoted to adaptive meshes, atmospheric dynamics, combustion, compressible flows, control-volume finite elements, crystal growth, domain decomposition, EM-field problems, FDM/FEM, and fluid-structure interactions. Consideration is given to free-boundary problems with heat transfer, free surface flow, geophysical flow problems, heat and mass transfer, high-speed flow, incompressible flow, inverse design methods, MHD problems, the mathematics of finite elements, and mesh generation. Also discussed are mixed finite elements, multigrid methods, non-Newtonian fluids, numerical dissipation, parallel vector processing, reservoir simulation, seepage, shallow-water problems, spectral methods, supercomputer architectures, three-dimensional problems, and turbulent flows.
Robotic inspection of fiber reinforced composites using phased array UT
NASA Astrophysics Data System (ADS)
Stetson, Jeffrey T.; De Odorico, Walter
2014-02-01
Ultrasound is the current NDE method of choice to inspect large fiber reinforced airframe structures. Over the last 15 years Cartesian based scanning machines using conventional ultrasound techniques have been employed by all airframe OEMs and their top tier suppliers to perform these inspections. Technical advances in both computing power and commercially available, multi-axis robots now facilitate a new generation of scanning machines. These machines use multiple end effector tools taking full advantage of phased array ultrasound technologies yielding substantial improvements in inspection quality and productivity. This paper outlines the general architecture for these new robotic scanning systems as well as details the variety of ultrasonic techniques available for use with them including advances such as wide area phased array scanning and sound field adaptation for non-flat, non-parallel surfaces.
A versatile diffractive maskless lithography for single-shot and serial microfabrication.
Jenness, Nathan J; Hill, Ryan T; Hucknall, Angus; Chilkoti, Ashutosh; Clark, Robert L
2010-05-24
We demonstrate a diffractive maskless lithographic system that is capable of rapidly performing both serial and single-shot micropatterning. Utilizing the diffractive properties of phase holograms displayed on a spatial light modulator, arbitrary intensity distributions were produced to form two and three dimensional micropatterns/structures in a variety of substrates. A straightforward graphical user interface was implemented to allow users to load templates and change patterning modes within the span of a few minutes. A minimum resolution of approximately 700 nm is demonstrated for both patterning modes, which compares favorably to the 232 nm resolution limit predicted by the Rayleigh criterion. The presented method is rapid and adaptable, allowing for the parallel fabrication of microstructures in photoresist as well as the fabrication of protein microstructures that retain functional activity.
2. View of Mainline elevated structure, parallel to Washington Street, ...
2. View of Mainline elevated structure, parallel to Washington Street, crossing over the Massachusetts Turnpike and the B&A R.R. tracks - looking North. - Boston Elevated Railway, Elevated Mainline, Washington Street, Boston, Suffolk County, MA
StrAuto: automation and parallelization of STRUCTURE analysis.
Chhatre, Vikram E; Emerson, Kevin J
2017-03-24
Population structure inference using the software STRUCTURE has become an integral part of population genetic studies covering a broad spectrum of taxa including humans. The ever-expanding size of genetic data sets poses computational challenges for this analysis. Although at least one tool currently implements parallel computing to reduce computational overload of this analysis, it does not fully automate the use of replicate STRUCTURE analysis runs required for downstream inference of optimal K. There is pressing need for a tool that can deploy population structure analysis on high performance computing clusters. We present an updated version of the popular Python program StrAuto, to streamline population structure analysis using parallel computing. StrAuto implements a pipeline that combines STRUCTURE analysis with the Evanno Δ K analysis and visualization of results using STRUCTURE HARVESTER. Using benchmarking tests, we demonstrate that StrAuto significantly reduces the computational time needed to perform iterative STRUCTURE analysis by distributing runs over two or more processors. StrAuto is the first tool to integrate STRUCTURE analysis with post-processing using a pipeline approach in addition to implementing parallel computation - a set up ideal for deployment on computing clusters. StrAuto is distributed under the GNU GPL (General Public License) and available to download from http://strauto.popgen.org .
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species. PMID:22408741
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species.
NASA Technical Reports Server (NTRS)
Agrawal, Gagan; Sussman, Alan; Saltz, Joel
1993-01-01
Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). A combined runtime and compile-time approach for parallelizing these applications on distributed memory parallel machines in an efficient and machine-independent fashion was described. A runtime library which can be used to port these applications on distributed memory machines was designed and implemented. The library is currently implemented on several different systems. To further ease the task of application programmers, methods were developed for integrating this runtime library with compilers for HPK-like parallel programming languages. How this runtime library was integrated with the Fortran 90D compiler being developed at Syracuse University is discussed. Experimental results to demonstrate the efficacy of our approach are presented. A multiblock Navier-Stokes solver template and a multigrid code were experimented with. Our experimental results show that our primitives have low runtime communication overheads. Further, the compiler parallelized codes perform within 20 percent of the code parallelized by manually inserting calls to the runtime library.
Metascalable molecular dynamics simulation of nano-mechano-chemistry
NASA Astrophysics Data System (ADS)
Shimojo, F.; Kalia, R. K.; Nakano, A.; Nomura, K.; Vashishta, P.
2008-07-01
We have developed a metascalable (or 'design once, scale on new architectures') parallel application-development framework for first-principles based simulations of nano-mechano-chemical processes on emerging petaflops architectures based on spatiotemporal data locality principles. The framework consists of (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms, (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these scalable algorithms onto hardware. The EDC-STEP-HCD framework exposes and expresses maximal concurrency and data locality, thereby achieving parallel efficiency as high as 0.99 for 1.59-billion-atom reactive force field molecular dynamics (MD) and 17.7-million-atom (1.56 trillion electronic degrees of freedom) quantum mechanical (QM) MD in the framework of the density functional theory (DFT) on adaptive multigrids, in addition to 201-billion-atom nonreactive MD, on 196 608 IBM BlueGene/L processors. We have also used the framework for automated execution of adaptive hybrid DFT/MD simulation on a grid of six supercomputers in the US and Japan, in which the number of processors changed dynamically on demand and tasks were migrated according to unexpected faults. The paper presents the application of the framework to the study of nanoenergetic materials: (1) combustion of an Al/Fe2O3 thermite and (2) shock initiation and reactive nanojets at a void in an energetic crystal.
A Generic Mesh Data Structure with Parallel Applications
ERIC Educational Resources Information Center
Cochran, William Kenneth, Jr.
2009-01-01
High performance, massively-parallel multi-physics simulations are built on efficient mesh data structures. Most data structures are designed from the bottom up, focusing on the implementation of linear algebra routines. In this thesis, we explore a top-down approach to design, evaluating the various needs of many aspects of simulation, not just…
Rochus, Christina Marie; Tortereau, Flavie; Plisson-Petit, Florence; Restoux, Gwendal; Moreno-Romieux, Carole; Tosser-Klopp, Gwenola; Servin, Bertrand
2018-01-23
One of the approaches to detect genetics variants affecting fitness traits is to identify their surrounding genomic signatures of past selection. With established methods for detecting selection signatures and the current and future availability of large datasets, such studies should have the power to not only detect these signatures but also to infer their selective histories. Domesticated animals offer a powerful model for these approaches as they adapted rapidly to environmental and human-mediated constraints in a relatively short time. We investigated this question by studying a large dataset of 542 individuals from 27 domestic sheep populations raised in France, genotyped for more than 500,000 SNPs. Population structure analysis revealed that this set of populations harbour a large part of European sheep diversity in a small geographical area, offering a powerful model for the study of adaptation. Identification of extreme SNP and haplotype frequency differences between populations listed 126 genomic regions likely affected by selection. These signatures revealed selection at loci commonly identified as selection targets in many species ("selection hotspots") including ABCG2, LCORL/NCAPG, MSTN, and coat colour genes such as ASIP, MC1R, MITF, and TYRP1. For one of these regions (ABCG2, LCORL/NCAPG), we could propose a historical scenario leading to the introgression of an adaptive allele into a new genetic background. Among selection signatures, we found clear evidence for parallel selection events in different genetic backgrounds, most likely for different mutations. We confirmed this allelic heterogeneity in one case by resequencing the MC1R gene in three black-faced breeds. Our study illustrates how dense genetic data in multiple populations allows the deciphering of evolutionary history of populations and of their adaptive mutations.
Parallel, Gradient-Based Anisotropic Mesh Adaptation for Re-entry Vehicle Configurations
NASA Technical Reports Server (NTRS)
Bibb, Karen L.; Gnoffo, Peter A.; Park, Michael A.; Jones, William T.
2006-01-01
Two gradient-based adaptation methodologies have been implemented into the Fun3d refine GridEx infrastructure. A spring-analogy adaptation which provides for nodal movement to cluster mesh nodes in the vicinity of strong shocks has been extended for general use within Fun3d, and is demonstrated for a 70 sphere cone at Mach 2. A more general feature-based adaptation metric has been developed for use with the adaptation mechanics available in Fun3d, and is applicable to any unstructured, tetrahedral, flow solver. The basic functionality of general adaptation is explored through a case of flow over the forebody of a 70 sphere cone at Mach 6. A practical application of Mach 10 flow over an Apollo capsule, computed with the Felisa flow solver, is given to compare the adaptive mesh refinement with uniform mesh refinement. The examples of the paper demonstrate that the gradient-based adaptation capability as implemented can give an improvement in solution quality.
A parallel strategy for predicting the secondary structure of polycistronic microRNAs.
Han, Dianwei; Tang, Guiliang; Zhang, Jun
2013-01-01
The biogenesis of a functional microRNA is largely dependent on the secondary structure of the microRNA precursor (pre-miRNA). Recently, it has been shown that microRNAs are present in the genome as the form of polycistronic transcriptional units in plants and animals. It will be important to design efficient computational methods to predict such structures for microRNA discovery and its applications in gene silencing. In this paper, we propose a parallel algorithm based on the master-slave architecture to predict the secondary structure from an input sequence. We conducted some experiments to verify the effectiveness of our parallel algorithm. The experimental results show that our algorithm is able to produce the optimal secondary structure of polycistronic microRNAs.
NASA Astrophysics Data System (ADS)
Quan, Zhe; Wu, Lei
2017-09-01
This article investigates the use of parallel computing for solving the disjunctively constrained knapsack problem. The proposed parallel computing model can be viewed as a cooperative algorithm based on a multi-neighbourhood search. The cooperation system is composed of a team manager and a crowd of team members. The team members aim at applying their own search strategies to explore the solution space. The team manager collects the solutions from the members and shares the best one with them. The performance of the proposed method is evaluated on a group of benchmark data sets. The results obtained are compared to those reached by the best methods from the literature. The results show that the proposed method is able to provide the best solutions in most cases. In order to highlight the robustness of the proposed parallel computing model, a new set of large-scale instances is introduced. Encouraging results have been obtained.
Parallel software for lattice N = 4 supersymmetric Yang-Mills theory
NASA Astrophysics Data System (ADS)
Schaich, David; DeGrand, Thomas
2015-05-01
We present new parallel software, SUSY LATTICE, for lattice studies of four-dimensional N = 4 supersymmetric Yang-Mills theory with gauge group SU(N). The lattice action is constructed to exactly preserve a single supersymmetry charge at non-zero lattice spacing, up to additional potential terms included to stabilize numerical simulations. The software evolved from the MILC code for lattice QCD, and retains a similar large-scale framework despite the different target theory. Many routines are adapted from an existing serial code (Catterall and Joseph, 2012), which SUSY LATTICE supersedes. This paper provides an overview of the new parallel software, summarizing the lattice system, describing the applications that are currently provided and explaining their basic workflow for non-experts in lattice gauge theory. We discuss the parallel performance of the code, and highlight some notable aspects of the documentation for those interested in contributing to its future development.
Probabilistic structural mechanics research for parallel processing computers
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Martin, William R.
1991-01-01
Aerospace structures and spacecraft are a complex assemblage of structural components that are subjected to a variety of complex, cyclic, and transient loading conditions. Significant modeling uncertainties are present in these structures, in addition to the inherent randomness of material properties and loads. To properly account for these uncertainties in evaluating and assessing the reliability of these components and structures, probabilistic structural mechanics (PSM) procedures must be used. Much research has focused on basic theory development and the development of approximate analytic solution methods in random vibrations and structural reliability. Practical application of PSM methods was hampered by their computationally intense nature. Solution of PSM problems requires repeated analyses of structures that are often large, and exhibit nonlinear and/or dynamic response behavior. These methods are all inherently parallel and ideally suited to implementation on parallel processing computers. New hardware architectures and innovative control software and solution methodologies are needed to make solution of large scale PSM problems practical.
The transcriptomics of ecological convergence between 2 limnetic coregonine fishes (Salmonidae).
Derome, N; Bernatchez, L
2006-12-01
Species living in comparable habitats often display strikingly similar patterns of specialization, suggesting that natural selection can lead to predictable evolutionary changes. Elucidating the genomic basis underlying such adaptive phenotypic changes is a major goal in evolutionary biology. Increasing evidence indicates that natural selection would first modulate gene regulation during the process of population divergence. Previously, we showed that parallel phenotypic adaptations of the dwarf whitefish (Coregonus clupeaformis) ecotype to the limnetic trophic niche involved parallel transcriptional changes at the same genes involved in muscle contraction and energetic metabolism relative to the sympatric normal ecotype. Here, we tested whether the same genes are also implicated in a limnetic specialist species, the cisco (Coregonus artedi), which is the most likely competitor of dwarf whitefish. Significant upregulation was detected in cisco at the same 6 candidate genes functionally involved in modulating swimming activity, namely 5 variants of a major protein of fast muscle and 1 putative catalytic crystallin enzyme. Moreover, 3 of 5 variants and the same putative catalytic crystallin enzyme were upregulated in cisco relative to the dwarf ecotype, indicating a greater physiological potential of the former for exploiting the limnetic trophic niche. This study provides the first empirical evidence that recent, parallel phenotypic evolution toward the use of the same ecological niche occupied by a specialist competitor involved similar adaptive changes in expression at the same genes. As such, this study provides strong support to the general hypothesis that directional selection acting on gene regulation may promote rapid phenotypic divergence and ultimately speciation.
Conceptualizing and communicating ecological river restoration: Chapter 2
Jacobson, Robert B.; Berkley, Jim
2011-01-01
We present a general conceptual model for communicating aspects of river restoration and management. The model is generic and adaptable to most riverine settings, independent of size. The model has separate categories of natural and social-economic drivers, and management actions are envisioned as modifiers of naturally dynamic systems. The model includes a decision-making structure in which managers, stakeholders, and scientists interact to define management objectives and performance evaluation. The model depicts a stress to the riverine ecosystem as either (1) deviation in the regimes (flow, sediment, temperature, light, biogeochemical, and genetic) by altering the frequency, magnitude, duration, timing, or rate of change of the fluxes or (2) imposition of a hard structural constraint on channel form. Restoration is depicted as naturalization of those regimes or removal of the constraint. The model recognizes the importance of river history in conditioning future responses. Three hierarchical tiers of essential ecosystem characteristics (EECs) illustrate how management actions typically propagate through physical/chemical processes to habitat to biotic responses. Uncertainty and expense in modeling or measuring responses increase in moving from tiers 1 to 3. Social-economic characteristics are shown in a parallel structure that emphasizes the need to quantify trade-offs between ecological and social-economic systems. Performance measures for EECs are also hierarchical, showing that selection of measures depend on participants’ willingness to accept uncertainty. The general form is of an adaptive management loop in which the performance measures are compared to reference conditions or success criteria and the information is fed back into the decision-making process.
Zawadzki, Robert J; Zhang, Pengfei; Zam, Azhar; Miller, Eric B; Goswami, Mayank; Wang, Xinlei; Jonnal, Ravi S; Lee, Sang-Hyuck; Kim, Dae Yu; Flannery, John G; Werner, John S; Burns, Marie E; Pugh, Edward N
2015-06-01
Adaptive optics scanning laser ophthalmoscopy (AO-SLO) has recently been used to achieve exquisite subcellular resolution imaging of the mouse retina. Wavefront sensing-based AO typically restricts the field of view to a few degrees of visual angle. As a consequence the relationship between AO-SLO data and larger scale retinal structures and cellular patterns can be difficult to assess. The retinal vasculature affords a large-scale 3D map on which cells and structures can be located during in vivo imaging. Phase-variance OCT (pv-OCT) can efficiently image the vasculature with near-infrared light in a label-free manner, allowing 3D vascular reconstruction with high precision. We combined widefield pv-OCT and SLO imaging with AO-SLO reflection and fluorescence imaging to localize two types of fluorescent cells within the retinal layers: GFP-expressing microglia, the resident macrophages of the retina, and GFP-expressing cone photoreceptor cells. We describe in detail a reflective afocal AO-SLO retinal imaging system designed for high resolution retinal imaging in mice. The optical performance of this instrument is compared to other state-of-the-art AO-based mouse retinal imaging systems. The spatial and temporal resolution of the new AO instrumentation was characterized with angiography of retinal capillaries, including blood-flow velocity analysis. Depth-resolved AO-SLO fluorescent images of microglia and cone photoreceptors are visualized in parallel with 469 nm and 663 nm reflectance images of the microvasculature and other structures. Additional applications of the new instrumentation are discussed.
High Resolution DNS of Turbulent Flows using an Adaptive, Finite Volume Method
NASA Astrophysics Data System (ADS)
Trebotich, David
2014-11-01
We present a new computational capability for high resolution simulation of incompressible viscous flows. Our approach is based on cut cell methods where an irregular geometry such as a bluff body is intersected with a rectangular Cartesian grid resulting in cut cells near the boundary. In the cut cells we use a conservative discretization based on a discrete form of the divergence theorem to approximate fluxes for elliptic and hyperbolic terms in the Navier-Stokes equations. Away from the boundary the method reduces to a finite difference method. The algorithm is implemented in the Chombo software framework which supports adaptive mesh refinement and massively parallel computations. The code is scalable to 200,000 + processor cores on DOE supercomputers, resulting in DNS studies at unprecedented scale and resolution. For flow past a cylinder in transition (Re = 300) we observe a number of secondary structures in the far wake in 2D where the wake is over 120 cylinder diameters in length. These are compared with the more regularized wake structures in 3D at the same scale. For flow past a sphere (Re = 600) we resolve an arrowhead structure in the velocity in the near wake. The effectiveness of AMR is further highlighted in a simulation of turbulent flow (Re = 6000) in the contraction of an oil well blowout preventer. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Contract Number DE-AC02-05-CH11231.
Adaptive beam shaping for improving the power coupling of a two-Cassegrain-telescope
NASA Astrophysics Data System (ADS)
Ma, Haotong; Hu, Haojun; Xie, Wenke; Zhao, Haichuan; Xu, Xiaojun; Chen, Jinbao
2013-08-01
We demonstrate the adaptive beam shaping for improving the power coupling of a two-Cassegrain-telescope based on the stochastic parallel gradient descent (SPGD) algorithm and dual phase only liquid crystal spatial light modulators (LC-SLMs). Adaptive pre-compensation the wavefront of projected laser beam at the transmitter telescope is chosen to improve the power coupling efficiency. One phase only LC-SLM adaptively optimizes phase distribution of the projected laser beam and the other generates turbulence phase screen. The intensity distributions of the dark hollow beam after passing through the turbulent atmosphere with and without adaptive beam shaping are analyzed in detail. The influence of propagation distance and aperture size of the Cassegrain-telescope on coupling efficiency are investigated theoretically and experimentally. These studies show that the power coupling can be significantly improved by adaptive beam shaping. The technique can be used in optical communication, deep space optical communication and relay mirror.
ERIC Educational Resources Information Center
Patton, Michael Quinn
2008-01-01
Extension and evaluation share some similar challenges, including working with diverse stakeholders, parallel processes for focusing priorities, meeting common standards of excellence, and adapting to globalization, new technologies, and changing times. Evaluations of extension programs have helped clarify how change occurs, especially the…
A Real-Time Capable Software-Defined Receiver Using GPU for Adaptive Anti-Jam GPS Sensors
Seo, Jiwon; Chen, Yu-Hsuan; De Lorenzo, David S.; Lo, Sherman; Enge, Per; Akos, Dennis; Lee, Jiyun
2011-01-01
Due to their weak received signal power, Global Positioning System (GPS) signals are vulnerable to radio frequency interference. Adaptive beam and null steering of the gain pattern of a GPS antenna array can significantly increase the resistance of GPS sensors to signal interference and jamming. Since adaptive array processing requires intensive computational power, beamsteering GPS receivers were usually implemented using hardware such as field-programmable gate arrays (FPGAs). However, a software implementation using general-purpose processors is much more desirable because of its flexibility and cost effectiveness. This paper presents a GPS software-defined radio (SDR) with adaptive beamsteering capability for anti-jam applications. The GPS SDR design is based on an optimized desktop parallel processing architecture using a quad-core Central Processing Unit (CPU) coupled with a new generation Graphics Processing Unit (GPU) having massively parallel processors. This GPS SDR demonstrates sufficient computational capability to support a four-element antenna array and future GPS L5 signal processing in real time. After providing the details of our design and optimization schemes for future GPU-based GPS SDR developments, the jamming resistance of our GPS SDR under synthetic wideband jamming is presented. Since the GPS SDR uses commercial-off-the-shelf hardware and processors, it can be easily adopted in civil GPS applications requiring anti-jam capabilities. PMID:22164116
SAChES: Scalable Adaptive Chain-Ensemble Sampling.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Swiler, Laura Painton; Ray, Jaideep; Ebeida, Mohamed Salah
We present the development of a parallel Markov Chain Monte Carlo (MCMC) method called SAChES, Scalable Adaptive Chain-Ensemble Sampling. This capability is targed to Bayesian calibration of com- putationally expensive simulation models. SAChES involves a hybrid of two methods: Differential Evo- lution Monte Carlo followed by Adaptive Metropolis. Both methods involve parallel chains. Differential evolution allows one to explore high-dimensional parameter spaces using loosely coupled (i.e., largely asynchronous) chains. Loose coupling allows the use of large chain ensembles, with far more chains than the number of parameters to explore. This reduces per-chain sampling burden, enables high-dimensional inversions and the usemore » of computationally expensive forward models. The large number of chains can also ameliorate the impact of silent-errors, which may affect only a few chains. The chain ensemble can also be sampled to provide an initial condition when an aberrant chain is re-spawned. Adaptive Metropolis takes the best points from the differential evolution and efficiently hones in on the poste- rior density. The multitude of chains in SAChES is leveraged to (1) enable efficient exploration of the parameter space; and (2) ensure robustness to silent errors which may be unavoidable in extreme-scale computational platforms of the future. This report outlines SAChES, describes four papers that are the result of the project, and discusses some additional results.« less
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method
Grayver, Alexander V.; Kolev, Tzanio V.
2015-11-01
Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grayver, Alexander V.; Kolev, Tzanio V.
Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
A real-time capable software-defined receiver using GPU for adaptive anti-jam GPS sensors.
Seo, Jiwon; Chen, Yu-Hsuan; De Lorenzo, David S; Lo, Sherman; Enge, Per; Akos, Dennis; Lee, Jiyun
2011-01-01
Due to their weak received signal power, Global Positioning System (GPS) signals are vulnerable to radio frequency interference. Adaptive beam and null steering of the gain pattern of a GPS antenna array can significantly increase the resistance of GPS sensors to signal interference and jamming. Since adaptive array processing requires intensive computational power, beamsteering GPS receivers were usually implemented using hardware such as field-programmable gate arrays (FPGAs). However, a software implementation using general-purpose processors is much more desirable because of its flexibility and cost effectiveness. This paper presents a GPS software-defined radio (SDR) with adaptive beamsteering capability for anti-jam applications. The GPS SDR design is based on an optimized desktop parallel processing architecture using a quad-core Central Processing Unit (CPU) coupled with a new generation Graphics Processing Unit (GPU) having massively parallel processors. This GPS SDR demonstrates sufficient computational capability to support a four-element antenna array and future GPS L5 signal processing in real time. After providing the details of our design and optimization schemes for future GPU-based GPS SDR developments, the jamming resistance of our GPS SDR under synthetic wideband jamming is presented. Since the GPS SDR uses commercial-off-the-shelf hardware and processors, it can be easily adopted in civil GPS applications requiring anti-jam capabilities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barrett, Brian; Brightwell, Ronald B.; Grant, Ryan
This report presents a specification for the Portals 4 networ k programming interface. Portals 4 is intended to allow scalable, high-performance network communication betwee n nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded syste ms. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platfor ms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is tarmore » geted to the next generation of machines employing advanced network interface architectures that support enh anced offload capabilities.« less
The Portals 4.0 network programming interface.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barrett, Brian W.; Brightwell, Ronald Brian; Pedretti, Kevin
2012-11-01
This report presents a specification for the Portals 4.0 network programming interface. Portals 4.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4.0 is well suited to massively parallel processing and embedded systems. Portals 4.0 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandias Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4.0 is targeted to the next generationmore » of machines employing advanced network interface architectures that support enhanced offload capabilities.« less
Progress in Computational Simulation of Earthquakes
NASA Technical Reports Server (NTRS)
Donnellan, Andrea; Parker, Jay; Lyzenga, Gregory; Judd, Michele; Li, P. Peggy; Norton, Charles; Tisdale, Edwin; Granat, Robert
2006-01-01
GeoFEST(P) is a computer program written for use in the QuakeSim project, which is devoted to development and improvement of means of computational simulation of earthquakes. GeoFEST(P) models interacting earthquake fault systems from the fault-nucleation to the tectonic scale. The development of GeoFEST( P) has involved coupling of two programs: GeoFEST and the Pyramid Adaptive Mesh Refinement Library. GeoFEST is a message-passing-interface-parallel code that utilizes a finite-element technique to simulate evolution of stress, fault slip, and plastic/elastic deformation in realistic materials like those of faulted regions of the crust of the Earth. The products of such simulations are synthetic observable time-dependent surface deformations on time scales from days to decades. Pyramid Adaptive Mesh Refinement Library is a software library that facilitates the generation of computational meshes for solving physical problems. In an application of GeoFEST(P), a computational grid can be dynamically adapted as stress grows on a fault. Simulations on workstations using a few tens of thousands of stress and displacement finite elements can now be expanded to multiple millions of elements with greater than 98-percent scaled efficiency on over many hundreds of parallel processors (see figure).
Schmoll, Hans-Joachim; Arnold, Dirk; de Gramont, Aimery; Ducreux, Michel; Grothey, Axel; O'Dwyer, Peter J; Van Cutsem, Eric; Hermann, Frank; Bosanac, Ivan; Bendahmane, Belguendouz; Mancao, Christoph; Tabernero, Josep
2018-06-01
The old approach of one therapeutic for all patients with mCRC is evolving with a need to target specific molecular aberrations or cell-signalling pathways. Molecular screening approaches and new biomarkers are required to fully characterize tumours, identify patients most likely to benefit, and predict treatment response. MODUL is a signal-seeking trial with a design that is highly adaptable, permitting modification of different treatment cohorts and inclusion of further additional cohorts based on novel evidence on new compounds/combinations that emerge during the study. MODUL is ongoing and its adaptable nature permits timely and efficient recruitment of patients into the most appropriate cohort. Recruitment will take place over approximately 5 years in Europe, Asia, Africa, and South America. The design of MODUL with ongoing parallel/sequential treatment cohorts means that the overall size and duration of the trial can be modified/prolonged based on accumulation of new data. The early success of the current trial suggests that the design may provide definitive leads in a patient-friendly and relatively economical trial structure. Along with other biomarker-driven trials that are currently underway, it is hoped that MODUL will contribute to the continuing evolution of clinical trial design and permit a more 'tailored' approach to the treatment of patients with mCRC.
Maillefert, J F; Kloppenburg, M; Fernandes, L; Punzi, L; Günther, K-P; Martin Mola, E; Lohmander, L S; Pavelka, K; Lopez-Olivo, M A; Dougados, M; Hawker, G A
2009-10-01
To conduct a multi-language translation and cross-cultural adaptation of the Intermittent and Constant OsteoArthritis Pain (ICOAP) questionnaire for hip and knee osteoarthritis (OA). The questionnaires were translated and cross-culturally adapted in parallel, using a common protocol, into the following languages: Czech, Dutch, French (France), German, Italian, Norwegian, Spanish (Castillan), North and Central American Spanish, Swedish. The process was conducted following five steps: (1)--independent translation into the target language by two or three persons; (2)--consensus meeting to obtain a single preliminary translated version; (3)--backward translation by an independent bilingual native English speaker, blinded to the English original version; (4)--final version produced by a multidisciplinary consensus committee; (5)--pre-testing of the final version with 10-20 target-language-native hip and knee OA patients. The process could be followed and completed in all countries. Only slight differences were identified in the structure of the sentences between the original and the translated versions. A large majority of the patients felt that the questionnaire was easy to understand and complete. Only a few minor criticisms were expressed. Moreover, a majority of patients found the concepts of constant pain and pain that comes and goes to be of a great pertinence and were very happy with the distinction. The ICOAP questionnaire is now available for multi-center international studies.
Liu, Rui; Milkie, Daniel E; Kerlin, Aaron; MacLennan, Bryan; Ji, Na
2014-01-27
In traditional zonal wavefront sensing for adaptive optics, after local wavefront gradients are obtained, the entire wavefront can be calculated by assuming that the wavefront is a continuous surface. Such an approach will lead to sub-optimal performance in reconstructing wavefronts which are either discontinuous or undersampled by the zonal wavefront sensor. Here, we report a new method to reconstruct the wavefront by directly measuring local wavefront phases in parallel using multidither coherent optical adaptive technique. This method determines the relative phases of each pupil segment independently, and thus produces an accurate wavefront for even discontinuous wavefronts. We implemented this method in an adaptive optical two-photon fluorescence microscopy and demonstrated its superior performance in correcting large or discontinuous aberrations.
AFFINE-CORRECTED PARADISE: FREE-BREATHING PATIENT-ADAPTIVE CARDIAC MRI WITH SENSITIVITY ENCODING
Sharif, Behzad; Bresler, Yoram
2013-01-01
We propose a real-time cardiac imaging method with parallel MRI that allows for free breathing during imaging and does not require cardiac or respiratory gating. The method is based on the recently proposed PARADISE (Patient-Adaptive Reconstruction and Acquisition Dynamic Imaging with Sensitivity Encoding) scheme. The new acquisition method adapts the PARADISE k-t space sampling pattern according to an affine model of the respiratory motion. The reconstruction scheme involves multi-channel time-sequential imaging with time-varying channels. All model parameters are adapted to the imaged patient as part of the experiment and drive both data acquisition and cine reconstruction. Simulated cardiac MRI experiments using the realistic NCAT phantom show high quality cine reconstructions and robustness to modeling inaccuracies. PMID:24390159
High-Resolution Adaptive Optics Test-Bed for Vision Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilks, S C; Thomspon, C A; Olivier, S S
2001-09-27
We discuss the design and implementation of a low-cost, high-resolution adaptive optics test-bed for vision research. It is well known that high-order aberrations in the human eye reduce optical resolution and limit visual acuity. However, the effects of aberration-free eyesight on vision are only now beginning to be studied using adaptive optics to sense and correct the aberrations in the eye. We are developing a high-resolution adaptive optics system for this purpose using a Hamamatsu Parallel Aligned Nematic Liquid Crystal Spatial Light Modulator. Phase-wrapping is used to extend the effective stroke of the device, and the wavefront sensing and wavefrontmore » correction are done at different wavelengths. Issues associated with these techniques will be discussed.« less