Parallel tempering for the traveling salesman problem
DOE Office of Scientific and Technical Information (OSTI.GOV)
Percus, Allon; Wang, Richard; Hyman, Jeffrey
We explore the potential of parallel tempering as a combinatorial optimization method, applying it to the traveling salesman problem. We compare simulation results of parallel tempering with a benchmark implementation of simulated annealing, and study how different choices of parameters affect the relative performance of the two methods. We find that a straightforward implementation of parallel tempering can outperform simulated annealing in several crucial respects. When parameters are chosen appropriately, both methods yield close approximation to the actual minimum distance for an instance with 200 nodes. However, parallel tempering yields more consistently accurate results when a series of independent simulationsmore » are performed. Our results suggest that parallel tempering might offer a simple but powerful alternative to simulated annealing for combinatorial optimization problems.« less
Hyper-Parallel Tempering Monte Carlo Method and It's Applications
NASA Astrophysics Data System (ADS)
Yan, Qiliang; de Pablo, Juan
2000-03-01
A new generalized hyper-parallel tempering Monte Carlo molecular simulation method is presented for study of complex fluids. The method is particularly useful for simulation of many-molecule complex systems, where rough energy landscapes and inherently long characteristic relaxation times can pose formidable obstacles to effective sampling of relevant regions of configuration space. The method combines several key elements from expanded ensemble formalisms, parallel-tempering, open ensemble simulations, configurational bias techniques, and histogram reweighting analysis of results. It is found to accelerate significantly the diffusion of a complex system through phase-space. In this presentation, we demonstrate the effectiveness of the new method by implementing it in grand canonical ensembles for a Lennard-Jones fluid, for the restricted primitive model of electrolyte solutions (RPM), and for polymer solutions and blends. Our results indicate that the new algorithm is capable of overcoming the large free energy barriers associated with phase transitions, thereby greatly facilitating the simulation of coexistence properties. It is also shown that the method can be orders of magnitude more efficient than previously available techniques. More importantly, the method is relatively simple and can be incorporated into existing simulation codes with minor efforts.
Efficient Simulation of Explicitly Solvated Proteins in the Well-Tempered Ensemble.
Deighan, Michael; Bonomi, Massimiliano; Pfaendtner, Jim
2012-07-10
Herein, we report significant reduction in the cost of combined parallel tempering and metadynamics simulations (PTMetaD). The efficiency boost is achieved using the recently proposed well-tempered ensemble (WTE) algorithm. We studied the convergence of PTMetaD-WTE conformational sampling and free energy reconstruction of an explicitly solvated 20-residue tryptophan-cage protein (trp-cage). A set of PTMetaD-WTE simulations was compared to a corresponding standard PTMetaD simulation. The properties of PTMetaD-WTE and the convergence of the calculations were compared. The roles of the number of replicas, total simulation time, and adjustable WTE parameter γ were studied.
Multisystem altruistic metadynamics—Well-tempered variant
NASA Astrophysics Data System (ADS)
Hošek, Petr; Kříž, Pavel; Toulcová, Daniela; Spiwok, Vojtěch
2017-03-01
Metadynamics method has been widely used to enhance sampling in molecular simulations. Its original form suffers two major drawbacks, poor convergence in complex (especially biomolecular) systems and its serial nature. The first drawback has been addressed by introduction of a convergent variant known as well-tempered metadynamics. The second was addressed by introduction of a parallel multisystem metadynamics referred to as altruistic metadynamics. Here, we combine both approaches into well-tempered altruistic metadynamics. We provide mathematical arguments and trial simulations to show that it accurately predicts free energy surfaces.
Multisystem altruistic metadynamics-Well-tempered variant.
Hošek, Petr; Kříž, Pavel; Toulcová, Daniela; Spiwok, Vojtěch
2017-03-28
Metadynamics method has been widely used to enhance sampling in molecular simulations. Its original form suffers two major drawbacks, poor convergence in complex (especially biomolecular) systems and its serial nature. The first drawback has been addressed by introduction of a convergent variant known as well-tempered metadynamics. The second was addressed by introduction of a parallel multisystem metadynamics referred to as altruistic metadynamics. Here, we combine both approaches into well-tempered altruistic metadynamics. We provide mathematical arguments and trial simulations to show that it accurately predicts free energy surfaces.
Doll, J.; Dupuis, P.; Nyquist, P.
2017-02-08
Parallel tempering, or replica exchange, is a popular method for simulating complex systems. The idea is to run parallel simulations at different temperatures, and at a given swap rate exchange configurations between the parallel simulations. From the perspective of large deviations it is optimal to let the swap rate tend to infinity and it is possible to construct a corresponding simulation scheme, known as infinite swapping. In this paper we propose a novel use of large deviations for empirical measures for a more detailed analysis of the infinite swapping limit in the setting of continuous time jump Markov processes. Usingmore » the large deviations rate function and associated stochastic control problems we consider a diagnostic based on temperature assignments, which can be easily computed during a simulation. We show that the convergence of this diagnostic to its a priori known limit is a necessary condition for the convergence of infinite swapping. The rate function is also used to investigate the impact of asymmetries in the underlying potential landscape, and where in the state space poor sampling is most likely to occur.« less
Study of the temperature configuration of parallel tempering for the traveling salesman problem
NASA Astrophysics Data System (ADS)
Hasegawa, Manabu
The effective temperature configuration of parallel tempering (PT) in finite-time optimization is studied for the solution of the traveling salesman problem. An experimental analysis is conducted to decide the relative importance of the two characteristic temperatures, the specific-heat-peak temperature referred to in the general guidelines and the effective intermediate temperature identified in the recent study on simulated annealing (SA). The results show that the operation near the former has no notable significance contrary to the conventional belief but that the operation near the latter plays a crucial role in fulfilling the optimization function of PT. The method shares the same origin of effectiveness with the SA and SA-related algorithms.
NASA Astrophysics Data System (ADS)
Fang, Ye; Feng, Sheng; Tam, Ka-Ming; Yun, Zhifeng; Moreno, Juana; Ramanujam, J.; Jarrell, Mark
2014-10-01
Monte Carlo simulations of the Ising model play an important role in the field of computational statistical physics, and they have revealed many properties of the model over the past few decades. However, the effect of frustration due to random disorder, in particular the possible spin glass phase, remains a crucial but poorly understood problem. One of the obstacles in the Monte Carlo simulation of random frustrated systems is their long relaxation time making an efficient parallel implementation on state-of-the-art computation platforms highly desirable. The Graphics Processing Unit (GPU) is such a platform that provides an opportunity to significantly enhance the computational performance and thus gain new insight into this problem. In this paper, we present optimization and tuning approaches for the CUDA implementation of the spin glass simulation on GPUs. We discuss the integration of various design alternatives, such as GPU kernel construction with minimal communication, memory tiling, and look-up tables. We present a binary data format, Compact Asynchronous Multispin Coding (CAMSC), which provides an additional 28.4% speedup compared with the traditionally used Asynchronous Multispin Coding (AMSC). Our overall design sustains a performance of 33.5 ps per spin flip attempt for simulating the three-dimensional Edwards-Anderson model with parallel tempering, which significantly improves the performance over existing GPU implementations.
Linking Well-Tempered Metadynamics Simulations with Experiments
Barducci, Alessandro; Bonomi, Massimiliano; Parrinello, Michele
2010-01-01
Abstract Linking experiments with the atomistic resolution provided by molecular dynamics simulations can shed light on the structure and dynamics of protein-disordered states. The sampling limitations of classical molecular dynamics can be overcome using metadynamics, which is based on the introduction of a history-dependent bias on a small number of suitably chosen collective variables. Even if such bias distorts the probability distribution of the other degrees of freedom, the equilibrium Boltzmann distribution can be reconstructed using a recently developed reweighting algorithm. Quantitative comparison with experimental data is thus possible. Here we show the potential of this combined approach by characterizing the conformational ensemble explored by a 13-residue helix-forming peptide by means of a well-tempered metadynamics/parallel tempering approach and comparing the reconstructed nuclear magnetic resonance scalar couplings with experimental data. PMID:20441734
Essential slow degrees of freedom in protein-surface simulations: A metadynamics investigation.
Prakash, Arushi; Sprenger, K G; Pfaendtner, Jim
2018-03-29
Many proteins exhibit strong binding affinities to surfaces, with binding energies much greater than thermal fluctuations. When modelling these protein-surface systems with classical molecular dynamics (MD) simulations, the large forces that exist at the protein/surface interface generally confine the system to a single free energy minimum. Exploring the full conformational space of the protein, especially finding other stable structures, becomes prohibitively expensive. Coupling MD simulations with metadynamics (enhanced sampling) has fast become a common method for sampling the adsorption of such proteins. In this paper, we compare three different flavors of metadynamics, specifically well-tempered, parallel-bias, and parallel-tempering in the well-tempered ensemble, to exhaustively sample the conformational surface-binding landscape of model peptide GGKGG. We investigate the effect of mobile ions and ion charge, as well as the choice of collective variable (CV), on the binding free energy of the peptide. We make the case for explicitly biasing ions to sample the true binding free energy of biomolecules when the ion concentration is high and the binding free energies of the solute and ions are similar. We also make the case for choosing CVs that apply bias to all atoms of the solute to speed up calculations and obtain the maximum possible amount of information about the system. Copyright © 2017 Elsevier Inc. All rights reserved.
Linking well-tempered metadynamics simulations with experiments.
Barducci, Alessandro; Bonomi, Massimiliano; Parrinello, Michele
2010-05-19
Linking experiments with the atomistic resolution provided by molecular dynamics simulations can shed light on the structure and dynamics of protein-disordered states. The sampling limitations of classical molecular dynamics can be overcome using metadynamics, which is based on the introduction of a history-dependent bias on a small number of suitably chosen collective variables. Even if such bias distorts the probability distribution of the other degrees of freedom, the equilibrium Boltzmann distribution can be reconstructed using a recently developed reweighting algorithm. Quantitative comparison with experimental data is thus possible. Here we show the potential of this combined approach by characterizing the conformational ensemble explored by a 13-residue helix-forming peptide by means of a well-tempered metadynamics/parallel tempering approach and comparing the reconstructed nuclear magnetic resonance scalar couplings with experimental data. Copyright (c) 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.
2017-08-10
simulation models the conformational plasticity along the helix-forming reaction coordinate was limited by free - energy barriers. By comparison the coarse...revealed. The latter becomes evident in comparing the energy Z-score landscapes , where CHARMM22 simulation shows a manifold of shuttling...solvent simulations of calculating the charging free energy of protein conformations.33 Deviation to the protocol by modification of Born radii
Parallel tempering Monte Carlo simulations of lysozyme orientation on charged surfaces
NASA Astrophysics Data System (ADS)
Xie, Yun; Zhou, Jian; Jiang, Shaoyi
2010-02-01
In this work, the parallel tempering Monte Carlo (PTMC) algorithm is applied to accurately and efficiently identify the global-minimum-energy orientation of a protein adsorbed on a surface in a single simulation. When applying the PTMC method to simulate lysozyme orientation on charged surfaces, it is found that lysozyme could easily be adsorbed on negatively charged surfaces with "side-on" and "back-on" orientations. When driven by dominant electrostatic interactions, lysozyme tends to be adsorbed on negatively charged surfaces with the side-on orientation for which the active site of lysozyme faces sideways. The side-on orientation agrees well with the experimental results where the adsorbed orientation of lysozyme is determined by electrostatic interactions. As the contribution from van der Waals interactions gradually dominates, the back-on orientation becomes the preferred one. For this orientation, the active site of lysozyme faces outward, which conforms to the experimental results where the orientation of adsorbed lysozyme is co-determined by electrostatic interactions and van der Waals interactions. It is also found that despite of its net positive charge, lysozyme could be adsorbed on positively charged surfaces with both "end-on" and back-on orientations owing to the nonuniform charge distribution over lysozyme surface and the screening effect from ions in solution. The PTMC simulation method provides a way to determine the preferred orientation of proteins on surfaces for biosensor and biomaterial applications.
Optimal estimates of free energies from multistate nonequilibrium work data.
Maragakis, Paul; Spichty, Martin; Karplus, Martin
2006-03-17
We derive the optimal estimates of the free energies of an arbitrary number of thermodynamic states from nonequilibrium work measurements; the work data are collected from forward and reverse switching processes and obey a fluctuation theorem. The maximum likelihood formulation properly reweights all pathways contributing to a free energy difference and is directly applicable to simulations and experiments. We demonstrate dramatic gains in efficiency by combining the analysis with parallel tempering simulations for alchemical mutations of model amino acids.
A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data.
Liang, Faming; Kim, Jinsu; Song, Qifan
2016-01-01
Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this paper, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. This is illustrated in the paper by the tempering BMH algorithm, which can be viewed as a combination of parallel tempering and the BMH algorithm. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively.
A Bootstrap Metropolis–Hastings Algorithm for Bayesian Analysis of Big Data
Kim, Jinsu; Song, Qifan
2016-01-01
Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this paper, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. This is illustrated in the paper by the tempering BMH algorithm, which can be viewed as a combination of parallel tempering and the BMH algorithm. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively. PMID:29033469
Structure of aqueous proline via parallel tempering molecular dynamics and neutron diffraction.
Troitzsch, R Z; Martyna, G J; McLain, S E; Soper, A K; Crain, J
2007-07-19
The structure of aqueous L-proline amino acid has been the subject of much debate centering on the validity of various proposed models, differing widely in the extent to which local and long-range correlations are present. Here, aqueous proline is investigated by atomistic, replica exchange molecular dynamics simulations, and the results are compared to neutron diffraction and small angle neutron scattering (SANS) data, which have been reported recently (McLain, S.; Soper, A.; Terry, A.; Watts, A. J. Phys. Chem. B 2007, 111, 4568). Comparisons between neutron experiments and simulation are made via the static structure factor S(Q) which is measured and computed from several systems with different H/D isotopic compositions at a concentration of 1:20 molar ratio. Several different empirical water models (TIP3P, TIP4P, and SPC/E) in conjunction with the CHARMM22 force field are investigated. Agreement between experiment and simulation is reasonably good across the entire Q range although there are significant model-dependent variations in some cases. In general, agreement is improved slightly upon application of approximate quantum corrections obtained from gas-phase path integral simulations. Dimers and short oligomeric chains formed by hydrogen bonds (frequently bifurcated) coexist with apolar (hydrophobic) contacts. These emerge as the dominant local motifs in the mixture. Evidence for long-range association is more equivocal: No long-range structures form spontaneously in the MD simulations, and no obvious low-Q signature is seen in the SANS data. Moreover, associations introduced artificially to replicate a long-standing proposed mesoscale structure for proline correlations as an initial condition are annealed out by parallel tempering MD simulations. However, some small residual aggregates do remain, implying a greater degree of long-range order than is apparent in the SANS data.
Enhanced Sampling in the Well-Tempered Ensemble
NASA Astrophysics Data System (ADS)
Bonomi, M.; Parrinello, M.
2010-05-01
We introduce the well-tempered ensemble (WTE) which is the biased ensemble sampled by well-tempered metadynamics when the energy is used as collective variable. WTE can be designed so as to have approximately the same average energy as the canonical ensemble but much larger fluctuations. These two properties lead to an extremely fast exploration of phase space. An even greater efficiency is obtained when WTE is combined with parallel tempering. Unbiased Boltzmann averages are computed on the fly by a recently developed reweighting method [M. Bonomi , J. Comput. Chem. 30, 1615 (2009)JCCHDD0192-865110.1002/jcc.21305]. We apply WTE and its parallel tempering variant to the 2d Ising model and to a Gō model of HIV protease, demonstrating in these two representative cases that convergence is accelerated by orders of magnitude.
Enhanced sampling in the well-tempered ensemble.
Bonomi, M; Parrinello, M
2010-05-14
We introduce the well-tempered ensemble (WTE) which is the biased ensemble sampled by well-tempered metadynamics when the energy is used as collective variable. WTE can be designed so as to have approximately the same average energy as the canonical ensemble but much larger fluctuations. These two properties lead to an extremely fast exploration of phase space. An even greater efficiency is obtained when WTE is combined with parallel tempering. Unbiased Boltzmann averages are computed on the fly by a recently developed reweighting method [M. Bonomi, J. Comput. Chem. 30, 1615 (2009)]. We apply WTE and its parallel tempering variant to the 2d Ising model and to a Gō model of HIV protease, demonstrating in these two representative cases that convergence is accelerated by orders of magnitude.
Population annealing with weighted averages: A Monte Carlo method for rough free-energy landscapes
NASA Astrophysics Data System (ADS)
Machta, J.
2010-08-01
The population annealing algorithm introduced by Hukushima and Iba is described. Population annealing combines simulated annealing and Boltzmann weighted differential reproduction within a population of replicas to sample equilibrium states. Population annealing gives direct access to the free energy. It is shown that unbiased measurements of observables can be obtained by weighted averages over many runs with weight factors related to the free-energy estimate from the run. Population annealing is well suited to parallelization and may be a useful alternative to parallel tempering for systems with rough free-energy landscapes such as spin glasses. The method is demonstrated for spin glasses.
Coslovich, Daniele; Ozawa, Misaki; Kob, Walter
2018-05-17
The physical behavior of glass-forming liquids presents complex features of both dynamic and thermodynamic nature. Some studies indicate the presence of thermodynamic anomalies and of crossovers in the dynamic properties, but their origin and degree of universality is difficult to assess. Moreover, conventional simulations are barely able to cover the range of temperatures at which these crossovers usually occur. To address these issues, we simulate the Kob-Andersen Lennard-Jones mixture using efficient protocols based on multi-CPU and multi-GPU parallel tempering. Our setup enables us to probe the thermodynamics and dynamics of the liquid at equilibrium well below the critical temperature of the mode-coupling theory, [Formula: see text]. We find that below [Formula: see text] the analysis is hampered by partial crystallization of the metastable liquid, which nucleates extended regions populated by large particles arranged in an fcc structure. By filtering out crystalline samples, we reveal that the specific heat grows in a regular manner down to [Formula: see text] . Possible thermodynamic anomalies suggested by previous studies can thus occur only in a region of the phase diagram where the system is highly metastable. Using the equilibrium configurations obtained from the parallel tempering simulations, we perform molecular dynamics and Monte Carlo simulations to probe the equilibrium dynamics down to [Formula: see text]. A temperature-derivative analysis of the relaxation time and diffusion data allows us to assess different dynamic scenarios around [Formula: see text]. Hints of a dynamic crossover come from analysis of the four-point dynamic susceptibility. Finally, we discuss possible future numerical strategies to clarify the nature of crossover phenomena in glass-forming liquids.
Mori, Yoshiharu; Okamoto, Yuko
2013-02-01
A simulated tempering method, which is referred to as simulated-tempering umbrella sampling, for calculating the free energy of chemical reactions is proposed. First principles molecular dynamics simulations with this simulated tempering were performed to study the intramolecular proton transfer reaction of malonaldehyde in an aqueous solution. Conformational sampling in reaction coordinate space can be easily enhanced with this method, and the free energy along a reaction coordinate can be calculated accurately. Moreover, the simulated-tempering umbrella sampling provides trajectory data more efficiently than the conventional umbrella sampling method.
Effects of Polymer Conjugation on Hybridization Thermodynamics of Oligonucleic Acids.
Ghobadi, Ahmadreza F; Jayaraman, Arthi
2016-09-15
In this work, we perform coarse-grained (CG) and atomistic simulations to study the effects of polymer conjugation on hybridization/melting thermodynamics of oligonucleic acids (ONAs). We present coarse-grained Langevin molecular dynamics simulations (CG-NVT) to assess the effects of the polymer flexibility, length, and architecture on hybridization/melting of ONAs with different ONA duplex sequences, backbone chemistry, and duplex concentration. In these CG-NVT simulations, we use our recently developed CG model of ONAs in implicit solvent, and treat the conjugated polymer as a CG chain with purely repulsive Weeks-Chandler-Andersen interactions with all other species in the system. We find that 8-100-mer linear polymer conjugation destabilizes 8-mer ONA duplexes with weaker Watson-Crick hydrogen bonding (WC H-bonding) interactions at low duplex concentrations, while the same polymer conjugation has an insignificant impact on 8-mer ONA duplexes with stronger WC H-bonding. To ensure the configurational space is sampled properly in the CG-NVT simulations, we also perform CG well-tempered metadynamics simulations (CG-NVT-MetaD) and analyze the free energy landscape of ONA hybridization for a select few systems. We demonstrate that CG-NVT-MetaD simulation results are consistent with the CG-NVT simulations for the studied systems. To examine the limitations of coarse-graining in capturing ONA-polymer interactions, we perform atomistic parallel tempering metadynamics simulations at well-tempered ensemble (AA-MetaD) for a 4-mer DNA in explicit water with and without conjugation to 8-mer poly(ethylene glycol) (PEG). AA-MetaD simulations also show that, for a short DNA duplex at T = 300 K, a condition where the DNA duplex is unstable, conjugation with PEG further destabilizes DNA duplex. We conclude with a comparison of results from these three different types of simulations and discuss their limitations and strengths.
Exhaustively sampling peptide adsorption with metadynamics.
Deighan, Michael; Pfaendtner, Jim
2013-06-25
Simulating the adsorption of a peptide or protein and obtaining quantitative estimates of thermodynamic observables remains challenging for many reasons. One reason is the dearth of molecular scale experimental data available for validating such computational models. We also lack simulation methodologies that effectively address the dual challenges of simulating protein adsorption: overcoming strong surface binding and sampling conformational changes. Unbiased classical simulations do not address either of these challenges. Previous attempts that apply enhanced sampling generally focus on only one of the two issues, leaving the other to chance or brute force computing. To improve our ability to accurately resolve adsorbed protein orientation and conformational states, we have applied the Parallel Tempering Metadynamics in the Well-Tempered Ensemble (PTMetaD-WTE) method to several explicitly solvated protein/surface systems. We simulated the adsorption behavior of two peptides, LKα14 and LKβ15, onto two self-assembled monolayer (SAM) surfaces with carboxyl and methyl terminal functionalities. PTMetaD-WTE proved effective at achieving rapid convergence of the simulations, whose results elucidated different aspects of peptide adsorption including: binding free energies, side chain orientations, and preferred conformations. We investigated how specific molecular features of the surface/protein interface change the shape of the multidimensional peptide binding free energy landscape. Additionally, we compared our enhanced sampling technique with umbrella sampling and also evaluated three commonly used molecular dynamics force fields.
Application of the DMRG in two dimensions: a parallel tempering algorithm
NASA Astrophysics Data System (ADS)
Hu, Shijie; Zhao, Jize; Zhang, Xuefeng; Eggert, Sebastian
The Density Matrix Renormalization Group (DMRG) is known to be a powerful algorithm for treating one-dimensional systems. When the DMRG is applied in two dimensions, however, the convergence becomes much less reliable and typically ''metastable states'' may appear, which are unfortunately quite robust even when keeping a very high number of DMRG states. To overcome this problem we have now successfully developed a parallel tempering DMRG algorithm. Similar to parallel tempering in quantum Monte Carlo, this algorithm allows the systematic switching of DMRG states between different model parameters, which is very efficient for solving convergence problems. Using this method we have figured out the phase diagram of the xxz model on the anisotropic triangular lattice which can be realized by hardcore bosons in optical lattices. SFB Transregio 49 of the Deutsche Forschungsgemeinschaft (DFG) and the Allianz fur Hochleistungsrechnen Rheinland-Pfalz (AHRP).
Slepoy, A; Peters, M D; Thompson, A P
2007-11-30
Molecular dynamics and other molecular simulation methods rely on a potential energy function, based only on the relative coordinates of the atomic nuclei. Such a function, called a force field, approximately represents the electronic structure interactions of a condensed matter system. Developing such approximate functions and fitting their parameters remains an arduous, time-consuming process, relying on expert physical intuition. To address this problem, a functional programming methodology was developed that may enable automated discovery of entirely new force-field functional forms, while simultaneously fitting parameter values. The method uses a combination of genetic programming, Metropolis Monte Carlo importance sampling and parallel tempering, to efficiently search a large space of candidate functional forms and parameters. The methodology was tested using a nontrivial problem with a well-defined globally optimal solution: a small set of atomic configurations was generated and the energy of each configuration was calculated using the Lennard-Jones pair potential. Starting with a population of random functions, our fully automated, massively parallel implementation of the method reproducibly discovered the original Lennard-Jones pair potential by searching for several hours on 100 processors, sampling only a minuscule portion of the total search space. This result indicates that, with further improvement, the method may be suitable for unsupervised development of more accurate force fields with completely new functional forms. Copyright (c) 2007 Wiley Periodicals, Inc.
Schwerdtfeger, Peter; Smits, Odile; Pahl, Elke; Jerabek, Paul
2018-06-12
State-of-the-art relativistic coupled-cluster theory is used to construct many-body potentials for the rare gas element radon in order to determine its bulk properties including the solid-to-liquid phase transition from parallel tempering Monte Carlo simulations through either direct sampling of the bulk or from a finite cluster approach. The calculated melting temperature are 201(3) K and 201(6) K from bulk simulations and from extrapolation of finite cluster values, respectively. This is in excellent agreement with the often debated (but widely cited) and only available value of 202 K, dating back to measurements by Gray and Ramsay in 1909. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Ghorai, Sankar; Chaudhury, Pinaki
2018-05-30
We have used a replica exchange Monte-Carlo procedure, popularly known as Parallel Tempering, to study the problem of Coulomb explosion in homogeneous Ar and Xe dicationic clusters as well as mixed Ar-Xe dicationic clusters of varying sizes with different degrees of relative composition. All the clusters studied have two units of positive charges. The simulations reveal that in all the cases there is a cutoff size below which the clusters fragment. It is seen that for the case of pure Ar, the value is around 95 while that for Xe it is 55. For the mixed clusters with increasing Xe content, the cutoff limit for suppression of Coulomb explosion gradually decreases from 95 for a pure Ar to 55 for a pure Xe cluster. The hallmark of this study is this smooth progression. All the clusters are simulated using the reliable potential energy surface developed by Gay and Berne (Gay and Berne, Phys. Rev. Lett. 1982, 49, 194). For the hetero clusters, we have also discussed two different ways of charge distribution, that is one in which both positive charges are on two Xe atoms and the other where the two charges are at a Xe atom and at an Ar atom. The fragmentation patterns observed by us are such that single ionic ejections are the favored dissociating pattern. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Liquid-liquid transition in the ST2 model of water
NASA Astrophysics Data System (ADS)
Debenedetti, Pablo
2013-03-01
We present clear evidence of the existence of a metastable liquid-liquid phase transition in the ST2 model of water. Using four different techniques (the weighted histogram analysis method with single-particle moves, well-tempered metadynamics with single-particle moves, weighted histograms with parallel tempering and collective particle moves, and conventional molecular dynamics), we calculate the free energy surface over a range of thermodynamic conditions, we perform a finite size scaling analysis for the free energy barrier between the coexisting liquid phases, we demonstrate the attainment of diffusive behavior, and we perform stringent thermodynamic consistency checks. The results provide conclusive evidence of a first-order liquid-liquid transition. We also show that structural equilibration in the sluggish low-density phase is attained over the time scale of our simulations, and that crystallization times are significantly longer than structural equilibration, even under deeply supercooled conditions. We place our results in the context of the theory of metastability.
Monte Carlo simulation of Hamaker nanospheres coated with dipolar particles
NASA Astrophysics Data System (ADS)
Meyra, Ariel G.; Zarragoicoechea, Guillermo J.; Kuz, Victor A.
2012-01-01
Parallel tempering Monte Carlo simulation is carried out in systems of N attractive Hamaker spheres dressed with n dipolar particles, able to move on the surface of the spheres. Different cluster configurations emerge for given values of the control parameters. Energy per sphere, pair distribution functions of spheres and dipoles as function of temperature, density, external electric field, and/or the angular orientation of dipoles are used to analyse the state of aggregation of the system. As a consequence of the non-central interaction, the model predicts complex structures like self-assembly of spheres by a double crown of dipoles. This interesting result could be of help in understanding some recent experiments in colloidal science and biology.
Modeling and Simulation of Quenching and Tempering Process in steels
NASA Astrophysics Data System (ADS)
Deng, Xiaohu; Ju, Dongying
Quenching and tempering (Q&T) is a combined heat treatment process to achieve maximum toughness and ductility at a specified hardness and strength. It is important to develop a mathematical model for quenching and tempering process for satisfy requirement of mechanical properties with low cost. This paper presents a modified model to predict structural evolution and hardness distribution during quenching and tempering process of steels. The model takes into account tempering parameters, carbon content, isothermal and non-isothermal transformations. Moreover, precipitation of transition carbides, decomposition of retained austenite and precipitation of cementite can be simulated respectively. Hardness distributions of quenched and tempered workpiece are predicted by experimental regression equation. In order to validate the model, it is employed to predict the tempering of 80MnCr5 steel. The predicted precipitation dynamics of transition carbides and cementite is consistent with the previous experimental and simulated results from literature. Then the model is implemented within the framework of the developed simulation code COSMAP to simulate microstructure, stress and distortion in the heat treated component. It is applied to simulate Q&T process of J55 steel. The calculated results show a good agreement with the experimental ones. This agreement indicates that the model is effective for simulation of Q&T process of steels.
Marsili, Simone; Signorini, Giorgio Federico; Chelli, Riccardo; Marchi, Massimo; Procacci, Piero
2010-04-15
We present the new release of the ORAC engine (Procacci et al., Comput Chem 1997, 18, 1834), a FORTRAN suite to simulate complex biosystems at the atomistic level. The previous release of the ORAC code included multiple time steps integration, smooth particle mesh Ewald method, constant pressure and constant temperature simulations. The present release has been supplemented with the most advanced techniques for enhanced sampling in atomistic systems including replica exchange with solute tempering, metadynamics and steered molecular dynamics. All these computational technologies have been implemented for parallel architectures using the standard MPI communication protocol. ORAC is an open-source program distributed free of charge under the GNU general public license (GPL) at http://www.chim.unifi.it/orac. 2009 Wiley Periodicals, Inc.
[Simulation on the seasonal growth patterns of grassland plant communities in northern China].
Zhang, Li; Zheng, Yuan-Run
2008-10-01
Soil moisture is the key factor limiting the productivity of grassland in northern China ranging from arid to subhumid arid regions. In this paper, the seasonal and annual growth, foliage projective cover (FPC), evaporative coefficient (k), and net primary productivity (NPP) of 7 types of grasslands in North China were simulated by using a simple model based on well established ecological processes of water balance and climatic data collected at 460 sites over 40 years. The observed NPPs were used to validate the model, and the simulated NPPs were in high agreement with the observed NPPs. The simulated k, NPP, and FPC deceased from east to west in temperate grasslands, and decreased from southeast to northwest in Qinghai-Tibet Plateau, reflecting the moisture gradient in northern China. Alpine meadow had the highest k, NPP, and FPC in the 7 types of grasslands, alpine steppe had the second highest FPC but with a NPP similar to that of temperate steppe, and the three simulated parameters of temperate desert were the smallest. The simulated results suggested that the livestock density should be lower than 5.2, 2.3, 3.6, 2.1, 1.0, 0.6, and 0.2 sheep unit x hm(-2), while the coverage of rehabilitated vegetation should be about 93%, 79%, 56%, 50%, 44%, 38%, and 37% in alpine meadow, alpine steppe, temperate meadow steppe, temperate steppe, temperate desert steppe, temperate steppe desert, and temperate desert, respectively.
Granato, Enzo
2008-07-11
Phase coherence and vortex order in a Josephson-junction array at irrational frustration are studied by extensive Monte Carlo simulations using the parallel-tempering method. A scaling analysis of the correlation length of phase variables in the full equilibrated system shows that the critical temperature vanishes with a power-law divergent correlation length and critical exponent nuph, in agreement with recent results from resistivity scaling analysis. A similar scaling analysis for vortex variables reveals a different critical exponent nuv, suggesting that there are two distinct correlation lengths associated with a decoupled zero-temperature phase transition.
Exploring the Energy Landscapes of Protein Folding Simulations with Bayesian Computation
Burkoff, Nikolas S.; Várnai, Csilla; Wells, Stephen A.; Wild, David L.
2012-01-01
Nested sampling is a Bayesian sampling technique developed to explore probability distributions localized in an exponentially small area of the parameter space. The algorithm provides both posterior samples and an estimate of the evidence (marginal likelihood) of the model. The nested sampling algorithm also provides an efficient way to calculate free energies and the expectation value of thermodynamic observables at any temperature, through a simple post processing of the output. Previous applications of the algorithm have yielded large efficiency gains over other sampling techniques, including parallel tempering. In this article, we describe a parallel implementation of the nested sampling algorithm and its application to the problem of protein folding in a Gō-like force field of empirical potentials that were designed to stabilize secondary structure elements in room-temperature simulations. We demonstrate the method by conducting folding simulations on a number of small proteins that are commonly used for testing protein-folding procedures. A topological analysis of the posterior samples is performed to produce energy landscape charts, which give a high-level description of the potential energy surface for the protein folding simulations. These charts provide qualitative insights into both the folding process and the nature of the model and force field used. PMID:22385859
Exploring the energy landscapes of protein folding simulations with Bayesian computation.
Burkoff, Nikolas S; Várnai, Csilla; Wells, Stephen A; Wild, David L
2012-02-22
Nested sampling is a Bayesian sampling technique developed to explore probability distributions localized in an exponentially small area of the parameter space. The algorithm provides both posterior samples and an estimate of the evidence (marginal likelihood) of the model. The nested sampling algorithm also provides an efficient way to calculate free energies and the expectation value of thermodynamic observables at any temperature, through a simple post processing of the output. Previous applications of the algorithm have yielded large efficiency gains over other sampling techniques, including parallel tempering. In this article, we describe a parallel implementation of the nested sampling algorithm and its application to the problem of protein folding in a Gō-like force field of empirical potentials that were designed to stabilize secondary structure elements in room-temperature simulations. We demonstrate the method by conducting folding simulations on a number of small proteins that are commonly used for testing protein-folding procedures. A topological analysis of the posterior samples is performed to produce energy landscape charts, which give a high-level description of the potential energy surface for the protein folding simulations. These charts provide qualitative insights into both the folding process and the nature of the model and force field used. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Free Energy Landscape of GAGA and UUCG RNA Tetraloops.
Bottaro, Sandro; Banáš, Pavel; Šponer, Jiří; Bussi, Giovanni
2016-10-20
We report the folding thermodynamics of ccUUCGgg and ccGAGAgg RNA tetraloops using atomistic molecular dynamics simulations. We obtain a previously unreported estimation of the folding free energy using parallel tempering in combination with well-tempered metadynamics. A key ingredient is the use of a recently developed metric distance, eRMSD, as a biased collective variable. We find that the native fold of both tetraloops is not the global free energy minimum using the Amberχ OL3 force field. The estimated folding free energies are 30.2 ± 0.5 kJ/mol for UUCG and 7.5 ± 0.6 kJ/mol for GAGA, in striking disagreement with experimental data. We evaluate the viability of all possible one-dimensional backbone force field corrections. We find that disfavoring the gauche + region of α and ζ angles consistently improves the existing force field. The level of accuracy achieved with these corrections, however, cannot be considered sufficient by judging on the basis of available thermodynamic data and solution experiments.
Olson, Mark A
2018-01-22
Intrinsically disordered proteins are characterized by their large manifold of thermally accessible conformations and their related statistical weights, making them an interesting target of simulation studies. To assess the development of a computational framework for modeling this distinct class of proteins, this work examines temperature-based replica-exchange simulations to generate a conformational ensemble of a 28-residue peptide from the Ebola virus protein VP35. Starting from a prefolded helix-β-turn-helix topology observed in a crystallographic assembly, the simulation strategy tested is the recently refined CHARMM36m force field combined with a generalized Born solvent model. A comparison of two replica-exchange methods is provided, where one is a traditional approach with a fixed set of temperatures and the other is an adaptive scheme in which the thermal windows are allowed to move in temperature space. The assessment is further extended to include a comparison with equivalent CHARMM22 simulation data sets. The analysis finds CHARMM36m to shift the minimum in the potential of mean force (PMF) to a lower fractional helicity compared with CHARMM22, while the latter showed greater conformational plasticity along the helix-forming reaction coordinate. Among the simulation models, only the adaptive tempering method with CHARMM36m found an ensemble of conformational heterogeneity consisting of transitions between α-helix-β-hairpin folds and unstructured states that produced a PMF of fractional fold propensity in qualitative agreement with circular dichroism experiments reporting a disordered peptide.
NASA Astrophysics Data System (ADS)
Karimi, Hamed; Rosenberg, Gili; Katzgraber, Helmut G.
2017-10-01
We present and apply a general-purpose, multistart algorithm for improving the performance of low-energy samplers used for solving optimization problems. The algorithm iteratively fixes the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are smaller and less connected, and samplers tend to give better low-energy samples for these problems. The algorithm is trivially parallelizable since each start in the multistart algorithm is independent, and could be applied to any heuristic solver that can be run multiple times to give a sample. We present results for several classes of hard problems solved using simulated annealing, path-integral quantum Monte Carlo, parallel tempering with isoenergetic cluster moves, and a quantum annealer, and show that the success metrics and the scaling are improved substantially. When combined with this algorithm, the quantum annealer's scaling was substantially improved for native Chimera graph problems. In addition, with this algorithm the scaling of the time to solution of the quantum annealer is comparable to the Hamze-de Freitas-Selby algorithm on the weak-strong cluster problems introduced by Boixo et al. Parallel tempering with isoenergetic cluster moves was able to consistently solve three-dimensional spin glass problems with 8000 variables when combined with our method, whereas without our method it could not solve any.
Population Annealing Monte Carlo for Frustrated Systems
NASA Astrophysics Data System (ADS)
Amey, Christopher; Machta, Jonathan
Population annealing is a sequential Monte Carlo algorithm that efficiently simulates equilibrium systems with rough free energy landscapes such as spin glasses and glassy fluids. A large population of configurations is initially thermalized at high temperature and then cooled to low temperature according to an annealing schedule. The population is kept in thermal equilibrium at every annealing step via resampling configurations according to their Boltzmann weights. Population annealing is comparable to parallel tempering in terms of efficiency, but has several distinct and useful features. In this talk I will give an introduction to population annealing and present recent progress in understanding its equilibration properties and optimizing it for spin glasses. Results from large-scale population annealing simulations for the Ising spin glass in 3D and 4D will be presented. NSF Grant DMR-1507506.
Off-diagonal expansion quantum Monte Carlo
NASA Astrophysics Data System (ADS)
Albash, Tameem; Wagenbreth, Gene; Hen, Itay
2017-12-01
We propose a Monte Carlo algorithm designed to simulate quantum as well as classical systems at equilibrium, bridging the algorithmic gap between quantum and classical thermal simulation algorithms. The method is based on a decomposition of the quantum partition function that can be viewed as a series expansion about its classical part. We argue that the algorithm not only provides a theoretical advancement in the field of quantum Monte Carlo simulations, but is optimally suited to tackle quantum many-body systems that exhibit a range of behaviors from "fully quantum" to "fully classical," in contrast to many existing methods. We demonstrate the advantages, sometimes by orders of magnitude, of the technique by comparing it against existing state-of-the-art schemes such as path integral quantum Monte Carlo and stochastic series expansion. We also illustrate how our method allows for the unification of quantum and classical thermal parallel tempering techniques into a single algorithm and discuss its practical significance.
Off-diagonal expansion quantum Monte Carlo.
Albash, Tameem; Wagenbreth, Gene; Hen, Itay
2017-12-01
We propose a Monte Carlo algorithm designed to simulate quantum as well as classical systems at equilibrium, bridging the algorithmic gap between quantum and classical thermal simulation algorithms. The method is based on a decomposition of the quantum partition function that can be viewed as a series expansion about its classical part. We argue that the algorithm not only provides a theoretical advancement in the field of quantum Monte Carlo simulations, but is optimally suited to tackle quantum many-body systems that exhibit a range of behaviors from "fully quantum" to "fully classical," in contrast to many existing methods. We demonstrate the advantages, sometimes by orders of magnitude, of the technique by comparing it against existing state-of-the-art schemes such as path integral quantum Monte Carlo and stochastic series expansion. We also illustrate how our method allows for the unification of quantum and classical thermal parallel tempering techniques into a single algorithm and discuss its practical significance.
The Metropolis Monte Carlo method with CUDA enabled Graphic Processing Units
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hall, Clifford; School of Physics, Astronomy, and Computational Sciences, George Mason University, 4400 University Dr., Fairfax, VA 22030; Ji, Weixiao
2014-02-01
We present a CPU–GPU system for runtime acceleration of large molecular simulations using GPU computation and memory swaps. The memory architecture of the GPU can be used both as container for simulation data stored on the graphics card and as floating-point code target, providing an effective means for the manipulation of atomistic or molecular data on the GPU. To fully take advantage of this mechanism, efficient GPU realizations of algorithms used to perform atomistic and molecular simulations are essential. Our system implements a versatile molecular engine, including inter-molecule interactions and orientational variables for performing the Metropolis Monte Carlo (MMC) algorithm,more » which is one type of Markov chain Monte Carlo. By combining memory objects with floating-point code fragments we have implemented an MMC parallel engine that entirely avoids the communication time of molecular data at runtime. Our runtime acceleration system is a forerunner of a new class of CPU–GPU algorithms exploiting memory concepts combined with threading for avoiding bus bandwidth and communication. The testbed molecular system used here is a condensed phase system of oligopyrrole chains. A benchmark shows a size scaling speedup of 60 for systems with 210,000 pyrrole monomers. Our implementation can easily be combined with MPI to connect in parallel several CPU–GPU duets. -- Highlights: •We parallelize the Metropolis Monte Carlo (MMC) algorithm on one CPU—GPU duet. •The Adaptive Tempering Monte Carlo employs MMC and profits from this CPU—GPU implementation. •Our benchmark shows a size scaling-up speedup of 62 for systems with 225,000 particles. •The testbed involves a polymeric system of oligopyrroles in the condensed phase. •The CPU—GPU parallelization includes dipole—dipole and Mie—Jones classic potentials.« less
Entropic stabilization of isolated beta-sheets.
Dugourd, Philippe; Antoine, Rodolphe; Breaux, Gary; Broyer, Michel; Jarrold, Martin F
2005-04-06
Temperature-dependent electric deflection measurements have been performed for a series of unsolvated alanine-based peptides (Ac-WA(n)-NH(2), where Ac = acetyl, W = tryptophan, A = alanine, and n = 3, 5, 10, 13, and 15). The measurements are interpreted using Monte Carlo simulations performed with a parallel tempering algorithm. Despite alanine's high helix propensity in solution, the results suggest that unsolvated Ac-WA(n)-NH(2) peptides with n > 10 adopt beta-sheet conformations at room temperature. Previous studies have shown that protonated alanine-based peptides adopt helical or globular conformations in the gas phase, depending on the location of the charge. Thus, the charge more than anything else controls the structure.
Simulated Annealing in the Variable Landscape
NASA Astrophysics Data System (ADS)
Hasegawa, Manabu; Kim, Chang Ju
An experimental analysis is conducted to test whether the appropriate introduction of the smoothness-temperature schedule enhances the optimizing ability of the MASSS method, the combination of the Metropolis algorithm (MA) and the search-space smoothing (SSS) method. The test is performed on two types of random traveling salesman problems. The results show that the optimization performance of the MA is substantially improved by a single smoothing alone and slightly more by a single smoothing with cooling and by a de-smoothing process with heating. The performance is compared to that of the parallel tempering method and a clear advantage of the idea of smoothing is observed depending on the problem.
Luo, Di; Mu, Yuguang
2016-06-09
G-quadruplex is a noncanonical yet crucial secondary structure of nucleic acids, which has proven its importance in cell aging, anticancer therapies, gene expression, and genome stability. In this study, the stability and folding dynamics of human telomeric DNA G-quadruplexes were investigated via enhanced sampling techniques. First, temperature-replica exchange MD (REMD) simulations were employed to compare the thermal stabilities among the five established folding topologies. The hybrid-2 type adopted by extended human telomeric sequence is revealed to be the most stable conformation in our simulations. Next, the free energy landscapes and folding intermediates of the hybrid-1 and -2 types were investigated with parallel tempering metadynamics simulations in the well-tempered ensemble. It was observed that the N-glycosidic conformations of guanines can flip over to accommodate into the cyclic Hoogsteen H-bonding on G-tetrads in which they were not originally involved. Furthermore, a hairpin and a triplex intermediate were identified for the folding of the hybrid-1 type conformation, whereas for the hybrid-2 type, there were no folding intermediates observed from its free energy surface. However, the energy barrier from its native topology to the transition structure is found to be extremely high compared to that of the hybrid-1 type, which is consistent with our stability predictions from the REMD simulations. We hope the insights presented in this work can help to complement current understanding on the stability and dynamics of G-quadruplexes, which is necessary not only to stabilize the structures but also to intervene their formation in genome.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walker, Anthony P; Hanson, Paul J; DeKauwe, Martin G
2014-01-01
Free Air CO2 Enrichment (FACE) experiments provide a remarkable wealth of data to test the sensitivities of terrestrial ecosystem models (TEMs). In this study, a broad set of 11 TEMs were compared to 22 years of data from two contrasting FACE experiments in temperate forests of the south eastern US the evergreen Duke Forest and the deciduous Oak Ridge forest. We evaluated the models' ability to reproduce observed net primary productivity (NPP), transpiration and Leaf Area index (LAI) in ambient CO2 treatments. Encouragingly, many models simulated annual NPP and transpiration within observed uncertainty. Daily transpiration model errors were often relatedmore » to errors in leaf area phenology and peak LAI. Our analysis demonstrates that the simulation of LAI often drives the simulation of transpiration and hence there is a need to adopt the most appropriate of hypothesis driven methods to simulate and predict LAI. Of the three competing hypotheses determining peak LAI (1) optimisation to maximise carbon export, (2) increasing SLA with canopy depth and (3) the pipe model the pipe model produced LAI closest to the observations. Modelled phenology was either prescribed or based on broader empirical calibrations to climate. In some cases, simulation accuracy was achieved through compensating biases in component variables. For example, NPP accuracy was sometimes achieved with counter-balancing biases in nitrogen use efficiency and nitrogen uptake. Combined analysis of parallel measurements aides the identification of offsetting biases; without which over-confidence in model abilities to predict ecosystem function may emerge, potentially leading to erroneous predictions of change under future climates.« less
Binding Modes of Teixobactin to Lipid II: Molecular Dynamics Study.
Liu, Yang; Liu, Yaxin; Chan-Park, Mary B; Mu, Yuguang
2017-12-08
Teixobactin (TXB) is a newly discovered antibiotic targeting the bacterial cell wall precursor Lipid II (L II ). In the present work, four binding modes of TXB on L II were identified by a contact-map based clustering method. The highly flexible binary complex ensemble was generated by parallel tempering metadynamics simulation in a well-tempered ensemble (PTMetaD-WTE). In agreement with experimental findings, the pyrophosphate group and the attached first sugar subunit of L II are found to be the minimal motif for stable TXB binding. Three of the four binding modes involve the ring structure of TXB and have relatively higher binding affinities, indicating the importance of the ring motif of TXB in L II recognition. TXB-L II complexes with a ratio of 2:1 are also predicted with configurations such that the ring motif of two TXB molecules bound to the pyrophosphate-MurNAc moiety and the glutamic acid residue of one L II , respectively. Our findings disclose that the ring motif of TXB is critical to L II binding and novel antibiotics can be designed based on its mimetics.
USDA-ARS?s Scientific Manuscript database
Simulation models can be used to make management decisions when properly parameterized. This study aimed to parameterize the ALMANAC (Agricultural Land Management Alternatives with Numerical Assessment Criteria) crop simulation model for dry bean in the semi-arid temperate areas of Mexico. The par...
Replica exchange with solute tempering: A method for sampling biological systems in explicit water
NASA Astrophysics Data System (ADS)
Liu, Pu; Kim, Byungchan; Friesner, Richard A.; Berne, B. J.
2005-09-01
An innovative replica exchange (parallel tempering) method called replica exchange with solute tempering (REST) for the efficient sampling of aqueous protein solutions is presented here. The method bypasses the poor scaling with system size of standard replica exchange and thus reduces the number of replicas (parallel processes) that must be used. This reduction is accomplished by deforming the Hamiltonian function for each replica in such a way that the acceptance probability for the exchange of replica configurations does not depend on the number of explicit water molecules in the system. For proof of concept, REST is compared with standard replica exchange for an alanine dipeptide molecule in water. The comparisons confirm that REST greatly reduces the number of CPUs required by regular replica exchange and increases the sampling efficiency. This method reduces the CPU time required for calculating thermodynamic averages and for the ab initio folding of proteins in explicit water. Author contributions: B.J.B. designed research; P.L. and B.K. performed research; P.L. and B.K. analyzed data; and P.L., B.K., R.A.F., and B.J.B. wrote the paper.Abbreviations: REST, replica exchange with solute tempering; REM, replica exchange method; MD, molecular dynamics.*P.L. and B.K. contributed equally to this work.
Apfelbeck, Beate; Helm, Barbara; Illera, Juan Carlos; Mortega, Kim G; Smiddy, Patrick; Evans, Neil P
2017-05-22
Latitudinal variation in avian life histories falls along a slow-fast pace of life continuum: tropical species produce small clutches, but have a high survival probability, while in temperate species the opposite pattern is found. This study investigated whether differential investment into reproduction and survival of tropical and temperate species is paralleled by differences in the secretion of the vertebrate hormone corticosterone (CORT). Depending on circulating concentrations, CORT can both act as a metabolic (low to medium levels) and a stress hormone (high levels) and, thereby, influence reproductive decisions. Baseline and stress-induced CORT was measured across sequential stages of the breeding season in males and females of closely related taxa of stonechats (Saxicola spp) from a wide distribution area. We compared stonechats from 13 sites, representing Canary Islands, European temperate and East African tropical areas. Stonechats are highly seasonal breeders at all these sites, but vary between tropical and temperate regions with regard to reproductive investment and presumably also survival. In accordance with life-history theory, during parental stages, post-capture (baseline) CORT was overall lower in tropical than in temperate stonechats. However, during mating stages, tropical males had elevated post-capture (baseline) CORT concentrations, which did not differ from those of temperate males. Female and male mates of a pair showed correlated levels of post-capture CORT when sampled after simulated territorial intrusions. In contrast to the hypothesis that species with low reproduction and high annual survival should be more risk-sensitive, tropical stonechats had lower stress-induced CORT concentrations than temperate stonechats. We also found relatively high post-capture (baseline) and stress-induced CORT concentrations, in slow-paced Canary Islands stonechats. Our data support and refine the view that baseline CORT facilitates energetically demanding activities in males and females and reflects investment into reproduction. Low parental workload was associated with lower post-capture (baseline) CORT as expected for a slow pace of life in tropical species. On a finer resolution, however, this tropical-temperate contrast did not generally hold. Post-capture (baseline) CORT was higher during mating stages in particular in tropical males, possibly to support the energetic needs of mate-guarding. Counter to predictions based on life history theory, our data do not confirm the hypothesis that long-lived tropical populations have higher stress-induced CORT concentrations than short-lived temperate populations. Instead, in the predator-rich tropical environments of African stonechats, a dampened stress response during parental stages may increase survival probabilities of young. Overall our data further support an association between life history and baseline CORT, but challenge the role of stress-induced CORT as a mediator of tropical-temperate variation in life history.
Finite-size polyelectrolyte bundles at thermodynamic equilibrium
NASA Astrophysics Data System (ADS)
Sayar, M.; Holm, C.
2007-01-01
We present the results of extensive computer simulations performed on solutions of monodisperse charged rod-like polyelectrolytes in the presence of trivalent counterions. To overcome energy barriers we used a combination of parallel tempering and hybrid Monte Carlo techniques. Our results show that for small values of the electrostatic interaction the solution mostly consists of dispersed single rods. The potential of mean force between the polyelectrolyte monomers yields an attractive interaction at short distances. For a range of larger values of the Bjerrum length, we find finite-size polyelectrolyte bundles at thermodynamic equilibrium. Further increase of the Bjerrum length eventually leads to phase separation and precipitation. We discuss the origin of the observed thermodynamic stability of the finite-size aggregates.
Free-energy landscape of protein oligomerization from atomistic simulations
Barducci, Alessandro; Bonomi, Massimiliano; Prakash, Meher K.; Parrinello, Michele
2013-01-01
In the realm of protein–protein interactions, the assembly process of homooligomers plays a fundamental role because the majority of proteins fall into this category. A comprehensive understanding of this multistep process requires the characterization of the driving molecular interactions and the transient intermediate species. The latter are often short-lived and thus remain elusive to most experimental investigations. Molecular simulations provide a unique tool to shed light onto these complex processes complementing experimental data. Here we combine advanced sampling techniques, such as metadynamics and parallel tempering, to characterize the oligomerization landscape of fibritin foldon domain. This system is an evolutionarily optimized trimerization motif that represents an ideal model for experimental and computational mechanistic studies. Our results are fully consistent with previous experimental nuclear magnetic resonance and kinetic data, but they provide a unique insight into fibritin foldon assembly. In particular, our simulations unveil the role of nonspecific interactions and suggest that an interplay between thermodynamic bias toward native structure and residual conformational disorder may provide a kinetic advantage. PMID:24248370
Vapor-liquid equilibrium and critical asymmetry of square well and short square well chain fluids.
Li, Liyan; Sun, Fangfang; Chen, Zhitong; Wang, Long; Cai, Jun
2014-08-07
The critical behavior of square well fluids with variable interaction ranges and of short square well chain fluids have been investigated by grand canonical ensemble Monte Carlo simulations. The critical temperatures and densities were estimated by a finite-size scaling analysis with the help of histogram reweighting technique. The vapor-liquid coexistence curve in the near-critical region was determined using hyper-parallel tempering Monte Carlo simulations. The simulation results for coexistence diameters show that the contribution of |t|(1-α) to the coexistence diameter dominates the singular behavior in all systems investigated. The contribution of |t|(2β) to the coexistence diameter is larger for the system with a smaller interaction range λ. While for short square well chain fluids, longer the chain length, larger the contribution of |t|(2β). The molecular configuration greatly influences the critical asymmetry: a short soft chain fluid shows weaker critical asymmetry than a stiff chain fluid with same chain length.
Free-energy landscape of protein oligomerization from atomistic simulations.
Barducci, Alessandro; Bonomi, Massimiliano; Prakash, Meher K; Parrinello, Michele
2013-12-03
In the realm of protein-protein interactions, the assembly process of homooligomers plays a fundamental role because the majority of proteins fall into this category. A comprehensive understanding of this multistep process requires the characterization of the driving molecular interactions and the transient intermediate species. The latter are often short-lived and thus remain elusive to most experimental investigations. Molecular simulations provide a unique tool to shed light onto these complex processes complementing experimental data. Here we combine advanced sampling techniques, such as metadynamics and parallel tempering, to characterize the oligomerization landscape of fibritin foldon domain. This system is an evolutionarily optimized trimerization motif that represents an ideal model for experimental and computational mechanistic studies. Our results are fully consistent with previous experimental nuclear magnetic resonance and kinetic data, but they provide a unique insight into fibritin foldon assembly. In particular, our simulations unveil the role of nonspecific interactions and suggest that an interplay between thermodynamic bias toward native structure and residual conformational disorder may provide a kinetic advantage.
Communication: Multiple atomistic force fields in a single enhanced sampling simulation
NASA Astrophysics Data System (ADS)
Hoang Viet, Man; Derreumaux, Philippe; Nguyen, Phuong H.
2015-07-01
The main concerns of biomolecular dynamics simulations are the convergence of the conformational sampling and the dependence of the results on the force fields. While the first issue can be addressed by employing enhanced sampling techniques such as simulated tempering or replica exchange molecular dynamics, repeating these simulations with different force fields is very time consuming. Here, we propose an automatic method that includes different force fields into a single advanced sampling simulation. Conformational sampling using three all-atom force fields is enhanced by simulated tempering and by formulating the weight parameters of the simulated tempering method in terms of the energy fluctuations, the system is able to perform random walk in both temperature and force field spaces. The method is first demonstrated on a 1D system and then validated by the folding of the 10-residue chignolin peptide in explicit water.
Continuous Easy-Plane Deconfined Phase Transition on the Kagome Lattice
NASA Astrophysics Data System (ADS)
Zhang, Xue-Feng; He, Yin-Chen; Eggert, Sebastian; Moessner, Roderich; Pollmann, Frank
2018-03-01
We use large scale quantum Monte Carlo simulations to study an extended Hubbard model of hard core bosons on the kagome lattice. In the limit of strong nearest-neighbor interactions at 1 /3 filling, the interplay between frustration and quantum fluctuations leads to a valence bond solid ground state. The system undergoes a quantum phase transition to a superfluid phase as the interaction strength is decreased. It is still under debate whether the transition is weakly first order or represents an unconventional continuous phase transition. We present a theory in terms of an easy plane noncompact C P1 gauge theory describing the phase transition at 1 /3 filling. Utilizing large scale quantum Monte Carlo simulations with parallel tempering in the canonical ensemble up to 15552 spins, we provide evidence that the phase transition is continuous at exactly 1 /3 filling. A careful finite size scaling analysis reveals an unconventional scaling behavior hinting at deconfined quantum criticality.
Spichty, Martin; Taly, Antoine; Hagn, Franz; Kessler, Horst; Barluenga, Sofia; Winssinger, Nicolas; Karplus, Martin
2009-01-01
We determine the binding mode of a macrocyclic radicicol-like oxime to yeast HSP90 by combining computer simulations and experimental measurements. We sample the macrocyclic scaffold of the unbound ligand by parallel tempering simulations and dock the most populated conformations to yeast HSP90. Docking poses are then evaluated by the use of binding free energy estimations with the linear interaction energy method. Comparison of QM/MM-calculated NMR chemical shifts with experimental shift data for a selective subset of back-bone 15N provides an additional evaluation criteria. As a last test we check the binding modes against available structure-activity-relationships. We find that the most likely binding mode of the oxime to yeast HSP90 is very similar to the known structure of the radicicol-HSP90 complex. PMID:19482409
Temperature-dependent errors in nuclear lattice simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Dean; Thomson, Richard
2007-06-15
We study the temperature dependence of discretization errors in nuclear lattice simulations. We find that for systems with strong attractive interactions the predominant error arises from the breaking of Galilean invariance. We propose a local 'well-tempered' lattice action which eliminates much of this error. The well-tempered action can be readily implemented in lattice simulations for nuclear systems as well as cold atomic Fermi systems.
Water isotope effect on the thermostability of a polio viral RNA hairpin: A metadynamics study.
Pathak, Arup K; Bandyopadhyay, Tusar
2017-04-28
Oral polio vaccine is considered to be the most thermolabile of all the common childhood vaccines. Despite heavy water (D 2 O) having been known for a long time to stabilise attenuated viral RNA against thermodegradation, the molecular underpinnings of its mechanism of action are still lacking. Whereas, understanding the basis of D 2 O action is an important step that might reform the way other thermolabile drugs are stored and could possibly minimize the cold chain problem. Here using a combination of parallel tempering and well-tempered metadynamics simulation in light water (H 2 O) and in D 2 O, we have fully described the free energy surface associated with the folding/unfolding of a RNA hairpin containing a non-canonical basepair motif, which is conserved within the 3'-untranslated region of poliovirus-like enteroviruses. Simulations reveal that in heavy water (D 2 O) there is a considerable increase of the stability of the folded basin as monitored through an intramolecular hydrogen bond (HB), size, shape, and flexibility of RNA structures. This translates into a higher melting temperature in D 2 O by 41 K when compared with light water (H 2 O). We have explored the hydration dynamics of the RNA, hydration shell around the RNA surface, and spatial dependence of RNA-solvent collective HB dynamics in the two water systems. Simulation in heavy water clearly showed that D 2 O strengthens the HB network in the solvent, lengthens inter-residue water-bridge lifetime, and weakens dynamical coupling of the hairpin to its solvation environment, which enhances the rigidity of solvent exposed sites of the native configurations. The results might suggest that like other added osmoprotectants, D 2 O can act as a thermostabilizer when used as a solvent.
Water isotope effect on the thermostability of a polio viral RNA hairpin: A metadynamics study
NASA Astrophysics Data System (ADS)
Pathak, Arup K.; Bandyopadhyay, Tusar
2017-04-01
Oral polio vaccine is considered to be the most thermolabile of all the common childhood vaccines. Despite heavy water (D2O) having been known for a long time to stabilise attenuated viral RNA against thermodegradation, the molecular underpinnings of its mechanism of action are still lacking. Whereas, understanding the basis of D2O action is an important step that might reform the way other thermolabile drugs are stored and could possibly minimize the cold chain problem. Here using a combination of parallel tempering and well-tempered metadynamics simulation in light water (H2O) and in D2O, we have fully described the free energy surface associated with the folding/unfolding of a RNA hairpin containing a non-canonical basepair motif, which is conserved within the 3'-untranslated region of poliovirus-like enteroviruses. Simulations reveal that in heavy water (D2O) there is a considerable increase of the stability of the folded basin as monitored through an intramolecular hydrogen bond (HB), size, shape, and flexibility of RNA structures. This translates into a higher melting temperature in D2O by 41 K when compared with light water (H2O). We have explored the hydration dynamics of the RNA, hydration shell around the RNA surface, and spatial dependence of RNA-solvent collective HB dynamics in the two water systems. Simulation in heavy water clearly showed that D2O strengthens the HB network in the solvent, lengthens inter-residue water-bridge lifetime, and weakens dynamical coupling of the hairpin to its solvation environment, which enhances the rigidity of solvent exposed sites of the native configurations. The results might suggest that like other added osmoprotectants, D2O can act as a thermostabilizer when used as a solvent.
Multidimensional generalized-ensemble algorithms for complex systems.
Mitsutake, Ayori; Okamoto, Yuko
2009-06-07
We give general formulations of the multidimensional multicanonical algorithm, simulated tempering, and replica-exchange method. We generalize the original potential energy function E(0) by adding any physical quantity V of interest as a new energy term. These multidimensional generalized-ensemble algorithms then perform a random walk not only in E(0) space but also in V space. Among the three algorithms, the replica-exchange method is the easiest to perform because the weight factor is just a product of regular Boltzmann-like factors, while the weight factors for the multicanonical algorithm and simulated tempering are not a priori known. We give a simple procedure for obtaining the weight factors for these two latter algorithms, which uses a short replica-exchange simulation and the multiple-histogram reweighting techniques. As an example of applications of these algorithms, we have performed a two-dimensional replica-exchange simulation and a two-dimensional simulated-tempering simulation using an alpha-helical peptide system. From these simulations, we study the helix-coil transitions of the peptide in gas phase and in aqueous solution.
Chemineau, Philippe; Daveau, Agnès; Cognié, Yves; Aumont, Gilles; Chesneau, Didier
2004-08-27
Seasonality of ovulatory activity is observed in European sheep and goat breeds, whereas tropical breeds show almost continuous ovulatory activity. It is not known if these tropical breeds are sensitive or not to temperate photoperiod. This study was therefore designed to determine whether tropical Creole goats and Black-Belly ewes are sensitive to temperate photoperiod. Two groups of adult females in each species, either progeny or directly born from imported embryos, were used and maintained in light-proof rooms under simulated temperate (8 to 16 h of light per day) or tropical (11 - 13 h) photoperiods. Ovulatory activity was determined by blood progesterone assays for more than two years. The experiment lasted 33 months in goats and 25 months in ewes. Marked seasonality of ovulatory activity appeared in the temperate group of Creole female goats. The percentage of female goats experiencing at least one ovulation per month dramatically decreased from May to September for the three years (0%, 27% and 0%, respectively). Tropical female goats demonstrated much less seasonality, as the percentage of goats experiencing at least one ovulation per month never went below 56%. These differences were significant. Both groups of temperate and tropical Black-Belly ewes experienced a marked seasonality in their ovulatory activity, with only a slightly significant difference between groups. The percentage of ewes experiencing at least one ovulation per month dropped dramatically in April and rose again in August (tropical ewes) or September (temperate ewes). The percentage of ewes experiencing at least one ovulation per month never went below 8% and 17% (for tropical and temperate ewes respectively) during the spring and summer months. An important seasonality in ovulatory activity of tropical Creole goats was observed when females were exposed to a simulated temperate photoperiod. An unexpected finding was that Black-Belly ewes and, to a lesser extent, Creole goats exposed to a simulated tropical photoperiod also showed seasonality in their ovulatory activity. Such results indicate that both species are capable of showing seasonality under the photoperiodic changes of the temperate zone even though they do not originate from these regions.
Chemineau, Philippe; Daveau, Agnès; Cognié, Yves; Aumont, Gilles; Chesneau, Didier
2004-01-01
Background Seasonality of ovulatory activity is observed in European sheep and goat breeds, whereas tropical breeds show almost continuous ovulatory activity. It is not known if these tropical breeds are sensitive or not to temperate photoperiod. This study was therefore designed to determine whether tropical Creole goats and Black-Belly ewes are sensitive to temperate photoperiod. Two groups of adult females in each species, either progeny or directly born from imported embryos, were used and maintained in light-proof rooms under simulated temperate (8 to 16 h of light per day) or tropical (11 – 13 h) photoperiods. Ovulatory activity was determined by blood progesterone assays for more than two years. The experiment lasted 33 months in goats and 25 months in ewes. Results Marked seasonality of ovulatory activity appeared in the temperate group of Creole female goats. The percentage of female goats experiencing at least one ovulation per month dramatically decreased from May to September for the three years (0%, 27% and 0%, respectively). Tropical female goats demonstrated much less seasonality, as the percentage of goats experiencing at least one ovulation per month never went below 56%. These differences were significant. Both groups of temperate and tropical Black-Belly ewes experienced a marked seasonality in their ovulatory activity, with only a slightly significant difference between groups. The percentage of ewes experiencing at least one ovulation per month dropped dramatically in April and rose again in August (tropical ewes) or September (temperate ewes). The percentage of ewes experiencing at least one ovulation per month never went below 8% and 17% (for tropical and temperate ewes respectively) during the spring and summer months. Conclusions An important seasonality in ovulatory activity of tropical Creole goats was observed when females were exposed to a simulated temperate photoperiod. An unexpected finding was that Black-Belly ewes and, to a lesser extent, Creole goats exposed to a simulated tropical photoperiod also showed seasonality in their ovulatory activity. Such results indicate that both species are capable of showing seasonality under the photoperiodic changes of the temperate zone even though they do not originate from these regions. PMID:15333134
NASA Astrophysics Data System (ADS)
Curotto, E.
2015-12-01
Structural optimizations, classical NVT ensemble, and variational Monte Carlo simulations of ion Stockmayer clusters parameterized to approximate the Li+(CH3NO2)n (n = 1-20) systems are performed. The Metropolis algorithm enhanced by the parallel tempering strategy is used to measure internal energies and heat capacities, and a parallel version of the genetic algorithm is employed to obtain the most important minima. The first solvation sheath is octahedral and this feature remains the dominant theme in the structure of clusters with n ≥ 6. The first "magic number" is identified using the adiabatic solvent dissociation energy, and it marks the completion of the second solvation layer for the lithium ion-nitromethane clusters. It corresponds to the n = 18 system, a solvated ion with the first sheath having octahedral symmetry, weakly bound to an eight-membered and a four-membered ring crowning a vertex of the octahedron. Variational Monte Carlo estimates of the adiabatic solvent dissociation energy reveal that quantum effects further enhance the stability of the n = 18 system relative to its neighbors.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Hao; Mey, Antonia S. J. S.; Noé, Frank
2014-12-07
We propose a discrete transition-based reweighting analysis method (dTRAM) for analyzing configuration-space-discretized simulation trajectories produced at different thermodynamic states (temperatures, Hamiltonians, etc.) dTRAM provides maximum-likelihood estimates of stationary quantities (probabilities, free energies, expectation values) at any thermodynamic state. In contrast to the weighted histogram analysis method (WHAM), dTRAM does not require data to be sampled from global equilibrium, and can thus produce superior estimates for enhanced sampling data such as parallel/simulated tempering, replica exchange, umbrella sampling, or metadynamics. In addition, dTRAM provides optimal estimates of Markov state models (MSMs) from the discretized state-space trajectories at all thermodynamic states. Under suitablemore » conditions, these MSMs can be used to calculate kinetic quantities (e.g., rates, timescales). In the limit of a single thermodynamic state, dTRAM estimates a maximum likelihood reversible MSM, while in the limit of uncorrelated sampling data, dTRAM is identical to WHAM. dTRAM is thus a generalization to both estimators.« less
Free energy landscape from path-sampling: application to the structural transition in LJ38
NASA Astrophysics Data System (ADS)
Adjanor, G.; Athènes, M.; Calvo, F.
2006-09-01
We introduce a path-sampling scheme that allows equilibrium state-ensemble averages to be computed by means of a biased distribution of non-equilibrium paths. This non-equilibrium method is applied to the case of the 38-atom Lennard-Jones atomic cluster, which has a double-funnel energy landscape. We calculate the free energy profile along the Q4 bond orientational order parameter. At high or moderate temperature the results obtained using the non-equilibrium approach are consistent with those obtained using conventional equilibrium methods, including parallel tempering and Wang-Landau Monte Carlo simulations. At lower temperatures, the non-equilibrium approach becomes more efficient in exploring the relevant inherent structures. In particular, the free energy agrees with the predictions of the harmonic superposition approximation.
Effect of Aspergillus niger xylanase on dough characteristics and bread quality attributes.
Ahmad, Zulfiqar; Butt, Masood Sadiq; Ahmed, Anwaar; Riaz, Muhammad; Sabir, Syed Mubashar; Farooq, Umar; Rehman, Fazal Ur
2014-10-01
The present study was conducted to investigate the impact of various treatments of xylanase produced by Aspergillus niger applied in bread making processes like during tempering of wheat kernels and dough mixing on the dough quality characteristics i.e. dryness, stiffness, elasticity, extensibility, coherency and bread quality parameters i.e. volume, specific volume, density, moisture retention and sensory attributes. Different doses (200, 400, 600, 800 and 1,000 IU) of purified enzyme were applied to 1 kg of wheat grains during tempering and 1 kg of flour (straight grade flour) during mixing of dough in parallel. The samples of wheat kernels were agitated at different intervals for uniformity in tempering. After milling and dough making of both types of flour (having enzyme treatment during tempering and flour mixing) showed improved dough characteristics but the improvement was more prominent in the samples receiving enzyme treatment during tempering. Moreover, xylanase decreased dryness and stiffness of the dough whereas, resulted in increased elasticity, extensibility and coherency and increase in volume & decrease in bread density. Xylanase treatments also resulted in higher moisture retention and improvement of sensory attributes of bread. From the results, it is concluded that dough characteristics and bread quality improved significantly in response to enzyme treatments during tempering as compared to application during mixing.
Zhang, Jie; Li, Yongxiang; Zheng, Jun; Zhang, Hongwei; Yang, Xiaohong; Wang, Jianhua; Wang, Guoying
2017-01-01
The extensive genetic variation present in maize (Zea mays) germplasm makes it possible to detect signatures of positive artificial selection that occurred during temperate and tropical maize improvement. Here we report an analysis of 532,815 polymorphisms from a maize association panel consisting of 368 diverse temperate and tropical inbred lines. We developed a gene-oriented approach adapting exonic polymorphisms to identify recently selected alleles by comparing haplotypes across the maize genome. This analysis revealed evidence of selection for more than 1100 genomic regions during recent improvement, and included regulatory genes and key genes with visible mutant phenotypes. We find that selected candidate target genes in temperate maize are enriched in biosynthetic processes, and further examination of these candidates highlights two cases, sucrose flux and oil storage, in which multiple genes in a common pathway can be cooperatively selected. Finally, based on available parallel gene expression data, we hypothesize that some genes were selected for regulatory variations, resulting in altered gene expression. PMID:28099470
NASA Astrophysics Data System (ADS)
Dettmer, J.; Quijano, J. E.; Dosso, S. E.; Holland, C. W.; Mandolesi, E.
2016-12-01
Geophysical seabed properties are important for the detection and classification of unexploded ordnance. However, current surveying methods such as vertical seismic profiling, coring, or inversion are of limited use when surveying large areas with high spatial sampling density. We consider surveys based on a source and receiver array towed by an autonomous vehicle which produce large volumes of seabed reflectivity data that contain unprecedented and detailed seabed information. The data are analyzed with a particle filter, which requires efficient reflection-coefficient computation, efficient inversion algorithms and efficient use of computer resources. The filter quantifies information content of multiple sequential data sets by considering results from previous data along the survey track to inform the importance sampling at the current point. Challenges arise from environmental changes along the track where the number of sediment layers and their properties change. This is addressed by a trans-dimensional model in the filter which allows layering complexity to change along a track. Efficiency is improved by likelihood tempering of various particle subsets and including exchange moves (parallel tempering). The filter is implemented on a hybrid computer that combines central processing units (CPUs) and graphics processing units (GPUs) to exploit three levels of parallelism: (1) fine-grained parallel computation of spherical reflection coefficients with a GPU implementation of Levin integration; (2) updating particles by concurrent CPU processes which exchange information using automatic load balancing (coarse grained parallelism); (3) overlapping CPU-GPU communication (a major bottleneck) with GPU computation by staggering CPU access to the multiple GPUs. The algorithm is applied to spherical reflection coefficients for data sets along a 14-km track on the Malta Plateau, Mediterranean Sea. We demonstrate substantial efficiency gains over previous methods. [This research was supported in part by the U.S. Dept of Defense, thought the Strategic Environmental Research and Development Program (SERDP).
Diffusion control for a tempered anomalous diffusion system using fractional-order PI controllers.
Juan Chen; Zhuang, Bo; Chen, YangQuan; Cui, Baotong
2017-05-09
This paper is concerned with diffusion control problem of a tempered anomalous diffusion system based on fractional-order PI controllers. The contribution of this paper is to introduce fractional-order PI controllers into the tempered anomalous diffusion system for mobile actuators motion and spraying control. For the proposed control force, convergence analysis of the system described by mobile actuator dynamical equations is presented based on Lyapunov stability arguments. Moreover, a new Centroidal Voronoi Tessellation (CVT) algorithm based on fractional-order PI controllers, henceforth called FOPI-based CVT algorithm, is provided together with a modified simulation platform called Fractional-Order Diffusion Mobile Actuator-Sensor 2-Dimension Fractional-Order Proportional Integral (FO-Diff-MAS2D-FOPI). Finally, extensive numerical simulations for the tempered anomalous diffusion process are presented to verify the effectiveness of our proposed fractional-order PI controllers. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Telasang, Gururaj; Dutta Majumdar, Jyotsna; Wasekar, Nitin; Padmanabham, G.; Manna, Indranil
2015-05-01
This study reports a detailed investigation of the microstructure and mechanical properties (wear resistance and tensile strength) of hardened and tempered AISI H13 tool steel substrate following laser cladding with AISI H13 tool steel powder in as-clad and after post-cladding conventional bulk isothermal tempering [at 823 K (550 °C) for 2 hours] heat treatment. Laser cladding was carried out on AISI H13 tool steel substrate using a 6 kW continuous wave diode laser coupled with fiber delivering an energy density of 133 J/mm2 and equipped with a co-axial powder feeding nozzle capable of feeding powder at the rate of 13.3 × 10-3 g/mm2. Laser clad zone comprises martensite, retained austenite, and carbides, and measures an average hardness of 600 to 650 VHN. Subsequent isothermal tempering converted the microstructure into one with tempered martensite and uniform dispersion of carbides with a hardness of 550 to 650 VHN. Interestingly, laser cladding introduced residual compressive stress of 670 ± 15 MPa, which reduces to 580 ± 20 MPa following isothermal tempering. Micro-tensile testing with specimens machined from the clad zone across or transverse to cladding direction showed high strength but failure in brittle mode. On the other hand, similar testing with samples sectioned from the clad zone parallel or longitudinal to the direction of laser cladding prior to and after post-cladding tempering recorded lower strength but ductile failure with 4.7 and 8 pct elongation, respectively. Wear resistance of the laser surface clad and post-cladding tempered samples (evaluated by fretting wear testing) registered superior performance as compared to that of conventional hardened and tempered AISI H13 tool steel.
Sidler, Dominik; Cristòfol-Clough, Michael; Riniker, Sereina
2017-06-13
Replica-exchange enveloping distribution sampling (RE-EDS) allows the efficient estimation of free-energy differences between multiple end-states from a single molecular dynamics (MD) simulation. In EDS, a reference state is sampled, which can be tuned by two types of parameters, i.e., smoothness parameters(s) and energy offsets, such that all end-states are sufficiently sampled. However, the choice of these parameters is not trivial. Replica exchange (RE) or parallel tempering is a widely applied technique to enhance sampling. By combining EDS with the RE technique, the parameter choice problem could be simplified and the challenge shifted toward an optimal distribution of the replicas in the smoothness-parameter space. The choice of a certain replica distribution can alter the sampling efficiency significantly. In this work, global round-trip time optimization (GRTO) algorithms are tested for the use in RE-EDS simulations. In addition, a local round-trip time optimization (LRTO) algorithm is proposed for systems with slowly adapting environments, where a reliable estimate for the round-trip time is challenging to obtain. The optimization algorithms were applied to RE-EDS simulations of a system of nine small-molecule inhibitors of phenylethanolamine N-methyltransferase (PNMT). The energy offsets were determined using our recently proposed parallel energy-offset (PEOE) estimation scheme. While the multistate GRTO algorithm yielded the best replica distribution for the ligands in water, the multistate LRTO algorithm was found to be the method of choice for the ligands in complex with PNMT. With this, the 36 alchemical free-energy differences between the nine ligands were calculated successfully from a single RE-EDS simulation 10 ns in length. Thus, RE-EDS presents an efficient method for the estimation of relative binding free energies.
Pan, Albert C; Weinreich, Thomas M; Piana, Stefano; Shaw, David E
2016-03-08
Molecular dynamics (MD) simulations can describe protein motions in atomic detail, but transitions between protein conformational states sometimes take place on time scales that are infeasible or very expensive to reach by direct simulation. Enhanced sampling methods, the aim of which is to increase the sampling efficiency of MD simulations, have thus been extensively employed. The effectiveness of such methods when applied to complex biological systems like proteins, however, has been difficult to establish because even enhanced sampling simulations of such systems do not typically reach time scales at which convergence is extensive enough to reliably quantify sampling efficiency. Here, we obtain sufficiently converged simulations of three proteins to evaluate the performance of simulated tempering, a member of a widely used class of enhanced sampling methods that use elevated temperature to accelerate sampling. Simulated tempering simulations with individual lengths of up to 100 μs were compared to (previously published) conventional MD simulations with individual lengths of up to 1 ms. With two proteins, BPTI and ubiquitin, we evaluated the efficiency of sampling of conformational states near the native state, and for the third, the villin headpiece, we examined the rate of folding and unfolding. Our comparisons demonstrate that simulated tempering can consistently achieve a substantial sampling speedup of an order of magnitude or more relative to conventional MD.
[Development of APSIM (agricultural production systems simulator) and its application].
Shen, Yuying; Nan, Zhibiao; Bellotti, Bill; Robertson, Michael; Chen, Wen; Shao, Xinqing
2002-08-01
Soil-crop simulator model is an effective tool for providing decision on agricultural management. APSIM (Agricultural Production Systems Simulator) was developed to simulate the biophysical process in farming system, and particularly in the economic and ecological features of the systems under climatic risk. The current literatures revealed that APSIM could be applied in wide zone, including temperate continental, temperate maritime, sub-tropic and arid climate, and Mediterranean climates, with the soil type of clay, duplex soil, vertisol, silt sandy, silt loam and silt clay loam. More than 20 crops have been simulated well. APSIM is powerful on describing crop structure, crop sequence, yield prediction, and quality control as well as erosion estimation under different planting pattern.
Morphology and properties of low-carbon bainite
NASA Astrophysics Data System (ADS)
Ohtani, H.; Okaguchi, S.; Fujishiro, Y.; Ohmori, Y.
1990-03-01
Morphology of low-carbon bainite in commercial-grade high-tensile-strength steels in both isothermal transformation and continuous cooling transformation is lathlike ferrite elongated in the <11l>b direction. Based on carbide distribution, three types of bainites are classified: Type I, is carbide-free, Type II has fine carbide platelets lying between laths, and Type III has carbides parallel to a specific ferrite plane. At the initial stage of transformation, upper bainitic ferrite forms a subunit elongated in the [-101]f which is nearly parallel to the [lll]b direction with the cross section a parallelogram shape. Coalescence of the subunit yields the lathlike bainite with the [-101]f growth direction and the habit plane between (232)f and (lll)f. Cementite particles precipitate on the sidewise growth tips of the Type II bainitic ferrite subunit. This results in the cementite platelet aligning parallel to a specific ferrite plane in the laths after coalescence. These morphologies of bainites are the same in various kinds of low-carbon high-strength steels. The lowest brittle-ductile transition temperature and the highest strength were obtained either by Type III bainite or bainite/martensite duplex structure because of the crack path limited by fine unit microstructure. It should also be noted that the tempered duplex structure has higher strength than the tempered martensite in the tempering temperature range between 200 °C and 500 °C. In the case of controlled rolling, the accelerated cooling afterward produces a complex structure comprised of ferrite, cementite, and martensite as well as BI-type bainite. Type I bainite in this structure is refined by controlled rolling and plays a very important role in improving the strength and toughness of low-carbon steels.
Bayesian tomography by interacting Markov chains
NASA Astrophysics Data System (ADS)
Romary, T.
2017-12-01
In seismic tomography, we seek to determine the velocity of the undergound from noisy first arrival travel time observations. In most situations, this is an ill posed inverse problem that admits several unperfect solutions. Given an a priori distribution over the parameters of the velocity model, the Bayesian formulation allows to state this problem as a probabilistic one, with a solution under the form of a posterior distribution. The posterior distribution is generally high dimensional and may exhibit multimodality. Moreover, as it is known only up to a constant, the only sensible way to addressthis problem is to try to generate simulations from the posterior. The natural tools to perform these simulations are Monte Carlo Markov chains (MCMC). Classical implementations of MCMC algorithms generally suffer from slow mixing: the generated states are slow to enter the stationary regime, that is to fit the observations, and when one mode of the posterior is eventually identified, it may become difficult to visit others. Using a varying temperature parameter relaxing the constraint on the data may help to enter the stationary regime. Besides, the sequential nature of MCMC makes them ill fitted toparallel implementation. Running a large number of chains in parallel may be suboptimal as the information gathered by each chain is not mutualized. Parallel tempering (PT) can be seen as a first attempt to make parallel chains at different temperatures communicate but only exchange information between current states. In this talk, I will show that PT actually belongs to a general class of interacting Markov chains algorithm. I will also show that this class enables to design interacting schemes that can take advantage of the whole history of the chain, by authorizing exchanges toward already visited states. The algorithms will be illustrated with toy examples and an application to first arrival traveltime tomography.
Three-Dimensional Color Code Thresholds via Statistical-Mechanical Mapping
NASA Astrophysics Data System (ADS)
Kubica, Aleksander; Beverland, Michael E.; Brandão, Fernando; Preskill, John; Svore, Krysta M.
2018-05-01
Three-dimensional (3D) color codes have advantages for fault-tolerant quantum computing, such as protected quantum gates with relatively low overhead and robustness against imperfect measurement of error syndromes. Here we investigate the storage threshold error rates for bit-flip and phase-flip noise in the 3D color code (3DCC) on the body-centered cubic lattice, assuming perfect syndrome measurements. In particular, by exploiting a connection between error correction and statistical mechanics, we estimate the threshold for 1D stringlike and 2D sheetlike logical operators to be p3DCC (1 )≃1.9 % and p3DCC (2 )≃27.6 % . We obtain these results by using parallel tempering Monte Carlo simulations to study the disorder-temperature phase diagrams of two new 3D statistical-mechanical models: the four- and six-body random coupling Ising models.
All-atomic simulations on human telomeric G-quadruplex DNA binding with thioflavin T.
Luo, Di; Mu, Yuguang
2015-04-16
Ligand-stabilized human telomeric G-quadruplex DNA is believed to be an anticancer agent, as it can impede the continuous elongation of telomeres by telomerase in cancer cells. In this study, five well-established human telomeric G-quadruplex DNA models were probed on their binding behaviors with thioflavin T (ThT) via both conventional molecular dynamics (MD) and well-tempered metadynamics (WT-MetaD) simulations. Novel dynamics and characteristic binding patterns were disclosed by the MD simulations. It was observed that the K(+) promoted parallel and hybridized human telomeric G-quadruplex conformations pose higher binding affinities to ThT than the Na(+) and K(+) promoted basket conformations. It is the end, sandwich, and base stacking driven by π-π interactions that are identified as the major binding mechanisms. As the most energy favorable binding mode, the sandwich stacking observed in (3 + 1) hybridized form 1 G-quadruplex conformation is triggered by reversible conformational change of the G-quadruplex. To further examine the free energy landscapes, WT-MetaD simulations were utilized on G-quadruplex-ThT systems. It is found that all of the major binding modes predicted by the MD simulations are confirmed by the WT-MetaD simulations. The results in this work not only accord with existing experimental findings, but also reinforce our understanding on the dynamics of G-quadruplexes and aid future drug developments for G-quadruplex stabilization ligands.
Simulation of Temperature Field Distribution for Cutting the Temperated Glass by Ultraviolet Laser
NASA Astrophysics Data System (ADS)
Yang, B. J.; He, Y. C.; Dai, F.; Lin, X. C.
2017-03-01
The finite element software ANSYS was adopted to simulate the temperature field distribution for laser cutting tempered glass, and the influence of different process parameters, including laser power, glass thickness and cutting speed, on temperature field distribution was studied in detail. The results show that the laser power has a greater influence on temperature field distribution than other paremeters, and when the laser power gets to 60W, the highest temperature reaches 749°C, which is higher than the glass softening temperature. It reflects the material near the laser spot is melted and the molten slag is removed by the high-energy water beam quickly. Finally, through the water guided laser cutting tempered glass experiment the FEM theoretical analysis was verified.
Jueterbock, A; Franssen, S U; Bergmann, N; Gu, J; Coyer, J A; Reusch, T B H; Bornberg-Bauer, E; Olsen, J L
2016-11-01
Populations distributed across a broad thermal cline are instrumental in addressing adaptation to increasing temperatures under global warming. Using a space-for-time substitution design, we tested for parallel adaptation to warm temperatures along two independent thermal clines in Zostera marina, the most widely distributed seagrass in the temperate Northern Hemisphere. A North-South pair of populations was sampled along the European and North American coasts and exposed to a simulated heatwave in a common-garden mesocosm. Transcriptomic responses under control, heat stress and recovery were recorded in 99 RNAseq libraries with ~13 000 uniquely annotated, expressed genes. We corrected for phylogenetic differentiation among populations to discriminate neutral from adaptive differentiation. The two southern populations recovered faster from heat stress and showed parallel transcriptomic differentiation, as compared with northern populations. Among 2389 differentially expressed genes, 21 exceeded neutral expectations and were likely involved in parallel adaptation to warm temperatures. However, the strongest differentiation following phylogenetic correction was between the three Atlantic populations and the Mediterranean population with 128 of 4711 differentially expressed genes exceeding neutral expectations. Although adaptation to warm temperatures is expected to reduce sensitivity to heatwaves, the continued resistance of seagrass to further anthropogenic stresses may be impaired by heat-induced downregulation of genes related to photosynthesis, pathogen defence and stress tolerance. © 2016 John Wiley & Sons Ltd.
Chodera, John D; Shirts, Michael R
2011-11-21
The widespread popularity of replica exchange and expanded ensemble algorithms for simulating complex molecular systems in chemistry and biophysics has generated much interest in discovering new ways to enhance the phase space mixing of these protocols in order to improve sampling of uncorrelated configurations. Here, we demonstrate how both of these classes of algorithms can be considered as special cases of Gibbs sampling within a Markov chain Monte Carlo framework. Gibbs sampling is a well-studied scheme in the field of statistical inference in which different random variables are alternately updated from conditional distributions. While the update of the conformational degrees of freedom by Metropolis Monte Carlo or molecular dynamics unavoidably generates correlated samples, we show how judicious updating of the thermodynamic state indices--corresponding to thermodynamic parameters such as temperature or alchemical coupling variables--can substantially increase mixing while still sampling from the desired distributions. We show how state update methods in common use can lead to suboptimal mixing, and present some simple, inexpensive alternatives that can increase mixing of the overall Markov chain, reducing simulation times necessary to obtain estimates of the desired precision. These improved schemes are demonstrated for several common applications, including an alchemical expanded ensemble simulation, parallel tempering, and multidimensional replica exchange umbrella sampling.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curotto, E., E-mail: curotto@arcadia.edu
2015-12-07
Structural optimizations, classical NVT ensemble, and variational Monte Carlo simulations of ion Stockmayer clusters parameterized to approximate the Li{sup +}(CH{sub 3}NO{sub 2}){sub n} (n = 1–20) systems are performed. The Metropolis algorithm enhanced by the parallel tempering strategy is used to measure internal energies and heat capacities, and a parallel version of the genetic algorithm is employed to obtain the most important minima. The first solvation sheath is octahedral and this feature remains the dominant theme in the structure of clusters with n ≥ 6. The first “magic number” is identified using the adiabatic solvent dissociation energy, and it marksmore » the completion of the second solvation layer for the lithium ion-nitromethane clusters. It corresponds to the n = 18 system, a solvated ion with the first sheath having octahedral symmetry, weakly bound to an eight-membered and a four-membered ring crowning a vertex of the octahedron. Variational Monte Carlo estimates of the adiabatic solvent dissociation energy reveal that quantum effects further enhance the stability of the n = 18 system relative to its neighbors.« less
In silico direct folding of thrombin-binding aptamer G-quadruplex at all-atom level
Yang, Changwon; Kulkarni, Mandar; Lim, Manho
2017-01-01
Abstract The reversible folding of the thrombin-binding DNA aptamer G-quadruplexes (GQs) (TBA-15) starting from fully unfolded states was demonstrated using a prolonged time scale (10–12 μs) parallel tempering metadynamics (PTMetaD) simulation method in conjunction with a modified version of the AMBER bsc1 force field. For unbiased descriptions of the folding free energy landscape of TBA-15, this force field was minimally modified. From this direct folding simulation using the modified bsc1 force field, reasonably converged free energy landscapes were obtained in K+-rich aqueous solution (150 mM), providing detailed atomistic pictures of GQ folding mechanisms for TBA-15. This study found that the TBA folding occurred via multiple folding pathways with two major free energy barriers of 13 and 15 kcal/mol in the presence of several intermediate states of G-triplex variants. The early formation of these intermediates was associated with a single K+ ion capturing. Interestingly, these intermediate states appear to undergo facile transitions among themselves through relatively small energy barriers. PMID:29112755
On the Helix Propensity in Generalized Born Solvent Descriptions of Modeling the Dark Proteome
Olson, Mark A.
2017-01-01
Intrinsically disordered proteins that populate the so-called “Dark Proteome” offer challenging benchmarks of atomistic simulation methods to accurately model conformational transitions on a multidimensional energy landscape. This work explores the application of parallel tempering with implicit solvent models as a computational framework to capture the conformational ensemble of an intrinsically disordered peptide derived from the Ebola virus protein VP35. A recent X-ray crystallographic study reported a protein-peptide interface where the VP35 peptide underwent a folding transition from a disordered form to a helix-β-turn-helix topological fold upon molecular association with the Ebola protein NP. An assessment is provided of the accuracy of two generalized Born solvent models (GBMV2 and GBSW2) using the CHARMM force field and applied with temperature-based replica exchange dynamics to calculate the disorder propensity of the peptide and its probability density of states in a continuum solvent. A further comparison is presented of applying an explicit/implicit solvent hybrid replica exchange simulation of the peptide to determine the effect of modeling water interactions at the all-atom resolution. PMID:28197405
On the Helix Propensity in Generalized Born Solvent Descriptions of Modeling the Dark Proteome.
Olson, Mark A
2017-01-01
Intrinsically disordered proteins that populate the so-called "Dark Proteome" offer challenging benchmarks of atomistic simulation methods to accurately model conformational transitions on a multidimensional energy landscape. This work explores the application of parallel tempering with implicit solvent models as a computational framework to capture the conformational ensemble of an intrinsically disordered peptide derived from the Ebola virus protein VP35. A recent X-ray crystallographic study reported a protein-peptide interface where the VP35 peptide underwent a folding transition from a disordered form to a helix-β-turn-helix topological fold upon molecular association with the Ebola protein NP. An assessment is provided of the accuracy of two generalized Born solvent models (GBMV2 and GBSW2) using the CHARMM force field and applied with temperature-based replica exchange dynamics to calculate the disorder propensity of the peptide and its probability density of states in a continuum solvent. A further comparison is presented of applying an explicit/implicit solvent hybrid replica exchange simulation of the peptide to determine the effect of modeling water interactions at the all-atom resolution.
Sprenger, K G; Pfaendtner, Jim
2016-06-07
Thermodynamic analyses can provide key insights into the origins of protein self-assembly on surfaces, protein function, and protein stability. However, obtaining quantitative measurements of thermodynamic observables from unbiased classical simulations of peptide or protein adsorption is challenging because of sampling limitations brought on by strong biomolecule/surface binding forces as well as time scale limitations. We used the parallel tempering metadynamics in the well-tempered ensemble (PTMetaD-WTE) enhanced sampling method to study the adsorption behavior and thermodynamics of several explicitly solvated model peptide adsorption systems, providing new molecular-level insight into the biomolecule adsorption process. Specifically studied were peptides LKα14 and LKβ15 and trpcage miniprotein adsorbing onto a charged, hydrophilic self-assembled monolayer surface functionalized with a carboxylic acid/carboxylate headgroup and a neutral, hydrophobic methyl-terminated self-assembled monolayer surface. Binding free energies were calculated as a function of temperature for each system and decomposed into their respective energetic and entropic contributions. We investigated how specific interfacial features such as peptide/surface electrostatic interactions and surface-bound ion content affect the thermodynamic landscape of adsorption and lead to differences in surface-bound conformations of the peptides. Results show that upon adsorption to the charged surface, configurational entropy gains of the released solvent molecules dominate the configurational entropy losses of the bound peptide. This behavior leads to an apparent increase in overall system entropy upon binding and therefore to the surprising and seemingly nonphysical result of an apparent increased binding free energy at elevated temperatures. Opposite effects and conclusions are found for the neutral surface. Additional simulations demonstrate that by adjusting the ionic strength of the solution, results that show the expected physical behavior, i.e., peptide binding strength that decreases with increasing temperature or is independent of temperature altogether, can be recovered on the charged surface. On the basis of this analysis, an overall free energy for the entire thermodynamic cycle for peptide adsorption on charged surfaces is constructed and validated with independent simulations.
NASA Astrophysics Data System (ADS)
Goulko, Olga; Kent, Adrian
2017-11-01
We introduce and physically motivate the following problem in geometric combinatorics, originally inspired by analysing Bell inequalities. A grasshopper lands at a random point on a planar lawn of area 1. It then jumps once, a fixed distance d, in a random direction. What shape should the lawn be to maximize the chance that the grasshopper remains on the lawn after jumping? We show that, perhaps surprisingly, a disc-shaped lawn is not optimal for any d>0. We investigate further by introducing a spin model whose ground state corresponds to the solution of a discrete version of the grasshopper problem. Simulated annealing and parallel tempering searches are consistent with the hypothesis that, for d<π-1/2, the optimal lawn resembles a cogwheel with n cogs, where the integer n is close to π (arcsin(√{π }d / 2 )) -1. We find transitions to other shapes for d ≳π-1 / 2.
Three-Dimensional Color Code Thresholds via Statistical-Mechanical Mapping.
Kubica, Aleksander; Beverland, Michael E; Brandão, Fernando; Preskill, John; Svore, Krysta M
2018-05-04
Three-dimensional (3D) color codes have advantages for fault-tolerant quantum computing, such as protected quantum gates with relatively low overhead and robustness against imperfect measurement of error syndromes. Here we investigate the storage threshold error rates for bit-flip and phase-flip noise in the 3D color code (3DCC) on the body-centered cubic lattice, assuming perfect syndrome measurements. In particular, by exploiting a connection between error correction and statistical mechanics, we estimate the threshold for 1D stringlike and 2D sheetlike logical operators to be p_{3DCC}^{(1)}≃1.9% and p_{3DCC}^{(2)}≃27.6%. We obtain these results by using parallel tempering Monte Carlo simulations to study the disorder-temperature phase diagrams of two new 3D statistical-mechanical models: the four- and six-body random coupling Ising models.
Solar radiation-driven inactivation of bacteria, virus and protozoan pathogen models was quantified in simulated drinking water at a temperate latitude (34°S). The water was seeded with Enterococcus faecalis, Clostridium sporogenes spores, and P22 bacteriophage, each at ca 1 x 10...
TEACHING COMPOSITION. WHAT RESEARCH SAYS TO THE TEACHER, NUMBER 18.
ERIC Educational Resources Information Center
BURROWS, ALVINA T.
ALTHOUGH CHILDREN'S NEEDS FOR WRITTEN EXPRESSION PROBABLY PARALLEL THOSE OF ADULTS, THE REASON BEHIND CHILDREN'S CHOICE OF WRITING OVER SPEAKING IN GIVEN INSTANCES IS OPEN TO CONJECTURE. MOREOVER, THE COMMON ASSUMPTION BY TEACHERS THAT CHILDREN CAN AND SHOULD WRITE ABOUT PERSONAL INTERESTS OUGHT TO BE TEMPERED BY THE IDEA THAT MANY INTERESTS ARE…
Liu, Jie; Peng, Chunwang; Yu, Gaobo; Zhou, Jian
2015-10-06
The surrounding conditions, such as surface charge density and ionic strength, play an important role in enzyme adsorption. The adsorption of a nonmodular type-A feruloyl esterase from Aspergillus niger (AnFaeA) on charged surfaces was investigated by parallel tempering Monte Carlo (PTMC) and all-atom molecular dynamics (AAMD) simulations at different surface charge densities (±0.05 and ±0.16 C·m(-2)) and ionic strengths (0.007 and 0.154 M). The adsorption energy, orientation, and conformational changes were analyzed. Simulation results show that whether AnFaeA can adsorb onto a charged surface is mainly controlled by electrostatic interactions between AnFaeA and the charged surface. The electrostatic interactions between AnFaeA and charged surfaces are weakened when the ionic strength increases. The positively charged surface at low surface charge density and high ionic strength conditions can maximize the utilization of the immobilized AnFaeA. The counterion layer plays a key role in the adsorption of AnFaeA on the negatively charged COOH-SAM. The native conformation of AnFaeA is well preserved under all of these conditions. The results of this work can be used for the controlled immobilization of AnFaeA.
TemperSAT: A new efficient fair-sampling random k-SAT solver
NASA Astrophysics Data System (ADS)
Fang, Chao; Zhu, Zheng; Katzgraber, Helmut G.
The set membership problem is of great importance to many applications and, in particular, database searches for target groups. Recently, an approach to speed up set membership searches based on the NP-hard constraint-satisfaction problem (random k-SAT) has been developed. However, the bottleneck of the approach lies in finding the solution to a large SAT formula efficiently and, in particular, a large number of independent solutions is needed to reduce the probability of false positives. Unfortunately, traditional random k-SAT solvers such as WalkSAT are biased when seeking solutions to the Boolean formulas. By porting parallel tempering Monte Carlo to the sampling of binary optimization problems, we introduce a new algorithm (TemperSAT) whose performance is comparable to current state-of-the-art SAT solvers for large k with the added benefit that theoretically it can find many independent solutions quickly. We illustrate our results by comparing to the currently fastest implementation of WalkSAT, WalkSATlm.
Red spruce (Picea rubens Sarg.) cold hardiness and freezing injury susceptibility. Chapter 18
Donald H. DeHayes; Paul G. Schaberg; G.Richard Strimbeck
2001-01-01
To survive subfreezing winter temperatmes, perennial plant species have evolved tissue-specific mechanisms to undergo changes in freezing tolerance that parallel seasonal variations in climate. As such, most northern temperate tree species, including conifers, are adapted to the habitat and climatic conditions within their natural ranges and suffer little or no...
Understanding Cryptic Pocket Formation in Protein Targets by Enhanced Sampling Simulations.
Oleinikovas, Vladimiras; Saladino, Giorgio; Cossins, Benjamin P; Gervasio, Francesco L
2016-11-02
Cryptic pockets, that is, sites on protein targets that only become apparent when drugs bind, provide a promising alternative to classical binding sites for drug development. Here, we investigate the nature and dynamical properties of cryptic sites in four pharmacologically relevant targets, while comparing the efficacy of various simulation-based approaches in discovering them. We find that the studied cryptic sites do not correspond to local minima in the computed conformational free energy landscape of the unliganded proteins. They thus promptly close in all of the molecular dynamics simulations performed, irrespective of the force-field used. Temperature-based enhanced sampling approaches, such as Parallel Tempering, do not improve the situation, as the entropic term does not help in the opening of the sites. The use of fragment probes helps, as in long simulations occasionally it leads to the opening and binding to the cryptic sites. Our observed mechanism of cryptic site formation is suggestive of an interplay between two classical mechanisms: induced-fit and conformational selection. Employing this insight, we developed a novel Hamiltonian Replica Exchange-based method "SWISH" (Sampling Water Interfaces through Scaled Hamiltonians), which combined with probes resulted in a promising general approach for cryptic site discovery. We also addressed the issue of "false-positives" and propose a simple approach to distinguish them from druggable cryptic pockets. Our simulations, whose cumulative sampling time was more than 200 μs, help in clarifying the molecular mechanism of pocket formation, providing a solid basis for the choice of an efficient computational method.
Electrostatics-mediated α-chymotrypsin inhibition by functionalized single-walled carbon nanotubes.
Zhao, Daohui; Zhou, Jian
2017-01-04
The α-chymotrypsin (α-ChT) enzyme is extensively used for studying nanomaterial-induced enzymatic activity inhibition. A recent experimental study reported that carboxylized carbon nanotubes (CNTs) played an important role in regulating the α-ChT activity. In this study, parallel tempering Monte Carlo and molecular dynamics simulations were combined to elucidate the interactions between α-ChT and CNTs in relation to the CNT functional group density. The simulation results indicate that the adsorption and the driving force of α-ChT on different CNTs are contingent on the carboxyl density. Meanwhile, minor secondary structural changes are observed in adsorption processes. It is revealed that α-ChT interacts with pristine CNTs through hydrophobic forces and exhibits a non-competitive characteristic with the active site facing towards the solution; while it binds to carboxylized CNTs with the active pocket through a dominant electrostatic association, which causes enzymatic activity inhibition in a competitive-like mode. These findings are in line with experimental results, and well interpret the activity inhibition of α-ChT at the molecular level. Moreover, this study would shed light on the detailed mechanism of specific recognition and regulation of α-ChT by other functionalized nanomaterials.
Insilico direct folding of thrombin-binding aptamer G-quadruplex at all-atom level.
Yang, Changwon; Kulkarni, Mandar; Lim, Manho; Pak, Youngshang
2017-12-15
The reversible folding of the thrombin-binding DNA aptamer G-quadruplexes (GQs) (TBA-15) starting from fully unfolded states was demonstrated using a prolonged time scale (10-12 μs) parallel tempering metadynamics (PTMetaD) simulation method in conjunction with a modified version of the AMBER bsc1 force field. For unbiased descriptions of the folding free energy landscape of TBA-15, this force field was minimally modified. From this direct folding simulation using the modified bsc1 force field, reasonably converged free energy landscapes were obtained in K+-rich aqueous solution (150 mM), providing detailed atomistic pictures of GQ folding mechanisms for TBA-15. This study found that the TBA folding occurred via multiple folding pathways with two major free energy barriers of 13 and 15 kcal/mol in the presence of several intermediate states of G-triplex variants. The early formation of these intermediates was associated with a single K+ ion capturing. Interestingly, these intermediate states appear to undergo facile transitions among themselves through relatively small energy barriers. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Huang, Shan; Roy, Kaustuv; Valentine, James W; Jablonski, David
2015-04-21
Paleontological data provide essential insights into the processes shaping the spatial distribution of present-day biodiversity. Here, we combine biogeographic data with the fossil record to investigate the roles of parallelism (similar diversities reached via changes from similar starting points), convergence (similar diversities reached from different starting points), and divergence in shaping the present-day latitudinal diversity gradients of marine bivalves along the two North American coasts. Although both faunas show the expected overall poleward decline in species richness, the trends differ between the coasts, and the discrepancies are not explained simply by present-day temperature differences. Instead, the fossil record indicates that both coasts have declined in overall diversity over the past 3 My, but the western Atlantic fauna suffered more severe Pliocene-Pleistocene extinction than did the eastern Pacific. Tropical western Atlantic diversity remains lower than the eastern Pacific, but warm temperate western Atlantic diversity recovered to exceed that of the temperate eastern Pacific, either through immigration or in situ origination. At the clade level, bivalve families shared by the two coasts followed a variety of paths toward today's diversities. The drivers of these lineage-level differences remain unclear, but species with broad geographic ranges during the Pliocene were more likely than geographically restricted species to persist in the temperate zone, suggesting that past differences in geographic range sizes among clades may underlie between-coast contrasts. More detailed comparative work on regional extinction intensities and selectivities, and subsequent recoveries (by in situ speciation or immigration), is needed to better understand present-day diversity patterns and model future changes.
Zhu, Jing; Fry, James D.
2018-01-01
The natural habitat of Drosophila melanogaster Meigen (Diptera: Drosophilidae) is fermenting fruits, which can be rich in ethanol. For unknown reasons, temperate populations of this cosmopolitan species have higher ethanol resistance than tropical populations. To determine whether this difference is accompanied by a parallel difference in preference for ethanol, we compared two European and two tropical African populations in feeding and oviposition preference for ethanol-supplemented medium. Although females of all populations laid significantly more eggs on medium with ethanol than on control medium, preference of European females for ethanol increased as ethanol concentration increased from 2 to 6%, whereas that of African females decreased. In feeding tests, African females preferred control medium over medium with 4% ethanol, whereas European females showed no preference. Males of all populations strongly preferred control medium. The combination of preference for ethanol in oviposition, and avoidance or neutrality in feeding, gives evidence that adults choose breeding sites with ethanol for the benefit of larvae, rather than for their own benefit. The stronger oviposition preference for ethanol of temperate than tropical females suggests that this benefit may be more important in temperate populations. Two possible benefits of ethanol for which there is some experimental evidence are cryoprotection and protection against natural enemies. PMID:29398715
Jason B. Fellman; Eran Hood; Richard T. Edwards; Jeremy B. Jones
2009-01-01
Dissolved organic matter (DOM) is an important component of aquatic food webs. We compare the uptake kinetics for NH4-N and different fractions of DOM during soil and salmon leachate additions by evaluating the uptake of organic forms of carbon (DOC) and nitrogen (DON), and proteinaceous DOM, as measured by parallel factor (PARAFAC) modeling of...
NASA Astrophysics Data System (ADS)
Yu, Hao; Zhou, Tao
The heat treatment during manufacturing process of induction bend pipe had been simulated. The evolutions of ferrite, M/A island and substructure after tempering at 500 700 °C were characterized by means of optical microscopy, positron annihilation technique, SEM, TEM, XRD and EBSD. The mechanical performance was evaluated by tensile test, Charpy V-notch impact test (-20 °C) and Vickers hardness test (10 kgf). Microstructure observations showed that fine and homogenous M/A islands as well as dislocation packages in quasi-polygonal ferrite matrix after tempering at 600 650 °C generated optimal combination of strength and toughness. After tempering at 700 °C, the yield strength decreased dramatically. EBSD analysis indicated that the effective grain size diminished with the tempering temperature increasing. It could cause more energy cost during microcrack propagation process with subsequent improvement in impact toughness. Dislocation analysis suggested that the decrease and pile-up of dislocation benefited the combination of strength and toughness.
Sun, Rui; Dama, James F; Tan, Jeffrey S; Rose, John P; Voth, Gregory A
2016-10-11
Metadynamics is an important enhanced sampling technique in molecular dynamics simulation to efficiently explore potential energy surfaces. The recently developed transition-tempered metadynamics (TTMetaD) has been proven to converge asymptotically without sacrificing exploration of the collective variable space in the early stages of simulations, unlike other convergent metadynamics (MetaD) methods. We have applied TTMetaD to study the permeation of drug-like molecules through a lipid bilayer to further investigate the usefulness of this method as applied to problems of relevance to medicinal chemistry. First, ethanol permeation through a lipid bilayer was studied to compare TTMetaD with nontempered metadynamics and well-tempered metadynamics. The bias energies computed from various metadynamics simulations were compared to the potential of mean force calculated from umbrella sampling. Though all of the MetaD simulations agree with one another asymptotically, TTMetaD is able to predict the most accurate and reliable estimate of the potential of mean force for permeation in the early stages of the simulations and is robust to the choice of required additional parameters. We also show that using multiple randomly initialized replicas allows convergence analysis and also provides an efficient means to converge the simulations in shorter wall times and, more unexpectedly, in shorter CPU times; splitting the CPU time between multiple replicas appears to lead to less overall error. After validating the method, we studied the permeation of a more complicated drug-like molecule, trimethoprim. Three sets of TTMetaD simulations with different choices of collective variables were carried out, and all converged within feasible simulation time. The minimum free energy paths showed that TTMetaD was able to predict almost identical permeation mechanisms in each case despite significantly different definitions of collective variables.
NASA Astrophysics Data System (ADS)
Laloy, Eric; Linde, Niklas; Jacques, Diederik; Mariethoz, Grégoire
2016-04-01
The sequential geostatistical resampling (SGR) algorithm is a Markov chain Monte Carlo (MCMC) scheme for sampling from possibly non-Gaussian, complex spatially-distributed prior models such as geologic facies or categorical fields. In this work, we highlight the limits of standard SGR for posterior inference of high-dimensional categorical fields with realistically complex likelihood landscapes and benchmark a parallel tempering implementation (PT-SGR). Our proposed PT-SGR approach is demonstrated using synthetic (error corrupted) data from steady-state flow and transport experiments in categorical 7575- and 10,000-dimensional 2D conductivity fields. In both case studies, every SGR trial gets trapped in a local optima while PT-SGR maintains an higher diversity in the sampled model states. The advantage of PT-SGR is most apparent in an inverse transport problem where the posterior distribution is made bimodal by construction. PT-SGR then converges towards the appropriate data misfit much faster than SGR and partly recovers the two modes. In contrast, for the same computational resources SGR does not fit the data to the appropriate error level and hardly produces a locally optimal solution that looks visually similar to one of the two reference modes. Although PT-SGR clearly surpasses SGR in performance, our results also indicate that using a small number (16-24) of temperatures (and thus parallel cores) may not permit complete sampling of the posterior distribution by PT-SGR within a reasonable computational time (less than 1-2 weeks).
Reconstructing the equilibrium Boltzmann distribution from well-tempered metadynamics.
Bonomi, M; Barducci, A; Parrinello, M
2009-08-01
Metadynamics is a widely used and successful method for reconstructing the free-energy surface of complex systems as a function of a small number of suitably chosen collective variables. This is achieved by biasing the dynamics of the system. The bias acting on the collective variables distorts the probability distribution of the other variables. Here we present a simple reweighting algorithm for recovering the unbiased probability distribution of any variable from a well-tempered metadynamics simulation. We show the efficiency of the reweighting procedure by reconstructing the distribution of the four backbone dihedral angles of alanine dipeptide from two and even one dimensional metadynamics simulation. 2009 Wiley Periodicals, Inc.
Aggregation of peptides in the tube model with correlated sidechain orientations
NASA Astrophysics Data System (ADS)
Hung, Nguyen Ba; Hoang, Trinh Xuan
2015-06-01
The ability of proteins and peptides to aggregate and form toxic amyloid fibrils is associated with a range of diseases including BSE (or mad cow), Alzheimer's and Parkinson's Diseases. In this study, we investigate the the role of amino acid sequence in the aggregation propensity by using a modified tube model with a new procedure for hydrophobic interaction. In this model, the amino acid sidechains are not considered explicitly, but their orientations are taken into account in the formation of hydrophobic contact. Extensive Monte Carlo simulations for systems of short peptides are carried out with the use of parallel tempering technique. Our results show that the propensity to form and the structures of the aggregates strongly depend on the amino acid sequence and the number of peptides. Some sequences may not aggregate at all at a presumable physiological temperature while other can easily form fibril-like, β-sheet struture. Our study provides an insight into the principles of how the formation of amyloid can be governed by amino acid sequence.
Conformational free energies of methyl-α-L-iduronic and methyl-β-D-glucuronic acids in water
NASA Astrophysics Data System (ADS)
Babin, Volodymyr; Sagui, Celeste
2010-03-01
We present a simulation protocol that allows for efficient sampling of the degrees of freedom of a solute in explicit solvent. The protocol involves using a nonequilibrium umbrella sampling method, in this case, the recently developed adaptively biased molecular dynamics method, to compute an approximate free energy for the slow modes of the solute in explicit solvent. This approximate free energy is then used to set up a Hamiltonian replica exchange scheme that samples both from biased and unbiased distributions. The final accurate free energy is recovered via the weighted histogram analysis technique applied to all the replicas, and equilibrium properties of the solute are computed from the unbiased trajectory. We illustrate the approach by applying it to the study of the puckering landscapes of the methyl glycosides of α-L-iduronic acid and its C5 epimer β-D-glucuronic acid in water. Big savings in computational resources are gained in comparison to the standard parallel tempering method.
Conformational free energies of methyl-alpha-L-iduronic and methyl-beta-D-glucuronic acids in water.
Babin, Volodymyr; Sagui, Celeste
2010-03-14
We present a simulation protocol that allows for efficient sampling of the degrees of freedom of a solute in explicit solvent. The protocol involves using a nonequilibrium umbrella sampling method, in this case, the recently developed adaptively biased molecular dynamics method, to compute an approximate free energy for the slow modes of the solute in explicit solvent. This approximate free energy is then used to set up a Hamiltonian replica exchange scheme that samples both from biased and unbiased distributions. The final accurate free energy is recovered via the weighted histogram analysis technique applied to all the replicas, and equilibrium properties of the solute are computed from the unbiased trajectory. We illustrate the approach by applying it to the study of the puckering landscapes of the methyl glycosides of alpha-L-iduronic acid and its C5 epimer beta-D-glucuronic acid in water. Big savings in computational resources are gained in comparison to the standard parallel tempering method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Zhiyang; Zhang, Xiong
A dynamic computer simulation is carried out in the climates of 35 cities distributed around the world. The variation of the annual air-conditioning energy loads due to changes in the longwave emissivity and the solar reflectance of the building envelopes is studied to find the most appropriate exterior building finishes in various climates (including a tropical climate, a subtropical climate, a mountain plateau climate, a frigid-temperate climate and a temperate climate). Both the longwave emissivity and the solar reflectance are set from 0.1 to 0.9 with an interval of 0.1 in the simulation. The annual air-conditioning energy loads trends ofmore » each city are listed in a chart. The results show that both the longwave emissivity and the solar reflectance of building envelopes play significant roles in energy-saving for buildings. In tropical climates, the optical parameters of the building exterior surface affect the building energy-saving most significantly. In the mountain plateau climates and the subarctic climates, the impacts on energy-saving in buildings due to changes in the longwave emissivity and the solar reflectance are still considerable, but in the temperate continental climates and the temperate maritime climates, only limited effects are seen. (author)« less
Rauscher, Sarah; Neale, Chris; Pomès, Régis
2009-10-13
Generalized-ensemble algorithms in temperature space have become popular tools to enhance conformational sampling in biomolecular simulations. A random walk in temperature leads to a corresponding random walk in potential energy, which can be used to cross over energetic barriers and overcome the problem of quasi-nonergodicity. In this paper, we introduce two novel methods: simulated tempering distributed replica sampling (STDR) and virtual replica exchange (VREX). These methods are designed to address the practical issues inherent in the replica exchange (RE), simulated tempering (ST), and serial replica exchange (SREM) algorithms. RE requires a large, dedicated, and homogeneous cluster of CPUs to function efficiently when applied to complex systems. ST and SREM both have the drawback of requiring extensive initial simulations, possibly adaptive, for the calculation of weight factors or potential energy distribution functions. STDR and VREX alleviate the need for lengthy initial simulations, and for synchronization and extensive communication between replicas. Both methods are therefore suitable for distributed or heterogeneous computing platforms. We perform an objective comparison of all five algorithms in terms of both implementation issues and sampling efficiency. We use disordered peptides in explicit water as test systems, for a total simulation time of over 42 μs. Efficiency is defined in terms of both structural convergence and temperature diffusion, and we show that these definitions of efficiency are in fact correlated. Importantly, we find that ST-based methods exhibit faster temperature diffusion and correspondingly faster convergence of structural properties compared to RE-based methods. Within the RE-based methods, VREX is superior to both SREM and RE. On the basis of our observations, we conclude that ST is ideal for simple systems, while STDR is well-suited for complex systems.
Simulation of carbon isotope discrimination of the terrestrial biosphere
NASA Astrophysics Data System (ADS)
Suits, N. S.; Denning, A. S.; Berry, J. A.; Still, C. J.; Kaduk, J.; Miller, J. B.; Baker, I. T.
2005-03-01
We introduce a multistage model of carbon isotope discrimination during C3 photosynthesis and global maps of C3/C4 plant ratios to an ecophysiological model of the terrestrial biosphere (SiB2) in order to predict the carbon isotope ratios of terrestrial plant carbon globally at a 1° resolution. The model is driven by observed meteorology from the European Centre for Medium-Range Weather Forecasts (ECMWF), constrained by satellite-derived Normalized Difference Vegetation Index (NDVI) and run for the years 1983-1993. Modeled mean annual C3 discrimination during this period is 19.2‰; total mean annual discrimination by the terrestrial biosphere (C3 and C4 plants) is 15.9‰. We test simulation results in three ways. First, we compare the modeled response of C3 discrimination to changes in physiological stress, including daily variations in vapor pressure deficit (vpd) and monthly variations in precipitation, to observed changes in discrimination inferred from Keeling plot intercepts. Second, we compare mean δ13C ratios from selected biomes (Broadleaf, Temperate Broadleaf, Temperate Conifer, and Boreal) to the observed values from Keeling plots at these biomes. Third, we compare simulated zonal δ13C ratios in the Northern Hemisphere (20°N to 60°N) to values predicted from high-frequency variations in measured atmospheric CO2 and δ13C from terrestrially dominated sites within the NOAA-Globalview flask network. The modeled response to changes in vapor pressure deficit compares favorably to observations. Simulated discrimination in tropical forests of the Amazon basin is less sensitive to changes in monthly precipitation than is suggested by some observations. Mean model δ13C ratios for Broadleaf, Temperate Broadleaf, Temperate Conifer, and Boreal biomes compare well with the few measurements available; however, there is more variability in observations than in the simulation, and modeled δ13C values for tropical forests are heavy relative to observations. Simulated zonal δ13C ratios in the Northern Hemisphere capture patterns of zonal δ13C inferred from atmospheric measurements better than previous investigations. Finally, there is still a need for additional constraints to verify that carbon isotope models behave as expected.
Parallelization and automatic data distribution for nuclear reactor simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liebrock, L.M.
1997-07-01
Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directlymore » affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.« less
A scalable parallel black oil simulator on distributed memory parallel computers
NASA Astrophysics Data System (ADS)
Wang, Kun; Liu, Hui; Chen, Zhangxin
2015-11-01
This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.
Resonant behavior of the generalized Langevin system with tempered Mittag–Leffler memory kernel
NASA Astrophysics Data System (ADS)
Chen, Yao; Wang, Xudong; Deng, Weihua
2018-05-01
The generalized Langevin equation describes anomalous dynamics. Noise is not only the origin of uncertainty but also plays a positive role in helping to detect signals with information, termed stochastic resonance (SR). This paper analyzes the anomalous resonant behaviors of the generalized Langevin system with a multiplicative dichotomous noise and an internal tempered Mittag–Leffler noise. For a system with a fluctuating harmonic potential, we obtain the exact expressions of several types of SR such as the first moment, the amplitude and autocorrelation function for the output signal as well as the signal–noise ratio. We analyze the influence of the tempering parameter and memory exponent on the bona fide SR and the general SR. Moreover, it is detected that the critical memory exponent changes regularly with the increase of the tempering parameter. Almost all the theoretical results are validated by numerical simulations.
Parallelized direct execution simulation of message-passing parallel programs
NASA Technical Reports Server (NTRS)
Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.
1994-01-01
As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
A parallel algorithm for switch-level timing simulation on a hypercube multiprocessor
NASA Technical Reports Server (NTRS)
Rao, Hariprasad Nannapaneni
1989-01-01
The parallel approach to speeding up simulation is studied, specifically the simulation of digital LSI MOS circuitry on the Intel iPSC/2 hypercube. The simulation algorithm is based on RSIM, an event driven switch-level simulator that incorporates a linear transistor model for simulating digital MOS circuits. Parallel processing techniques based on the concepts of Virtual Time and rollback are utilized so that portions of the circuit may be simulated on separate processors, in parallel for as large an increase in speed as possible. A partitioning algorithm is also developed in order to subdivide the circuit for parallel processing.
NASA Technical Reports Server (NTRS)
Nicol, David; Fujimoto, Richard
1992-01-01
This paper surveys topics that presently define the state of the art in parallel simulation. Included in the tutorial are discussions on new protocols, mathematical performance analysis, time parallelism, hardware support for parallel simulation, load balancing algorithms, and dynamic memory management for optimistic synchronization.
Synchronization Of Parallel Discrete Event Simulations
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S.
1992-01-01
Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.
Lusk, Christopher H; Kelly, Jeff W G; Gleason, Sean M
2013-03-01
A trade-off between shade tolerance and growth in high light is thought to underlie the temporal dynamics of humid forests. On the other hand, it has been suggested that tree species sorting on temperature gradients involves a trade-off between growth rate and cold resistance. Little is known about how these two major trade-offs interact. Seedlings of Australian tropical and cool-temperate rainforest trees were grown in glasshouse environments to compare growth versus shade-tolerance trade-offs in these two assemblages. Biomass distribution, photosynthetic capacity and vessel diameters were measured in order to examine the functional correlates of species differences in light requirements and growth rate. Species light requirements were assessed by field estimation of the light compensation point for stem growth. Light-demanding and shade-tolerant tropical species differed markedly in relative growth rates (RGR), but this trend was less evident among temperate species. This pattern was paralleled by biomass distribution data: specific leaf area (SLA) and leaf area ratio (LAR) of tropical species were significantly positively correlated with compensation points, but not those of cool-temperate species. The relatively slow growth and small SLA and LAR of Tasmanian light-demanders were associated with narrow vessels and low potential sapwood conductivity. The conservative xylem traits, small LAR and modest RGR of Tasmanian light-demanders are consistent with selection for resistance to freeze-thaw embolism, at the expense of growth rate. Whereas competition for light favours rapid growth in light-demanding trees native to environments with warm, frost-free growing seasons, frost resistance may be an equally important determinant of the fitness of light-demanders in cool-temperate rainforest, as seedlings establishing in large openings are exposed to sub-zero temperatures that can occur throughout most of the year.
Jin, Dongliang; Coasne, Benoit
2017-10-24
Different molecular simulation strategies are used to assess the stability of methane hydrate under various temperature and pressure conditions. First, using two water molecular models, free energy calculations consisting of the Einstein molecule approach in combination with semigrand Monte Carlo simulations are used to determine the pressure-temperature phase diagram of methane hydrate. With these calculations, we also estimate the chemical potentials of water and methane and methane occupancy at coexistence. Second, we also consider two other advanced molecular simulation techniques that allow probing the phase diagram of methane hydrate: the direct coexistence method in the Grand Canonical ensemble and the hyperparallel tempering Monte Carlo method. These two direct techniques are found to provide stability conditions that are consistent with the pressure-temperature phase diagram obtained using rigorous free energy calculations. The phase diagram obtained in this work, which is found to be consistent with previous simulation studies, is close to its experimental counterpart provided the TIP4P/Ice model is used to describe the water molecule.
Mori, Yoshiharu; Okumura, Hisashi
2015-12-05
Simulated tempering (ST) is a useful method to enhance sampling of molecular simulations. When ST is used, the Metropolis algorithm, which satisfies the detailed balance condition, is usually applied to calculate the transition probability. Recently, an alternative method that satisfies the global balance condition instead of the detailed balance condition has been proposed by Suwa and Todo. In this study, ST method with the Suwa-Todo algorithm is proposed. Molecular dynamics simulations with ST are performed with three algorithms (the Metropolis, heat bath, and Suwa-Todo algorithms) to calculate the transition probability. Among the three algorithms, the Suwa-Todo algorithm yields the highest acceptance ratio and the shortest autocorrelation time. These suggest that sampling by a ST simulation with the Suwa-Todo algorithm is most efficient. In addition, because the acceptance ratio of the Suwa-Todo algorithm is higher than that of the Metropolis algorithm, the number of temperature states can be reduced by 25% for the Suwa-Todo algorithm when compared with the Metropolis algorithm. © 2015 Wiley Periodicals, Inc.
2015-08-01
Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten and James P Larentzos Approved for...Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten Weapons and Materials Research Directorate, ARL James P Larentzos Engility...Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software 5a. CONTRACT NUMBER 5b
SIMULATED CLIMATE CHANGE EFFECTS ON DISSOLVED OXYGEN CHARACTERISTICS IN ICE-COVERED LAKES. (R824801)
A deterministic, one-dimensional model is presented which simulates daily dissolved oxygen (DO) profiles and associated water temperatures, ice covers and snow covers for dimictic and polymictic lakes of the temperate zone. The lake parameters required as model input are surface ...
Modeling methane emissions by cattle production systems in Mexico
NASA Astrophysics Data System (ADS)
Castelan-Ortega, O. A.; Ku Vera, J.; Molina, L. T.
2013-12-01
Methane emissions from livestock is one of the largest sources of methane in Mexico. The purpose of the present paper is to provide a realistic estimate of the national inventory of methane produced by the enteric fermentation of cattle, based on an integrated simulation model, and to provide estimates of CH4 produced by cattle fed typical diets from the tropical and temperate climates of Mexico. The Mexican cattle population of 23.3 million heads was divided in two groups. The first group (7.8 million heads), represents cattle of the tropical climate regions. The second group (15.5 million heads), are the cattle in the temperate climate regions. This approach allows incorporating the effect of diet on CH4 production into the analysis because the quality of forages is lower in the tropics than in temperate regions. Cattle population in every group was subdivided into two categories: cows (COW) and other type of cattle (OTHE), which included calves, heifers, steers and bulls. The daily CH4 production by each category of animal along an average production cycle of 365 days was simulated, instead of using a default emission factor as in Tier 1 approach. Daily milk yield, live weight changes associated with the lactation, and dry matter intake, were simulated for the entire production cycle. The Moe and Tyrrell (1979) model was used to simulate CH4 production for the COW category, the linear model of Mills et al. (2003) for the OTHE category in temperate regions and the Kurihara et al. (1999) model for the OTHE category in the tropical regions as it has been developed for cattle fed tropical diets. All models were integrated with a cow submodel to form an Integrated Simulation Model (ISM). The AFRC (1993) equations and the lactation curve model of Morant and Gnanasakthy (1989) were used to construct the cow submodel. The ISM simulates on a daily basis the CH4 production, milk yield, live weight changes associated with lactation and dry matter intake. The total daily CH4 emission per region was calculated by multiplying the number of heads of cattle in each region by their corresponding simulated emission factor, either COW or OTHE, as predicted by the ISM. The total CH4 emissions from the Mexican cattle population was then calculated by adding up the daily emissions from each region. The predicted total emission of methane produced by the 23.3 million heads of cattle in Mexico is approximately 2.02 Tg/year, from which 1.28 Tg is produced by cattle in temperate regions and the rest by cattle in the tropics. It was concluded that the modeling approach was suitable in producing a better estimate of the national methane inventory for cattle. It is flexible enough to incorporate more cattle groups or classification schemes and productivity levels.
2015-01-01
The lateral heterogeneity of cellular membranes plays an important role in many biological functions such as signaling and regulating membrane proteins. This heterogeneity can result from preferential interactions between membrane components or interactions with membrane proteins. One major difficulty in molecular dynamics simulations aimed at studying the membrane heterogeneity is that lipids diffuse slowly and collectively in bilayers, and therefore, it is difficult to reach equilibrium in lateral organization in bilayer mixtures. Here, we propose the use of the replica exchange with solute tempering (REST) approach to accelerate lateral relaxation in heterogeneous bilayers. REST is based on the replica exchange method but tempers only the solute, leaving the temperature of the solvent fixed. Since the number of replicas in REST scales approximately only with the degrees of freedom in the solute, REST enables us to enhance the configuration sampling of lipid bilayers with fewer replicas, in comparison with the temperature replica exchange molecular dynamics simulation (T-REMD) where the number of replicas scales with the degrees of freedom of the entire system. We apply the REST method to a cholesterol and 1,2-dipalmitoyl-sn-glycero-3-phosphocholine (DPPC) bilayer mixture and find that the lateral distribution functions of all molecular pair types converge much faster than in the standard MD simulation. The relative diffusion rate between molecules in REST is, on average, an order of magnitude faster than in the standard MD simulation. Although REST was initially proposed to study protein folding and its efficiency in protein folding is still under debate, we find a unique application of REST to accelerate lateral equilibration in mixed lipid membranes and suggest a promising way to probe membrane lateral heterogeneity through molecular dynamics simulation. PMID:25328493
Huang, Kun; García, Angel E
2014-10-14
The lateral heterogeneity of cellular membranes plays an important role in many biological functions such as signaling and regulating membrane proteins. This heterogeneity can result from preferential interactions between membrane components or interactions with membrane proteins. One major difficulty in molecular dynamics simulations aimed at studying the membrane heterogeneity is that lipids diffuse slowly and collectively in bilayers, and therefore, it is difficult to reach equilibrium in lateral organization in bilayer mixtures. Here, we propose the use of the replica exchange with solute tempering (REST) approach to accelerate lateral relaxation in heterogeneous bilayers. REST is based on the replica exchange method but tempers only the solute, leaving the temperature of the solvent fixed. Since the number of replicas in REST scales approximately only with the degrees of freedom in the solute, REST enables us to enhance the configuration sampling of lipid bilayers with fewer replicas, in comparison with the temperature replica exchange molecular dynamics simulation (T-REMD) where the number of replicas scales with the degrees of freedom of the entire system. We apply the REST method to a cholesterol and 1,2-dipalmitoyl- sn -glycero-3-phosphocholine (DPPC) bilayer mixture and find that the lateral distribution functions of all molecular pair types converge much faster than in the standard MD simulation. The relative diffusion rate between molecules in REST is, on average, an order of magnitude faster than in the standard MD simulation. Although REST was initially proposed to study protein folding and its efficiency in protein folding is still under debate, we find a unique application of REST to accelerate lateral equilibration in mixed lipid membranes and suggest a promising way to probe membrane lateral heterogeneity through molecular dynamics simulation.
Forest turnover rates follow global and regional patterns of productivity
Stephenson, N.L.; van Mantgem, P.J.
2005-01-01
Using a global database, we found that forest turnover rates (the average of tree mortality and recruitment rates) parallel broad-scale patterns of net primary productivity. First, forest turnover was higher in tropical than in temperate forests. Second, as recently demonstrated by others, Amazonian forest turnover was higher on fertile than infertile soils. Third, within temperate latitudes, turnover was highest in angiosperm forests, intermediate in mixed forests, and lowest in gymnosperm forests. Finally, within a single forest physiognomic type, turnover declined sharply with elevation (hence with temperature). These patterns of turnover in populations of trees are broadly similar to the patterns of turnover in populations of plant organs (leaves and roots) found in other studies. Our findings suggest a link between forest mass balance and the population dynamics of trees, and have implications for understanding and predicting the effects of environmental changes on forest structure and terrestrial carbon dynamics. ??2005 Blackwell Publishing Ltd/CNRS.
Longhi, Giovanna; Fornili, Sandro L; Turco Liveri, Vincenzo
2015-07-07
Experimental investigations using mass spectrometry have established that surfactant molecules are able to form aggregates in the gas phase. However, there is no general consensus on the organization of these aggregates and how it depends on the aggregation number and surfactant molecular structure. In the present paper we investigate the structural organization of some surfactants in vacuo by molecular dynamics and well-tempered metadynamics simulations to widely explore the space of their possible conformations in vacuo. To study how the specific molecular features of such compounds affect their organization, we have considered as paradigmatic surfactants, the anionic single-chain sodium dodecyl sulfate (SDS), the anionic double-chain sodium bis(2-ethylhexyl) sulfosuccinate (AOT) and the zwitterionic single-chain dodecyl phosphatidyl choline (DPC) within a wide aggregation number range (from 5 to 100). We observe that for low aggregation numbers the aggregates show in vacuo the typical structure of reverse micelles, while for large aggregation numbers a variety of globular aggregates occur that are characterized by the coexistence of interlaced domains formed by the polar or ionic heads and by the alkyl chains of the surfactants. Well-tempered metadynamics simulations allows us to confirm that the structural organizations obtained after 50 ns of molecular dynamics simulations are practically the equilibrium ones. Similarities and differences of surfactant aggregates in vacuo and in apolar media are also discussed.
Simulation Exploration through Immersive Parallel Planes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brunhart-Lupo, Nicholas J; Bush, Brian W; Gruchalla, Kenny M
We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, eachmore » individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.« less
Simulation Exploration through Immersive Parallel Planes: Preprint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny
We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, eachmore » individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.« less
A path-level exact parallelization strategy for sequential simulation
NASA Astrophysics Data System (ADS)
Peredo, Oscar F.; Baeza, Daniel; Ortiz, Julián M.; Herrero, José R.
2018-01-01
Sequential Simulation is a well known method in geostatistical modelling. Following the Bayesian approach for simulation of conditionally dependent random events, Sequential Indicator Simulation (SIS) method draws simulated values for K categories (categorical case) or classes defined by K different thresholds (continuous case). Similarly, Sequential Gaussian Simulation (SGS) method draws simulated values from a multivariate Gaussian field. In this work, a path-level approach to parallelize SIS and SGS methods is presented. A first stage of re-arrangement of the simulation path is performed, followed by a second stage of parallel simulation for non-conflicting nodes. A key advantage of the proposed parallelization method is to generate identical realizations as with the original non-parallelized methods. Case studies are presented using two sequential simulation codes from GSLIB: SISIM and SGSIM. Execution time and speedup results are shown for large-scale domains, with many categories and maximum kriging neighbours in each case, achieving high speedup results in the best scenarios using 16 threads of execution in a single machine.
A compositional reservoir simulator on distributed memory parallel computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rame, M.; Delshad, M.
1995-12-31
This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
Unconstrained Enhanced Sampling for Free Energy Calculations of Biomolecules: A Review
Miao, Yinglong; McCammon, J. Andrew
2016-01-01
Free energy calculations are central to understanding the structure, dynamics and function of biomolecules. Yet insufficient sampling of biomolecular configurations is often regarded as one of the main sources of error. Many enhanced sampling techniques have been developed to address this issue. Notably, enhanced sampling methods based on biasing collective variables (CVs), including the widely used umbrella sampling, adaptive biasing force and metadynamics, have been discussed in a recent excellent review (Abrams and Bussi, Entropy, 2014). Here, we aim to review enhanced sampling methods that do not require predefined system-dependent CVs for biomolecular simulations and as such do not suffer from the hidden energy barrier problem as encountered in the CV-biasing methods. These methods include, but are not limited to, replica exchange/parallel tempering, self-guided molecular/Langevin dynamics, essential energy space random walk and accelerated molecular dynamics. While it is overwhelming to describe all details of each method, we provide a summary of the methods along with the applications and offer our perspectives. We conclude with challenges and prospects of the unconstrained enhanced sampling methods for accurate biomolecular free energy calculations. PMID:27453631
Unconstrained Enhanced Sampling for Free Energy Calculations of Biomolecules: A Review.
Miao, Yinglong; McCammon, J Andrew
Free energy calculations are central to understanding the structure, dynamics and function of biomolecules. Yet insufficient sampling of biomolecular configurations is often regarded as one of the main sources of error. Many enhanced sampling techniques have been developed to address this issue. Notably, enhanced sampling methods based on biasing collective variables (CVs), including the widely used umbrella sampling, adaptive biasing force and metadynamics, have been discussed in a recent excellent review (Abrams and Bussi, Entropy, 2014). Here, we aim to review enhanced sampling methods that do not require predefined system-dependent CVs for biomolecular simulations and as such do not suffer from the hidden energy barrier problem as encountered in the CV-biasing methods. These methods include, but are not limited to, replica exchange/parallel tempering, self-guided molecular/Langevin dynamics, essential energy space random walk and accelerated molecular dynamics. While it is overwhelming to describe all details of each method, we provide a summary of the methods along with the applications and offer our perspectives. We conclude with challenges and prospects of the unconstrained enhanced sampling methods for accurate biomolecular free energy calculations.
Chen, Weiliang; De Schutter, Erik
2017-01-01
Stochastic, spatial reaction-diffusion simulations have been widely used in systems biology and computational neuroscience. However, the increasing scale and complexity of models and morphologies have exceeded the capacity of any serial implementation. This led to the development of parallel solutions that benefit from the boost in performance of modern supercomputers. In this paper, we describe an MPI-based, parallel operator-splitting implementation for stochastic spatial reaction-diffusion simulations with irregular tetrahedral meshes. The performance of our implementation is first examined and analyzed with simulations of a simple model. We then demonstrate its application to real-world research by simulating the reaction-diffusion components of a published calcium burst model in both Purkinje neuron sub-branch and full dendrite morphologies. Simulation results indicate that our implementation is capable of achieving super-linear speedup for balanced loading simulations with reasonable molecule density and mesh quality. In the best scenario, a parallel simulation with 2,000 processes runs more than 3,600 times faster than its serial SSA counterpart, and achieves more than 20-fold speedup relative to parallel simulation with 100 processes. In a more realistic scenario with dynamic calcium influx and data recording, the parallel simulation with 1,000 processes and no load balancing is still 500 times faster than the conventional serial SSA simulation. PMID:28239346
Chen, Weiliang; De Schutter, Erik
2017-01-01
Stochastic, spatial reaction-diffusion simulations have been widely used in systems biology and computational neuroscience. However, the increasing scale and complexity of models and morphologies have exceeded the capacity of any serial implementation. This led to the development of parallel solutions that benefit from the boost in performance of modern supercomputers. In this paper, we describe an MPI-based, parallel operator-splitting implementation for stochastic spatial reaction-diffusion simulations with irregular tetrahedral meshes. The performance of our implementation is first examined and analyzed with simulations of a simple model. We then demonstrate its application to real-world research by simulating the reaction-diffusion components of a published calcium burst model in both Purkinje neuron sub-branch and full dendrite morphologies. Simulation results indicate that our implementation is capable of achieving super-linear speedup for balanced loading simulations with reasonable molecule density and mesh quality. In the best scenario, a parallel simulation with 2,000 processes runs more than 3,600 times faster than its serial SSA counterpart, and achieves more than 20-fold speedup relative to parallel simulation with 100 processes. In a more realistic scenario with dynamic calcium influx and data recording, the parallel simulation with 1,000 processes and no load balancing is still 500 times faster than the conventional serial SSA simulation.
Efficient hierarchical trans-dimensional Bayesian inversion of magnetotelluric data
NASA Astrophysics Data System (ADS)
Xiang, Enming; Guo, Rongwen; Dosso, Stan E.; Liu, Jianxin; Dong, Hao; Ren, Zhengyong
2018-06-01
This paper develops an efficient hierarchical trans-dimensional (trans-D) Bayesian algorithm to invert magnetotelluric (MT) data for subsurface geoelectrical structure, with unknown geophysical model parameterization (the number of conductivity-layer interfaces) and data-error models parameterized by an auto-regressive (AR) process to account for potential error correlations. The reversible-jump Markov-chain Monte Carlo algorithm, which adds/removes interfaces and AR parameters in birth/death steps, is applied to sample the trans-D posterior probability density for model parameterization, model parameters, error variance and AR parameters, accounting for the uncertainties of model dimension and data-error statistics in the uncertainty estimates of the conductivity profile. To provide efficient sampling over the multiple subspaces of different dimensions, advanced proposal schemes are applied. Parameter perturbations are carried out in principal-component space, defined by eigen-decomposition of the unit-lag model covariance matrix, to minimize the effect of inter-parameter correlations and provide effective perturbation directions and length scales. Parameters of new layers in birth steps are proposed from the prior, instead of focused distributions centred at existing values, to improve birth acceptance rates. Parallel tempering, based on a series of parallel interacting Markov chains with successively relaxed likelihoods, is applied to improve chain mixing over model dimensions. The trans-D inversion is applied in a simulation study to examine the resolution of model structure according to the data information content. The inversion is also applied to a measured MT data set from south-central Australia.
Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Shuangshuang; Chen, Yousu; Wu, Di
2015-12-09
Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Messagemore » Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.« less
Borukhovich, Efim; Du, Guanxing; Stratmann, Matthias; Boeff, Martin; Shchyglo, Oleg; Hartmaier, Alexander; Steinbach, Ingo
2016-01-01
Martensitic steels form a material class with a versatile range of properties that can be selected by varying the processing chain. In order to study and design the desired processing with the minimal experimental effort, modeling tools are required. In this work, a full processing cycle from quenching over tempering to mechanical testing is simulated with a single modeling framework that combines the features of the phase-field method and a coupled chemo-mechanical approach. In order to perform the mechanical testing, the mechanical part is extended to the large deformations case and coupled to crystal plasticity and a linear damage model. The quenching process is governed by the austenite-martensite transformation. In the tempering step, carbon segregation to the grain boundaries and the resulting cementite formation occur. During mechanical testing, the obtained material sample undergoes a large deformation that leads to local failure. The initial formation of the damage zones is observed to happen next to the carbides, while the final damage morphology follows the martensite microstructure. This multi-scale approach can be applied to design optimal microstructures dependent on processing and materials composition. PMID:28773791
D. Bachelet; J. Lenihan; R. Neilson; R. Drapek; T. Kittel
2005-01-01
The dynamic global vegetation model MC1 was used to examine climate, fire, and ecosystems interactions in Alaska under historical (1922-1996) and future (1997-2100) climate conditions. Projections show that by the end of the 21st century, 75%-90% of the area simulated as tundra in 1922 is replaced by boreal and temperate forest. From 1922 to 1996, simulation results...
Veselý, Lukáš; Buřič, Miloš; Kouba, Antonín
2015-01-01
The spreading of new crayfish species poses a serious risk for freshwater ecosystems; because they are omnivores they influence more than one level in the trophic chain and they represent a significant part of the benthic biomass. Both the environmental change through global warming and the expansion of the pet trade increase the possibilities of their spreading. We investigated the potential of four “warm water” highly invasive crayfish species to overwinter in the temperate zone, so as to predict whether these species pose a risk for European freshwaters. We used 15 specimens of each of the following species: the red swamp crayfish (Procambarus clarkii), the marbled crayfish (Procambarus fallax f. virginalis), the yabby (Cherax destructor), and the redclaw (Cherax quadricarinatus). Specimens were acclimatized and kept for 6.5 months at temperatures simulating the winter temperature regime of European temperate zone lentic ecosystems. We conclude that the red swamp crayfish, marbled crayfish and yabby have the ability to withstand low winter temperatures relevant for lentic habitats in the European temperate zone, making them a serious invasive threat to freshwater ecosystems. PMID:26572317
Parallel Signal Processing and System Simulation using aCe
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2003-01-01
Recently, networked and cluster computation have become very popular for both signal processing and system simulation. A new language is ideally suited for parallel signal processing applications and system simulation since it allows the programmer to explicitly express the computations that can be performed concurrently. In addition, the new C based parallel language (ace C) for architecture-adaptive programming allows programmers to implement algorithms and system simulation applications on parallel architectures by providing them with the assurance that future parallel architectures will be able to run their applications with a minimum of modification. In this paper, we will focus on some fundamental features of ace C and present a signal processing application (FFT).
On the suitability of the connection machine for direct particle simulation
NASA Technical Reports Server (NTRS)
Dagum, Leonard
1990-01-01
The algorithmic structure was examined of the vectorizable Stanford particle simulation (SPS) method and the structure is reformulated in data parallel form. Some of the SPS algorithms can be directly translated to data parallel, but several of the vectorizable algorithms have no direct data parallel equivalent. This requires the development of new, strictly data parallel algorithms. In particular, a new sorting algorithm is developed to identify collision candidates in the simulation and a master/slave algorithm is developed to minimize communication cost in large table look up. Validation of the method is undertaken through test calculations for thermal relaxation of a gas, shock wave profiles, and shock reflection from a stationary wall. A qualitative measure is provided of the performance of the Connection Machine for direct particle simulation. The massively parallel architecture of the Connection Machine is found quite suitable for this type of calculation. However, there are difficulties in taking full advantage of this architecture because of lack of a broad based tradition of data parallel programming. An important outcome of this work has been new data parallel algorithms specifically of use for direct particle simulation but which also expand the data parallel diction.
NASA Astrophysics Data System (ADS)
Yan, Hui; Wang, K. G.; Jones, Jim E.
2016-06-01
A parallel algorithm for large-scale three-dimensional phase-field simulations of phase coarsening is developed and implemented on high-performance architectures. From the large-scale simulations, a new kinetics in phase coarsening in the region of ultrahigh volume fraction is found. The parallel implementation is capable of harnessing the greater computer power available from high-performance architectures. The parallelized code enables increase in three-dimensional simulation system size up to a 5123 grid cube. Through the parallelized code, practical runtime can be achieved for three-dimensional large-scale simulations, and the statistical significance of the results from these high resolution parallel simulations are greatly improved over those obtainable from serial simulations. A detailed performance analysis on speed-up and scalability is presented, showing good scalability which improves with increasing problem size. In addition, a model for prediction of runtime is developed, which shows a good agreement with actual run time from numerical tests.
Suppressing correlations in massively parallel simulations of lattice models
NASA Astrophysics Data System (ADS)
Kelling, Jeffrey; Ódor, Géza; Gemming, Sibylle
2017-11-01
For lattice Monte Carlo simulations parallelization is crucial to make studies of large systems and long simulation time feasible, while sequential simulations remain the gold-standard for correlation-free dynamics. Here, various domain decomposition schemes are compared, concluding with one which delivers virtually correlation-free simulations on GPUs. Extensive simulations of the octahedron model for 2 + 1 dimensional Kardar-Parisi-Zhang surface growth, which is very sensitive to correlation in the site-selection dynamics, were performed to show self-consistency of the parallel runs and agreement with the sequential algorithm. We present a GPU implementation providing a speedup of about 30 × over a parallel CPU implementation on a single socket and at least 180 × with respect to the sequential reference.
Haldar, Susanta; Kührová, Petra; Banáš, Pavel; Spiwok, Vojtěch; Šponer, Jiří; Hobza, Pavel; Otyepka, Michal
2015-08-11
RNA hairpins capped by 5'-GNRA-3' or 5'-UNCG-3' tetraloops (TLs) are prominent RNA structural motifs. Despite their small size, a wealth of experimental data, and recent progress in theoretical simulations of their structural dynamics and folding, our understanding of the folding and unfolding processes of these small RNA elements is still limited. Theoretical description of the folding and unfolding processes requires robust sampling, which can be achieved by either an exhaustive time scale in standard molecular dynamics simulations or sophisticated enhanced sampling methods, using temperature acceleration or biasing potentials. Here, we study structural dynamics of 5'-GNRA-3' and 5'-UNCG-3' TLs by 15-μs-long standard simulations and a series of well-tempered metadynamics, attempting to accelerate sampling by bias in a few chosen collective variables (CVs). Both methods provide useful insights. The unfolding and refolding mechanisms of the GNRA TL observed by well-tempered metadynamics agree with the (reverse) folding mechanism suggested by recent replica exchange molecular dynamics simulations. The orientation of the glycosidic bond of the GL4 nucleobase is critical for the UUCG TL folding pathway, and our data strongly support the hypothesis that GL4-anti forms a kinetic trap along the folding pathway. Along with giving useful insight, our study also demonstrates that using only a few CVs apparently does not capture the full folding landscape of the RNA TLs. Despite using several sophisticated selections of the CVs, formation of the loop appears to remain a hidden variable, preventing a full convergence of the metadynamics. Finally, our data suggest that the unfolded state might be overstabilized by the force fields used.
Simulating the onset of spring vegetation growth across the Northern Hemisphere.
Liu, Qiang; Fu, Yongshuo H; Liu, Yongwen; Janssens, Ivan A; Piao, Shilong
2018-03-01
Changes in the spring onset of vegetation growth in response to climate change can profoundly impact climate-biosphere interactions. Thus, robust simulation of spring onset is essential to accurately predict ecosystem responses and feedback to ongoing climate change. To date, the ability of vegetation phenology models to reproduce spatiotemporal patterns of spring onset at larger scales has not been thoroughly investigated. In this study, we took advantage of phenology observations via remote sensing to calibrate and evaluated six models, including both one-phase (considering only forcing temperatures) and two-phase (involving forcing, chilling, and photoperiod) models across the Northern Hemisphere between 1982 and 2012. Overall, we found that the model that integrated the photoperiod effect performed best at capturing spatiotemporal patterns of spring phenology in boreal and temperate forests. By contrast, all of the models performed poorly in simulating the onset of growth in grasslands. These results suggest that the photoperiod plays a role in controlling the onset of growth in most Northern Hemisphere forests, whereas other environmental factors (e.g., precipitation) should be considered when simulating the onset of growth in grasslands. We also found that the one-phase model performed as well as the two-phase models in boreal forests, which implies that the chilling requirement is probably fulfilled across most of the boreal zone. Conversely, two-phase models performed better in temperate forests than the one-phase model, suggesting that photoperiod and chilling play important roles in these temperate forests. Our results highlight the significance of including chilling and photoperiod effects in models of the spring onset of forest growth at large scales, and indicate that the consideration of additional drivers may be required for grasslands. © 2017 John Wiley & Sons Ltd.
SPEEDES - A multiple-synchronization environment for parallel discrete-event simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeff S.
1992-01-01
Synchronous Parallel Environment for Emulation and Discrete-Event Simulation (SPEEDES) is a unified parallel simulation environment. It supports multiple-synchronization protocols without requiring users to recompile their code. When a SPEEDES simulation runs on one node, all the extra parallel overhead is removed automatically at run time. When the same executable runs in parallel, the user preselects the synchronization algorithm from a list of options. SPEEDES currently runs on UNIX networks and on the California Institute of Technology/Jet Propulsion Laboratory Mark III Hypercube. SPEEDES also supports interactive simulations. Featured in the SPEEDES environment is a new parallel synchronization approach called Breathing Time Buckets. This algorithm uses some of the conservative techniques found in Time Bucket synchronization, along with the optimism that characterizes the Time Warp approach. A mathematical model derived from first principles predicts the performance of Breathing Time Buckets. Along with the Breathing Time Buckets algorithm, this paper discusses the rules for processing events in SPEEDES, describes the implementation of various other synchronization protocols supported by SPEEDES, describes some new ones for the future, discusses interactive simulations, and then gives some performance results.
Influence of equilibrium shear flow in the parallel magnetic direction on edge localized mode crash
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Y.; Xiong, Y. Y.; Chen, S. Y., E-mail: sychen531@163.com
2016-04-15
The influence of the parallel shear flow on the evolution of peeling-ballooning (P-B) modes is studied with the BOUT++ four-field code in this paper. The parallel shear flow has different effects in linear simulation and nonlinear simulation. In the linear simulations, the growth rate of edge localized mode (ELM) can be increased by Kelvin-Helmholtz term, which can be caused by the parallel shear flow. In the nonlinear simulations, the results accord with the linear simulations in the linear phase. However, the ELM size is reduced by the parallel shear flow in the beginning of the turbulence phase, which is recognizedmore » as the P-B filaments' structure. Then during the turbulence phase, the ELM size is decreased by the shear flow.« less
Random number generators for large-scale parallel Monte Carlo simulations on FPGA
NASA Astrophysics Data System (ADS)
Lin, Y.; Wang, F.; Liu, B.
2018-05-01
Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.
A sweep algorithm for massively parallel simulation of circuit-switched networks
NASA Technical Reports Server (NTRS)
Gaujal, Bruno; Greenberg, Albert G.; Nicol, David M.
1992-01-01
A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude.
Parallelization of sequential Gaussian, indicator and direct simulation algorithms
NASA Astrophysics Data System (ADS)
Nunes, Ruben; Almeida, José A.
2010-08-01
Improving the performance and robustness of algorithms on new high-performance parallel computing architectures is a key issue in efficiently performing 2D and 3D studies with large amount of data. In geostatistics, sequential simulation algorithms are good candidates for parallelization. When compared with other computational applications in geosciences (such as fluid flow simulators), sequential simulation software is not extremely computationally intensive, but parallelization can make it more efficient and creates alternatives for its integration in inverse modelling approaches. This paper describes the implementation and benchmarking of a parallel version of the three classic sequential simulation algorithms: direct sequential simulation (DSS), sequential indicator simulation (SIS) and sequential Gaussian simulation (SGS). For this purpose, the source used was GSLIB, but the entire code was extensively modified to take into account the parallelization approach and was also rewritten in the C programming language. The paper also explains in detail the parallelization strategy and the main modifications. Regarding the integration of secondary information, the DSS algorithm is able to perform simple kriging with local means, kriging with an external drift and collocated cokriging with both local and global correlations. SIS includes a local correction of probabilities. Finally, a brief comparison is presented of simulation results using one, two and four processors. All performance tests were carried out on 2D soil data samples. The source code is completely open source and easy to read. It should be noted that the code is only fully compatible with Microsoft Visual C and should be adapted for other systems/compilers.
Relation of Parallel Discrete Event Simulation algorithms with physical models
NASA Astrophysics Data System (ADS)
Shchur, L. N.; Shchur, L. V.
2015-09-01
We extend concept of local simulation times in parallel discrete event simulation (PDES) in order to take into account architecture of the current hardware and software in high-performance computing. We shortly review previous research on the mapping of PDES on physical problems, and emphasise how physical results may help to predict parallel algorithms behaviour.
Parallel Simulation of Subsonic Fluid Dynamics on a Cluster of Workstations.
1994-11-01
inside wind musical instruments. Typical simulations achieve $80\\%$ parallel efficiency (speedup/processors) using 20 HP-Apollo workstations. Detailed...TERMS AI, MIT, Artificial Intelligence, Distributed Computing, Workstation Cluster, Network, Fluid Dynamics, Musical Instruments 17. SECURITY...for example, the flow of air inside wind musical instruments. Typical simulations achieve 80% parallel efficiency (speedup/processors) using 20 HP
Wood phenology: from organ-scale processes to terrestrial ecosystem models
NASA Astrophysics Data System (ADS)
Delpierre, Nicolas; Guillemot, Joannès
2016-04-01
In temperate and boreal trees, a dormancy period prevents organ development during adverse climatic conditions. Whereas the phenology of leaves and flowers has received considerable attention, to date, little is known regarding the phenology of other tree organs such as wood, fine roots, fruits and reserve compounds. In this presentation, we review both the role of environmental drivers in determining the phenology of wood and the models used to predict its phenology in temperate and boreal forest trees. Temperature is a key driver of the resumption of wood activity in spring. There is no such clear dominant environmental cue involved in the cessation of wood formation in autumn, but temperature and water stress appear as prominent factors. We show that wood phenology is a key driver of the interannual variability of wood growth in temperate tree species. Incorporating representations of wood phenology in a terrestrial ecosystem model substantially improved the simulation of wood growth under current climate.
Migration behaviour of silicone moulds in contact with different foodstuffs.
Helling, Ruediger; Kutschbach, Katja; Joachim Simat, Thomas
2010-03-01
Various foodstuffs were prepared in silicone baking moulds and analyzed for siloxane migration using a previously developed and validated (1)H-NMR method. Meat loaf significantly exceeded the overall migration limit of 60 mg kg(-1) (10 mg sdm(-1)) in the first and third experiment. The highest siloxane migration found in a meat loaf after preparation in a commercial mould was 177 mg kg(-1). In contrast, milk-based food showed very low or non-detectable migration (<2.4 mg kg(-1)), even containing high fat levels. Similar results were achieved using 50% ethanol as the simulant for milk-based products, as defined in the Plastics Directive 2007/19/EEC. After solvent extraction of the moulds in simulating long-term usage, no further migration into the food was detectable, indicating that there is no significant formation of low molecular weight, potentially migrating siloxanes from the elastomer. During repeated usage, the moulds showed a high uptake of fat: up to 8.0 g fat per kg elastomer. Proper tempering of the moulds had a major influence on the migration properties of siloxanes into different foodstuffs. Non-tempered moulds with a high level of volatile organic compounds (1.1%) were shown to have considerably higher migration than the equivalent tempered moulds.
Xyce parallel electronic simulator users guide, version 6.1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users' guide, Version 6.0.1.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users guide, version 6.0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Data parallel sorting for particle simulation
NASA Technical Reports Server (NTRS)
Dagum, Leonardo
1992-01-01
Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
Model-data fusion across ecosystems: from multisite optimizations to global simulations
NASA Astrophysics Data System (ADS)
Kuppel, S.; Peylin, P.; Maignan, F.; Chevallier, F.; Kiely, G.; Montagnani, L.; Cescatti, A.
2014-11-01
This study uses a variational data assimilation framework to simultaneously constrain a global ecosystem model with eddy covariance measurements of daily net ecosystem exchange (NEE) and latent heat (LE) fluxes from a large number of sites grouped in seven plant functional types (PFTs). It is an attempt to bridge the gap between the numerous site-specific parameter optimization works found in the literature and the generic parameterization used by most land surface models within each PFT. The present multisite approach allows deriving PFT-generic sets of optimized parameters enhancing the agreement between measured and simulated fluxes at most of the sites considered, with performances often comparable to those of the corresponding site-specific optimizations. Besides reducing the PFT-averaged model-data root-mean-square difference (RMSD) and the associated daily output uncertainty, the optimization improves the simulated CO2 balance at tropical and temperate forests sites. The major site-level NEE adjustments at the seasonal scale are reduced amplitude in C3 grasslands and boreal forests, increased seasonality in temperate evergreen forests, and better model-data phasing in temperate deciduous broadleaf forests. Conversely, the poorer performances in tropical evergreen broadleaf forests points to deficiencies regarding the modelling of phenology and soil water stress for this PFT. An evaluation with data-oriented estimates of photosynthesis (GPP - gross primary productivity) and ecosystem respiration (Reco) rates indicates distinctively improved simulations of both gross fluxes. The multisite parameter sets are then tested against CO2 concentrations measured at 53 locations around the globe, showing significant adjustments of the modelled seasonality of atmospheric CO2 concentration, whose relevance seems PFT-dependent, along with an improved interannual variability. Lastly, a global-scale evaluation with remote sensing NDVI (normalized difference vegetation index) measurements indicates an improvement of the simulated seasonal variations of the foliar cover for all considered PFTs.
Churski, Marcin; Bubnicki, Jakub W; Jędrzejewska, Bogumiła; Kuijper, Dries P J; Cromsigt, Joris P G M
2017-04-01
Plant biomass consumers (mammalian herbivory and fire) are increasingly seen as major drivers of ecosystem structure and function but the prevailing paradigm in temperate forest ecology is still that their dynamics are mainly bottom-up resource-controlled. Using conceptual advances from savanna ecology, particularly the demographic bottleneck model, we present a novel view on temperate forest dynamics that integrates consumer and resource control. We used a fully factorial experiment, with varying levels of ungulate herbivory and resource (light) availability, to investigate how these factors shape recruitment of five temperate tree species. We ran simulations to project how inter- and intraspecific differences in height increment under the different experimental scenarios influence long-term recruitment of tree species. Strong herbivore-driven demographic bottlenecks occurred in our temperate forest system, and bottlenecks were as strong under resource-rich as under resource-poor conditions. Increased browsing by herbivores in resource-rich patches strongly counteracted the increased escape strength of saplings in these patches. This finding is a crucial extension of the demographic bottleneck model which assumes that increased resource availability allows plants to more easily escape consumer-driven bottlenecks. Our study demonstrates that a more dynamic understanding of consumer-resource interactions is necessary, where consumers and plants both respond to resource availability. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Ilott, Andrew J; Palucha, Sebastian; Hodgkinson, Paul; Wilson, Mark R
2013-10-10
The well-tempered, smoothly converging form of the metadynamics algorithm has been implemented in classical molecular dynamics simulations and used to obtain an estimate of the free energy surface explored by the molecular rotations in the plastic crystal, octafluoronaphthalene. The biased simulations explore the full energy surface extremely efficiently, more than 4 orders of magnitude faster than unbiased molecular dynamics runs. The metadynamics collective variables used have also been expanded to include the simultaneous orientations of three neighboring octafluoronaphthalene molecules. Analysis of the resultant three-dimensional free energy surface, which is sampled to a very high degree despite its significant complexity, demonstrates that there are strong correlations between the molecular orientations. Although this correlated motion is of limited applicability in terms of exploiting dynamical motion in octafluoronaphthalene, the approach used is extremely well suited to the investigation of the function of crystalline molecular machines.
A new deadlock resolution protocol and message matching algorithm for the extreme-scale simulator
Engelmann, Christian; Naughton, III, Thomas J.
2016-03-22
Investigating the performance of parallel applications at scale on future high-performance computing (HPC) architectures and the performance impact of different HPC architecture choices is an important component of HPC hardware/software co-design. The Extreme-scale Simulator (xSim) is a simulation toolkit for investigating the performance of parallel applications at scale. xSim scales to millions of simulated Message Passing Interface (MPI) processes. The overhead introduced by a simulation tool is an important performance and productivity aspect. This paper documents two improvements to xSim: (1)~a new deadlock resolution protocol to reduce the parallel discrete event simulation overhead and (2)~a new simulated MPI message matchingmore » algorithm to reduce the oversubscription management overhead. The results clearly show a significant performance improvement. The simulation overhead for running the NAS Parallel Benchmark suite was reduced from 102% to 0% for the embarrassingly parallel (EP) benchmark and from 1,020% to 238% for the conjugate gradient (CG) benchmark. xSim offers a highly accurate simulation mode for better tracking of injected MPI process failures. Furthermore, with highly accurate simulation, the overhead was reduced from 3,332% to 204% for EP and from 37,511% to 13,808% for CG.« less
A Systems Approach to Scalable Transportation Network Modeling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S
2006-01-01
Emerging needs in transportation network modeling and simulation are raising new challenges with respect to scal-ability of network size and vehicular traffic intensity, speed of simulation for simulation-based optimization, and fidel-ity of vehicular behavior for accurate capture of event phe-nomena. Parallel execution is warranted to sustain the re-quired detail, size and speed. However, few parallel simulators exist for such applications, partly due to the challenges underlying their development. Moreover, many simulators are based on time-stepped models, which can be computationally inefficient for the purposes of modeling evacuation traffic. Here an approach is presented to de-signing a simulator with memory andmore » speed efficiency as the goals from the outset, and, specifically, scalability via parallel execution. The design makes use of discrete event modeling techniques as well as parallel simulation meth-ods. Our simulator, called SCATTER, is being developed, incorporating such design considerations. Preliminary per-formance results are presented on benchmark road net-works, showing scalability to one million vehicles simu-lated on one processor.« less
ANNarchy: a code generation approach to neural simulations on parallel hardware
Vitay, Julien; Dinkelbach, Helge Ü.; Hamker, Fred H.
2015-01-01
Many modern neural simulators focus on the simulation of networks of spiking neurons on parallel hardware. Another important framework in computational neuroscience, rate-coded neural networks, is mostly difficult or impossible to implement using these simulators. We present here the ANNarchy (Artificial Neural Networks architect) neural simulator, which allows to easily define and simulate rate-coded and spiking networks, as well as combinations of both. The interface in Python has been designed to be close to the PyNN interface, while the definition of neuron and synapse models can be specified using an equation-oriented mathematical description similar to the Brian neural simulator. This information is used to generate C++ code that will efficiently perform the simulation on the chosen parallel hardware (multi-core system or graphical processing unit). Several numerical methods are available to transform ordinary differential equations into an efficient C++code. We compare the parallel performance of the simulator to existing solutions. PMID:26283957
Xyce parallel electronic simulator : users' guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.
2011-05-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-artmore » algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.« less
NASA Technical Reports Server (NTRS)
Dagum, Leonardo
1989-01-01
The data parallel implementation of a particle simulation for hypersonic rarefied flow described by Dagum associates a single parallel data element with each particle in the simulation. The simulated space is divided into discrete regions called cells containing a variable and constantly changing number of particles. The implementation requires a global sort of the parallel data elements so as to arrange them in an order that allows immediate access to the information associated with cells in the simulation. Described here is a very fast algorithm for performing the necessary ranking of the parallel data elements. The performance of the new algorithm is compared with that of the microcoded instruction for ranking on the Connection Machine.
Symplectic molecular dynamics simulations on specially designed parallel computers.
Borstnik, Urban; Janezic, Dusanka
2005-01-01
We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables the use of longer simulation time steps. The low-frequency motion is treated numerically on specially designed parallel computers, which decreases the computational time of each simulation time step. The combination of these approaches means that less time is required and fewer steps are needed and so enables fast MD simulations. We study the computational performance of MD simulation of molecular systems on specialized computers and provide a comparison to standard personal computers. The combination of the SISM with two specialized parallel computers is an effective way to increase the speed of MD simulations up to 16-fold over a single PC processor.
Parallel discrete-event simulation of FCFS stochastic queueing networks
NASA Technical Reports Server (NTRS)
Nicol, David M.
1988-01-01
Physical systems are inherently parallel. Intuition suggests that simulations of these systems may be amenable to parallel execution. The parallel execution of a discrete-event simulation requires careful synchronization of processes in order to ensure the execution's correctness; this synchronization can degrade performance. Largely negative results were recently reported in a study which used a well-known synchronization method on queueing network simulations. Discussed here is a synchronization method (appointments), which has proven itself to be effective on simulations of FCFS queueing networks. The key concept behind appointments is the provision of lookahead. Lookahead is a prediction on a processor's future behavior, based on an analysis of the processor's simulation state. It is shown how lookahead can be computed for FCFS queueing network simulations, give performance data that demonstrates the method's effectiveness under moderate to heavy loads, and discuss performance tradeoffs between the quality of lookahead, and the cost of computing lookahead.
Horrocks, Nicholas P C; Hegemann, Arne; Matson, Kevin D; Hine, Kathryn; Jaquier, Sophie; Shobrak, Mohammed; Williams, Joseph B; Tinbergen, Joost M; Tieleman, B Irene
2012-01-01
Immune defense may vary as a result of trade-offs with other life-history traits or in parallel with variation in antigen levels in the environment. We studied lark species (Alaudidae) in the Arabian Desert and temperate Netherlands to test opposing predictions from these two hypotheses. Based on their slower pace of life, the trade-off hypothesis predicts relatively stronger immune defenses in desert larks compared with temperate larks. However, as predicted by the antigen exposure hypothesis, reduced microbial abundances in deserts should result in desert-living larks having relatively weaker immune defenses. We quantified host-independent and host-dependent microbial abundances of culturable microbes in ambient air and from the surfaces of birds. We measured components of immunity by quantifying concentrations of the acute-phase protein haptoglobin, natural antibody-mediated agglutination titers, complement-mediated lysis titers, and the microbicidal ability of whole blood. Desert-living larks were exposed to significantly lower concentrations of airborne microbes than temperate larks, and densities of some bird-associated microbes were also lower in desert species. Haptoglobin concentrations and lysis titers were also significantly lower in desert-living larks, but other immune indexes did not differ. Thus, contrary to the trade-off hypothesis, we found little evidence that a slow pace of life predicted increased immunological investment. In contrast, and in support of the antigen exposure hypothesis, associations between microbial exposure and some immune indexes were apparent. Measures of antigen exposure, including assessment of host-independent and host-dependent microbial assemblages, can provide novel insights into the mechanisms underlying immunological variation.
Progress in Unsteady Turbopump Flow Simulations
NASA Technical Reports Server (NTRS)
Kiris, Cetin C.; Chan, William; Kwak, Dochan; Williams, Robert
2002-01-01
This viewgraph presentation discusses unsteady flow simulations for a turbopump intended for a reusable launch vehicle (RLV). The simulation process makes use of computational grids and parallel processing. The architecture of the parallel computers used is discussed, as is the scripting of turbopump simulations.
NASA Astrophysics Data System (ADS)
Byun, Hye Suk; El-Naggar, Mohamed Y.; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya
2017-10-01
Kinetic Monte Carlo (KMC) simulations are used to study long-time dynamics of a wide variety of systems. Unfortunately, the conventional KMC algorithm is not scalable to larger systems, since its time scale is inversely proportional to the simulated system size. A promising approach to resolving this issue is the synchronous parallel KMC (SPKMC) algorithm, which makes the time scale size-independent. This paper introduces a formal derivation of the SPKMC algorithm based on local transition-state and time-dependent Hartree approximations, as well as its scalable parallel implementation based on a dual linked-list cell method. The resulting algorithm has achieved a weak-scaling parallel efficiency of 0.935 on 1024 Intel Xeon processors for simulating biological electron transfer dynamics in a 4.2 billion-heme system, as well as decent strong-scaling parallel efficiency. The parallel code has been used to simulate a lattice of cytochrome complexes on a bacterial-membrane nanowire, and it is broadly applicable to other problems such as computational synthesis of new materials.
A hybrid parallel framework for the cellular Potts model simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Yi; He, Kejing; Dong, Shoubin
2009-01-01
The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approachmore » achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).« less
Parallel discrete event simulation: A shared memory approach
NASA Technical Reports Server (NTRS)
Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.
1987-01-01
With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models.
Lu, Wei; Fan, Wen Yi; Tian, Tian
2016-05-01
Keeping other parameters as empirical constants, different numerical combinations of the main photosynthetic parameters V c max and J max were conducted to estimate daily GPP by using the iteration method in this paper. To optimize V c max and J max in BEPSHourly model at hourly time steps, simulated daily GPP using different numerical combinations of the parameters were compared with the flux tower data obtained from the temperate deciduous broad-leaved forest of the Maoershan Forest Farm in Northeast China. Comparing the simulated daily GPP with the observed flux data in 2011, the results showed that optimal V c max and J max for the deciduous broad-leaved forest in Northeast China were 41.1 μmol·m -2 ·s -1 and 82.8 μmol·m -2 ·s -1 respectively with the minimal RMSE and the maximum R 2 of 1.10 g C·m -2 ·d -1 and 0.95. After V c max and J max optimization, BEPSHourly model simulated the seasonal variation of GPP better.
Terrestrial biosphere changes over the last 120 kyr
NASA Astrophysics Data System (ADS)
Hoogakker, B. A. A.; Smith, R. S.; Singarayer, J. S.; Marchant, R.; Prentice, I. C.; Allen, J. R. M.; Anderson, R. S.; Bhagwat, S. A.; Behling, H.; Borisova, O.; Bush, M.; Correa-Metrio, A.; de Vernal, A.; Finch, J. M.; Fréchette, B.; Lozano-Garcia, S.; Gosling, W. D.; Granoszewski, W.; Grimm, E. C.; Grüger, E.; Hanselman, J.; Harrison, S. P.; Hill, T. R.; Huntley, B.; Jiménez-Moreno, G.; Kershaw, P.; Ledru, M.-P.; Magri, D.; McKenzie, M.; Müller, U.; Nakagawa, T.; Novenko, E.; Penny, D.; Sadori, L.; Scott, L.; Stevenson, J.; Valdes, P. J.; Vandergoes, M.; Velichko, A.; Whitlock, C.; Tzedakis, C.
2016-01-01
A new global synthesis and biomization of long (> 40 kyr) pollen-data records is presented and used with simulations from the HadCM3 and FAMOUS climate models and the BIOME4 vegetation model to analyse the dynamics of the global terrestrial biosphere and carbon storage over the last glacial-interglacial cycle. Simulated biome distributions using BIOME4 driven by HadCM3 and FAMOUS at the global scale over time generally agree well with those inferred from pollen data. Global average areas of grassland and dry shrubland, desert, and tundra biomes show large-scale increases during the Last Glacial Maximum, between ca. 64 and 74 ka BP and cool substages of Marine Isotope Stage 5, at the expense of the tropical forest, warm-temperate forest, and temperate forest biomes. These changes are reflected in BIOME4 simulations of global net primary productivity, showing good agreement between the two models. Such changes are likely to affect terrestrial carbon storage, which in turn influences the stable carbon isotopic composition of seawater as terrestrial carbon is depleted in 13C.
δ15N constraints on long-term nitrogen balances in temperate forests
Perakis, S.S.; Sinkhorn, E.R.; Compton, J.E.
2011-01-01
Biogeochemical theory emphasizes nitrogen (N) limitation and the many factors that can restrict N accumulation in temperate forests, yet lacks a working model of conditions that can promote naturally high N accumulation. We used a dynamic simulation model of ecosystem N and δ15N to evaluate which combination of N input and loss pathways could produce a range of high ecosystem N contents characteristic of forests in the Oregon Coast Range. Total ecosystem N at nine study sites ranged from 8,788 to 22,667 kg ha−1 and carbon (C) ranged from 188 to 460 Mg ha−1, with highest values near the coast. Ecosystem δ15N displayed a curvilinear relationship with ecosystem N content, and largely reflected mineral soil, which accounted for 96–98% of total ecosystem N. Model simulations of ecosystem N balances parameterized with field rates of N leaching required long-term average N inputs that exceed atmospheric deposition and asymbiotic and epiphytic N2-fixation, and that were consistent with cycles of post-fire N2-fixation by early-successional red alder. Soil water δ15NO3 − patterns suggested a shift in relative N losses from denitrification to nitrate leaching as N accumulated, and simulations identified nitrate leaching as the primary N loss pathway that constrains maximum N accumulation. Whereas current theory emphasizes constraints on biological N2-fixation and disturbance-mediated N losses as factors that limit N accumulation in temperate forests, our results suggest that wildfire can foster substantial long-term N accumulation in ecosystems that are colonized by symbiotic N2-fixing vegetation.
GPU accelerated population annealing algorithm
NASA Astrophysics Data System (ADS)
Barash, Lev Yu.; Weigel, Martin; Borovský, Michal; Janke, Wolfhard; Shchur, Lev N.
2017-11-01
Population annealing is a promising recent approach for Monte Carlo simulations in statistical physics, in particular for the simulation of systems with complex free-energy landscapes. It is a hybrid method, combining importance sampling through Markov chains with elements of sequential Monte Carlo in the form of population control. While it appears to provide algorithmic capabilities for the simulation of such systems that are roughly comparable to those of more established approaches such as parallel tempering, it is intrinsically much more suitable for massively parallel computing. Here, we tap into this structural advantage and present a highly optimized implementation of the population annealing algorithm on GPUs that promises speed-ups of several orders of magnitude as compared to a serial implementation on CPUs. While the sample code is for simulations of the 2D ferromagnetic Ising model, it should be easily adapted for simulations of other spin models, including disordered systems. Our code includes implementations of some advanced algorithmic features that have only recently been suggested, namely the automatic adaptation of temperature steps and a multi-histogram analysis of the data at different temperatures. Program Files doi:http://dx.doi.org/10.17632/sgzt4b7b3m.1 Licensing provisions: Creative Commons Attribution license (CC BY 4.0) Programming language: C, CUDA External routines/libraries: NVIDIA CUDA Toolkit 6.5 or newer Nature of problem: The program calculates the internal energy, specific heat, several magnetization moments, entropy and free energy of the 2D Ising model on square lattices of edge length L with periodic boundary conditions as a function of inverse temperature β. Solution method: The code uses population annealing, a hybrid method combining Markov chain updates with population control. The code is implemented for NVIDIA GPUs using the CUDA language and employs advanced techniques such as multi-spin coding, adaptive temperature steps and multi-histogram reweighting. Additional comments: Code repository at https://github.com/LevBarash/PAising. The system size and size of the population of replicas are limited depending on the memory of the GPU device used. For the default parameter values used in the sample programs, L = 64, θ = 100, β0 = 0, βf = 1, Δβ = 0 . 005, R = 20 000, a typical run time on an NVIDIA Tesla K80 GPU is 151 seconds for the single spin coded (SSC) and 17 seconds for the multi-spin coded (MSC) program (see Section 2 for a description of these parameters).
Phosphorus limits Eucalyptus grandis seedling growth in an unburnt rain forest soil
Tng, David Y. P.; Janos, David P.; Jordan, Gregory J.; Weber, Ellen; Bowman, David M. J. S.
2014-01-01
Although rain forest is characterized as pyrophobic, pyrophilic giant eucalypts grow as rain forest emergents in both temperate and tropical Australia. In temperate Australia, such eucalypts depend on extensive, infrequent fires to produce conditions suitable for seedling growth. Little is known, however, about constraints on seedlings of tropical giant eucalypts. We tested whether seedlings of Eucalyptus grandis experience edaphic constraints similar to their temperate counterparts. We hypothesized that phosphorous addition would alleviate edaphic constraints. We grew seedlings in a factorial experiment combining fumigation (to simulate nutrient release and soil pasteurization by fire), soil type (E. grandis forest versus rain forest soil) and phosphorus addition as factors. We found that phosphorus was the principal factor limiting E. grandis seedling survival and growth in rain forest soil, and that fumigation enhanced survival of seedlings in both E. grandis forest and rain forest soil. We conclude that similar to edaphic constraints on temperate giant eucalypts, mineral nutrient and biotic attributes of a tropical rain forest soil may hamper E. grandis seedling establishment. In rain forest soil, E. grandis seedlings benefited from conditions akin to a fire-generated ashbed (i.e., an “ashbed effect”). PMID:25339968
Sliding mode controllers for a tempered glass furnace.
Almutairi, Naif B; Zribi, Mohamed
2016-01-01
This paper investigates the design of two sliding mode controllers (SMCs) applied to a tempered glass furnace system. The main objective of the proposed controllers is to regulate the glass plate temperature, the upper-wall temperature and the lower-wall temperature in the furnace to a common desired temperature. The first controller is a conventional sliding mode controller. The key step in the design of this controller is the introduction of a nonlinear transformation that maps the dynamic model of the tempered glass furnace into the generalized controller canonical form; this step facilitates the design of the sliding mode controller. The second controller is based on a state-dependent coefficient (SDC) factorization of the tempered glass furnace dynamic model. Using an SDC factorization, a simplified sliding mode controller is designed. The simulation results indicate that the two proposed control schemes work very well. Moreover, the robustness of the control schemes to changes in the system's parameters as well as to disturbances is investigated. In addition, a comparison of the proposed control schemes with a fuzzy PID controller is performed; the results show that the proposed SDC-based sliding mode controller gave better results. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Adamo, Shelley A; Baker, Jillian L; Lovett, Maggie M E; Wilson, Graham
2012-12-01
Climate change will result in warmer temperatures and an increase in the frequency and severity of extreme weather events. Given that higher temperatures increase the reproductive rate of temperate zone insects, insect population growth rates are predicted to increase in the temperate zone in response to climate. This consensus, however, rests on the assumption that food is freely available. However, under conditions of limited food, the reproductive output of the Texan cricket Gryllus texensis (Cade and Otte) was highest at its current normal average temperature and declined with increasing temperature. Moreover, low food availability decreased survival during a simulated heat wave. Therefore, the effects of climate change on this species, and possibly on many others, are likely to hinge on food availability. Extrapolation from our data suggests that G. texensis will show larger yearly fluctuations in population size as climate change continues, and this will also have ecological repercussions. Only those temperate zone insects with a ready supply of food (e.g., agricultural pests) are likely to experience the predicted increase in population growth in response to climate change; food-limited species are likely to experience a population decline.
Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P
DOE Office of Scientific and Technical Information (OSTI.GOV)
Candel, A.; Kabel, A.; Lee, L.
In recent years, SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic time-domain code T3P. Higher-order Finite Element methods on conformal unstructured meshes and massively parallel processing allow unprecedented simulation accuracy for wakefield computations and simulations of transient effects in realistic accelerator structures. Applications include simulation of wakefield damping in the Compact Linear Collider (CLIC) power extraction and transfer structure (PETS).
Xyce Parallel Electronic Simulator Users' Guide Version 6.8
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase$-$ a message passing parallel implementation $-$ which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Effects of oncogenic mutations on the conformational free-energy landscape of EGFR kinase
Sutto, Ludovico; Gervasio, Francesco Luigi
2013-01-01
Activating mutations in the epidermal growth factor receptor (EGFR) tyrosine kinase are frequently found in many cancers. It has been suggested that changes in the equilibrium between its active and inactive conformations are linked to its oncogenic potential. Here, we quantify the effects of some of the most common single (L858R and T790M) and double (T790M-L858R) oncogenic mutations on the conformational free-energy landscape of the EGFR kinase domain by using massive molecular dynamics simulations together with parallel tempering, metadynamics, and one of the best force-fields available. Whereas the wild-type EGFR catalytic domain monomer is mostly found in an inactive conformation, our results show a clear shift toward the active conformation for all of the mutants. The L858R mutation stabilizes the active conformation at the expense of the inactive conformation and rigidifies the αC-helix. The T790M gatekeeper mutant favors activation by stabilizing a hydrophobic cluster. Finally, T790M with L858R shows a significant positive epistasis effect. This combination not only stabilizes the active conformation, but in nontrivial ways changes the free-energy landscape lowering the transition barriers. PMID:23754386
Bayesian inference on EMRI signals using low frequency approximations
NASA Astrophysics Data System (ADS)
Ali, Asad; Christensen, Nelson; Meyer, Renate; Röver, Christian
2012-07-01
Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation methods presented in this paper are general in their nature, and can be applied in any other scenario such as AdLIGO, AdVIRGO and Einstein Telescope with their respective response functions.
Effects of oncogenic mutations on the conformational free-energy landscape of EGFR kinase.
Sutto, Ludovico; Gervasio, Francesco Luigi
2013-06-25
Activating mutations in the epidermal growth factor receptor (EGFR) tyrosine kinase are frequently found in many cancers. It has been suggested that changes in the equilibrium between its active and inactive conformations are linked to its oncogenic potential. Here, we quantify the effects of some of the most common single (L858R and T790M) and double (T790M-L858R) oncogenic mutations on the conformational free-energy landscape of the EGFR kinase domain by using massive molecular dynamics simulations together with parallel tempering, metadynamics, and one of the best force-fields available. Whereas the wild-type EGFR catalytic domain monomer is mostly found in an inactive conformation, our results show a clear shift toward the active conformation for all of the mutants. The L858R mutation stabilizes the active conformation at the expense of the inactive conformation and rigidifies the αC-helix. The T790M gatekeeper mutant favors activation by stabilizing a hydrophobic cluster. Finally, T790M with L858R shows a significant positive epistasis effect. This combination not only stabilizes the active conformation, but in nontrivial ways changes the free-energy landscape lowering the transition barriers.
A hybrid algorithm for parallel molecular dynamics simulations
NASA Astrophysics Data System (ADS)
Mangiardi, Chris M.; Meyer, R.
2017-10-01
This article describes algorithms for the hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-range forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds or thousands of cores and SIMD units with large vector sizes. In order to test the efficiency of the method, simulations of a variety of configurations with up to 74 million atoms have been performed. Results are shown that were obtained on multi-core systems with Sandy Bridge and Haswell processors as well as systems with Xeon Phi many-core processors.
A parallel simulated annealing algorithm for standard cell placement on a hypercube computer
NASA Technical Reports Server (NTRS)
Jones, Mark Howard
1987-01-01
A parallel version of a simulated annealing algorithm is presented which is targeted to run on a hypercube computer. A strategy for mapping the cells in a two dimensional area of a chip onto processors in an n-dimensional hypercube is proposed such that both small and large distance moves can be applied. Two types of moves are allowed: cell exchanges and cell displacements. The computation of the cost function in parallel among all the processors in the hypercube is described along with a distributed data structure that needs to be stored in the hypercube to support parallel cost evaluation. A novel tree broadcasting strategy is used extensively in the algorithm for updating cell locations in the parallel environment. Studies on the performance of the algorithm on example industrial circuits show that it is faster and gives better final placement results than the uniprocessor simulated annealing algorithms. An improved uniprocessor algorithm is proposed which is based on the improved results obtained from parallelization of the simulated annealing algorithm.
Methods of parallel computation applied on granular simulations
NASA Astrophysics Data System (ADS)
Martins, Gustavo H. B.; Atman, Allbens P. F.
2017-06-01
Every year, parallel computing has becoming cheaper and more accessible. As consequence, applications were spreading over all research areas. Granular materials is a promising area for parallel computing. To prove this statement we study the impact of parallel computing in simulations of the BNE (Brazil Nut Effect). This property is due the remarkable arising of an intruder confined to a granular media when vertically shaken against gravity. By means of DEM (Discrete Element Methods) simulations, we study the code performance testing different methods to improve clock time. A comparison between serial and parallel algorithms, using OpenMP® is also shown. The best improvement was obtained by optimizing the function that find contacts using Verlet's cells.
Turbomachinery CFD on parallel computers
NASA Technical Reports Server (NTRS)
Blech, Richard A.; Milner, Edward J.; Quealy, Angela; Townsend, Scott E.
1992-01-01
The role of multistage turbomachinery simulation in the development of propulsion system models is discussed. Particularly, the need for simulations with higher fidelity and faster turnaround time is highlighted. It is shown how such fast simulations can be used in engineering-oriented environments. The use of parallel processing to achieve the required turnaround times is discussed. Current work by several researchers in this area is summarized. Parallel turbomachinery CFD research at the NASA Lewis Research Center is then highlighted. These efforts are focused on implementing the average-passage turbomachinery model on MIMD, distributed memory parallel computers. Performance results are given for inviscid, single blade row and viscous, multistage applications on several parallel computers, including networked workstations.
Massively parallel multicanonical simulations
NASA Astrophysics Data System (ADS)
Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard
2018-03-01
Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.
Solar wind interaction with Venus and Mars in a parallel hybrid code
NASA Astrophysics Data System (ADS)
Jarvinen, Riku; Sandroos, Arto
2013-04-01
We discuss the development and applications of a new parallel hybrid simulation, where ions are treated as particles and electrons as a charge-neutralizing fluid, for the interaction between the solar wind and Venus and Mars. The new simulation code under construction is based on the algorithm of the sequential global planetary hybrid model developed at the Finnish Meteorological Institute (FMI) and on the Corsair parallel simulation platform also developed at the FMI. The FMI's sequential hybrid model has been used for studies of plasma interactions of several unmagnetized and weakly magnetized celestial bodies for more than a decade. Especially, the model has been used to interpret in situ particle and magnetic field observations from plasma environments of Mars, Venus and Titan. Further, Corsair is an open source MPI (Message Passing Interface) particle and mesh simulation platform, mainly aimed for simulations of diffusive shock acceleration in solar corona and interplanetary space, but which is now also being extended for global planetary hybrid simulations. In this presentation we discuss challenges and strategies of parallelizing a legacy simulation code as well as possible applications and prospects of a scalable parallel hybrid model for the solar wind interactions of Venus and Mars.
Coherence, causation, and the future of cognitive neuroscience research.
Ramey, Christopher H; Chrysikou, Evangelia G
2014-01-01
Nachev and Hacker's conceptual analysis of the neural antecedents of voluntary action underscores the real danger of ignoring the meta-theoretical apparatus of cognitive neuroscience research. In this response, we temper certain claims (e.g., whether or not certain research questions are incoherent), consider a more extreme consequence of their argument against cognitive neuroscience (i.e., whether or not one can speak about causation with neural antecedents at all), and, finally, highlight recent methodological developments that exemplify cognitive neuroscientists' focus on studying the brain as a parallel, dynamic, and highly complex biological system.
Acoustic simulation in architecture with parallel algorithm
NASA Astrophysics Data System (ADS)
Li, Xiaohong; Zhang, Xinrong; Li, Dan
2004-03-01
In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.
2008-02-09
Campbell, S. Ogata, and F. Shimojo, “ Multimillion atom simulations of nanosystems on parallel computers,” in Proceedings of the International...nanomesas: multimillion -atom molecular dynamics simulations on parallel computers,” J. Appl. Phys. 94, 6762 (2003). 21. P. Vashishta, R. K. Kalia...and A. Nakano, “ Multimillion atom molecular dynamics simulations of nanoparticles on parallel computers,” Journal of Nanoparticle Research 5, 119-135
Massively parallel simulator of optical coherence tomography of inhomogeneous turbid media.
Malektaji, Siavash; Lima, Ivan T; Escobar I, Mauricio R; Sherif, Sherif S
2017-10-01
An accurate and practical simulator for Optical Coherence Tomography (OCT) could be an important tool to study the underlying physical phenomena in OCT such as multiple light scattering. Recently, many researchers have investigated simulation of OCT of turbid media, e.g., tissue, using Monte Carlo methods. The main drawback of these earlier simulators is the long computational time required to produce accurate results. We developed a massively parallel simulator of OCT of inhomogeneous turbid media that obtains both Class I diffusive reflectivity, due to ballistic and quasi-ballistic scattered photons, and Class II diffusive reflectivity due to multiply scattered photons. This Monte Carlo-based simulator is implemented on graphic processing units (GPUs), using the Compute Unified Device Architecture (CUDA) platform and programming model, to exploit the parallel nature of propagation of photons in tissue. It models an arbitrary shaped sample medium as a tetrahedron-based mesh and uses an advanced importance sampling scheme. This new simulator speeds up simulations of OCT of inhomogeneous turbid media by about two orders of magnitude. To demonstrate this result, we have compared the computation times of our new parallel simulator and its serial counterpart using two samples of inhomogeneous turbid media. We have shown that our parallel implementation reduced simulation time of OCT of the first sample medium from 407 min to 92 min by using a single GPU card, to 12 min by using 8 GPU cards and to 7 min by using 16 GPU cards. For the second sample medium, the OCT simulation time was reduced from 209 h to 35.6 h by using a single GPU card, and to 4.65 h by using 8 GPU cards, and to only 2 h by using 16 GPU cards. Therefore our new parallel simulator is considerably more practical to use than its central processing unit (CPU)-based counterpart. Our new parallel OCT simulator could be a practical tool to study the different physical phenomena underlying OCT, or to design OCT systems with improved performance. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Hsieh, Shang-Hsien
1993-01-01
The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.
Extended Hamiltonian approach to continuous tempering
NASA Astrophysics Data System (ADS)
Gobbo, Gianpaolo; Leimkuhler, Benedict J.
2015-06-01
We introduce an enhanced sampling simulation technique based on continuous tempering, i.e., on continuously varying the temperature of the system under investigation. Our approach is mathematically straightforward, being based on an extended Hamiltonian formulation in which an auxiliary degree of freedom, determining the effective temperature, is coupled to the physical system. The physical system and its temperature evolve continuously in time according to the equations of motion derived from the extended Hamiltonian. Due to the Hamiltonian structure, it is easy to show that a particular subset of the configurations of the extended system is distributed according to the canonical ensemble for the physical system at the correct physical temperature.
Deforestation intensifies hot days
NASA Astrophysics Data System (ADS)
Stoy, Paul C.
2018-05-01
Deforestation often increases land-surface and near-surface temperatures, but climate models struggle to simulate this effect. Research now shows that deforestation has increased the severity of extreme heat in temperate regions of North America and Europe. This points to opportunities to mitigate extreme heat.
NASA Astrophysics Data System (ADS)
Thurner, Martin; Beer, Christian; Carvalhais, Nuno; Forkel, Matthias; Tito Rademacher, Tim; Santoro, Maurizio; Tum, Markus; Schmullius, Christiane
2016-04-01
Long-term vegetation dynamics are one of the key uncertainties of the carbon cycle. There are large differences in simulated vegetation carbon stocks and fluxes including productivity, respiration and carbon turnover between global vegetation models. Especially the implementation of climate-related mortality processes, for instance drought, fire, frost or insect effects, is often lacking or insufficient in current models and their importance at global scale is highly uncertain. These shortcomings have been due to the lack of spatially extensive information on vegetation carbon stocks, which cannot be provided by inventory data alone. Instead, we recently have been able to estimate northern boreal and temperate forest carbon stocks based on radar remote sensing data. Our spatially explicit product (0.01° resolution) shows strong agreement to inventory-based estimates at a regional scale and allows for a spatial evaluation of carbon stocks and dynamics simulated by global vegetation models. By combining this state-of-the-art biomass product and NPP datasets originating from remote sensing, we are able to study the relation between carbon turnover rate and a set of climate indices in northern boreal and temperate forests along spatial gradients. We observe an increasing turnover rate with colder winter temperatures and longer winters in boreal forests, suggesting frost damage and the trade-off between frost adaptation and growth being important mortality processes in this ecosystem. In contrast, turnover rate increases with climatic conditions favouring drought and insect outbreaks in temperate forests. Investigated global vegetation models from the Inter-Sectoral Impact Model Intercomparison Project (ISI-MIP), including HYBRID4, JeDi, JULES, LPJml, ORCHIDEE, SDGVM, and VISIT, are able to reproduce observation-based spatial climate - turnover rate relationships only to a limited extent. While most of the models compare relatively well in terms of NPP, simulated vegetation carbon stocks are severely biased compared to our biomass dataset. Current limitations lead to considerable uncertainties in the estimated vegetation carbon turnover, contributing substantially to the forest feedback to climate change. Our results are the basis for improving mortality concepts in models and estimating their impact on the land carbon balance.
Crashworthiness simulations with DYNA3D
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schauer, D.A.; Hoover, C.G.; Kay, G.J.
1996-04-01
Current progress in parallel algorithm research and applications in vehicle crash simulation is described for the explicit, finite element algorithms in DYNA3D. Problem partitioning methods and parallel algorithms for contact at material interfaces are the two challenging algorithm research problems that are addressed. Two prototype parallel contact algorithms have been developed for treating the cases of local and arbitrary contact. Demonstration problems for local contact are crashworthiness simulations with 222 locally defined contact surfaces and a vehicle/barrier collision modeled with arbitrary contact. A simulation of crash tests conducted for a vehicle impacting a U-channel small sign post embedded in soilmore » has been run on both the serial and parallel versions of DYNA3D. A significant reduction in computational time has been observed when running these problems on the parallel version. However, to achieve maximum efficiency, complex problems must be appropriately partitioned, especially when contact dominates the computation.« less
pWeb: A High-Performance, Parallel-Computing Framework for Web-Browser-Based Medical Simulation.
Halic, Tansel; Ahn, Woojin; De, Suvranu
2014-01-01
This work presents a pWeb - a new language and compiler for parallelization of client-side compute intensive web applications such as surgical simulations. The recently introduced HTML5 standard has enabled creating unprecedented applications on the web. Low performance of the web browser, however, remains the bottleneck of computationally intensive applications including visualization of complex scenes, real time physical simulations and image processing compared to native ones. The new proposed language is built upon web workers for multithreaded programming in HTML5. The language provides fundamental functionalities of parallel programming languages as well as the fork/join parallel model which is not supported by web workers. The language compiler automatically generates an equivalent parallel script that complies with the HTML5 standard. A case study on realistic rendering for surgical simulations demonstrates enhanced performance with a compact set of instructions.
n-body simulations using message passing parallel computers.
NASA Astrophysics Data System (ADS)
Grama, A. Y.; Kumar, V.; Sameh, A.
The authors present new parallel formulations of the Barnes-Hut method for n-body simulations on message passing computers. These parallel formulations partition the domain efficiently incurring minimal communication overhead. This is in contrast to existing schemes that are based on sorting a large number of keys or on the use of global data structures. The new formulations are augmented by alternate communication strategies which serve to minimize communication overhead. The impact of these communication strategies is experimentally studied. The authors report on experimental results obtained from an astrophysical simulation on an nCUBE2 parallel computer.
A conservative approach to parallelizing the Sharks World simulation
NASA Technical Reports Server (NTRS)
Nicol, David M.; Riffe, Scott E.
1990-01-01
Parallelizing a benchmark problem for parallel simulation, the Sharks World, is described. The described solution is conservative, in the sense that no state information is saved, and no 'rollbacks' occur. The used approach illustrates both the principal advantage and principal disadvantage of conservative parallel simulation. The advantage is that by exploiting lookahead an approach was found that dramatically improves the serial execution time, and also achieves excellent speedups. The disadvantage is that if the model rules are changed in such a way that the lookahead is destroyed, it is difficult to modify the solution to accommodate the changes.
AC losses in horizontally parallel HTS tapes for possible wireless power transfer applications
NASA Astrophysics Data System (ADS)
Shen, Boyang; Geng, Jianzhao; Zhang, Xiuchang; Fu, Lin; Li, Chao; Zhang, Heng; Dong, Qihuan; Ma, Jun; Gawith, James; Coombs, T. A.
2017-12-01
This paper presents the concept of using horizontally parallel HTS tapes with AC loss study, and the investigation on possible wireless power transfer (WPT) applications. An example of three parallel HTS tapes was proposed, whose AC loss study was carried out both from experiment using electrical method; and simulation using 2D H-formulation on the FEM platform of COMSOL Multiphysics. The electromagnetic induction around the three parallel tapes was monitored using COMSOL simulation. The electromagnetic induction and AC losses generated by a conventional three turn coil was simulated as well, and then compared to the case of three parallel tapes with the same AC transport current. The analysis demonstrates that HTS parallel tapes could be potentially used into wireless power transfer systems, which could have lower total AC losses than conventional HTS coils.
Xyce™ Parallel Electronic Simulator Users' Guide, Version 6.5.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R.; Aadithya, Karthik V.; Mei, Ting
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright © 2002-2016 Sandia Corporation. All rights reserved.« less
NASA Astrophysics Data System (ADS)
Jo, Sunhwan; Jiang, Wei
2015-12-01
Replica Exchange with Solute Tempering (REST2) is a powerful sampling enhancement algorithm of molecular dynamics (MD) in that it needs significantly smaller number of replicas but achieves higher sampling efficiency relative to standard temperature exchange algorithm. In this paper, we extend the applicability of REST2 for quantitative biophysical simulations through a robust and generic implementation in greatly scalable MD software NAMD. The rescaling procedure of force field parameters controlling REST2 "hot region" is implemented into NAMD at the source code level. A user can conveniently select hot region through VMD and write the selection information into a PDB file. The rescaling keyword/parameter is written in NAMD Tcl script interface that enables an on-the-fly simulation parameter change. Our implementation of REST2 is within communication-enabled Tcl script built on top of Charm++, thus communication overhead of an exchange attempt is vanishingly small. Such a generic implementation facilitates seamless cooperation between REST2 and other modules of NAMD to provide enhanced sampling for complex biomolecular simulations. Three challenging applications including native REST2 simulation for peptide folding-unfolding transition, free energy perturbation/REST2 for absolute binding affinity of protein-ligand complex and umbrella sampling/REST2 Hamiltonian exchange for free energy landscape calculation were carried out on IBM Blue Gene/Q supercomputer to demonstrate efficacy of REST2 based on the present implementation.
Sensitivity of DIVWAG to Variations in Weather Parameters
1976-04-01
1 18. SUPPLEMENTARY NOTES 1 19. KEY WORDS (Continue on reverse aide if necessary and Identify by block number) DIVWAG WAR GAME SIMULATION...simulation of a Division Level War Game , to determine the signif- icance of varying battlefield parameters; i.e., artillery parameters, troop and...The only Red artillery weapons doing better in bad weather are the 130MM guns , but this statistic is tempered by the few casualties occuring in
NASA Astrophysics Data System (ADS)
Valasek, Lukas; Glasa, Jan
2017-12-01
Current fire simulation systems are capable to utilize advantages of high-performance computer (HPC) platforms available and to model fires efficiently in parallel. In this paper, efficiency of a corridor fire simulation on a HPC computer cluster is discussed. The parallel MPI version of Fire Dynamics Simulator is used for testing efficiency of selected strategies of allocation of computational resources of the cluster using a greater number of computational cores. Simulation results indicate that if the number of cores used is not equal to a multiple of the total number of cluster node cores there are allocation strategies which provide more efficient calculations.
NASA Astrophysics Data System (ADS)
Li, Gen; Tang, Chun-An; Liang, Zheng-Zhao
2017-01-01
Multi-scale high-resolution modeling of rock failure process is a powerful means in modern rock mechanics studies to reveal the complex failure mechanism and to evaluate engineering risks. However, multi-scale continuous modeling of rock, from deformation, damage to failure, has raised high requirements on the design, implementation scheme and computation capacity of the numerical software system. This study is aimed at developing the parallel finite element procedure, a parallel rock failure process analysis (RFPA) simulator that is capable of modeling the whole trans-scale failure process of rock. Based on the statistical meso-damage mechanical method, the RFPA simulator is able to construct heterogeneous rock models with multiple mechanical properties, deal with and represent the trans-scale propagation of cracks, in which the stress and strain fields are solved for the damage evolution analysis of representative volume element by the parallel finite element method (FEM) solver. This paper describes the theoretical basis of the approach and provides the details of the parallel implementation on a Windows - Linux interactive platform. A numerical model is built to test the parallel performance of FEM solver. Numerical simulations are then carried out on a laboratory-scale uniaxial compression test, and field-scale net fracture spacing and engineering-scale rock slope examples, respectively. The simulation results indicate that relatively high speedup and computation efficiency can be achieved by the parallel FEM solver with a reasonable boot process. In laboratory-scale simulation, the well-known physical phenomena, such as the macroscopic fracture pattern and stress-strain responses, can be reproduced. In field-scale simulation, the formation process of net fracture spacing from initiation, propagation to saturation can be revealed completely. In engineering-scale simulation, the whole progressive failure process of the rock slope can be well modeled. It is shown that the parallel FE simulator developed in this study is an efficient tool for modeling the whole trans-scale failure process of rock from meso- to engineering-scale.
Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schmalz, Mark S
2011-07-24
Statement of Problem - Department of Energy has many legacy codes for simulation of computational particle dynamics and computational fluid dynamics applications that are designed to run on sequential processors and are not easily parallelized. Emerging high-performance computing architectures employ massively parallel multicore architectures (e.g., graphics processing units) to increase throughput. Parallelization of legacy simulation codes is a high priority, to achieve compatibility, efficiency, accuracy, and extensibility. General Statement of Solution - A legacy simulation application designed for implementation on mainly-sequential processors has been represented as a graph G. Mathematical transformations, applied to G, produce a graph representation {und G}more » for a high-performance architecture. Key computational and data movement kernels of the application were analyzed/optimized for parallel execution using the mapping G {yields} {und G}, which can be performed semi-automatically. This approach is widely applicable to many types of high-performance computing systems, such as graphics processing units or clusters comprised of nodes that contain one or more such units. Phase I Accomplishments - Phase I research decomposed/profiled computational particle dynamics simulation code for rocket fuel combustion into low and high computational cost regions (respectively, mainly sequential and mainly parallel kernels), with analysis of space and time complexity. Using the research team's expertise in algorithm-to-architecture mappings, the high-cost kernels were transformed, parallelized, and implemented on Nvidia Fermi GPUs. Measured speedups (GPU with respect to single-core CPU) were approximately 20-32X for realistic model parameters, without final optimization. Error analysis showed no loss of computational accuracy. Commercial Applications and Other Benefits - The proposed research will constitute a breakthrough in solution of problems related to efficient parallel computation of particle and fluid dynamics simulations. These problems occur throughout DOE, military and commercial sectors: the potential payoff is high. We plan to license or sell the solution to contractors for military and domestic applications such as disaster simulation (aerodynamic and hydrodynamic), Government agencies (hydrological and environmental simulations), and medical applications (e.g., in tomographic image reconstruction). Keywords - High-performance Computing, Graphic Processing Unit, Fluid/Particle Simulation. Summary for Members of Congress - Department of Energy has many simulation codes that must compute faster, to be effective. The Phase I research parallelized particle/fluid simulations for rocket combustion, for high-performance computing systems.« less
Validation of the enthalpy method by means of analytical solution
NASA Astrophysics Data System (ADS)
Kleiner, Thomas; Rückamp, Martin; Bondzio, Johannes; Humbert, Angelika
2014-05-01
Numerical simulations moved in the recent year(s) from describing the cold-temperate transition surface (CTS) towards an enthalpy description, which allows avoiding incorporating a singular surface inside the model (Aschwanden et al., 2012). In Enthalpy methods the CTS is represented as a level set of the enthalpy state variable. This method has several numerical and practical advantages (e.g. representation of the full energy by one scalar field, no restriction to topology and shape of the CTS). The proposed method is rather new in glaciology and to our knowledge not verified and validated against analytical solutions. Unfortunately we are still lacking analytical solutions for sufficiently complex thermo-mechanically coupled polythermal ice flow. However, we present two experiments to test the implementation of the enthalpy equation and corresponding boundary conditions. The first experiment tests particularly the functionality of the boundary condition scheme and the corresponding basal melt rate calculation. Dependent on the different thermal situations that occur at the base, the numerical code may have to switch to another boundary type (from Neuman to Dirichlet or vice versa). The main idea of this set-up is to test the reversibility during transients. A former cold ice body that run through a warmer period with an associated built up of a liquid water layer at the base must be able to return to its initial steady state. Since we impose several assumptions on the experiment design analytical solutions can be formulated for different quantities during distinct stages of the simulation. The second experiment tests the positioning of the internal CTS in a parallel-sided polythermal slab. We compare our simulation results to the analytical solution proposed by Greve and Blatter (2009). Results from three different ice flow-models (COMIce, ISSM, TIMFD3) are presented.
Program For Parallel Discrete-Event Simulation
NASA Technical Reports Server (NTRS)
Beckman, Brian C.; Blume, Leo R.; Geiselman, John S.; Presley, Matthew T.; Wedel, John J., Jr.; Bellenot, Steven F.; Diloreto, Michael; Hontalas, Philip J.; Reiher, Peter L.; Weiland, Frederick P.
1991-01-01
User does not have to add any special logic to aid in synchronization. Time Warp Operating System (TWOS) computer program is special-purpose operating system designed to support parallel discrete-event simulation. Complete implementation of Time Warp mechanism. Supports only simulations and other computations designed for virtual time. Time Warp Simulator (TWSIM) subdirectory contains sequential simulation engine interface-compatible with TWOS. TWOS and TWSIM written in, and support simulations in, C programming language.
Human impact on wildfires varies between regions and with vegetation productivity
NASA Astrophysics Data System (ADS)
Lasslop, Gitta; Kloster, Silvia
2017-11-01
We assess the influence of humans on burned area simulated with a dynamic global vegetation model. The human impact in the model is based on population density and cropland fraction, which were identified as important drivers of burned area in analyses of global datasets, and are commonly used in global models. After an evaluation of the sensitivity to these two variables we extend the model by including an additional effect of the cropland fraction on the fire duration. The general pattern of human influence is similar in both model versions: the strongest human impact is found in regions with intermediate productivity, where fire occurrence is not limited by fuel load or climatic conditions. Human effects in the model increases burned area in the tropics, while in temperate regions burned area is reduced. While the population density is similar on average for the tropical and temperate regions, the cropland fraction is higher in temperate regions, and leads to a strong suppression of fire. The model shows a low human impact in the boreal region, where both population density and cropland fraction is very low and the climatic conditions, as well as the vegetation productivity limit fire. Previous studies attributed a decrease in fire activity found in global charcoal datasets to human activity. This is confirmed by our simulations, which only show a decrease in burned area when the human influence on fire is accounted for, and not with only natural effects on fires. We assess how the vegetation-fire feedback influences the results, by comparing simulations with dynamic vegetation biogeography to simulations with prescribed vegetation. The vegetation-fire feedback increases the human impact on burned area by 10% for present day conditions. These results emphasize that projections of burned area need to account for the interactions between fire, climate, vegetation and humans.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomquist, Heidi K.; Fixel, Deborah A.; Fett, David Brian
The Xyce Parallel Electronic Simulator simulates electronic circuit behavior in DC, AC, HB, MPDE and transient mode using standard analog (DAE) and/or device (PDE) device models including several age and radiation aware devices. It supports a variety of computing platforms (both serial and parallel) computers. Lastly, it uses a variety of modern solution algorithms dynamic parallel load-balancing and iterative solvers.
Tutorial: Parallel Computing of Simulation Models for Risk Analysis.
Reilly, Allison C; Staid, Andrea; Gao, Michael; Guikema, Seth D
2016-10-01
Simulation models are widely used in risk analysis to study the effects of uncertainties on outcomes of interest in complex problems. Often, these models are computationally complex and time consuming to run. This latter point may be at odds with time-sensitive evaluations or may limit the number of parameters that are considered. In this article, we give an introductory tutorial focused on parallelizing simulation code to better leverage modern computing hardware, enabling risk analysts to better utilize simulation-based methods for quantifying uncertainty in practice. This article is aimed primarily at risk analysts who use simulation methods but do not yet utilize parallelization to decrease the computational burden of these models. The discussion is focused on conceptual aspects of embarrassingly parallel computer code and software considerations. Two complementary examples are shown using the languages MATLAB and R. A brief discussion of hardware considerations is located in the Appendix. © 2016 Society for Risk Analysis.
Real-time electron dynamics for massively parallel excited-state simulations
NASA Astrophysics Data System (ADS)
Andrade, Xavier
The simulation of the real-time dynamics of electrons, based on time dependent density functional theory (TDDFT), is a powerful approach to study electronic excited states in molecular and crystalline systems. What makes the method attractive is its flexibility to simulate different kinds of phenomena beyond the linear-response regime, including strongly-perturbed electronic systems and non-adiabatic electron-ion dynamics. Electron-dynamics simulations are also attractive from a computational point of view. They can run efficiently on massively parallel architectures due to the low communication requirements. Our implementations of electron dynamics, based on the codes Octopus (real-space) and Qball (plane-waves), allow us to simulate systems composed of thousands of atoms and to obtain good parallel scaling up to 1.6 million processor cores. Due to the versatility of real-time electron dynamics and its parallel performance, we expect it to become the method of choice to apply the capabilities of exascale supercomputers for the simulation of electronic excited states.
Reversible Parallel Discrete-Event Execution of Large-scale Epidemic Outbreak Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S; Seal, Sudip K
2010-01-01
The spatial scale, runtime speed and behavioral detail of epidemic outbreak simulations together require the use of large-scale parallel processing. In this paper, an optimistic parallel discrete event execution of a reaction-diffusion simulation model of epidemic outbreaks is presented, with an implementation over themore » $$\\mu$$sik simulator. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. Parallel speedup and other runtime performance metrics of the simulation are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundred million individuals in the largest case) are exercised.« less
NASA Astrophysics Data System (ADS)
Shen, Lin; Xie, Liangxu; Yang, Mingjun
2017-04-01
Conformational sampling under rugged energy landscape is always a challenge in computer simulations. The recently developed integrated tempering sampling, together with its selective variant (SITS), emerges to be a powerful tool in exploring the free energy landscape or functional motions of various systems. The estimation of weighting factors constitutes a critical step in these methods and requires accurate calculation of partition function ratio between different thermodynamic states. In this work, we propose a new adaptive update algorithm to compute the weighting factors based on the weighted histogram analysis method (WHAM). The adaptive-WHAM algorithm with SITS is then applied to study the thermodynamic properties of several representative peptide systems solvated in an explicit water box. The performance of the new algorithm is validated in simulations of these solvated peptide systems. We anticipate more applications of this coupled optimisation and production algorithm to other complicated systems such as the biochemical reactions in solution.
NASA Astrophysics Data System (ADS)
Ewers, B. E.; Bretfeld, M.; Millar, D.; Hall, J. S.; Beverly, D.; Hall, J. S.; Ogden, F. L.; Mackay, D. S.
2016-12-01
Process-based models of tree impacts on the hydrologic cycle must include not only plant hydraulic limitations but also photosynthetic controls because plants lose water to gain carbon. The Terrestrial Regional Ecosystem Exchange Simulator (TREES) is one such model. TREES includes a Bayesian model-data fusion approach that provides rigorous tests of patterns in tree transpiration data against biophysical processes in the model. TREES has been extensively tested against many temperate tree data sets including those experiencing severe and lethal drought. We test TREES against data from sap flow-scaled transpiration in 76 tropical trees (representing 42 different species) in secondary forests of three different ages (8, 25, and 80+ years) located in the Panama Canal Watershed. These data were collected during the third driest El Niño-Southern Oscillation (ENSO) event on record in Panama during 2015/2016. Tree transpiration response to vapor pressure deficit and solar radiation was the same in the two older forests, but showed an additional response to limited soil moisture in the youngest forest. Volumetric water content at 30 and 50 cm depths was 8% lower in the 8 year old forest than in the 80+ year old forest. TREES could not simulate this difference in soil moisture without increasing simulated root area. TREES simulations were improved by including light response curves of leaf photosynthesis, root vulnerability to cavitation and canopy position impacts on light. TREES was able to simulate the anisohydric (loose stomatal regulation of leaf water potential) and isohydric (tight stomatal regulation) of the 73 trees species a priori indicating that species level information is not required. Analyses of posterior probability distributions indicates TREES model predictions of individual tree transpiration would likely be improved with more detailed root and soil moisture in all forest ages data with the most improvement likely in the 8 year old forest. Our results suggest that a biophysical tree transpiration model developed in temperate forests can be applied to the tropics and could be used to improve predictions of evapotranspiration from changing land cover in tropical hydrology models.
Yang, Y Isaac; Zhang, Jun; Che, Xing; Yang, Lijiang; Gao, Yi Qin
2016-03-07
In order to efficiently overcome high free energy barriers embedded in a complex energy landscape and calculate overall thermodynamics properties using molecular dynamics simulations, we developed and implemented a sampling strategy by combining the metadynamics with (selective) integrated tempering sampling (ITS/SITS) method. The dominant local minima on the potential energy surface (PES) are partially exalted by accumulating history-dependent potentials as in metadynamics, and the sampling over the entire PES is further enhanced by ITS/SITS. With this hybrid method, the simulated system can be rapidly driven across the dominant barrier along selected collective coordinates. Then, ITS/SITS ensures a fast convergence of the sampling over the entire PES and an efficient calculation of the overall thermodynamic properties of the simulation system. To test the accuracy and efficiency of this method, we first benchmarked this method in the calculation of ϕ - ψ distribution of alanine dipeptide in explicit solvent. We further applied it to examine the design of template molecules for aromatic meta-C-H activation in solutions and investigate solution conformations of the nonapeptide Bradykinin involving slow cis-trans isomerizations of three proline residues.
NASA Astrophysics Data System (ADS)
Yang, Y. Isaac; Zhang, Jun; Che, Xing; Yang, Lijiang; Gao, Yi Qin
2016-03-01
In order to efficiently overcome high free energy barriers embedded in a complex energy landscape and calculate overall thermodynamics properties using molecular dynamics simulations, we developed and implemented a sampling strategy by combining the metadynamics with (selective) integrated tempering sampling (ITS/SITS) method. The dominant local minima on the potential energy surface (PES) are partially exalted by accumulating history-dependent potentials as in metadynamics, and the sampling over the entire PES is further enhanced by ITS/SITS. With this hybrid method, the simulated system can be rapidly driven across the dominant barrier along selected collective coordinates. Then, ITS/SITS ensures a fast convergence of the sampling over the entire PES and an efficient calculation of the overall thermodynamic properties of the simulation system. To test the accuracy and efficiency of this method, we first benchmarked this method in the calculation of ϕ - ψ distribution of alanine dipeptide in explicit solvent. We further applied it to examine the design of template molecules for aromatic meta-C—H activation in solutions and investigate solution conformations of the nonapeptide Bradykinin involving slow cis-trans isomerizations of three proline residues.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Y. Isaac; Zhang, Jun; Che, Xing
2016-03-07
In order to efficiently overcome high free energy barriers embedded in a complex energy landscape and calculate overall thermodynamics properties using molecular dynamics simulations, we developed and implemented a sampling strategy by combining the metadynamics with (selective) integrated tempering sampling (ITS/SITS) method. The dominant local minima on the potential energy surface (PES) are partially exalted by accumulating history-dependent potentials as in metadynamics, and the sampling over the entire PES is further enhanced by ITS/SITS. With this hybrid method, the simulated system can be rapidly driven across the dominant barrier along selected collective coordinates. Then, ITS/SITS ensures a fast convergence ofmore » the sampling over the entire PES and an efficient calculation of the overall thermodynamic properties of the simulation system. To test the accuracy and efficiency of this method, we first benchmarked this method in the calculation of ϕ − ψ distribution of alanine dipeptide in explicit solvent. We further applied it to examine the design of template molecules for aromatic meta-C—H activation in solutions and investigate solution conformations of the nonapeptide Bradykinin involving slow cis-trans isomerizations of three proline residues.« less
Modelling and simulation of parallel triangular triple quantum dots (TTQD) by using SIMON 2.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fathany, Maulana Yusuf, E-mail: myfathany@gmail.com; Fuada, Syifaul, E-mail: fsyifaul@gmail.com; Lawu, Braham Lawas, E-mail: bram-labs@rocketmail.com
2016-04-19
This research presents analysis of modeling on Parallel Triple Quantum Dots (TQD) by using SIMON (SIMulation Of Nano-structures). Single Electron Transistor (SET) is used as the basic concept of modeling. We design the structure of Parallel TQD by metal material with triangular geometry model, it is called by Triangular Triple Quantum Dots (TTQD). We simulate it with several scenarios using different parameters; such as different value of capacitance, various gate voltage, and different thermal condition.
Parallel-Processing Test Bed For Simulation Software
NASA Technical Reports Server (NTRS)
Blech, Richard; Cole, Gary; Townsend, Scott
1996-01-01
Second-generation Hypercluster computing system is multiprocessor test bed for research on parallel algorithms for simulation in fluid dynamics, electromagnetics, chemistry, and other fields with large computational requirements but relatively low input/output requirements. Built from standard, off-shelf hardware readily upgraded as improved technology becomes available. System used for experiments with such parallel-processing concepts as message-passing algorithms, debugging software tools, and computational steering. First-generation Hypercluster system described in "Hypercluster Parallel Processor" (LEW-15283).
A parallel finite element simulator for ion transport through three-dimensional ion channel systems.
Tu, Bin; Chen, Minxin; Xie, Yan; Zhang, Linbo; Eisenberg, Bob; Lu, Benzhuo
2013-09-15
A parallel finite element simulator, ichannel, is developed for ion transport through three-dimensional ion channel systems that consist of protein and membrane. The coordinates of heavy atoms of the protein are taken from the Protein Data Bank and the membrane is represented as a slab. The simulator contains two components: a parallel adaptive finite element solver for a set of Poisson-Nernst-Planck (PNP) equations that describe the electrodiffusion process of ion transport, and a mesh generation tool chain for ion channel systems, which is an essential component for the finite element computations. The finite element method has advantages in modeling irregular geometries and complex boundary conditions. We have built a tool chain to get the surface and volume mesh for ion channel systems, which consists of a set of mesh generation tools. The adaptive finite element solver in our simulator is implemented using the parallel adaptive finite element package Parallel Hierarchical Grid (PHG) developed by one of the authors, which provides the capability of doing large scale parallel computations with high parallel efficiency and the flexibility of choosing high order elements to achieve high order accuracy. The simulator is applied to a real transmembrane protein, the gramicidin A (gA) channel protein, to calculate the electrostatic potential, ion concentrations and I - V curve, with which both primitive and transformed PNP equations are studied and their numerical performances are compared. To further validate the method, we also apply the simulator to two other ion channel systems, the voltage dependent anion channel (VDAC) and α-Hemolysin (α-HL). The simulation results agree well with Brownian dynamics (BD) simulation results and experimental results. Moreover, because ionic finite size effects can be included in PNP model now, we also perform simulations using a size-modified PNP (SMPNP) model on VDAC and α-HL. It is shown that the size effects in SMPNP can effectively lead to reduced current in the channel, and the results are closer to BD simulation results. Copyright © 2013 Wiley Periodicals, Inc.
Transport Mechanism of Guest Methane in Water-Filled Nanopores
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bui, Tai; Phan, Anh; Cole, David R.
We computed the transport of methane through 1 nm wide slit-shaped pores carved out of selected solid substrates using classical molecular dynamics simulations. The transport mechanism was elucidated via the implementation of the well-tempered metadynamics algorithm, which allowed for the quantification and visualization of the free energy landscape sampled by the guest molecule. Models for silica, magnesium oxide, alumina, muscovite, and calcite were used as solid substrates. Slit-shaped pores of width 1 nm were carved out of these materials and filled with liquid water. Methane was then inserted at low concentration. The results show that the diffusion of methane throughmore » the hydrated pores is strongly dependent on the solid substrate. While methane molecules diffuse isotropically along the directions parallel to the pore surfaces in most of the pores considered, anisotropic diffusion was observed in the hydrated calcite pore. The differences observed in the various pores are due to local molecular properties of confined water, including molecular structure and solvation free energy. The transport mechanism and the diffusion coefficients are dependent on the free energy barriers encountered by one methane molecule as it migrates from one preferential adsorption site to a neighboring one. It was found that the heterogeneous water distribution in different hydration layers and the low free energy pathways in the plane parallel to the pore surfaces yield the anisotropic diffusion of methane molecules in the hydrated calcite pore. Our observations contribute to an ongoing debate on the relation between local free energy profiles and diffusion coefficients and could have important practical consequences in various applications, ranging from the design of selective membranes for gas separations to the sustainable deployment of shale gas.« less
Transport Mechanism of Guest Methane in Water-Filled Nanopores
Bui, Tai; Phan, Anh; Cole, David R.; ...
2017-05-11
We computed the transport of methane through 1 nm wide slit-shaped pores carved out of selected solid substrates using classical molecular dynamics simulations. The transport mechanism was elucidated via the implementation of the well-tempered metadynamics algorithm, which allowed for the quantification and visualization of the free energy landscape sampled by the guest molecule. Models for silica, magnesium oxide, alumina, muscovite, and calcite were used as solid substrates. Slit-shaped pores of width 1 nm were carved out of these materials and filled with liquid water. Methane was then inserted at low concentration. The results show that the diffusion of methane throughmore » the hydrated pores is strongly dependent on the solid substrate. While methane molecules diffuse isotropically along the directions parallel to the pore surfaces in most of the pores considered, anisotropic diffusion was observed in the hydrated calcite pore. The differences observed in the various pores are due to local molecular properties of confined water, including molecular structure and solvation free energy. The transport mechanism and the diffusion coefficients are dependent on the free energy barriers encountered by one methane molecule as it migrates from one preferential adsorption site to a neighboring one. It was found that the heterogeneous water distribution in different hydration layers and the low free energy pathways in the plane parallel to the pore surfaces yield the anisotropic diffusion of methane molecules in the hydrated calcite pore. Our observations contribute to an ongoing debate on the relation between local free energy profiles and diffusion coefficients and could have important practical consequences in various applications, ranging from the design of selective membranes for gas separations to the sustainable deployment of shale gas.« less
NASA Astrophysics Data System (ADS)
Shoemaker, C. A.; Pang, M.; Akhtar, T.; Bindel, D.
2016-12-01
New parallel surrogate global optimization algorithms are developed and applied to objective functions that are expensive simulations (possibly with multiple local minima). The algorithms can be applied to most geophysical simulations, including those with nonlinear partial differential equations. The optimization does not require simulations be parallelized. Asynchronous (and synchronous) parallel execution is available in the optimization toolbox "pySOT". The parallel algorithms are modified from serial to eliminate fine grained parallelism. The optimization is computed with open source software pySOT, a Surrogate Global Optimization Toolbox that allows user to pick the type of surrogate (or ensembles), the search procedure on surrogate, and the type of parallelism (synchronous or asynchronous). pySOT also allows the user to develop new algorithms by modifying parts of the code. In the applications here, the objective function takes up to 30 minutes for one simulation, and serial optimization can take over 200 hours. Results from Yellowstone (NSF) and NCSS (Singapore) supercomputers are given for groundwater contaminant hydrology simulations with applications to model parameter estimation and decontamination management. All results are compared with alternatives. The first results are for optimization of pumping at many wells to reduce cost for decontamination of groundwater at a superfund site. The optimization runs with up to 128 processors. Superlinear speed up is obtained for up to 16 processors, and efficiency with 64 processors is over 80%. Each evaluation of the objective function requires the solution of nonlinear partial differential equations to describe the impact of spatially distributed pumping and model parameters on model predictions for the spatial and temporal distribution of groundwater contaminants. The second application uses an asynchronous parallel global optimization for groundwater quality model calibration. The time for a single objective function evaluation varies unpredictably, so efficiency is improved with asynchronous parallel calculations to improve load balancing. The third application (done at NCSS) incorporates new global surrogate multi-objective parallel search algorithms into pySOT and applies it to a large watershed calibration problem.
2012-10-01
using the open-source code Large-scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) (http://lammps.sandia.gov) (23). The commercial...parameters are proprietary and cannot be ported to the LAMMPS 4 simulation code. In our molecular dynamics simulations at the atomistic resolution, we...IBI iterative Boltzmann inversion LAMMPS Large-scale Atomic/Molecular Massively Parallel Simulator MAPS Materials Processes and Simulations MS
DOE Office of Scientific and Technical Information (OSTI.GOV)
2015-10-20
Look-ahead dynamic simulation software system incorporates the high performance parallel computing technologies, significantly reduces the solution time for each transient simulation case, and brings the dynamic simulation analysis into on-line applications to enable more transparency for better reliability and asset utilization. It takes the snapshot of the current power grid status, functions in parallel computing the system dynamic simulation, and outputs the transient response of the power system in real time.
NASA Astrophysics Data System (ADS)
Vivoni, Enrique R.; Mascaro, Giuseppe; Mniszewski, Susan; Fasel, Patricia; Springer, Everett P.; Ivanov, Valeriy Y.; Bras, Rafael L.
2011-10-01
SummaryA major challenge in the use of fully-distributed hydrologic models has been the lack of computational capabilities for high-resolution, long-term simulations in large river basins. In this study, we present the parallel model implementation and real-world hydrologic assessment of the Triangulated Irregular Network (TIN)-based Real-time Integrated Basin Simulator (tRIBS). Our parallelization approach is based on the decomposition of a complex watershed using the channel network as a directed graph. The resulting sub-basin partitioning divides effort among processors and handles hydrologic exchanges across boundaries. Through numerical experiments in a set of nested basins, we quantify parallel performance relative to serial runs for a range of processors, simulation complexities and lengths, and sub-basin partitioning methods, while accounting for inter-run variability on a parallel computing system. In contrast to serial simulations, the parallel model speed-up depends on the variability of hydrologic processes. Load balancing significantly improves parallel speed-up with proportionally faster runs as simulation complexity (domain resolution and channel network extent) increases. The best strategy for large river basins is to combine a balanced partitioning with an extended channel network, with potential savings through a lower TIN resolution. Based on these advances, a wider range of applications for fully-distributed hydrologic models are now possible. This is illustrated through a set of ensemble forecasts that account for precipitation uncertainty derived from a statistical downscaling model.
NASA Astrophysics Data System (ADS)
Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.
2013-08-01
Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time ti (trajectory positions and velocities xi = (ri, vi)) to time ti + 1 (xi + 1) by xi + 1 = fi(xi), the dynamics problem spanning an interval from t0…tM can be transformed into a root finding problem, F(X) = [xi - f(x(i - 1)]i = 1, M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H2O AIMD simulation at the MP2 level. The maximum speedup (serial execution time/parallel execution time) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.
NASA Astrophysics Data System (ADS)
Thomas, R. Q.; Williams, M.
2014-09-01
Carbon (C) and nitrogen (N) cycles are coupled in terrestrial ecosystems through multiple processes including photosynthesis, tissue allocation, respiration, N fixation, N uptake, and decomposition of litter and soil organic matter. Capturing the constraint of N on terrestrial C uptake and storage has been a focus of the Earth System Modeling community. However, there is little understanding of the trade-offs and sensitivities of allocating C and N to different tissues in order to optimize the productivity of plants. Here we describe a new, simple model of ecosystem C-N cycling and interactions (ACONITE), that builds on theory related to plant economics in order to predict key ecosystem properties (leaf area index, leaf C : N, N fixation, and plant C use efficiency) based on the outcome of assessments of the marginal change in net C or N uptake associated with a change in allocation of C or N to plant tissues. We simulated and evaluated steady-state ecosystem stocks and fluxes in three different forest ecosystems types (tropical evergreen, temperate deciduous, and temperate evergreen). Leaf C : N differed among the three ecosystem types (temperate deciduous < tropical evergreen < temperature evergreen), a result that compared well to observations from a global database describing plant traits. Gross primary productivity (GPP) and net primary productivity (NPP) estimates compared well to observed fluxes at the simulation sites. Simulated N fixation at steady-state, calculated based on relative demand for N and the marginal return on C investment to acquire N, was an order of magnitude higher in the tropical forest than in the temperate forest, consistent with observations. A sensitivity analysis revealed that parameterization of the relationship between leaf N and leaf respiration had the largest influence on leaf area index and leaf C : N. A parameter governing how photosynthesis scales with day length had the largest influence on total vegetation C, GPP, and NPP. Multiple parameters associated with photosynthesis, respiration, and N uptake influenced the rate of N fixation. Overall, our ability to constrain leaf area index and allow spatially and temporally variable leaf C : N can help address challenges simulating these properties in ecosystem and Earth System models. Furthermore, the simple approach with emergent properties based on coupled C-N dynamics has potential for use in research that uses data-assimilation methods to integrate data on both the C and N cycles to improve C flux forecasts.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cowan, Nicolas B.; Voigt, Aiko; Abbot, Dorian S., E-mail: n-cowan@nortwestern.edu
In order to understand the climate on terrestrial planets orbiting nearby Sun-like stars, one would like to know their thermal inertia. We use a global climate model to simulate the thermal phase variations of Earth analogs and test whether these data could distinguish between planets with different heat storage and heat transport characteristics. In particular, we consider a temperate climate with polar ice caps (like the modern Earth) and a snowball state where the oceans are globally covered in ice. We first quantitatively study the periodic radiative forcing from, and climatic response to, rotation, obliquity, and eccentricity. Orbital eccentricity andmore » seasonal changes in albedo cause variations in the global-mean absorbed flux. The responses of the two climates to these global seasons indicate that the temperate planet has 3 Multiplication-Sign the bulk heat capacity of the snowball planet due to the presence of liquid water oceans. The obliquity seasons in the temperate simulation are weaker than one would expect based on thermal inertia alone; this is due to cross-equatorial oceanic and atmospheric energy transport. Thermal inertia and cross-equatorial heat transport have qualitatively different effects on obliquity seasons, insofar as heat transport tends to reduce seasonal amplitude without inducing a phase lag. For an Earth-like planet, however, this effect is masked by the mixing of signals from low thermal inertia regions (sea ice and land) with that from high thermal inertia regions (oceans), which also produces a damped response with small phase lag. We then simulate thermal light curves as they would appear to a high-contrast imaging mission (TPF-I/Darwin). In order of importance to the present simulations, which use modern-Earth orbital parameters, the three drivers of thermal phase variations are (1) obliquity seasons, (2) diurnal cycle, and (3) global seasons. Obliquity seasons are the dominant source of phase variations for most viewing angles. A pole-on observer would measure peak-to-trough amplitudes of 13% and 47% for the temperate and snowball climates, respectively. Diurnal heating is important for equatorial observers ({approx}5% phase variations), because the obliquity effects cancel to first order from that vantage. Finally, we compare the prospects of optical versus thermal direct imaging missions for constraining the climate on exoplanets and conclude that while zero- and one-dimensional models are best served by thermal measurements, second-order models accounting for seasons and planetary thermal inertia would require both optical and thermal observations.« less
Xyce Parallel Electronic Simulator : users' guide, version 2.0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont
2004-06-01
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator capable of simulating electrical circuits at a variety of abstraction levels. Primarily, Xyce has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability the current state-of-the-art in the following areas: {sm_bullet} Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. {sm_bullet} Improved performance for allmore » numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. {sm_bullet} Device models which are specifically tailored to meet Sandia's needs, including many radiation-aware devices. {sm_bullet} A client-server or multi-tiered operating model wherein the numerical kernel can operate independently of the graphical user interface (GUI). {sm_bullet} Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing of computing platforms. These include serial, shared-memory and distributed-memory parallel implementation - which allows it to run efficiently on the widest possible number parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. One feature required by designers is the ability to add device models, many specific to the needs of Sandia, to the code. To this end, the device package in the Xyce These input formats include standard analytical models, behavioral models look-up Parallel Electronic Simulator is designed to support a variety of device model inputs. tables, and mesh-level PDE device models. Combined with this flexible interface is an architectural design that greatly simplifies the addition of circuit models. One of the most important feature of Xyce is in providing a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia now has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods) research and development can be performed. Ultimately, these capabilities are migrated to end users.« less
NASA Technical Reports Server (NTRS)
Sohn, Andrew; Biswas, Rupak
1996-01-01
Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.
Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations
NASA Astrophysics Data System (ADS)
Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.
2016-07-01
Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.
Parallel simulation of tsunami inundation on a large-scale supercomputer
NASA Astrophysics Data System (ADS)
Oishi, Y.; Imamura, F.; Sugawara, D.
2013-12-01
An accurate prediction of tsunami inundation is important for disaster mitigation purposes. One approach is to approximate the tsunami wave source through an instant inversion analysis using real-time observation data (e.g., Tsushima et al., 2009) and then use the resulting wave source data in an instant tsunami inundation simulation. However, a bottleneck of this approach is the large computational cost of the non-linear inundation simulation and the computational power of recent massively parallel supercomputers is helpful to enable faster than real-time execution of a tsunami inundation simulation. Parallel computers have become approximately 1000 times faster in 10 years (www.top500.org), and so it is expected that very fast parallel computers will be more and more prevalent in the near future. Therefore, it is important to investigate how to efficiently conduct a tsunami simulation on parallel computers. In this study, we are targeting very fast tsunami inundation simulations on the K computer, currently the fastest Japanese supercomputer, which has a theoretical peak performance of 11.2 PFLOPS. One computing node of the K computer consists of 1 CPU with 8 cores that share memory, and the nodes are connected through a high-performance torus-mesh network. The K computer is designed for distributed-memory parallel computation, so we have developed a parallel tsunami model. Our model is based on TUNAMI-N2 model of Tohoku University, which is based on a leap-frog finite difference method. A grid nesting scheme is employed to apply high-resolution grids only at the coastal regions. To balance the computation load of each CPU in the parallelization, CPUs are first allocated to each nested layer in proportion to the number of grid points of the nested layer. Using CPUs allocated to each layer, 1-D domain decomposition is performed on each layer. In the parallel computation, three types of communication are necessary: (1) communication to adjacent neighbours for the finite difference calculation, (2) communication between adjacent layers for the calculations to connect each layer, and (3) global communication to obtain the time step which satisfies the CFL condition in the whole domain. A preliminary test on the K computer showed the parallel efficiency on 1024 cores was 57% relative to 64 cores. We estimate that the parallel efficiency will be considerably improved by applying a 2-D domain decomposition instead of the present 1-D domain decomposition in future work. The present parallel tsunami model was applied to the 2011 Great Tohoku tsunami. The coarsest resolution layer covers a 758 km × 1155 km region with a 405 m grid spacing. A nesting of five layers was used with the resolution ratio of 1/3 between nested layers. The finest resolution region has 5 m resolution and covers most of the coastal region of Sendai city. To complete 2 hours of simulation time, the serial (non-parallel) computation took approximately 4 days on a workstation. To complete the same simulation on 1024 cores of the K computer, it took 45 minutes which is more than two times faster than real-time. This presentation discusses the updated parallel computational performance and the efficient use of the K computer when considering the characteristics of the tsunami inundation simulation model in relation to the characteristics and capabilities of the K computer.
Methodology of modeling and measuring computer architectures for plasma simulations
NASA Technical Reports Server (NTRS)
Wang, L. P. T.
1977-01-01
A brief introduction to plasma simulation using computers and the difficulties on currently available computers is given. Through the use of an analyzing and measuring methodology - SARA, the control flow and data flow of a particle simulation model REM2-1/2D are exemplified. After recursive refinements the total execution time may be greatly shortened and a fully parallel data flow can be obtained. From this data flow, a matched computer architecture or organization could be configured to achieve the computation bound of an application problem. A sequential type simulation model, an array/pipeline type simulation model, and a fully parallel simulation model of a code REM2-1/2D are proposed and analyzed. This methodology can be applied to other application problems which have implicitly parallel nature.
Massively parallel quantum computer simulator
NASA Astrophysics Data System (ADS)
De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.
2007-01-01
We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.
PENTACLE: Parallelized particle-particle particle-tree code for planet formation
NASA Astrophysics Data System (ADS)
Iwasawa, Masaki; Oshino, Shoichi; Fujii, Michiko S.; Hori, Yasunori
2017-10-01
We have newly developed a parallelized particle-particle particle-tree code for planet formation, PENTACLE, which is a parallelized hybrid N-body integrator executed on a CPU-based (super)computer. PENTACLE uses a fourth-order Hermite algorithm to calculate gravitational interactions between particles within a cut-off radius and a Barnes-Hut tree method for gravity from particles beyond. It also implements an open-source library designed for full automatic parallelization of particle simulations, FDPS (Framework for Developing Particle Simulator), to parallelize a Barnes-Hut tree algorithm for a memory-distributed supercomputer. These allow us to handle 1-10 million particles in a high-resolution N-body simulation on CPU clusters for collisional dynamics, including physical collisions in a planetesimal disc. In this paper, we show the performance and the accuracy of PENTACLE in terms of \\tilde{R}_cut and a time-step Δt. It turns out that the accuracy of a hybrid N-body simulation is controlled through Δ t / \\tilde{R}_cut and Δ t / \\tilde{R}_cut ˜ 0.1 is necessary to simulate accurately the accretion process of a planet for ≥106 yr. For all those interested in large-scale particle simulations, PENTACLE, customized for planet formation, will be freely available from https://github.com/PENTACLE-Team/PENTACLE under the MIT licence.
Numerical characteristics of quantum computer simulation
NASA Astrophysics Data System (ADS)
Chernyavskiy, A.; Khamitov, K.; Teplov, A.; Voevodin, V.; Voevodin, Vl.
2016-12-01
The simulation of quantum circuits is significantly important for the implementation of quantum information technologies. The main difficulty of such modeling is the exponential growth of dimensionality, thus the usage of modern high-performance parallel computations is relevant. As it is well known, arbitrary quantum computation in circuit model can be done by only single- and two-qubit gates, and we analyze the computational structure and properties of the simulation of such gates. We investigate the fact that the unique properties of quantum nature lead to the computational properties of the considered algorithms: the quantum parallelism make the simulation of quantum gates highly parallel, and on the other hand, quantum entanglement leads to the problem of computational locality during simulation. We use the methodology of the AlgoWiki project (algowiki-project.org) to analyze the algorithm. This methodology consists of theoretical (sequential and parallel complexity, macro structure, and visual informational graph) and experimental (locality and memory access, scalability and more specific dynamic characteristics) parts. Experimental part was made by using the petascale Lomonosov supercomputer (Moscow State University, Russia). We show that the simulation of quantum gates is a good base for the research and testing of the development methods for data intense parallel software, and considered methodology of the analysis can be successfully used for the improvement of the algorithms in quantum information science.
Comparison of stochastic optimization methods for all-atom folding of the Trp-Cage protein.
Schug, Alexander; Herges, Thomas; Verma, Abhinav; Lee, Kyu Hwan; Wenzel, Wolfgang
2005-12-09
The performances of three different stochastic optimization methods for all-atom protein structure prediction are investigated and compared. We use the recently developed all-atom free-energy force field (PFF01), which was demonstrated to correctly predict the native conformation of several proteins as the global optimum of the free energy surface. The trp-cage protein (PDB-code 1L2Y) is folded with the stochastic tunneling method, a modified parallel tempering method, and the basin-hopping technique. All the methods correctly identify the native conformation, and their relative efficiency is discussed.
Visualization and Tracking of Parallel CFD Simulations
NASA Technical Reports Server (NTRS)
Vaziri, Arsi; Kremenetsky, Mark
1995-01-01
We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.
Design of object-oriented distributed simulation classes
NASA Technical Reports Server (NTRS)
Schoeffler, James D. (Principal Investigator)
1995-01-01
Distributed simulation of aircraft engines as part of a computer aided design package is being developed by NASA Lewis Research Center for the aircraft industry. The project is called NPSS, an acronym for 'Numerical Propulsion Simulation System'. NPSS is a flexible object-oriented simulation of aircraft engines requiring high computing speed. It is desirable to run the simulation on a distributed computer system with multiple processors executing portions of the simulation in parallel. The purpose of this research was to investigate object-oriented structures such that individual objects could be distributed. The set of classes used in the simulation must be designed to facilitate parallel computation. Since the portions of the simulation carried out in parallel are not independent of one another, there is the need for communication among the parallel executing processors which in turn implies need for their synchronization. Communication and synchronization can lead to decreased throughput as parallel processors wait for data or synchronization signals from other processors. As a result of this research, the following have been accomplished. The design and implementation of a set of simulation classes which result in a distributed simulation control program have been completed. The design is based upon MIT 'Actor' model of a concurrent object and uses 'connectors' to structure dynamic connections between simulation components. Connectors may be dynamically created according to the distribution of objects among machines at execution time without any programming changes. Measurements of the basic performance have been carried out with the result that communication overhead of the distributed design is swamped by the computation time of modules unless modules have very short execution times per iteration or time step. An analytical performance model based upon queuing network theory has been designed and implemented. Its application to realistic configurations has not been carried out.
Design of Object-Oriented Distributed Simulation Classes
NASA Technical Reports Server (NTRS)
Schoeffler, James D.
1995-01-01
Distributed simulation of aircraft engines as part of a computer aided design package being developed by NASA Lewis Research Center for the aircraft industry. The project is called NPSS, an acronym for "Numerical Propulsion Simulation System". NPSS is a flexible object-oriented simulation of aircraft engines requiring high computing speed. It is desirable to run the simulation on a distributed computer system with multiple processors executing portions of the simulation in parallel. The purpose of this research was to investigate object-oriented structures such that individual objects could be distributed. The set of classes used in the simulation must be designed to facilitate parallel computation. Since the portions of the simulation carried out in parallel are not independent of one another, there is the need for communication among the parallel executing processors which in turn implies need for their synchronization. Communication and synchronization can lead to decreased throughput as parallel processors wait for data or synchronization signals from other processors. As a result of this research, the following have been accomplished. The design and implementation of a set of simulation classes which result in a distributed simulation control program have been completed. The design is based upon MIT "Actor" model of a concurrent object and uses "connectors" to structure dynamic connections between simulation components. Connectors may be dynamically created according to the distribution of objects among machines at execution time without any programming changes. Measurements of the basic performance have been carried out with the result that communication overhead of the distributed design is swamped by the computation time of modules unless modules have very short execution times per iteration or time step. An analytical performance model based upon queuing network theory has been designed and implemented. Its application to realistic configurations has not been carried out.
Partitioning and packing mathematical simulation models for calculation on parallel computers
NASA Technical Reports Server (NTRS)
Arpasi, D. J.; Milner, E. J.
1986-01-01
The development of multiprocessor simulations from a serial set of ordinary differential equations describing a physical system is described. Degrees of parallelism (i.e., coupling between the equations) and their impact on parallel processing are discussed. The problem of identifying computational parallelism within sets of closely coupled equations that require the exchange of current values of variables is described. A technique is presented for identifying this parallelism and for partitioning the equations for parallel solution on a multiprocessor. An algorithm which packs the equations into a minimum number of processors is also described. The results of the packing algorithm when applied to a turbojet engine model are presented in terms of processor utilization.
Vectorization for Molecular Dynamics on Intel Xeon Phi Corpocessors
NASA Astrophysics Data System (ADS)
Yi, Hongsuk
2014-03-01
Many modern processors are capable of exploiting data-level parallelism through the use of single instruction multiple data (SIMD) execution. The new Intel Xeon Phi coprocessor supports 512 bit vector registers for the high performance computing. In this paper, we have developed a hierarchical parallelization scheme for accelerated molecular dynamics simulations with the Terfoff potentials for covalent bond solid crystals on Intel Xeon Phi coprocessor systems. The scheme exploits multi-level parallelism computing. We combine thread-level parallelism using a tightly coupled thread-level and task-level parallelism with 512-bit vector register. The simulation results show that the parallel performance of SIMD implementations on Xeon Phi is apparently superior to their x86 CPU architecture.
Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zuo, Wangda; McNeil, Andrew; Wetter, Michael
2011-09-06
We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.
Streaming parallel GPU acceleration of large-scale filter-based spiking neural networks.
Slażyński, Leszek; Bohte, Sander
2012-01-01
The arrival of graphics processing (GPU) cards suitable for massively parallel computing promises affordable large-scale neural network simulation previously only available at supercomputing facilities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of magnitude, the challenge is to develop fine-grained parallel algorithms to fully exploit the particulars of GPUs. Computation in a neural network is inherently parallel and thus a natural match for GPU architectures: given inputs, the internal state for each neuron can be updated in parallel. We show that for filter-based spiking neurons, like the Spike Response Model, the additive nature of membrane potential dynamics enables additional update parallelism. This also reduces the accumulation of numerical errors when using single precision computation, the native precision of GPUs. We further show that optimizing simulation algorithms and data structures to the GPU's architecture has a large pay-off: for example, matching iterative neural updating to the memory architecture of the GPU speeds up this simulation step by a factor of three to five. With such optimizations, we can simulate in better-than-realtime plausible spiking neural networks of up to 50 000 neurons, processing over 35 million spiking events per second.
Dewaraja, Yuni K; Ljungberg, Michael; Majumdar, Amitava; Bose, Abhijit; Koral, Kenneth F
2002-02-01
This paper reports the implementation of the SIMIND Monte Carlo code on an IBM SP2 distributed memory parallel computer. Basic aspects of running Monte Carlo particle transport calculations on parallel architectures are described. Our parallelization is based on equally partitioning photons among the processors and uses the Message Passing Interface (MPI) library for interprocessor communication and the Scalable Parallel Random Number Generator (SPRNG) to generate uncorrelated random number streams. These parallelization techniques are also applicable to other distributed memory architectures. A linear increase in computing speed with the number of processors is demonstrated for up to 32 processors. This speed-up is especially significant in Single Photon Emission Computed Tomography (SPECT) simulations involving higher energy photon emitters, where explicit modeling of the phantom and collimator is required. For (131)I, the accuracy of the parallel code is demonstrated by comparing simulated and experimental SPECT images from a heart/thorax phantom. Clinically realistic SPECT simulations using the voxel-man phantom are carried out to assess scatter and attenuation correction.
Advanced Computational Methods for High-accuracy Refinement of Protein Low-quality Models
NASA Astrophysics Data System (ADS)
Zang, Tianwu
Predicting the 3-dimentional structure of protein has been a major interest in the modern computational biology. While lots of successful methods can generate models with 3˜5A root-mean-square deviation (RMSD) from the solution, the progress of refining these models is quite slow. It is therefore urgently needed to develop effective methods to bring low-quality models to higher-accuracy ranges (e.g., less than 2 A RMSD). In this thesis, I present several novel computational methods to address the high-accuracy refinement problem. First, an enhanced sampling method, named parallel continuous simulated tempering (PCST), is developed to accelerate the molecular dynamics (MD) simulation. Second, two energy biasing methods, Structure-Based Model (SBM) and Ensemble-Based Model (EBM), are introduced to perform targeted sampling around important conformations. Third, a three-step method is developed to blindly select high-quality models along the MD simulation. These methods work together to make significant refinement of low-quality models without any knowledge of the solution. The effectiveness of these methods is examined in different applications. Using the PCST-SBM method, models with higher global distance test scores (GDT_TS) are generated and selected in the MD simulation of 18 targets from the refinement category of the 10th Critical Assessment of Structure Prediction (CASP10). In addition, in the refinement test of two CASP10 targets using the PCST-EBM method, it is indicated that EBM may bring the initial model to even higher-quality levels. Furthermore, a multi-round refinement protocol of PCST-SBM improves the model quality of a protein to the level that is sufficient high for the molecular replacement in X-ray crystallography. Our results justify the crucial position of enhanced sampling in the protein structure prediction and demonstrate that a considerable improvement of low-accuracy structures is still achievable with current force fields.
Synchronisation under shocks: The Lévy Kuramoto model
NASA Astrophysics Data System (ADS)
Roberts, Dale; Kalloniatis, Alexander C.
2018-04-01
We study the Kuramoto model of identical oscillators on Erdős-Rényi (ER) and Barabasi-Alberts (BA) scale free networks examining the dynamics when perturbed by a Lévy noise. Lévy noise exhibits heavier tails than Gaussian while allowing for their tempering in a controlled manner. This allows us to understand how 'shocks' influence individual oscillator and collective system behaviour of a paradigmatic complex system. Skewed α-stable Lévy noise, equivalent to fractional diffusion perturbations, are considered, but overlaid by exponential tempering of rate λ. In an earlier paper we found that synchrony takes a variety of forms for identical Kuramoto oscillators subject to stable Lévy noise, not seen for the Gaussian case, and changing with α: a noise-induced drift, a smooth α dependence of the point of cross-over of synchronisation point of ER and BA networks, and a severe loss of synchronisation at low values of α. In the presence of tempering we observe both analytically and numerically a dramatic change to the α < 1 behaviour where synchronisation is sustained over a larger range of values of the 'noise strength' σ, improved compared to the α > 1 tempered cases. Analytically we study the system close to the phase synchronised fixed point and solve the tempered fractional Fokker-Planck equation. There we observe that densities show stronger support in the basin of attraction at low α for fixed coupling, σ and tempering λ. We then perform numerical simulations for networks of size N = 1000 and average degree d ¯ = 10. There, we compute the order parameter r as a function of σ for fixed α and λ and observe values of r ≈ 1 over larger ranges of σ for α < 1 and λ ≠ 0. In addition we observe drift of both positive and negative slopes for different α and λ when native frequencies are equal, and confirm a sustainment of synchronisation down to low values of α. We propose a mechanism for this in terms of the basic shape of the tempered stable Lévy densities for various α and how it feeds into Kuramoto oscillator dynamics and illustrate this with examples of specific paths.
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Owen, Jeffrey E.
1988-01-01
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.
Komarov, Ivan; D'Souza, Roshan M
2012-01-01
The Gillespie Stochastic Simulation Algorithm (GSSA) and its variants are cornerstone techniques to simulate reaction kinetics in situations where the concentration of the reactant is too low to allow deterministic techniques such as differential equations. The inherent limitations of the GSSA include the time required for executing a single run and the need for multiple runs for parameter sweep exercises due to the stochastic nature of the simulation. Even very efficient variants of GSSA are prohibitively expensive to compute and perform parameter sweeps. Here we present a novel variant of the exact GSSA that is amenable to acceleration by using graphics processing units (GPUs). We parallelize the execution of a single realization across threads in a warp (fine-grained parallelism). A warp is a collection of threads that are executed synchronously on a single multi-processor. Warps executing in parallel on different multi-processors (coarse-grained parallelism) simultaneously generate multiple trajectories. Novel data-structures and algorithms reduce memory traffic, which is the bottleneck in computing the GSSA. Our benchmarks show an 8×-120× performance gain over various state-of-the-art serial algorithms when simulating different types of models.
Freezing Transition Studies Through Constrained Cell Model Simulation
NASA Astrophysics Data System (ADS)
Nayhouse, Michael; Kwon, Joseph Sang-Il; Heng, Vincent R.; Amlani, Ankur M.; Orkoulas, G.
2014-10-01
In the present work, a simulation method based on cell models is used to deduce the fluid-solid transition of a system of particles that interact via a pair potential, , which is of the form with . The simulations are implemented under constant-pressure conditions on a generalized version of the constrained cell model. The constrained cell model is constructed by dividing the volume into Wigner-Seitz cells and confining each particle in a single cell. This model is a special case of a more general cell model which is formed by introducing an additional field variable that controls the number of particles per cell and, thus, the relative stability of the solid against the fluid phase. High field values force configurations with one particle per cell and thus favor the solid phase. Fluid-solid coexistence on the isotherm that corresponds to a reduced temperature of 2 is determined from constant-pressure simulations of the generalized cell model using tempering and histogram reweighting techniques. The entire fluid-solid phase boundary is determined through a thermodynamic integration technique based on histogram reweighting, using the previous coexistence point as a reference point. The vapor-liquid phase diagram is obtained from constant-pressure simulations of the unconstrained system using tempering and histogram reweighting. The phase diagram of the system is found to contain a stable critical point and a triple point. The phase diagram of the corresponding constrained cell model is also found to contain both a stable critical point and a triple point.
Traffic Simulations on Parallel Computers Using Domain Decomposition Techniques
DOT National Transportation Integrated Search
1995-01-01
Large scale simulations of Intelligent Transportation Systems (ITS) can only be acheived by using the computing resources offered by parallel computing architectures. Domain decomposition techniques are proposed which allow the performance of traffic...
Jung, Jaewoon; Mori, Takaharu; Kobayashi, Chigusa; Matsunaga, Yasuhiro; Yoda, Takao; Feig, Michael; Sugita, Yuji
2015-07-01
GENESIS (Generalized-Ensemble Simulation System) is a new software package for molecular dynamics (MD) simulations of macromolecules. It has two MD simulators, called ATDYN and SPDYN. ATDYN is parallelized based on an atomic decomposition algorithm for the simulations of all-atom force-field models as well as coarse-grained Go-like models. SPDYN is highly parallelized based on a domain decomposition scheme, allowing large-scale MD simulations on supercomputers. Hybrid schemes combining OpenMP and MPI are used in both simulators to target modern multicore computer architectures. Key advantages of GENESIS are (1) the highly parallel performance of SPDYN for very large biological systems consisting of more than one million atoms and (2) the availability of various REMD algorithms (T-REMD, REUS, multi-dimensional REMD for both all-atom and Go-like models under the NVT, NPT, NPAT, and NPγT ensembles). The former is achieved by a combination of the midpoint cell method and the efficient three-dimensional Fast Fourier Transform algorithm, where the domain decomposition space is shared in real-space and reciprocal-space calculations. Other features in SPDYN, such as avoiding concurrent memory access, reducing communication times, and usage of parallel input/output files, also contribute to the performance. We show the REMD simulation results of a mixed (POPC/DMPC) lipid bilayer as a real application using GENESIS. GENESIS is released as free software under the GPLv2 licence and can be easily modified for the development of new algorithms and molecular models. WIREs Comput Mol Sci 2015, 5:310-323. doi: 10.1002/wcms.1220.
On extending parallelism to serial simulators
NASA Technical Reports Server (NTRS)
Nicol, David; Heidelberger, Philip
1994-01-01
This paper describes an approach to discrete event simulation modeling that appears to be effective for developing portable and efficient parallel execution of models of large distributed systems and communication networks. In this approach, the modeler develops submodels using an existing sequential simulation modeling tool, using the full expressive power of the tool. A set of modeling language extensions permit automatically synchronized communication between submodels; however, the automation requires that any such communication must take a nonzero amount off simulation time. Within this modeling paradigm, a variety of conservative synchronization protocols can transparently support conservative execution of submodels on potentially different processors. A specific implementation of this approach, U.P.S. (Utilitarian Parallel Simulator), is described, along with performance results on the Intel Paragon.
Global Magnetohydrodynamic Simulation Using High Performance FORTRAN on Parallel Computers
NASA Astrophysics Data System (ADS)
Ogino, T.
High Performance Fortran (HPF) is one of modern and common techniques to achieve high performance parallel computation. We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5 VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.
The cost of conservative synchronization in parallel discrete event simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.
1990-01-01
The performance of a synchronous conservative parallel discrete-event simulation protocol is analyzed. The class of simulation models considered is oriented around a physical domain and possesses a limited ability to predict future behavior. A stochastic model is used to show that as the volume of simulation activity in the model increases relative to a fixed architecture, the complexity of the average per-event overhead due to synchronization, event list manipulation, lookahead calculations, and processor idle time approach the complexity of the average per-event overhead of a serial simulation. The method is therefore within a constant factor of optimal. The analysis demonstrates that on large problems--those for which parallel processing is ideally suited--there is often enough parallel workload so that processors are not usually idle. The viability of the method is also demonstrated empirically, showing how good performance is achieved on large problems using a thirty-two node Intel iPSC/2 distributed memory multiprocessor.
A New Parallel Boundary Condition for Turbulence Simulations in Stellarators
NASA Astrophysics Data System (ADS)
Martin, Mike F.; Landreman, Matt; Dorland, William; Xanthopoulos, Pavlos
2017-10-01
For gyrokinetic simulations of core turbulence, the ``twist-and-shift'' parallel boundary condition (Beer et al., PoP, 1995), which involves a shift in radial wavenumber proportional to the global shear and a quantization of the simulation domain's aspect ratio, is the standard choice. But as this condition was derived under the assumption of axisymmetry, ``twist-and-shift'' as it stands is formally incorrect for turbulence simulations in stellarators. Moreover, for low-shear stellarators like W7X and HSX, the use of a global shear in the traditional boundary condition places an inflexible constraint on the aspect ratio of the domain, requiring more grid points to fully resolve its extent. Here, we present a parallel boundary condition for ``stellarator-symmetric'' simulations that relies on the local shear along a field line. This boundary condition is similar to ``twist-and-shift'', but has an added flexibility in choosing the parallel length of the domain based on local shear consideration in order to optimize certain parameters such as the aspect ratio of the simulation domain.
The Distributed Diagonal Force Decomposition Method for Parallelizing Molecular Dynamics Simulations
Boršnik, Urban; Miller, Benjamin T.; Brooks, Bernard R.; Janežič, Dušanka
2011-01-01
Parallelization is an effective way to reduce the computational time needed for molecular dynamics simulations. We describe a new parallelization method, the distributed-diagonal force decomposition method, with which we extend and improve the existing force decomposition methods. Our new method requires less data communication during molecular dynamics simulations than replicated data and current force decomposition methods, increasing the parallel efficiency. It also dynamically load-balances the processors' computational load throughout the simulation. The method is readily implemented in existing molecular dynamics codes and it has been incorporated into the CHARMM program, allowing its immediate use in conjunction with the many molecular dynamics simulation techniques that are already present in the program. We also present the design of the Force Decomposition Machine, a cluster of personal computers and networks that is tailored to running molecular dynamics simulations using the distributed diagonal force decomposition method. The design is expandable and provides various degrees of fault resilience. This approach is easily adaptable to computers with Graphics Processing Units because it is independent of the processor type being used. PMID:21793007
Parallelizing Timed Petri Net simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.
1993-01-01
The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included.
Parallel computing method for simulating hydrological processesof large rivers under climate change
NASA Astrophysics Data System (ADS)
Wang, H.; Chen, Y.
2016-12-01
Climate change is one of the proverbial global environmental problems in the world.Climate change has altered the watershed hydrological processes in time and space distribution, especially in worldlarge rivers.Watershed hydrological process simulation based on physically based distributed hydrological model can could have better results compared with the lumped models.However, watershed hydrological process simulation includes large amount of calculations, especially in large rivers, thus needing huge computing resources that may not be steadily available for the researchers or at high expense, this seriously restricted the research and application. To solve this problem, the current parallel method are mostly parallel computing in space and time dimensions.They calculate the natural features orderly thatbased on distributed hydrological model by grid (unit, a basin) from upstream to downstream.This articleproposes ahigh-performancecomputing method of hydrological process simulation with high speedratio and parallel efficiency.It combinedthe runoff characteristics of time and space of distributed hydrological model withthe methods adopting distributed data storage, memory database, distributed computing, parallel computing based on computing power unit.The method has strong adaptability and extensibility,which means it canmake full use of the computing and storage resources under the condition of limited computing resources, and the computing efficiency can be improved linearly with the increase of computing resources .This method can satisfy the parallel computing requirements ofhydrological process simulation in small, medium and large rivers.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, William Michael; Plimpton, Steven James; Wang, Peng
2010-03-01
LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has potentials for soft materials (biomolecules, polymers) and solid-state materials (metals, semiconductors) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale. LAMMPS runs on single processors or in parallel using message-passing techniques and a spatial-decomposition of the simulation domain. The code is designed to be easy to modify or extend with new functionality.
Characterisation of VOC, SVOC, and PM emissions from peat burnt in laboratory simulations
Peat, or organic soil, is a vast store of organic carbon, widely distributed from polar temperate to equatorial regions. Drainage for agriculture and drought are drying vast areas of peat, exposing it to increasing fire risk, which may be exacerbated by climate change. This has ...
Research in parallel computing
NASA Technical Reports Server (NTRS)
Ortega, James M.; Henderson, Charles
1994-01-01
This report summarizes work on parallel computations for NASA Grant NAG-1-1529 for the period 1 Jan. - 30 June 1994. Short summaries on highly parallel preconditioners, target-specific parallel reductions, and simulation of delta-cache protocols are provided.
NASA Technical Reports Server (NTRS)
Reinsch, K. G. (Editor); Schmidt, W. (Editor); Ecer, A. (Editor); Haeuser, Jochem (Editor); Periaux, J. (Editor)
1992-01-01
A conference was held on parallel computational fluid dynamics and produced related papers. Topics discussed in these papers include: parallel implicit and explicit solvers for compressible flow, parallel computational techniques for Euler and Navier-Stokes equations, grid generation techniques for parallel computers, and aerodynamic simulation om massively parallel systems.
NASA Astrophysics Data System (ADS)
Furuichi, M.; Nishiura, D.
2015-12-01
Fully Lagrangian methods such as Smoothed Particle Hydrodynamics (SPH) and Discrete Element Method (DEM) have been widely used to solve the continuum and particles motions in the computational geodynamics field. These mesh-free methods are suitable for the problems with the complex geometry and boundary. In addition, their Lagrangian nature allows non-diffusive advection useful for tracking history dependent properties (e.g. rheology) of the material. These potential advantages over the mesh-based methods offer effective numerical applications to the geophysical flow and tectonic processes, which are for example, tsunami with free surface and floating body, magma intrusion with fracture of rock, and shear zone pattern generation of granular deformation. In order to investigate such geodynamical problems with the particle based methods, over millions to billion particles are required for the realistic simulation. Parallel computing is therefore important for handling such huge computational cost. An efficient parallel implementation of SPH and DEM methods is however known to be difficult especially for the distributed-memory architecture. Lagrangian methods inherently show workload imbalance problem for parallelization with the fixed domain in space, because particles move around and workloads change during the simulation. Therefore dynamic load balance is key technique to perform the large scale SPH and DEM simulation. In this work, we present the parallel implementation technique of SPH and DEM method utilizing dynamic load balancing algorithms toward the high resolution simulation over large domain using the massively parallel super computer system. Our method utilizes the imbalances of the executed time of each MPI process as the nonlinear term of parallel domain decomposition and minimizes them with the Newton like iteration method. In order to perform flexible domain decomposition in space, the slice-grid algorithm is used. Numerical tests show that our approach is suitable for solving the particles with different calculation costs (e.g. boundary particles) as well as the heterogeneous computer architecture. We analyze the parallel efficiency and scalability on the super computer systems (K-computer, Earth simulator 3, etc.).
A tool for simulating parallel branch-and-bound methods
NASA Astrophysics Data System (ADS)
Golubeva, Yana; Orlov, Yury; Posypkin, Mikhail
2016-01-01
The Branch-and-Bound method is known as one of the most powerful but very resource consuming global optimization methods. Parallel and distributed computing can efficiently cope with this issue. The major difficulty in parallel B&B method is the need for dynamic load redistribution. Therefore design and study of load balancing algorithms is a separate and very important research topic. This paper presents a tool for simulating parallel Branchand-Bound method. The simulator allows one to run load balancing algorithms with various numbers of processors, sizes of the search tree, the characteristics of the supercomputer's interconnect thereby fostering deep study of load distribution strategies. The process of resolution of the optimization problem by B&B method is replaced by a stochastic branching process. Data exchanges are modeled using the concept of logical time. The user friendly graphical interface to the simulator provides efficient visualization and convenient performance analysis.
NASA Astrophysics Data System (ADS)
Kum, Oyeon; Dickson, Brad M.; Stuart, Steven J.; Uberuaga, Blas P.; Voter, Arthur F.
2004-11-01
Parallel replica dynamics simulation methods appropriate for the simulation of chemical reactions in molecular systems with many conformational degrees of freedom have been developed and applied to study the microsecond-scale pyrolysis of n-hexadecane in the temperature range of 2100-2500 K. The algorithm uses a transition detection scheme that is based on molecular topology, rather than energetic basins. This algorithm allows efficient parallelization of small systems even when using more processors than particles (in contrast to more traditional parallelization algorithms), and even when there are frequent conformational transitions (in contrast to previous implementations of the parallel replica algorithm). The parallel efficiency for pyrolysis initiation reactions was over 90% on 61 processors for this 50-atom system. The parallel replica dynamics technique results in reaction probabilities that are statistically indistinguishable from those obtained from direct molecular dynamics, under conditions where both are feasible, but allows simulations at temperatures as much as 1000 K lower than direct molecular dynamics simulations. The rate of initiation displayed Arrhenius behavior over the entire temperature range, with an activation energy and frequency factor of Ea=79.7 kcal/mol and log A/s-1=14.8, respectively, in reasonable agreement with experiment and empirical kinetic models. Several interesting unimolecular reaction mechanisms were observed in simulations of the chain propagation reactions above 2000 K, which are not included in most coarse-grained kinetic models. More studies are needed in order to determine whether these mechanisms are experimentally relevant, or specific to the potential energy surface used.
Parallel, Asynchronous Executive (PAX): System concepts, facilities, and architecture
NASA Technical Reports Server (NTRS)
Jones, W. H.
1983-01-01
The Parallel, Asynchronous Executive (PAX) is a software operating system simulation that allows many computers to work on a single problem at the same time. PAX is currently implemented on a UNIVAC 1100/42 computer system. Independent UNIVAC runstreams are used to simulate independent computers. Data are shared among independent UNIVAC runstreams through shared mass-storage files. PAX has achieved the following: (1) applied several computing processes simultaneously to a single, logically unified problem; (2) resolved most parallel processor conflicts by careful work assignment; (3) resolved by means of worker requests to PAX all conflicts not resolved by work assignment; (4) provided fault isolation and recovery mechanisms to meet the problems of an actual parallel, asynchronous processing machine. Additionally, one real-life problem has been constructed for the PAX environment. This is CASPER, a collection of aerodynamic and structural dynamic problem simulation routines. CASPER is not discussed in this report except to provide examples of parallel-processing techniques.
Effectiveness of simulation for improvement in self-efficacy among novice nurses: a meta-analysis.
Franklin, Ashley E; Lee, Christopher S
2014-11-01
The influence of simulation on self-efficacy for novice nurses has been reported inconsistently in the literature. Effect sizes across studies were synthesized using random-effects meta-analyses. Simulation improved self-efficacy in one-group, pretest-posttest studies (Hedge's g=1.21, 95% CI [0.63, 1.78]; p<0.001). Simulation also was favored over control teaching interventions in improving self-efficacy in studies with experimental designs (Hedge's g=0.27, 95% CI [0.1, 0.44]; p=0.002). In nonexperimental designs, consistent conclusions about the influence of simulation were tempered by significant between-study differences in effects. Simulation is effective at increasing self-efficacy among novice nurses, compared with traditional control groups. Copyright 2014, SLACK Incorporated.
A Metascalable Computing Framework for Large Spatiotemporal-Scale Atomistic Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nomura, K; Seymour, R; Wang, W
2009-02-17
A metascalable (or 'design once, scale on new architectures') parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials based on spatiotemporal data locality principles, which is expected to scale on emerging multipetaflops architectures. The framework consists of: (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms for high complexity problems; (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, while introducing multiple parallelization axes; and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these O(N) algorithms onto a multicore cluster based onmore » hybrid implementation combining message passing and critical section-free multithreading. The EDC-STEP-HCD framework exposes maximal concurrency and data locality, thereby achieving: (1) inter-node parallel efficiency well over 0.95 for 218 billion-atom molecular-dynamics and 1.68 trillion electronic-degrees-of-freedom quantum-mechanical simulations on 212,992 IBM BlueGene/L processors (superscalability); (2) high intra-node, multithreading parallel efficiency (nanoscalability); and (3) nearly perfect time/ensemble parallel efficiency (eon-scalability). The spatiotemporal scale covered by MD simulation on a sustained petaflops computer per day (i.e. petaflops {center_dot} day of computing) is estimated as NT = 2.14 (e.g. N = 2.14 million atoms for T = 1 microseconds).« less
PRATHAM: Parallel Thermal Hydraulics Simulations using Advanced Mesoscopic Methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joshi, Abhijit S; Jain, Prashant K; Mudrich, Jaime A
2012-01-01
At the Oak Ridge National Laboratory, efforts are under way to develop a 3D, parallel LBM code called PRATHAM (PaRAllel Thermal Hydraulic simulations using Advanced Mesoscopic Methods) to demonstrate the accuracy and scalability of LBM for turbulent flow simulations in nuclear applications. The code has been developed using FORTRAN-90, and parallelized using the message passing interface MPI library. Silo library is used to compact and write the data files, and VisIt visualization software is used to post-process the simulation data in parallel. Both the single relaxation time (SRT) and multi relaxation time (MRT) LBM schemes have been implemented in PRATHAM.more » To capture turbulence without prohibitively increasing the grid resolution requirements, an LES approach [5] is adopted allowing large scale eddies to be numerically resolved while modeling the smaller (subgrid) eddies. In this work, a Smagorinsky model has been used, which modifies the fluid viscosity by an additional eddy viscosity depending on the magnitude of the rate-of-strain tensor. In LBM, this is achieved by locally varying the relaxation time of the fluid.« less
Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM
NASA Astrophysics Data System (ADS)
de La Fuente Marcos, Carlos; Barge, Pierre; de La Fuente Marcos, Raúl
2002-03-01
We describe a parallel version of our high-order-accuracy particle-mesh code for the simulation of collisionless protoplanetary disks. We use this code to carry out a massively parallel, two-dimensional, time-dependent, numerical simulation, which includes dust particles, to study the potential role of large-scale, gaseous vortices in protoplanetary disks. This noncollisional problem is easy to parallelize on message-passing multicomputer architectures. We performed the simulations on a cache-coherent nonuniform memory access Origin 2000 machine, using both the parallel virtual machine (PVM) and message-passing interface (MPI) message-passing libraries. Our performance analysis suggests that, for our problem, PVM is about 25% faster than MPI. Using PVM and MPI made it possible to reduce CPU time and increase code performance. This allows for simulations with a large number of particles (N ~ 105-106) in reasonable CPU times. The performances of our implementation of the pa! rallel code on an Origin 2000 supercomputer are presented and discussed. They exhibit very good speedup behavior and low load unbalancing. Our results confirm that giant gaseous vortices can play a dominant role in giant planet formation.
Absolute Humidity and the Seasonality of Influenza (Invited)
NASA Astrophysics Data System (ADS)
Shaman, J. L.; Pitzer, V.; Viboud, C.; Grenfell, B.; Goldstein, E.; Lipsitch, M.
2010-12-01
Much of the observed wintertime increase of mortality in temperate regions is attributed to seasonal influenza. A recent re-analysis of laboratory experiments indicates that absolute humidity strongly modulates the airborne survival and transmission of the influenza virus. Here we show that the onset of increased wintertime influenza-related mortality in the United States is associated with anomalously low absolute humidity levels during the prior weeks. We then use an epidemiological model, in which observed absolute humidity conditions temper influenza transmission rates, to successfully simulate the seasonal cycle of observed influenza-related mortality. The model results indicate that direct modulation of influenza transmissibility by absolute humidity alone is sufficient to produce this observed seasonality. These findings provide epidemiological support for the hypothesis that absolute humidity drives seasonal variations of influenza transmission in temperate regions. In addition, we show that variations of the basic and effective reproductive numbers for influenza, caused by seasonal changes in absolute humidity, are consistent with the general timing of pandemic influenza outbreaks observed for 2009 A/H1N1 in temperate regions. Indeed, absolute humidity conditions correctly identify the region of the United States vulnerable to a third, wintertime wave of pandemic influenza. These findings suggest that the timing of pandemic influenza outbreaks is controlled by a combination of absolute humidity conditions, levels of susceptibility and changes in population mixing and contact rates.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bylaska, Eric J., E-mail: Eric.Bylaska@pnnl.gov; Weare, Jonathan Q., E-mail: weare@uchicago.edu; Weare, John H., E-mail: jweare@ucsd.edu
2013-08-21
Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time t{sub i} (trajectory positions and velocities x{sub i} = (r{sub i}, v{sub i})) to time t{sub i+1} (x{sub i+1}) by x{sub i+1} = f{sub i}(x{sub i}), the dynamics problem spanning an interval from t{sub 0}…t{sub M} can be transformed into a root finding problem, F(X) = [x{sub i} − f(x{sub (i−1})]{sub i} {sub =1,M} = 0, for themore » trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H{sub 2}O AIMD simulation at the MP2 level. The maximum speedup ((serial execution time)/(parallel execution time) ) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H{sub 2}O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.« less
NASA Astrophysics Data System (ADS)
Wang, Audrey; Price, David T.
2007-03-01
A simple integrated algorithm was developed to relate global climatology to distributions of tree plant functional types (PFT). Multivariate cluster analysis was performed to analyze the statistical homogeneity of the climate space occupied by individual tree PFTs. Forested regions identified from the satellite-based GLC2000 classification were separated into tropical, temperate, and boreal sub-PFTs for use in the Canadian Terrestrial Ecosystem Model (CTEM). Global data sets of monthly minimum temperature, growing degree days, an index of climatic moisture, and estimated PFT cover fractions were then used as variables in the cluster analysis. The statistical results for individual PFT clusters were found consistent with other global-scale classifications of dominant vegetation. As an improvement of the quantification of the climatic limitations on PFT distributions, the results also demonstrated overlapping of PFT cluster boundaries that reflected vegetation transitions, for example, between tropical and temperate biomes. The resulting global database should provide a better basis for simulating the interaction of climate change and terrestrial ecosystem dynamics using global vegetation models.
Jolly, William M; Nemani, Ramakrishna; Running, Steven W
2004-09-01
Some saplings and shrubs growing in the understory of temperate deciduous forests extend their periods of leaf display beyond that of the overstory, resulting in periods when understory radiation, and hence productivity, are not limited by the overstory canopy. To assess the importance of the duration of leaf display on the productivity of understory and overstory trees of deciduous forests in the north eastern United States, we applied the simulation model, BIOME-BGC with climate data for Hubbard Brook Experimental Forest, New Hampshire, USA and mean ecophysiological data for species of deciduous, temperate forests. Extension of the overstory leaf display period increased overstory leaf area index (LAI) by only 3 to 4% and productivity by only 2 to 4%. In contrast, extending the growing season of the understory relative to the overstory by one week in both spring and fall, increased understory LAI by 35% and productivity by 32%. A 2-week extension of the growing period in both spring and fall increased understory LAI by 53% and productivity by 55%.
Constitutive Model Constants for Al7075-T651 and Al7075-T6
NASA Astrophysics Data System (ADS)
Brar, N. S.; Joshi, V. S.; Harris, B. W.
2009-12-01
Aluminum 7075-T651 and 7075-T6 are characterized at quasi-static and high strain rates to determine Johnson-Cook (J-C) strength and fracture model constants. Constitutive model constants are required as input to computer codes to simulate projectile (fragment) impact or similar impact events on structural components made of these materials. Although the two tempers show similar elongation at breakage, the ultimate tensile strength of T651 temper is generally lower than the T6 temper. Johnson-Cook strength model constants (A, B, n, C, and m) for the two alloys are determined from high strain rate tension stress-strain data at room and high temperature to 250°C. The Johnson-Cook fracture model constants are determined from quasi-static and medium strain rate as well as high temperature tests on notched and smooth tension specimens. Although the J-C strength model constants are similar, the fracture model constants show wide variations. Details of the experimental method used and the results for the two alloys are presented.
NASA Technical Reports Server (NTRS)
Bailey, D. H.; Barszcz, E.; Barton, J. T.; Carter, R. L.; Lasinski, T. A.; Browning, D. S.; Dagum, L.; Fatoohi, R. A.; Frederickson, P. O.; Schreiber, R. S.
1991-01-01
A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers in the framework of the NASA Ames Numerical Aerodynamic Simulation (NAS) Program. These consist of five 'parallel kernel' benchmarks and three 'simulated application' benchmarks. Together they mimic the computation and data movement characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
NASA Astrophysics Data System (ADS)
Eilert, Tobias; Beckers, Maximilian; Drechsler, Florian; Michaelis, Jens
2017-10-01
The analysis tool and software package Fast-NPS can be used to analyse smFRET data to obtain quantitative structural information about macromolecules in their natural environment. In the algorithm a Bayesian model gives rise to a multivariate probability distribution describing the uncertainty of the structure determination. Since Fast-NPS aims to be an easy-to-use general-purpose analysis tool for a large variety of smFRET networks, we established an MCMC based sampling engine that approximates the target distribution and requires no parameter specification by the user at all. For an efficient local exploration we automatically adapt the multivariate proposal kernel according to the shape of the target distribution. In order to handle multimodality, the sampler is equipped with a parallel tempering scheme that is fully adaptive with respect to temperature spacing and number of chains. Since the molecular surrounding of a dye molecule affects its spatial mobility and thus the smFRET efficiency, we introduce dye models which can be selected for every dye molecule individually. These models allow the user to represent the smFRET network in great detail leading to an increased localisation precision. Finally, a tool to validate the chosen model combination is provided. Programme Files doi:http://dx.doi.org/10.17632/7ztzj63r68.1 Licencing provisions: Apache-2.0 Programming language: GUI in MATLAB (The MathWorks) and the core sampling engine in C++ Nature of problem: Sampling of highly diverse multivariate probability distributions in order to solve for macromolecular structures from smFRET data. Solution method: MCMC algorithm with fully adaptive proposal kernel and parallel tempering scheme.
PARALLEL HOP: A SCALABLE HALO FINDER FOR MASSIVE COSMOLOGICAL DATA SETS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skory, Stephen; Turk, Matthew J.; Norman, Michael L.
2010-11-15
Modern N-body cosmological simulations contain billions (10{sup 9}) of dark matter particles. These simulations require hundreds to thousands of gigabytes of memory and employ hundreds to tens of thousands of processing cores on many compute nodes. In order to study the distribution of dark matter in a cosmological simulation, the dark matter halos must be identified using a halo finder, which establishes the halo membership of every particle in the simulation. The resources required for halo finding are similar to the requirements for the simulation itself. In particular, simulations have become too extensive to use commonly employed halo finders, suchmore » that the computational requirements to identify halos must now be spread across multiple nodes and cores. Here, we present a scalable-parallel halo finding method called Parallel HOP for large-scale cosmological simulation data. Based on the halo finder HOP, it utilizes message passing interface and domain decomposition to distribute the halo finding workload across multiple compute nodes, enabling analysis of much larger data sets than is possible with the strictly serial or previous parallel implementations of HOP. We provide a reference implementation of this method as a part of the toolkit {sup yt}, an analysis toolkit for adaptive mesh refinement data that include complementary analysis modules. Additionally, we discuss a suite of benchmarks that demonstrate that this method scales well up to several hundred tasks and data sets in excess of 2000{sup 3} particles. The Parallel HOP method and our implementation can be readily applied to any kind of N-body simulation data and is therefore widely applicable.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.
2013-08-21
Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f , (e.g. Verlet algorithm) is available to propagate the system from time ti (trajectory positions and velocities xi = (ri; vi)) to time ti+1 (xi+1) by xi+1 = fi(xi), the dynamics problem spanning an interval from t0 : : : tM can be transformed into a root finding problem, F(X) = [xi - f (x(i-1)]i=1;M = 0, for the trajectory variables. The root finding problem is solved using amore » variety of optimization techniques, including quasi-Newton and preconditioned quasi-Newton optimization schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed and the effectiveness of various approaches to solving the root finding problem are tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl+4H2O AIMD simulation at the MP2 level. The maximum speedup obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow TCP/IP networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl+4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. By using these algorithms we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 seconds per time step to 6.9 seconds per time step.« less
Bylaska, Eric J; Weare, Jonathan Q; Weare, John H
2013-08-21
Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time ti (trajectory positions and velocities xi = (ri, vi)) to time ti + 1 (xi + 1) by xi + 1 = fi(xi), the dynamics problem spanning an interval from t0[ellipsis (horizontal)]tM can be transformed into a root finding problem, F(X) = [xi - f(x(i - 1)]i = 1, M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H2O AIMD simulation at the MP2 level. The maximum speedup (serial execution/timeparallel execution time) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.
Fully Parallel MHD Stability Analysis Tool
NASA Astrophysics Data System (ADS)
Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang
2014-10-01
Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Initial results of the code parallelization will be reported. Work is supported by the U.S. DOE SBIR program.
NASA Astrophysics Data System (ADS)
Yang, Sheng-Chun; Lu, Zhong-Yuan; Qian, Hu-Jun; Wang, Yong-Lei; Han, Jie-Ping
2017-11-01
In this work, we upgraded the electrostatic interaction method of CU-ENUF (Yang, et al., 2016) which first applied CUNFFT (nonequispaced Fourier transforms based on CUDA) to the reciprocal-space electrostatic computation and made the computation of electrostatic interaction done thoroughly in GPU. The upgraded edition of CU-ENUF runs concurrently in a hybrid parallel way that enables the computation parallelizing on multiple computer nodes firstly, then further on the installed GPU in each computer. By this parallel strategy, the size of simulation system will be never restricted to the throughput of a single CPU or GPU. The most critical technical problem is how to parallelize a CUNFFT in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Furthermore, the upgraded method is capable of computing electrostatic interactions for both the atomistic molecular dynamics (MD) and the dissipative particle dynamics (DPD). Finally, the benchmarks conducted for validation and performance indicate that the upgraded method is able to not only present a good precision when setting suitable parameters, but also give an efficient way to compute electrostatic interactions for huge simulation systems. Program Files doi:http://dx.doi.org/10.17632/zncf24fhpv.1 Licensing provisions: GNU General Public License 3 (GPL) Programming language: C, C++, and CUDA C Supplementary material: The program is designed for effective electrostatic interactions of large-scale simulation systems, which runs on particular computers equipped with NVIDIA GPUs. It has been tested on (a) single computer node with Intel(R) Core(TM) i7-3770@ 3.40 GHz (CPU) and GTX 980 Ti (GPU), and (b) MPI parallel computer nodes with the same configurations. Nature of problem: For molecular dynamics simulation, the electrostatic interaction is the most time-consuming computation because of its long-range feature and slow convergence in simulation space, which approximately take up most of the total simulation time. Although the parallel method CU-ENUF (Yang et al., 2016) based on GPU has achieved a qualitative leap compared with previous methods in electrostatic interactions computation, the computation capability is limited to the throughput capacity of a single GPU for super-scale simulation system. Therefore, we should look for an effective method to handle the calculation of electrostatic interactions efficiently for a simulation system with super-scale size. Solution method: We constructed a hybrid parallel architecture, in which CPU and GPU are combined to accelerate the electrostatic computation effectively. Firstly, the simulation system is divided into many subtasks via domain-decomposition method. Then MPI (Message Passing Interface) is used to implement the CPU-parallel computation with each computer node corresponding to a particular subtask, and furthermore each subtask in one computer node will be executed in GPU in parallel efficiently. In this hybrid parallel method, the most critical technical problem is how to parallelize a CUNFFT (nonequispaced fast Fourier transform based on CUDA) in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Restrictions: The HP-ENUF is mainly oriented to super-scale system simulations, in which the performance superiority is shown adequately. However, for a small simulation system containing less than 106 particles, the mode of multiple computer nodes has no apparent efficiency advantage or even lower efficiency due to the serious network delay among computer nodes, than the mode of single computer node. References: (1) S.-C. Yang, H.-J. Qian, Z.-Y. Lu, Appl. Comput. Harmon. Anal. 2016, http://dx.doi.org/10.1016/j.acha.2016.04.009. (2) S.-C. Yang, Y.-L. Wang, G.-S. Jiao, H.-J. Qian, Z.-Y. Lu, J. Comput. Chem. 37 (2016) 378. (3) S.-C. Yang, Y.-L. Zhu, H.-J. Qian, Z.-Y. Lu, Appl. Chem. Res. Chin. Univ., 2017, http://dx.doi.org/10.1007/s40242-016-6354-5. (4) Y.-L. Zhu, H. Liu, Z.-W. Li, H.-J. Qian, G. Milano, Z.-Y. Lu, J. Comput. Chem. 34 (2013) 2197.
Hipp, Andrew L; Manos, Paul S; González-Rodríguez, Antonio; Hahn, Marlene; Kaproth, Matthew; McVay, John D; Avalos, Susana Valencia; Cavender-Bares, Jeannine
2018-01-01
Oaks (Quercus, Fagaceae) are the dominant tree genus of North America in species number and biomass, and Mexico is a global center of oak diversity. Understanding the origins of oak diversity is key to understanding biodiversity of northern temperate forests. A phylogenetic study of biogeography, niche evolution and diversification patterns in Quercus was performed using 300 samples, 146 species. Next-generation sequencing data were generated using the restriction-site associated DNA (RAD-seq) method. A time-calibrated maximum likelihood phylogeny was inferred and analyzed with bioclimatic, soils, and leaf habit data to reconstruct the biogeographic and evolutionary history of the American oaks. Our highly resolved phylogeny demonstrates sympatric parallel diversification in climatic niche, leaf habit, and diversification rates. The two major American oak clades arose in what is now the boreal zone and radiated, in parallel, from eastern North America into Mexico and Central America. Oaks adapted rapidly to niche transitions. The Mexican oaks are particularly numerous, not because Mexico is a center of origin, but because of high rates of lineage diversification associated with high rates of evolution along moisture gradients and between the evergreen and deciduous leaf habits. Sympatric parallel diversification in the oaks has shaped the diversity of North American forests. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Jung, Jaewoon; Mori, Takaharu; Kobayashi, Chigusa; Matsunaga, Yasuhiro; Yoda, Takao; Feig, Michael; Sugita, Yuji
2015-01-01
GENESIS (Generalized-Ensemble Simulation System) is a new software package for molecular dynamics (MD) simulations of macromolecules. It has two MD simulators, called ATDYN and SPDYN. ATDYN is parallelized based on an atomic decomposition algorithm for the simulations of all-atom force-field models as well as coarse-grained Go-like models. SPDYN is highly parallelized based on a domain decomposition scheme, allowing large-scale MD simulations on supercomputers. Hybrid schemes combining OpenMP and MPI are used in both simulators to target modern multicore computer architectures. Key advantages of GENESIS are (1) the highly parallel performance of SPDYN for very large biological systems consisting of more than one million atoms and (2) the availability of various REMD algorithms (T-REMD, REUS, multi-dimensional REMD for both all-atom and Go-like models under the NVT, NPT, NPAT, and NPγT ensembles). The former is achieved by a combination of the midpoint cell method and the efficient three-dimensional Fast Fourier Transform algorithm, where the domain decomposition space is shared in real-space and reciprocal-space calculations. Other features in SPDYN, such as avoiding concurrent memory access, reducing communication times, and usage of parallel input/output files, also contribute to the performance. We show the REMD simulation results of a mixed (POPC/DMPC) lipid bilayer as a real application using GENESIS. GENESIS is released as free software under the GPLv2 licence and can be easily modified for the development of new algorithms and molecular models. WIREs Comput Mol Sci 2015, 5:310–323. doi: 10.1002/wcms.1220 PMID:26753008
NASA Technical Reports Server (NTRS)
Krosel, S. M.; Milner, E. J.
1982-01-01
The application of Predictor corrector integration algorithms developed for the digital parallel processing environment are investigated. The algorithms are implemented and evaluated through the use of a software simulator which provides an approximate representation of the parallel processing hardware. Test cases which focus on the use of the algorithms are presented and a specific application using a linear model of a turbofan engine is considered. Results are presented showing the effects of integration step size and the number of processors on simulation accuracy. Real time performance, interprocessor communication, and algorithm startup are also discussed.
Shen, Wenfeng; Wei, Daming; Xu, Weimin; Zhu, Xin; Yuan, Shizhong
2010-10-01
Biological computations like electrocardiological modelling and simulation usually require high-performance computing environments. This paper introduces an implementation of parallel computation for computer simulation of electrocardiograms (ECGs) in a personal computer environment with an Intel CPU of Core (TM) 2 Quad Q6600 and a GPU of Geforce 8800GT, with software support by OpenMP and CUDA. It was tested in three parallelization device setups: (a) a four-core CPU without a general-purpose GPU, (b) a general-purpose GPU plus 1 core of CPU, and (c) a four-core CPU plus a general-purpose GPU. To effectively take advantage of a multi-core CPU and a general-purpose GPU, an algorithm based on load-prediction dynamic scheduling was developed and applied to setting (c). In the simulation with 1600 time steps, the speedup of the parallel computation as compared to the serial computation was 3.9 in setting (a), 16.8 in setting (b), and 20.0 in setting (c). This study demonstrates that a current PC with a multi-core CPU and a general-purpose GPU provides a good environment for parallel computations in biological modelling and simulation studies. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
Dependability analysis of parallel systems using a simulation-based approach. M.S. Thesis
NASA Technical Reports Server (NTRS)
Sawyer, Darren Charles
1994-01-01
The analysis of dependability in large, complex, parallel systems executing real applications or workloads is examined in this thesis. To effectively demonstrate the wide range of dependability problems that can be analyzed through simulation, the analysis of three case studies is presented. For each case, the organization of the simulation model used is outlined, and the results from simulated fault injection experiments are explained, showing the usefulness of this method in dependability modeling of large parallel systems. The simulation models are constructed using DEPEND and C++. Where possible, methods to increase dependability are derived from the experimental results. Another interesting facet of all three cases is the presence of some kind of workload of application executing in the simulation while faults are injected. This provides a completely new dimension to this type of study, not possible to model accurately with analytical approaches.
2014-06-12
interferometry and polarimetry . In the paper, the model was used to simulate SAR data for Mangrove (tropical) and Nezer (temperate) forests for P-band and...Scattering Model Applied to Radiometry, Interferometry, and Polarimetry at P- and L-Band. IEEE Transactions on Geoscience and Remote Sensing 44(4): 849
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
PARTIAL INHIBITION OF IN VITRO POLLEN GERMINATION BY SIMULATED SOLAR ULTRAVIOLET-B RADIATION
Pollen from four temperate-latitude taxa were treated with UV radiation in a portion of the UV-B (280-320 nm) waveband during in vitro germination. Inhibition of germination was noted in this pollen compared to samples treated identically except for the exclusion of the UV-B port...
Impact of seasonality on artificial drainage discharge under temperate climate conditions
Ulrike Hirt; Annett Wetzig; Devandra Amatya; Marisa Matranga
2011-01-01
Artificial drainage systems affect all components of the water and matter balance. For the proper simulation of water and solute fluxes, information is needed about artificial drainage discharge rates and their response times. However, there is relatively little information available about the response of artificial drainage systems to precipitation. To address this...
2013-08-01
potential for HMX / RDX (3, 9). ...................................................................................8 1 1. Purpose This work...6 dispersion and electrostatic interactions. Constants for the SB potential are given in table 1. 8 Table 1. SB potential for HMX / RDX (3, 9...modeling dislocations in the energetic molecular crystal RDX using the Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) molecular
Constitutive Model Calibration via Autonomous Multiaxial Experimentation (Postprint)
2016-09-17
test machine. Experimental data is reduced and finite element simulations are conducted in parallel with the test based on experimental strain...data is reduced and finite element simulations are conducted in parallel with the test based on experimental strain conditions. Optimization methods...be used directly in finite element simulations of more complex geometries. Keywords Axial/torsional experimentation • Plasticity • Constitutive model
Parallel Implementation of the Discontinuous Galerkin Method
NASA Technical Reports Server (NTRS)
Baggag, Abdalkader; Atkins, Harold; Keyes, David
1999-01-01
This paper describes a parallel implementation of the discontinuous Galerkin method. Discontinuous Galerkin is a spatially compact method that retains its accuracy and robustness on non-smooth unstructured grids and is well suited for time dependent simulations. Several parallelization approaches are studied and evaluated. The most natural and symmetric of the approaches has been implemented in all object-oriented code used to simulate aeroacoustic scattering. The parallel implementation is MPI-based and has been tested on various parallel platforms such as the SGI Origin, IBM SP2, and clusters of SGI and Sun workstations. The scalability results presented for the SGI Origin show slightly superlinear speedup on a fixed-size problem due to cache effects.
Yang, Fu-lin; Zhou, Guang-sheng; Zhang, Feng; Wang, Feng-yu; Bao, Fang; Ping, Xiao-yan
2009-12-01
Based on the meteorological and biological observation data from the temperate desert steppe ecosystem research station in Sunitezuoqi of Inner Mongolia during growth season (from May 1st to October 15th, 2008), the diurnal and seasonal characteristics of surface albedo in the steppe were analyzed, with related model constructed. In the steppe, the diurnal variation of surface albedo was mainly affected by solar altitude, being higher just after sunrise and before sunset and lower in midday. During growth season, the surface albedo was from 0.20 to 0.34, with an average of 0.25, and was higher in May, decreased in June, kept relatively stable from July to September, and increased in October. This seasonal variation was related to the phenology of canopy leaf, and affected by precipitation process. Soil water content (SWC) and leaf area index (LAI) were the key factors affecting the surface albedo. A model for the surface albedo responding to SWC and LAI was developed, which showed a good performance in consistent between simulated and observed surface albedo.
Experimental evidence for beneficial effects of projected climate change on hibernating amphibians.
Üveges, Bálint; Mahr, Katharina; Szederkényi, Márk; Bókony, Veronika; Hoi, Herbert; Hettyey, Attila
2016-05-27
Amphibians are the most threatened vertebrates today, experiencing worldwide declines. In recent years considerable effort was invested in exposing the causes of these declines. Climate change has been identified as such a cause; however, the expectable effects of predicted milder, shorter winters on hibernation success of temperate-zone Amphibians have remained controversial, mainly due to a lack of controlled experimental studies. Here we present a laboratory experiment, testing the effects of simulated climate change on hibernating juvenile common toads (Bufo bufo). We simulated hibernation conditions by exposing toadlets to current (1.5 °C) or elevated (4.5 °C) hibernation temperatures in combination with current (91 days) or shortened (61 days) hibernation length. We found that a shorter winter and milder hibernation temperature increased survival of toads during hibernation. Furthermore, the increase in temperature and shortening of the cold period had a synergistic positive effect on body mass change during hibernation. Consequently, while climate change may pose severe challenges for amphibians of the temperate zone during their activity period, the negative effects may be dampened by shorter and milder winters experienced during hibernation.
Yu, Mei; Gao, Qiong
2011-01-01
Background and Aims The ability to simulate plant competition accurately is essential for plant functional type (PFT)-based models used in climate-change studies, yet gaps and uncertainties remain in our understanding of the details of the competition mechanisms and in ecosystem responses at a landscape level. This study examines secondary succession in a temperate deciduous forest in eastern China with the aim of determining if competition between tree types can be explained by differences in leaf ecophysiological traits and growth allometry, and whether ecophysiological traits and habitat spatial configurations among PFTs differentiate their responses to climate change. Methods A temperate deciduous broadleaved forest in eastern China was studied, containing two major vegetation types dominated by Quercus liaotungensis (OAK) and by birch/poplar (Betula platyphylla and Populus davidiana; BIP), respectively. The Terrestrial Ecosystem Simulator (TESim) suite of models was used to examine carbon and water dynamics using parameters measured at the site, and the model was evaluated against long-term data collected at the site. Key Results Simulations indicated that a higher assimilation rate for the BIP vegetation than OAK led to the former's dominance during early successional stages with relatively low competition. In middle/late succession with intensive competition for below-ground resources, BIP, with its lower drought tolerance/resistance and smaller allocation to leaves/roots, gave way to OAK. At landscape scale, predictions with increased temperature extrapolated from existing weather records resulted in increased average net primary productivity (NPP; +19 %), heterotrophic respiration (+23 %) and net ecosystem carbon balance (+17 %). The BIP vegetation in higher and cooler habitats showed 14 % greater sensitivity to increased temperature than the OAK at lower and warmer locations. Conclusions Drought tolerance/resistance and morphology-related allocation strategy (i.e. more allocation to leaves/roots) played key roles in the competition between the vegetation types. The overall site-average impacts of increased temperature on NPP and carbon stored in plants were found to be positive, despite negative effects of increased respiration and soil water stress, with such impacts being more significant for BIP located in higher and cooler habitats. PMID:21835816
NASA Astrophysics Data System (ADS)
Thomas, R. Q.; Williams, M.
2014-12-01
Carbon (C) and nitrogen (N) cycles are coupled in terrestrial ecosystems through multiple processes including photosynthesis, tissue allocation, respiration, N fixation, N uptake, and decomposition of litter and soil organic matter. Capturing the constraint of N on terrestrial C uptake and storage has been a focus of the Earth System modelling community. Here we explore the trade-offs and sensitivities of allocating C and N to different tissues in order to optimize the productivity of plants using a new, simple model of ecosystem C-N cycling and interactions (ACONITE). ACONITE builds on theory related to plant economics in order to predict key ecosystem properties (leaf area index, leaf C:N, N fixation, and plant C use efficiency) based on the optimization of the marginal change in net C or N uptake associated with a change in allocation of C or N to plant tissues. We simulated and evaluated steady-state and transient ecosystem stocks and fluxes in three different forest ecosystems types (tropical evergreen, temperate deciduous, and temperate evergreen). Leaf C:N differed among the three ecosystem types (temperate deciduous < tropical evergreen < temperature evergreen), a result that compared well to observations from a global database describing plant traits. Gross primary productivity (GPP) and net primary productivity (NPP) estimates compared well to observed fluxes at the simulation sites. A sensitivity analysis revealed that parameterization of the relationship between leaf N and leaf respiration had the largest influence on leaf area index and leaf C:N. Also, a widely used linear leaf N-respiration relationship did not yield a realistic leaf C:N, while a more recently reported non-linear relationship simulated leaf C:N that compared better to the global trait database than the linear relationship. Overall, our ability to constrain leaf area index and allow spatially and temporally variable leaf C:N can help address challenges simulating these properties in ecosystem and Earth System models. Furthermore, the simple approach with emergent properties based on coupled C-N dynamics has potential for use in research that uses data-assimilation methods to integrate data on both the C and N cycles to improve C flux forecasts.
Providing a parallel and distributed capability for JMASS using SPEEDES
NASA Astrophysics Data System (ADS)
Valinski, Maria; Driscoll, Jonathan; McGraw, Robert M.; Meyer, Bob
2002-07-01
The Joint Modeling And Simulation System (JMASS) is a Tri-Service simulation environment that supports engineering and engagement-level simulations. As JMASS is expanded to support other Tri-Service domains, the current set of modeling services must be expanded for High Performance Computing (HPC) applications by adding support for advanced time-management algorithms, parallel and distributed topologies, and high speed communications. By providing support for these services, JMASS can better address modeling domains requiring parallel computationally intense calculations such clutter, vulnerability and lethality calculations, and underwater-based scenarios. A risk reduction effort implementing some HPC services for JMASS using the SPEEDES (Synchronous Parallel Environment for Emulation and Discrete Event Simulation) Simulation Framework has recently concluded. As an artifact of the JMASS-SPEEDES integration, not only can HPC functionality be brought to the JMASS program through SPEEDES, but an additional HLA-based capability can be demonstrated that further addresses interoperability issues. The JMASS-SPEEDES integration provided a means of adding HLA capability to preexisting JMASS scenarios through an implementation of the standard JMASS port communication mechanism that allows players to communicate.
CUBE: Information-optimized parallel cosmological N-body simulation code
NASA Astrophysics Data System (ADS)
Yu, Hao-Ran; Pen, Ue-Li; Wang, Xin
2018-05-01
CUBE, written in Coarray Fortran, is a particle-mesh based parallel cosmological N-body simulation code. The memory usage of CUBE can approach as low as 6 bytes per particle. Particle pairwise (PP) force, cosmological neutrinos, spherical overdensity (SO) halofinder are included.
Efficient parallel simulation of CO2 geologic sequestration insaline aquifers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Keni; Doughty, Christine; Wu, Yu-Shu
2007-01-01
An efficient parallel simulator for large-scale, long-termCO2 geologic sequestration in saline aquifers has been developed. Theparallel simulator is a three-dimensional, fully implicit model thatsolves large, sparse linear systems arising from discretization of thepartial differential equations for mass and energy balance in porous andfractured media. The simulator is based on the ECO2N module of the TOUGH2code and inherits all the process capabilities of the single-CPU TOUGH2code, including a comprehensive description of the thermodynamics andthermophysical properties of H2O-NaCl- CO2 mixtures, modeling singleand/or two-phase isothermal or non-isothermal flow processes, two-phasemixtures, fluid phases appearing or disappearing, as well as saltprecipitation or dissolution. The newmore » parallel simulator uses MPI forparallel implementation, the METIS software package for simulation domainpartitioning, and the iterative parallel linear solver package Aztec forsolving linear equations by multiple processors. In addition, theparallel simulator has been implemented with an efficient communicationscheme. Test examples show that a linear or super-linear speedup can beobtained on Linux clusters as well as on supercomputers. Because of thesignificant improvement in both simulation time and memory requirement,the new simulator provides a powerful tool for tackling larger scale andmore complex problems than can be solved by single-CPU codes. Ahigh-resolution simulation example is presented that models buoyantconvection, induced by a small increase in brine density caused bydissolution of CO2.« less
Hardness of H13 Tool Steel After Non-isothermal Tempering
NASA Astrophysics Data System (ADS)
Nelson, E.; Kohli, A.; Poirier, D. R.
2018-04-01
A direct method to calculate the tempering response of a tool steel (H13) that exhibits secondary hardening is presented. Based on the traditional method of presenting tempering response in terms of isothermal tempering, we show that the tempering response for a steel undergoing a non-isothermal tempering schedule can be predicted. Experiments comprised (1) isothermal tempering, (2) non-isothermal tempering pertaining to a relatively slow heating to process-temperature and (3) fast-heating cycles that are relevant to tempering by induction heating. After establishing the tempering response of the steel under simple isothermal conditions, the tempering response can be applied to non-isothermal tempering by using a numerical method to calculate the tempering parameter. Calculated results are verified by the experiments.
Parallel computing of physical maps--a comparative study in SIMD and MIMD parallelism.
Bhandarkar, S M; Chirravuri, S; Arnold, J
1996-01-01
Ordering clones from a genomic library into physical maps of whole chromosomes presents a central computational problem in genetics. Chromosome reconstruction via clone ordering is usually isomorphic to the NP-complete Optimal Linear Arrangement problem. Parallel SIMD and MIMD algorithms for simulated annealing based on Markov chain distribution are proposed and applied to the problem of chromosome reconstruction via clone ordering. Perturbation methods and problem-specific annealing heuristics are proposed and described. The SIMD algorithms are implemented on a 2048 processor MasPar MP-2 system which is an SIMD 2-D toroidal mesh architecture whereas the MIMD algorithms are implemented on an 8 processor Intel iPSC/860 which is an MIMD hypercube architecture. A comparative analysis of the various SIMD and MIMD algorithms is presented in which the convergence, speedup, and scalability characteristics of the various algorithms are analyzed and discussed. On a fine-grained, massively parallel SIMD architecture with a low synchronization overhead such as the MasPar MP-2, a parallel simulated annealing algorithm based on multiple periodically interacting searches performs the best. For a coarse-grained MIMD architecture with high synchronization overhead such as the Intel iPSC/860, a parallel simulated annealing algorithm based on multiple independent searches yields the best results. In either case, distribution of clonal data across multiple processors is shown to exacerbate the tendency of the parallel simulated annealing algorithm to get trapped in a local optimum.
McGuire, A.D.; Melillo, J.M.; Randerson, J.T.; Parton, W.J.; Heimann, Martin; Meier, R.A.; Clein, Joy S.; Kicklighter, D.W.; Sauf, W.
2000-01-01
Simulations by global terrestrial biogeochemical models (TBMs) consistently underestimate the concentration of atmospheric carbon dioxide (CO2) at high latitude monitoring stations during the nongrowing season. We hypothesized that heterotrophic respiration is underestimated during the nongrowing season primarily because TBMs do not generally consider the insulative effects of snowpack on soil temperature. To evaluate this hypothesis, we compared the performance of baseline and modified versions of three TBMs in simulating the seasonal cycle of atmospheric CO2 at high latitude CO2 monitoring stations; the modified version maintained soil temperature at 0 ??C when modeled snowpack was present. The three TBMs include the Carnegie-Ames-Stanford Approach (CASA), Century, and the Terrestrial Ecosystem Model (TEM). In comparison with the baseline simulation of each model, the snowpack simulations caused higher releases of CO2 between November and March and greater uptake of CO2 between June and August for latitudes north of 30??N. We coupled the monthly estimates of CO2 exchange, the seasonal carbon dioxide flux fields generated by the HAMOCC3 seasonal ocean carbon cycle model, and fossil fuel source fields derived from standard sources to the three-dimensional atmospheric transport model TM2 forced by observed winds to simulate the seasonal cycle of atmospheric CO2 at each of seven high latitude monitoring stations, in comparison to the CO2 concentrations simulated with the baseline fluxes of each TBM, concentrations simulated using the snowpack fluxes are generally in better agreement with observed concentrations between August and March at each of the monitoring stations. Thus, representation of the insulative effects of snowpack in TBMs generally improves simulation of atmospheric CO2 concentrations in high latitudes during both the late growing season and nongrowing season. These simulations highlight the global importance of biogeochemical processes during the nongrowing season in estimating carbon balance of ecosystems in northern high and temperate latitudes.
Ma, Jun; Hu, Yuanman; Bu, Rencang; Chang, Yu; Deng, Huawei; Qin, Qin
2014-01-01
The aboveground carbon sequestration rate (ACSR) reflects the influence of climate change on forest dynamics. To reveal the long-term effects of climate change on forest succession and carbon sequestration, a forest landscape succession and disturbance model (LANDIS Pro7.0) was used to simulate the ACSR of a temperate forest at the community and species levels in northeastern China based on both current and predicted climatic data. On the community level, the ACSR of mixed Korean pine hardwood forests and mixed larch hardwood forests, fluctuated during the entire simulation, while a large decline of ACSR emerged in interim of simulation in spruce-fir forest and aspen-white birch forests, respectively. On the species level, the ACSR of all conifers declined greatly around 2070s except for Korean pine. The ACSR of dominant hardwoods in the Lesser Khingan Mountains area, such as Manchurian ash, Amur cork, black elm, and ribbed birch fluctuated with broad ranges, respectively. Pioneer species experienced a sharp decline around 2080s, and they would finally disappear in the simulation. The differences of the ACSR among various climates were mainly identified in mixed Korean pine hardwood forests, in all conifers, and in a few hardwoods in the last quarter of simulation. These results indicate that climate warming can influence the ACSR in the Lesser Khingan Mountains area, and the largest impact commonly emerged in the A2 scenario. The ACSR of coniferous species experienced higher impact by climate change than that of deciduous species. PMID:24763409
Ma, Jun; Hu, Yuanman; Bu, Rencang; Chang, Yu; Deng, Huawei; Qin, Qin
2014-01-01
The aboveground carbon sequestration rate (ACSR) reflects the influence of climate change on forest dynamics. To reveal the long-term effects of climate change on forest succession and carbon sequestration, a forest landscape succession and disturbance model (LANDIS Pro7.0) was used to simulate the ACSR of a temperate forest at the community and species levels in northeastern China based on both current and predicted climatic data. On the community level, the ACSR of mixed Korean pine hardwood forests and mixed larch hardwood forests, fluctuated during the entire simulation, while a large decline of ACSR emerged in interim of simulation in spruce-fir forest and aspen-white birch forests, respectively. On the species level, the ACSR of all conifers declined greatly around 2070s except for Korean pine. The ACSR of dominant hardwoods in the Lesser Khingan Mountains area, such as Manchurian ash, Amur cork, black elm, and ribbed birch fluctuated with broad ranges, respectively. Pioneer species experienced a sharp decline around 2080s, and they would finally disappear in the simulation. The differences of the ACSR among various climates were mainly identified in mixed Korean pine hardwood forests, in all conifers, and in a few hardwoods in the last quarter of simulation. These results indicate that climate warming can influence the ACSR in the Lesser Khingan Mountains area, and the largest impact commonly emerged in the A2 scenario. The ACSR of coniferous species experienced higher impact by climate change than that of deciduous species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zuo, Wangda; McNeil, Andrew; Wetter, Michael
2013-05-23
Building designers are increasingly relying on complex fenestration systems to reduce energy consumed for lighting and HVAC in low energy buildings. Radiance, a lighting simulation program, has been used to conduct daylighting simulations for complex fenestration systems. Depending on the configurations, the simulation can take hours or even days using a personal computer. This paper describes how to accelerate the matrix multiplication portion of a Radiance three-phase daylight simulation by conducting parallel computing on heterogeneous hardware of a personal computer. The algorithm was optimized and the computational part was implemented in parallel using OpenCL. The speed of new approach wasmore » evaluated using various daylighting simulation cases on a multicore central processing unit and a graphics processing unit. Based on the measurements and analysis of the time usage for the Radiance daylighting simulation, further speedups can be achieved by using fast I/O devices and storing the data in a binary format.« less
Using parallel computing for the display and simulation of the space debris environment
NASA Astrophysics Data System (ADS)
Möckel, M.; Wiedemann, C.; Flegel, S.; Gelhaus, J.; Vörsmann, P.; Klinkrad, H.; Krag, H.
2011-07-01
Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction to OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Using parallel computing for the display and simulation of the space debris environment
NASA Astrophysics Data System (ADS)
Moeckel, Marek; Wiedemann, Carsten; Flegel, Sven Kevin; Gelhaus, Johannes; Klinkrad, Heiner; Krag, Holger; Voersmann, Peter
Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction of OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Xyce Parallel Electronic Simulator Users' Guide Version 6.7.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting
This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one tomore » develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright c 2002-2017 Sandia Corporation. All rights reserved. Trademarks Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. All other trademarks are property of their respective owners. Contacts World Wide Web http://xyce.sandia.gov https://info.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only) Bug Reports (Sandia only) http://joseki-vm.sandia.gov/bugzilla http://morannon.sandia.gov/bugzilla« less
Developing parallel GeoFEST(P) using the PYRAMID AMR library
NASA Technical Reports Server (NTRS)
Norton, Charles D.; Lyzenga, Greg; Parker, Jay; Tisdale, Robert E.
2004-01-01
The PYRAMID parallel unstructured adaptive mesh refinement (AMR) library has been coupled with the GeoFEST geophysical finite element simulation tool to support parallel active tectonics simulations. Specifically, we have demonstrated modeling of coseismic and postseismic surface displacement due to a simulated Earthquake for the Landers system of interacting faults in Southern California. The new software demonstrated a 25-times resolution improvement and a 4-times reduction in time to solution over the sequential baseline milestone case. Simulations on workstations using a few tens of thousands of stress displacement finite elements can now be expanded to multiple millions of elements with greater than 98% scaled efficiency on various parallel platforms over many hundreds of processors. Our most recent work has demonstrated that we can dynamically adapt the computational grid as stress grows on a fault. In this paper, we will describe the major issues and challenges associated with coupling these two programs to create GeoFEST(P). Performance and visualization results will also be described.
Parallel Discrete Molecular Dynamics Simulation With Speculation and In-Order Commitment*†
Khan, Md. Ashfaquzzaman; Herbordt, Martin C.
2011-01-01
Discrete molecular dynamics simulation (DMD) uses simplified and discretized models enabling simulations to advance by event rather than by timestep. DMD is an instance of discrete event simulation and so is difficult to scale: even in this multi-core era, all reported DMD codes are serial. In this paper we discuss the inherent difficulties of scaling DMD and present our method of parallelizing DMD through event-based decomposition. Our method is microarchitecture inspired: speculative processing of events exposes parallelism, while in-order commitment ensures correctness. We analyze the potential of this parallelization method for shared-memory multiprocessors. Achieving scalability required extensive experimentation with scheduling and synchronization methods to mitigate serialization. The speed-up achieved for a variety of system sizes and complexities is nearly 6× on an 8-core and over 9× on a 12-core processor. We present and verify analytical models that account for the achieved performance as a function of available concurrency and architectural limitations. PMID:21822327
Parallel Discrete Molecular Dynamics Simulation With Speculation and In-Order Commitment.
Khan, Md Ashfaquzzaman; Herbordt, Martin C
2011-07-20
Discrete molecular dynamics simulation (DMD) uses simplified and discretized models enabling simulations to advance by event rather than by timestep. DMD is an instance of discrete event simulation and so is difficult to scale: even in this multi-core era, all reported DMD codes are serial. In this paper we discuss the inherent difficulties of scaling DMD and present our method of parallelizing DMD through event-based decomposition. Our method is microarchitecture inspired: speculative processing of events exposes parallelism, while in-order commitment ensures correctness. We analyze the potential of this parallelization method for shared-memory multiprocessors. Achieving scalability required extensive experimentation with scheduling and synchronization methods to mitigate serialization. The speed-up achieved for a variety of system sizes and complexities is nearly 6× on an 8-core and over 9× on a 12-core processor. We present and verify analytical models that account for the achieved performance as a function of available concurrency and architectural limitations.
NASA Technical Reports Server (NTRS)
Kasahara, Hironori; Honda, Hiroki; Narita, Seinosuke
1989-01-01
Parallel processing of real-time dynamic systems simulation on a multiprocessor system named OSCAR is presented. In the simulation of dynamic systems, generally, the same calculation are repeated every time step. However, we cannot apply to Do-all or the Do-across techniques for parallel processing of the simulation since there exist data dependencies from the end of an iteration to the beginning of the next iteration and furthermore data-input and data-output are required every sampling time period. Therefore, parallelism inside the calculation required for a single time step, or a large basic block which consists of arithmetic assignment statements, must be used. In the proposed method, near fine grain tasks, each of which consists of one or more floating point operations, are generated to extract the parallelism from the calculation and assigned to processors by using optimal static scheduling at compile time in order to reduce large run time overhead caused by the use of near fine grain tasks. The practicality of the scheme is demonstrated on OSCAR (Optimally SCheduled Advanced multiprocessoR) which has been developed to extract advantageous features of static scheduling algorithms to the maximum extent.
A parallel implementation of an off-lattice individual-based model of multicellular populations
NASA Astrophysics Data System (ADS)
Harvey, Daniel G.; Fletcher, Alexander G.; Osborne, James M.; Pitt-Francis, Joe
2015-07-01
As computational models of multicellular populations include ever more detailed descriptions of biophysical and biochemical processes, the computational cost of simulating such models limits their ability to generate novel scientific hypotheses and testable predictions. While developments in microchip technology continue to increase the power of individual processors, parallel computing offers an immediate increase in available processing power. To make full use of parallel computing technology, it is necessary to develop specialised algorithms. To this end, we present a parallel algorithm for a class of off-lattice individual-based models of multicellular populations. The algorithm divides the spatial domain between computing processes and comprises communication routines that ensure the model is correctly simulated on multiple processors. The parallel algorithm is shown to accurately reproduce the results of a deterministic simulation performed using a pre-existing serial implementation. We test the scaling of computation time, memory use and load balancing as more processes are used to simulate a cell population of fixed size. We find approximate linear scaling of both speed-up and memory consumption on up to 32 processor cores. Dynamic load balancing is shown to provide speed-up for non-regular spatial distributions of cells in the case of a growing population.
NASA Astrophysics Data System (ADS)
Ferrando, N.; Gosálvez, M. A.; Cerdá, J.; Gadea, R.; Sato, K.
2011-03-01
Presently, dynamic surface-based models are required to contain increasingly larger numbers of points and to propagate them over longer time periods. For large numbers of surface points, the octree data structure can be used as a balance between low memory occupation and relatively rapid access to the stored data. For evolution rules that depend on neighborhood states, extended simulation periods can be obtained by using simplified atomistic propagation models, such as the Cellular Automata (CA). This method, however, has an intrinsic parallel updating nature and the corresponding simulations are highly inefficient when performed on classical Central Processing Units (CPUs), which are designed for the sequential execution of tasks. In this paper, a series of guidelines is presented for the efficient adaptation of octree-based, CA simulations of complex, evolving surfaces into massively parallel computing hardware. A Graphics Processing Unit (GPU) is used as a cost-efficient example of the parallel architectures. For the actual simulations, we consider the surface propagation during anisotropic wet chemical etching of silicon as a computationally challenging process with a wide-spread use in microengineering applications. A continuous CA model that is intrinsically parallel in nature is used for the time evolution. Our study strongly indicates that parallel computations of dynamically evolving surfaces simulated using CA methods are significantly benefited by the incorporation of octrees as support data structures, substantially decreasing the overall computational time and memory usage.
Tough2{_}MP: A parallel version of TOUGH2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Keni; Wu, Yu-Shu; Ding, Chris
2003-04-09
TOUGH2{_}MP is a massively parallel version of TOUGH2. It was developed for running on distributed-memory parallel computers to simulate large simulation problems that may not be solved by the standard, single-CPU TOUGH2 code. The new code implements an efficient massively parallel scheme, while preserving the full capacity and flexibility of the original TOUGH2 code. The new software uses the METIS software package for grid partitioning and AZTEC software package for linear-equation solving. The standard message-passing interface is adopted for communication among processors. Numerical performance of the current version code has been tested on CRAY-T3E and IBM RS/6000 SP platforms. Inmore » addition, the parallel code has been successfully applied to real field problems of multi-million-cell simulations for three-dimensional multiphase and multicomponent fluid and heat flow, as well as solute transport. In this paper, we will review the development of the TOUGH2{_}MP, and discuss the basic features, modules, and their applications.« less
Parallel VLSI architecture emulation and the organization of APSA/MPP
NASA Technical Reports Server (NTRS)
Odonnell, John T.
1987-01-01
The Applicative Programming System Architecture (APSA) combines an applicative language interpreter with a novel parallel computer architecture that is well suited for Very Large Scale Integration (VLSI) implementation. The Massively Parallel Processor (MPP) can simulate VLSI circuits by allocating one processing element in its square array to an area on a square VLSI chip. As long as there are not too many long data paths, the MPP can simulate a VLSI clock cycle very rapidly. The APSA circuit contains a binary tree with a few long paths and many short ones. A skewed H-tree layout allows every processing element to simulate a leaf cell and up to four tree nodes, with no loss in parallelism. Emulation of a key APSA algorithm on the MPP resulted in performance 16,000 times faster than a Vax. This speed will make it possible for the APSA language interpreter to run fast enough to support research in parallel list processing algorithms.
A real-time, dual processor simulation of the rotor system research aircraft
NASA Technical Reports Server (NTRS)
Mackie, D. B.; Alderete, T. S.
1977-01-01
A real-time, man-in-the loop, simulation of the rotor system research aircraft (RSRA) was conducted. The unique feature of this simulation was that two digital computers were used in parallel to solve the equations of the RSRA mathematical model. The design, development, and implementation of the simulation are documented. Program validation was discussed, and examples of data recordings are given. This simulation provided an important research tool for the RSRA project in terms of safe and cost-effective design analysis. In addition, valuable knowledge concerning parallel processing and a powerful simulation hardware and software system was gained.
Capabilities of Fully Parallelized MHD Stability Code MARS
NASA Astrophysics Data System (ADS)
Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang
2016-10-01
Results of full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. Parallel version of MARS, named PMARS, has been recently developed at FAR-TECH. Parallelized MARS is an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, implemented in MARS. Parallelization of the code included parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse vector iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the MARS algorithm using parallel libraries and procedures. Parallelized MARS is capable of calculating eigenmodes with significantly increased spatial resolution: up to 5,000 adapted radial grid points with up to 500 poloidal harmonics. Such resolution is sufficient for simulation of kink, tearing and peeling-ballooning instabilities with physically relevant parameters. Work is supported by the U.S. DOE SBIR program.
Fully Parallel MHD Stability Analysis Tool
NASA Astrophysics Data System (ADS)
Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang
2015-11-01
Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Results of MARS parallelization and of the development of a new fix boundary equilibrium code adapted for MARS input will be reported. Work is supported by the U.S. DOE SBIR program.
NASA Astrophysics Data System (ADS)
Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian
2018-01-01
We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.
Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...
2016-09-18
This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.
This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
Parallel ALLSPD-3D: Speeding Up Combustor Analysis Via Parallel Processing
NASA Technical Reports Server (NTRS)
Fricker, David M.
1997-01-01
The ALLSPD-3D Computational Fluid Dynamics code for reacting flow simulation was run on a set of benchmark test cases to determine its parallel efficiency. These test cases included non-reacting and reacting flow simulations with varying numbers of processors. Also, the tests explored the effects of scaling the simulation with the number of processors in addition to distributing a constant size problem over an increasing number of processors. The test cases were run on a cluster of IBM RS/6000 Model 590 workstations with ethernet and ATM networking plus a shared memory SGI Power Challenge L workstation. The results indicate that the network capabilities significantly influence the parallel efficiency, i.e., a shared memory machine is fastest and ATM networking provides acceptable performance. The limitations of ethernet greatly hamper the rapid calculation of flows using ALLSPD-3D.
Applying Parallel Processing Techniques to Tether Dynamics Simulation
NASA Technical Reports Server (NTRS)
Wells, B. Earl
1996-01-01
The focus of this research has been to determine the effectiveness of applying parallel processing techniques to a sizable real-world problem, the simulation of the dynamics associated with a tether which connects two objects in low earth orbit, and to explore the degree to which the parallelization process can be automated through the creation of new software tools. The goal has been to utilize this specific application problem as a base to develop more generally applicable techniques.
Parallel discrete event simulation using shared memory
NASA Technical Reports Server (NTRS)
Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.
1988-01-01
With traditional event-list techniques, evaluating a detailed discrete-event simulation-model can often require hours or even days of computation time. By eliminating the event list and maintaining only sufficient synchronization to ensure causality, parallel simulation can potentially provide speedups that are linear in the numbers of processors. A set of shared-memory experiments, using the Chandy-Misra distributed-simulation algorithm, to simulate networks of queues is presented. Parameters of the study include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential-simulation of most queueing network models.
User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Earth Sciences Division; Zhang, Keni; Zhang, Keni
TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used. To familiarize users with the parallel code, illustrative sample problems are presented.« less
Scalability of Parallel Spatial Direct Numerical Simulations on Intel Hypercube and IBM SP1 and SP2
NASA Technical Reports Server (NTRS)
Joslin, Ronald D.; Hanebutte, Ulf R.; Zubair, Mohammad
1995-01-01
The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 M ops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a "real world" simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application. KEY WORDS: Spatial direct numerical simulations; incompressible viscous flows; spectral methods; finite differences; parallel computing.
Parallel-plate transmission line type of EMP simulators: Systematic review and recommendations
NASA Astrophysics Data System (ADS)
Giri, D. V.; Liu, T. K.; Tesche, F. M.; King, R. W. P.
1980-05-01
This report presents various aspects of the two-parallel-plate transmission line type of EMP simulator. Much of the work is the result of research efforts conducted during the last two decades at the Air Force Weapons Laboratory, and in industries/universities as well. The principal features of individual simulator components are discussed. The report also emphasizes that it is imperative to hybridize our understanding of individual components so that we can draw meaningful conclusions of simulator performance as a whole.
MMS Observations and Hybrid Simulations of Surface Ripples at a Marginally Quasi-Parallel Shock
NASA Astrophysics Data System (ADS)
Gingell, Imogen; Schwartz, Steven J.; Burgess, David; Johlander, Andreas; Russell, Christopher T.; Burch, James L.; Ergun, Robert E.; Fuselier, Stephen; Gershman, Daniel J.; Giles, Barbara L.; Goodrich, Katherine A.; Khotyaintsev, Yuri V.; Lavraud, Benoit; Lindqvist, Per-Arne; Strangeway, Robert J.; Trattner, Karlheinz; Torbert, Roy B.; Wei, Hanying; Wilder, Frederick
2017-11-01
Simulations and observations of collisionless shocks have shown that deviations of the nominal local shock normal orientation, that is, surface waves or ripples, are expected to propagate in the ramp and overshoot of quasi-perpendicular shocks. Here we identify signatures of a surface ripple propagating during a crossing of Earth's marginally quasi-parallel (θBn˜45∘) or quasi-parallel bow shock on 27 November 2015 06:01:44 UTC by the Magnetospheric Multiscale (MMS) mission and determine the ripple's properties using multispacecraft methods. Using two-dimensional hybrid simulations, we confirm that surface ripples are a feature of marginally quasi-parallel and quasi-parallel shocks under the observed solar wind conditions. In addition, since these marginally quasi-parallel and quasi-parallel shocks are expected to undergo a cyclic reformation of the shock front, we discuss the impact of multiple sources of nonstationarity on shock structure. Importantly, ripples are shown to be transient phenomena, developing faster than an ion gyroperiod and only during the period of the reformation cycle when a newly developed shock ramp is unaffected by turbulence in the foot. We conclude that the change in properties of the ripple observed by MMS is consistent with the reformation of the shock front over a time scale of an ion gyroperiod.
NASA Astrophysics Data System (ADS)
Sabzikar, Farzad; Meerschaert, Mark M.; Chen, Jinghua
2015-07-01
Fractional derivatives and integrals are convolutions with a power law. Multiplying by an exponential factor leads to tempered fractional derivatives and integrals. Tempered fractional diffusion equations, where the usual second derivative in space is replaced by a tempered fractional derivative, govern the limits of random walk models with an exponentially tempered power law jump distribution. The limiting tempered stable probability densities exhibit semi-heavy tails, which are commonly observed in finance. Tempered power law waiting times lead to tempered fractional time derivatives, which have proven useful in geophysics. The tempered fractional derivative or integral of a Brownian motion, called a tempered fractional Brownian motion, can exhibit semi-long range dependence. The increments of this process, called tempered fractional Gaussian noise, provide a useful new stochastic model for wind speed data. A tempered fractional difference forms the basis for numerical methods to solve tempered fractional diffusion equations, and it also provides a useful new correlation model in time series.
Meerschaert, Mark M; Sabzikar, Farzad; Chen, Jinghua
2015-07-15
Fractional derivatives and integrals are convolutions with a power law. Multiplying by an exponential factor leads to tempered fractional derivatives and integrals. Tempered fractional diffusion equations, where the usual second derivative in space is replaced by a tempered fractional derivative, govern the limits of random walk models with an exponentially tempered power law jump distribution. The limiting tempered stable probability densities exhibit semi-heavy tails, which are commonly observed in finance. Tempered power law waiting times lead to tempered fractional time derivatives, which have proven useful in geophysics. The tempered fractional derivative or integral of a Brownian motion, called a tempered fractional Brownian motion, can exhibit semi-long range dependence. The increments of this process, called tempered fractional Gaussian noise, provide a useful new stochastic model for wind speed data. A tempered difference forms the basis for numerical methods to solve tempered fractional diffusion equations, and it also provides a useful new correlation model in time series.
MEERSCHAERT, MARK M.; SABZIKAR, FARZAD; CHEN, JINGHUA
2014-01-01
Fractional derivatives and integrals are convolutions with a power law. Multiplying by an exponential factor leads to tempered fractional derivatives and integrals. Tempered fractional diffusion equations, where the usual second derivative in space is replaced by a tempered fractional derivative, govern the limits of random walk models with an exponentially tempered power law jump distribution. The limiting tempered stable probability densities exhibit semi-heavy tails, which are commonly observed in finance. Tempered power law waiting times lead to tempered fractional time derivatives, which have proven useful in geophysics. The tempered fractional derivative or integral of a Brownian motion, called a tempered fractional Brownian motion, can exhibit semi-long range dependence. The increments of this process, called tempered fractional Gaussian noise, provide a useful new stochastic model for wind speed data. A tempered difference forms the basis for numerical methods to solve tempered fractional diffusion equations, and it also provides a useful new correlation model in time series. PMID:26085690
Carbon sequestration in managed temperate coniferous forests under climate change
NASA Astrophysics Data System (ADS)
Dymond, Caren C.; Beukema, Sarah; Nitschke, Craig R.; Coates, K. David; Scheller, Robert M.
2016-03-01
Management of temperate forests has the potential to increase carbon sinks and mitigate climate change. However, those opportunities may be confounded by negative climate change impacts. We therefore need a better understanding of climate change alterations to temperate forest carbon dynamics before developing mitigation strategies. The purpose of this project was to investigate the interactions of species composition, fire, management, and climate change in the Copper-Pine Creek valley, a temperate coniferous forest with a wide range of growing conditions. To do so, we used the LANDIS-II modelling framework including the new Forest Carbon Succession extension to simulate forest ecosystems under four different productivity scenarios, with and without climate change effects, until 2050. Significantly, the new extension allowed us to calculate the net sector productivity, a carbon accounting metric that integrates aboveground and belowground carbon dynamics, disturbances, and the eventual fate of forest products. The model output was validated against literature values. The results implied that the species optimum growing conditions relative to current and future conditions strongly influenced future carbon dynamics. Warmer growing conditions led to increased carbon sinks and storage in the colder and wetter ecoregions but not necessarily in the others. Climate change impacts varied among species and site conditions, and this indicates that both of these components need to be taken into account when considering climate change mitigation activities and adaptive management. The introduction of a new carbon indicator, net sector productivity, promises to be useful in assessing management effectiveness and mitigation activities.
NASA Astrophysics Data System (ADS)
Bao, Jian; Lau, Calvin; Kuley, Animesh; Lin, Zhihong; Fulton, Daniel; Tajima, Toshiki; Tri Alpha Energy, Inc. Team
2017-10-01
Collisional and turbulent transport in a field reversed configuration (FRC) is studied in global particle simulation by using GTC (gyrokinetic toroidal code). The global FRC geometry is incorporated in GTC by using a field-aligned mesh in cylindrical coordinates, which enables global simulation coupling core and scrape-off layer (SOL) across the separatrix. Furthermore, fully kinetic ions are implemented in GTC to treat magnetic-null point in FRC core. Both global simulation coupling core and SOL regions and independent SOL region simulation have been carried out to study turbulence. In this work, the ``logical sheath boundary condition'' is implemented to study parallel transport in the SOL. This method helps to relax time and spatial steps without resolving electron plasma frequency and Debye length, which enables turbulent transports simulation with sheath effects. We will study collisional and turbulent SOL parallel transport with mirror geometry and sheath boundary condition in C2-W divertor.
Simulated parallel annealing within a neighborhood for optimization of biomechanical systems.
Higginson, J S; Neptune, R R; Anderson, F C
2005-09-01
Optimization problems for biomechanical systems have become extremely complex. Simulated annealing (SA) algorithms have performed well in a variety of test problems and biomechanical applications; however, despite advances in computer speed, convergence to optimal solutions for systems of even moderate complexity has remained prohibitive. The objective of this study was to develop a portable parallel version of a SA algorithm for solving optimization problems in biomechanics. The algorithm for simulated parallel annealing within a neighborhood (SPAN) was designed to minimize interprocessor communication time and closely retain the heuristics of the serial SA algorithm. The computational speed of the SPAN algorithm scaled linearly with the number of processors on different computer platforms for a simple quadratic test problem and for a more complex forward dynamic simulation of human pedaling.
Simulation of Hypervelocity Impact on Aluminum-Nextel-Kevlar Orbital Debris Shields
NASA Technical Reports Server (NTRS)
Fahrenthold, Eric P.
2000-01-01
An improved hybrid particle-finite element method has been developed for hypervelocity impact simulation. The method combines the general contact-impact capabilities of particle codes with the true Lagrangian kinematics of large strain finite element formulations. Unlike some alternative schemes which couple Lagrangian finite element models with smooth particle hydrodynamics, the present formulation makes no use of slidelines or penalty forces. The method has been implemented in a parallel, three dimensional computer code. Simulations of three dimensional orbital debris impact problems using this parallel hybrid particle-finite element code, show good agreement with experiment and good speedup in parallel computation. The simulations included single and multi-plate shields as well as aluminum and composite shielding materials. at an impact velocity of eleven kilometers per second.
Parallel Grand Canonical Monte Carlo (ParaGrandMC) Simulation Code
NASA Technical Reports Server (NTRS)
Yamakov, Vesselin I.
2016-01-01
This report provides an overview of the Parallel Grand Canonical Monte Carlo (ParaGrandMC) simulation code. This is a highly scalable parallel FORTRAN code for simulating the thermodynamic evolution of metal alloy systems at the atomic level, and predicting the thermodynamic state, phase diagram, chemical composition and mechanical properties. The code is designed to simulate multi-component alloy systems, predict solid-state phase transformations such as austenite-martensite transformations, precipitate formation, recrystallization, capillary effects at interfaces, surface absorption, etc., which can aid the design of novel metallic alloys. While the software is mainly tailored for modeling metal alloys, it can also be used for other types of solid-state systems, and to some degree for liquid or gaseous systems, including multiphase systems forming solid-liquid-gas interfaces.
Simulation of ICESat-2 canopy height retrievals for different ecosystems
NASA Astrophysics Data System (ADS)
Neuenschwander, A. L.
2016-12-01
Slated for launch in late 2017 (or early 2018), the ICESat-2 satellite will provide a global distribution of geodetic measurements from a space-based laser altimeter of both the terrain surface and relative canopy heights which will provide a significant benefit to society through a variety of applications ranging from improved global digital terrain models to producing distribution of above ground vegetation structure. The ATLAS instrument designed for ICESat-2, will utilize a different technology than what is found on most laser mapping systems. The photon counting technology of the ATLAS instrument onboard ICESat-2 will record the arrival time associated with a single photon detection. That detection can occur anywhere within the vertical distribution of the reflected signal, that is, anywhere within the vertical distribution of the canopy. This uncertainty of where the photon will be returned from within the vegetation layer is referred to as the vertical sampling error. Preliminary simulation studies to estimate vertical sampling error have been conducted for several ecosystems including woodland savanna, montane conifers, temperate hardwoods, tropical forest, and boreal forest. The results from these simulations indicate that the canopy heights reported on the ATL08 data product will underestimate the top canopy height in the range of 1 - 4 m. Although simulation results indicate the ICESat-2 will underestimate top canopy height, there is, however, a strong correlation between ICESat-2 heights and relative canopy height metrics (e.g. RH75, RH90). In tropical forest, simulation results indicate the ICESat-2 height correlates strongly with RH90. Similarly, in temperate broadleaf forest, the simulated ICESat-2 heights were also strongly correlated with RH90. In boreal forest, the simulated ICESat-2 heights are strongly correlated with RH75 heights. It is hypothesized that the correlations between simulated ICESat-2 heights and canopy height metrics are a function of both canopy cover and vegetation physiology (e.g. leaf size/shape) which contributes to the horizontal and vertical structure of the vegetation.
Evaluating long-term cumulative hydrologic effects of forest management: a conceptual approach
Robert R. Ziemer
1992-01-01
It is impractical to address experimentally many aspects of cumulative hydrologic effects, since to do so would require studying large watersheds for a century or more. Monte Carlo simulations were conducted using three hypothetical 10,000-ha fifth-order forested watersheds. Most of the physical processes expressed by the model are transferable from temperate to...
An Empirical Development of Parallelization Guidelines for Time-Driven Simulation
1989-12-01
wives, who though not Cub fans, put on a good show during our trip, to waich some games . I would also like to recognize the help of my professors at...program parallelization. in this research effort a Ballistic Missile Defense (BMD) time driven simulation program, developed by DESE Research and...continuously, or continuously with discrete changes superimposed. The distinguishing feature of these simulations is the interaction between discretely
Xyce Parallel Electronic Simulator : reference guide, version 2.0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.
Predicting Protein Structure Using Parallel Genetic Algorithms.
1994-12-01
Molecular dynamics attempts to simulate the protein folding process. However, the time steps required for this simulation are on the order of one...harmonics. These two factors have limited molecular dynamics simulations to less than a few nanoseconds (10-9 sec), even on today’s fastest supercomputers...By " Predicting rotein Structure D istribticfiar.. ................ Using Parallel Genetic Algorithms ,Avaiu " ’ •"... Dist THESIS I IGeorge H
Xyce™ Parallel Electronic Simulator Reference Guide Version 6.8
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce . This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.
VizieR Online Data Catalog: Solar wind 3D magnetohydrodynamic simulation (Chhiber+, 2017)
NASA Astrophysics Data System (ADS)
Chhiber, R.; Subedi, P.; Usmanov, A. V.; Matthaeus, W. H.; Ruffolo, D.; Goldstein, M. L.; Parashar, T. N.
2017-08-01
We use a three-dimensional magnetohydrodynamic simulation of the solar wind to calculate cosmic-ray diffusion coefficients throughout the inner heliosphere (2Rȯ-3au). The simulation resolves large-scale solar wind flow, which is coupled to small-scale fluctuations through a turbulence model. Simulation results specify background solar wind fields and turbulence parameters, which are used to compute diffusion coefficients and study their behavior in the inner heliosphere. The parallel mean free path (mfp) is evaluated using quasi-linear theory, while the perpendicular mfp is determined from nonlinear guiding center theory with the random ballistic interpretation. Several runs examine varying turbulent energy and different solar source dipole tilts. We find that for most of the inner heliosphere, the radial mfp is dominated by diffusion parallel to the mean magnetic field; the parallel mfp remains at least an order of magnitude larger than the perpendicular mfp, except in the heliospheric current sheet, where the perpendicular mfp may be a few times larger than the parallel mfp. In the ecliptic region, the perpendicular mfp may influence the radial mfp at heliocentric distances larger than 1.5au; our estimations of the parallel mfp in the ecliptic region at 1 au agree well with the Palmer "consensus" range of 0.08-0.3au. Solar activity increases perpendicular diffusion and reduces parallel diffusion. The parallel mfp mostly varies with rigidity (P) as P.33, and the perpendicular mfp is weakly dependent on P. The mfps are weakly influenced by the choice of long-wavelength power spectra. (2 data files).
Smoldyn on graphics processing units: massively parallel Brownian dynamics simulations.
Dematté, Lorenzo
2012-01-01
Space is a very important aspect in the simulation of biochemical systems; recently, the need for simulation algorithms able to cope with space is becoming more and more compelling. Complex and detailed models of biochemical systems need to deal with the movement of single molecules and particles, taking into consideration localized fluctuations, transportation phenomena, and diffusion. A common drawback of spatial models lies in their complexity: models can become very large, and their simulation could be time consuming, especially if we want to capture the systems behavior in a reliable way using stochastic methods in conjunction with a high spatial resolution. In order to deliver the promise done by systems biology to be able to understand a system as whole, we need to scale up the size of models we are able to simulate, moving from sequential to parallel simulation algorithms. In this paper, we analyze Smoldyn, a widely diffused algorithm for stochastic simulation of chemical reactions with spatial resolution and single molecule detail, and we propose an alternative, innovative implementation that exploits the parallelism of Graphics Processing Units (GPUs). The implementation executes the most computational demanding steps (computation of diffusion, unimolecular, and bimolecular reaction, as well as the most common cases of molecule-surface interaction) on the GPU, computing them in parallel on each molecule of the system. The implementation offers good speed-ups and real time, high quality graphics output
NASA Astrophysics Data System (ADS)
Zhang, Q.; Drake, J. F.; Swisdak, M.
2017-12-01
How ions and electrons are energized in magnetic reconnection outflows is an essential topic throughout the heliosphere. Here we carry out guide field PIC Riemann simulations to explore the ion and electron energization mechanisms far downstream of the x-line. Riemann simulations, with their simple magnetic geometry, facilitate the study of the reconnection outflow far downstream of the x-line in much more detail than is possible with conventional reconnection simulations. We find that the ions get accelerated at rotational discontinuities, counter stream, and give rise to two slow shocks. We demonstrate that the energization mechanism at the slow shocks is essentially the same as that of parallel electrostatic shocks. Also, the electron confining electric potential at the slow shocks is driven by the counterstreaming beams, which tend to break the quasi-neutrality. Based on this picture, we build a kinetic model to self consistently predict the downstream ion and electron temperatures. Additional explorations using parallel shock simulations also imply that in a very low beta(0.001 0.01 for a modest guide field) regime, electron energization will be insignificant compared to the ion energization. Our model and the parallel shock simulations might be used as simple tools to understand and estimate the energization of ions and electrons and the energy partition far downstream of the x-line.
Parallel Tensor Compression for Large-Scale Scientific Data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kolda, Tamara G.; Ballard, Grey; Austin, Woody Nathan
As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memorymore » parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.« less
Synchronous parallel system for emulation and discrete event simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (Inventor)
1992-01-01
A synchronous parallel system for emulation and discrete event simulation having parallel nodes responds to received messages at each node by generating event objects having individual time stamps, stores only the changes to state variables of the simulation object attributable to the event object, and produces corresponding messages. The system refrains from transmitting the messages and changing the state variables while it determines whether the changes are superseded, and then stores the unchanged state variables in the event object for later restoral to the simulation object if called for. This determination preferably includes sensing the time stamp of each new event object and determining which new event object has the earliest time stamp as the local event horizon, determining the earliest local event horizon of the nodes as the global event horizon, and ignoring the events whose time stamps are less than the global event horizon. Host processing between the system and external terminals enables such a terminal to query, monitor, command or participate with a simulation object during the simulation process.
GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.
Hess, Berk; Kutzner, Carsten; van der Spoel, David; Lindahl, Erik
2008-03-01
Molecular simulation is an extremely useful, but computationally very expensive tool for studies of chemical and biomolecular systems. Here, we present a new implementation of our molecular simulation toolkit GROMACS which now both achieves extremely high performance on single processors from algorithmic optimizations and hand-coded routines and simultaneously scales very well on parallel machines. The code encompasses a minimal-communication domain decomposition algorithm, full dynamic load balancing, a state-of-the-art parallel constraint solver, and efficient virtual site algorithms that allow removal of hydrogen atom degrees of freedom to enable integration time steps up to 5 fs for atomistic simulations also in parallel. To improve the scaling properties of the common particle mesh Ewald electrostatics algorithms, we have in addition used a Multiple-Program, Multiple-Data approach, with separate node domains responsible for direct and reciprocal space interactions. Not only does this combination of algorithms enable extremely long simulations of large systems but also it provides that simulation performance on quite modest numbers of standard cluster nodes.
Synchronous Parallel System for Emulation and Discrete Event Simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeffrey S. (Inventor)
2001-01-01
A synchronous parallel system for emulation and discrete event simulation having parallel nodes responds to received messages at each node by generating event objects having individual time stamps, stores only the changes to the state variables of the simulation object attributable to the event object and produces corresponding messages. The system refrains from transmitting the messages and changing the state variables while it determines whether the changes are superseded, and then stores the unchanged state variables in the event object for later restoral to the simulation object if called for. This determination preferably includes sensing the time stamp of each new event object and determining which new event object has the earliest time stamp as the local event horizon, determining the earliest local event horizon of the nodes as the global event horizon, and ignoring events whose time stamps are less than the global event horizon. Host processing between the system and external terminals enables such a terminal to query, monitor, command or participate with a simulation object during the simulation process.
Parallelization of Program to Optimize Simulated Trajectories (POST3D)
NASA Technical Reports Server (NTRS)
Hammond, Dana P.; Korte, John J. (Technical Monitor)
2001-01-01
This paper describes the parallelization of the Program to Optimize Simulated Trajectories (POST3D). POST3D uses a gradient-based optimization algorithm that reaches an optimum design point by moving from one design point to the next. The gradient calculations required to complete the optimization process, dominate the computational time and have been parallelized using a Single Program Multiple Data (SPMD) on a distributed memory NUMA (non-uniform memory access) architecture. The Origin2000 was used for the tests presented.
Design of a real-time wind turbine simulator using a custom parallel architecture
NASA Technical Reports Server (NTRS)
Hoffman, John A.; Gluck, R.; Sridhar, S.
1995-01-01
The design of a new parallel-processing digital simulator is described. The new simulator has been developed specifically for analysis of wind energy systems in real time. The new processor has been named: the Wind Energy System Time-domain simulator, version 3 (WEST-3). Like previous WEST versions, WEST-3 performs many computations in parallel. The modules in WEST-3 are pure digital processors, however. These digital processors can be programmed individually and operated in concert to achieve real-time simulation of wind turbine systems. Because of this programmability, WEST-3 is very much more flexible and general than its two predecessors. The design features of WEST-3 are described to show how the system produces high-speed solutions of nonlinear time-domain equations. WEST-3 has two very fast Computational Units (CU's) that use minicomputer technology plus special architectural features that make them many times faster than a microcomputer. These CU's are needed to perform the complex computations associated with the wind turbine rotor system in real time. The parallel architecture of the CU causes several tasks to be done in each cycle, including an IO operation and the combination of a multiply, add, and store. The WEST-3 simulator can be expanded at any time for additional computational power. This is possible because the CU's interfaced to each other and to other portions of the simulation using special serial buses. These buses can be 'patched' together in essentially any configuration (in a manner very similar to the programming methods used in analog computation) to balance the input/ output requirements. CU's can be added in any number to share a given computational load. This flexible bus feature is very different from many other parallel processors which usually have a throughput limit because of rigid bus architecture.
Accelerating Wright–Fisher Forward Simulations on the Graphics Processing Unit
Lawrie, David S.
2017-01-01
Forward Wright–Fisher simulations are powerful in their ability to model complex demography and selection scenarios, but suffer from slow execution on the Central Processor Unit (CPU), thus limiting their usefulness. However, the single-locus Wright–Fisher forward algorithm is exceedingly parallelizable, with many steps that are so-called “embarrassingly parallel,” consisting of a vast number of individual computations that are all independent of each other and thus capable of being performed concurrently. The rise of modern Graphics Processing Units (GPUs) and programming languages designed to leverage the inherent parallel nature of these processors have allowed researchers to dramatically speed up many programs that have such high arithmetic intensity and intrinsic concurrency. The presented GPU Optimized Wright–Fisher simulation, or “GO Fish” for short, can be used to simulate arbitrary selection and demographic scenarios while running over 250-fold faster than its serial counterpart on the CPU. Even modest GPU hardware can achieve an impressive speedup of over two orders of magnitude. With simulations so accelerated, one can not only do quick parametric bootstrapping of previously estimated parameters, but also use simulated results to calculate the likelihoods and summary statistics of demographic and selection models against real polymorphism data, all without restricting the demographic and selection scenarios that can be modeled or requiring approximations to the single-locus forward algorithm for efficiency. Further, as many of the parallel programming techniques used in this simulation can be applied to other computationally intensive algorithms important in population genetics, GO Fish serves as an exciting template for future research into accelerating computation in evolution. GO Fish is part of the Parallel PopGen Package available at: http://dl42.github.io/ParallelPopGen/. PMID:28768689
Lee, Anthony; Yau, Christopher; Giles, Michael B.; Doucet, Arnaud; Holmes, Christopher C.
2011-01-01
We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design. PMID:22003276
NASA Technical Reports Server (NTRS)
Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad
1994-01-01
The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.
Estimation of Nitrous Oxide Emissions from US Grasslands.
Mummey; Smith; Bluhm
2000-02-01
/ Nitrous oxide (N(2)O) emissions from temperate grasslands are poorly quantified and may be an important part of the atmospheric N(2)O budget. In this study N(2)O emissions were simulated for 1052 grassland sites in the United States using the NGAS model of Parton and others (1996) coupled with an organic matter decomposition model. N(2)O flux was calculated for each site using soil and land use data obtained from the National Resource Inventory (NRI) database and weather data obtained from NASA. The estimates were regionalized based upon temperature and moisture isotherms. Annual N(2)O emissions for each region were based on the grassland area of each region and the mean estimated annual N(2)O flux from NRI grassland sites in the region. The regional fluxes ranged from 0.18 to 1.02 kg N(2)O N/ha/yr with the mean flux for all regions being 0.28 kg N(2)O N/ha/yr. Even though fluxes from the western regions were relatively low, these regions made the largest contribution to total emissions due to their large grassland area. Total US grassland N(2)O emissions were estimated to be about 67 Gg N(2)O N/yr. Emissions from the Great Plains states, which contain the largest expanse of natural grassland in the United States, were estimated to average 0.24 kg N(2)O N/ha/yr. Using the annual flux estimate for the temperate Great Plains, we estimate that temperate grasslands worldwide may potentially produce 0.27 Tg N(2)O N/yr. Even though our estimate for global temperate grassland N(2)O emissions is less than published estimates for other major temperate and tropical biomes, our results indicate that temperate grasslands are a significant part of both United States and global atmospheric N(2)O budgets. This study demonstrates the utility of models for regional N(2)O flux estimation although additional data from carefully designed field studies is needed to further validate model results.
Brischoux, François; Dupoué, Andréaz; Lourdais, Olivier; Angelier, Frédéric
2016-02-01
Temperate ectotherms are expected to benefit from climate change (e.g., increased activity time), but the impacts of climate warming during the winter have mostly been overlooked. Milder winters are expected to decrease body condition upon emergence, and thus to affect crucial life-history traits, such as survival and reproduction. Mild winter temperature could also trigger a state of chronic physiological stress due to inadequate thermal conditions that preclude both dormancy and activity. We tested these hypotheses on a typical temperate ectothermic vertebrate, the aspic viper (Vipera aspis). We simulated different wintering conditions for three groups of aspic vipers (cold: ~6 °C, mild: ~14 °C and no wintering: ~24 °C) during a one month long period. We found that mild wintering conditions induced a marked decrease in body condition, and provoked an alteration of some hormonal mechanisms involved in emergence. Such effects are likely to bear ultimate consequences on reproduction, and thus population persistence. We emphasize that future studies should incorporate the critical, albeit neglected, winter season when assessing the potential impacts of global changes on ectotherms. Copyright © 2015 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sabzikar, Farzad, E-mail: sabzika2@stt.msu.edu; Meerschaert, Mark M., E-mail: mcubed@stt.msu.edu; Chen, Jinghua, E-mail: cjhdzdz@163.com
2015-07-15
Fractional derivatives and integrals are convolutions with a power law. Multiplying by an exponential factor leads to tempered fractional derivatives and integrals. Tempered fractional diffusion equations, where the usual second derivative in space is replaced by a tempered fractional derivative, govern the limits of random walk models with an exponentially tempered power law jump distribution. The limiting tempered stable probability densities exhibit semi-heavy tails, which are commonly observed in finance. Tempered power law waiting times lead to tempered fractional time derivatives, which have proven useful in geophysics. The tempered fractional derivative or integral of a Brownian motion, called a temperedmore » fractional Brownian motion, can exhibit semi-long range dependence. The increments of this process, called tempered fractional Gaussian noise, provide a useful new stochastic model for wind speed data. A tempered fractional difference forms the basis for numerical methods to solve tempered fractional diffusion equations, and it also provides a useful new correlation model in time series.« less
Parallel programming with Easy Java Simulations
NASA Astrophysics Data System (ADS)
Esquembre, F.; Christian, W.; Belloni, M.
2018-01-01
Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.
A parallel computational model for GATE simulations.
Rannou, F R; Vega-Acevedo, N; El Bitar, Z
2013-12-01
GATE/Geant4 Monte Carlo simulations are computationally demanding applications, requiring thousands of processor hours to produce realistic results. The classical strategy of distributing the simulation of individual events does not apply efficiently for Positron Emission Tomography (PET) experiments, because it requires a centralized coincidence processing and large communication overheads. We propose a parallel computational model for GATE that handles event generation and coincidence processing in a simple and efficient way by decentralizing event generation and processing but maintaining a centralized event and time coordinator. The model is implemented with the inclusion of a new set of factory classes that can run the same executable in sequential or parallel mode. A Mann-Whitney test shows that the output produced by this parallel model in terms of number of tallies is equivalent (but not equal) to its sequential counterpart. Computational performance evaluation shows that the software is scalable and well balanced. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Global MHD simulation of magnetosphere using HPF
NASA Astrophysics Data System (ADS)
Ogino, T.
We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5% using 56 PEs of Fujitsu VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.
DOE Office of Scientific and Technical Information (OSTI.GOV)
G.A. Pope; K. Sephernoori; D.C. McKinney
1996-03-15
This report describes the application of distributed-memory parallel programming techniques to a compositional simulator called UTCHEM. The University of Texas Chemical Flooding reservoir simulator (UTCHEM) is a general-purpose vectorized chemical flooding simulator that models the transport of chemical species in three-dimensional, multiphase flow through permeable media. The parallel version of UTCHEM addresses solving large-scale problems by reducing the amount of time that is required to obtain the solution as well as providing a flexible and portable programming environment. In this work, the original parallel version of UTCHEM was modified and ported to CRAY T3D and CRAY T3E, distributed-memory, multiprocessor computersmore » using CRAY-PVM as the interprocessor communication library. Also, the data communication routines were modified such that the portability of the original code across different computer architectures was mad possible.« less
Parallel Simulation of Unsteady Turbulent Flames
NASA Technical Reports Server (NTRS)
Menon, Suresh
1996-01-01
Time-accurate simulation of turbulent flames in high Reynolds number flows is a challenging task since both fluid dynamics and combustion must be modeled accurately. To numerically simulate this phenomenon, very large computer resources (both time and memory) are required. Although current vector supercomputers are capable of providing adequate resources for simulations of this nature, the high cost and their limited availability, makes practical use of such machines less than satisfactory. At the same time, the explicit time integration algorithms used in unsteady flow simulations often possess a very high degree of parallelism, making them very amenable to efficient implementation on large-scale parallel computers. Under these circumstances, distributed memory parallel computers offer an excellent near-term solution for greatly increased computational speed and memory, at a cost that may render the unsteady simulations of the type discussed above more feasible and affordable.This paper discusses the study of unsteady turbulent flames using a simulation algorithm that is capable of retaining high parallel efficiency on distributed memory parallel architectures. Numerical studies are carried out using large-eddy simulation (LES). In LES, the scales larger than the grid are computed using a time- and space-accurate scheme, while the unresolved small scales are modeled using eddy viscosity based subgrid models. This is acceptable for the moment/energy closure since the small scales primarily provide a dissipative mechanism for the energy transferred from the large scales. However, for combustion to occur, the species must first undergo mixing at the small scales and then come into molecular contact. Therefore, global models cannot be used. Recently, a new model for turbulent combustion was developed, in which the combustion is modeled, within the subgrid (small-scales) using a methodology that simulates the mixing and the molecular transport and the chemical kinetics within each LES grid cell. Finite-rate kinetics can be included without any closure and this approach actually provides a means to predict the turbulent rates and the turbulent flame speed. The subgrid combustion model requires resolution of the local time scales associated with small-scale mixing, molecular diffusion and chemical kinetics and, therefore, within each grid cell, a significant amount of computations must be carried out before the large-scale (LES resolved) effects are incorporated. Therefore, this approach is uniquely suited for parallel processing and has been implemented on various systems such as: Intel Paragon, IBM SP-2, Cray T3D and SGI Power Challenge (PC) using the system independent Message Passing Interface (MPI) compiler. In this paper, timing data on these machines is reported along with some characteristic results.
NASA Astrophysics Data System (ADS)
Thomas, R. Q.; Williams, M.
2014-04-01
Carbon (C) and nitrogen (N) cycles are coupled in terrestrial ecosystems through multiple processes including photosynthesis, tissue allocation, respiration, N fixation, N uptake, and decomposition of litter and soil organic matter. Capturing the constraint of N on terrestrial C uptake and storage has been a focus of the Earth System modelling community. However there is little understanding of the trade-offs and sensitivities of allocating C and N to different tissues in order to optimize the productivity of plants. Here we describe a new, simple model of ecosystem C-N cycling and interactions (ACONITE), that builds on theory related to plant economics in order to predict key ecosystem properties (leaf area index, leaf C : N, N fixation, and plant C use efficiency) using emergent constraints provided by marginal returns on investment for C and/or N allocation. We simulated and evaluated steady-state ecosystem stocks and fluxes in three different forest ecosystems types (tropical evergreen, temperate deciduous, and temperate evergreen). Leaf C : N differed among the three ecosystem types (temperate deciduous < tropical evergreen < temperature evergreen), a result that compared well to observations from a global database describing plant traits. Gross primary productivity (GPP) and net primary productivity (NPP) estimates compared well to observed fluxes at the simulation sites. Simulated N fixation at steady-state, calculated based on relative demand for N and the marginal return on C investment to acquire N, was an order of magnitude higher in the tropical forest than in the temperate forest, consistent with observations. A sensitivity analysis revealed that parameterization of the relationship between leaf N and leaf respiration had the largest influence on leaf area index and leaf C : N. Also, a widely used linear leaf N-respiration relationship did not yield a realistic leaf C : N, while a more recently reported non-linear relationship performed better. A parameter governing how photosynthesis scales with day length had the largest influence on total vegetation C, GPP, and NPP. Multiple parameters associated with photosynthesis, respiration, and N uptake influenced the rate of N fixation. Overall, our ability to constrain leaf area index and have spatially and temporally variable leaf C : N helps address challenges for ecosystem and Earth System models. Furthermore, the simple approach with emergent properties based on coupled C-N dynamics has potential for use in research that uses data-assimilation methods to integrate data on both the C and N cycles to improve C flux forecasts.
Visualization Co-Processing of a CFD Simulation
NASA Technical Reports Server (NTRS)
Vaziri, Arsi
1999-01-01
OVERFLOW, a widely used CFD simulation code, is combined with a visualization system, pV3, to experiment with an environment for simulation/visualization co-processing on a SGI Origin 2000 computer(O2K) system. The shared memory version of the solver is used with the O2K 'pfa' preprocessor invoked to automatically discover parallelism in the source code. No other explicit parallelism is enabled. In order to study the scaling and performance of the visualization co-processing system, sample runs are made with different processor groups in the range of 1 to 254 processors. The data exchange between the visualization system and the simulation system is rapid enough for user interactivity when the problem size is small. This shared memory version of OVERFLOW, with minimal parallelization, does not scale well to an increasing number of available processors. The visualization task takes about 18 to 30% of the total processing time and does not appear to be a major contributor to the poor scaling. Improper load balancing and inter-processor communication overhead are contributors to this poor performance. Work is in progress which is aimed at obtaining improved parallel performance of the solver and removing the limitations of serial data transfer to pV3 by examining various parallelization/communication strategies, including the use of the explicit message passing.
Testing for carryover effects after cessation of treatments: a design approach.
Sturdevant, S Gwynn; Lumley, Thomas
2016-08-02
Recently, trials addressing noisy measurements with diagnosis occurring by exceeding thresholds (such as diabetes and hypertension) have been published which attempt to measure carryover - the impact that treatment has on an outcome after cessation. The design of these trials has been criticised and simulations have been conducted which suggest that the parallel-designs used are not adequate to test this hypothesis; two solutions are that either a differing parallel-design or a cross-over design could allow for diagnosis of carryover. We undertook a systematic simulation study to determine the ability of a cross-over or a parallel-group trial design to detect carryover effects on incident hypertension in a population with prehypertension. We simulated blood pressure and focused on varying criteria to diagnose systolic hypertension. Using the difference in cumulative incidence hypertension to analyse parallel-group or cross-over trials resulted in none of the designs having acceptable Type I error rate. Under the null hypothesis of no carryover the difference is well above the nominal 5 % error rate. When a treatment is effective during the intervention period, reliable testing for a carryover effect is difficult. Neither parallel-group nor cross-over designs using the difference in cumulative incidence appear to be a feasible approach. Future trials should ensure their design and analysis is validated by simulation.
Algorithm theoretical basis for GEDI level-4A footprint above ground biomass density.
NASA Astrophysics Data System (ADS)
Kellner, J. R.; Armston, J.; Blair, J. B.; Duncanson, L.; Hancock, S.; Hofton, M. A.; Luthcke, S. B.; Marselis, S.; Tang, H.; Dubayah, R.
2017-12-01
The Global Ecosystem Dynamics Investigation is a NASA Earth-Venture-2 mission that will place a multi-beam waveform lidar instrument on the International Space Station. GEDI data will provide globally representative measurements of vertical height profiles (waveforms) and estimates of above ground carbon stocks throughout the planet's temperate and tropical regions. Here we describe the current algorithm theoretical basis for the L4A footprint above ground biomass data product. The L4A data product is above ground biomass density (AGBD, Mg · ha-1) at the scale of individual GEDI footprints (25 m diameter). Footprint AGBD is derived from statistical models that relate waveform height metrics to field-estimated above ground biomass. The field estimates are from long-term permanent plot inventories in which all free-standing woody plants greater than a diameter size threshold have been identified and mapped. We simulated GEDI waveforms from discrete-return airborne lidar data using the GEDI waveform simulator. We associated height metrics from simulated waveforms with field-estimated AGBD at 61 sites in temperate and tropical regions of North and South America, Europe, Africa, Asia and Australia. We evaluated the ability of empirical and physically-based regression and machine learning models to predict AGBD at the footprint level. Our analysis benchmarks the performance of these models in terms of site and region-specific accuracy and transferability using a globally comprehensive calibration and validation dataset.
Xyce parallel electronic simulator reference guide, Version 6.0.1.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.
2014-01-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide [1] .
Xyce parallel electronic simulator reference guide, version 6.0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.
2013-08-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide [1] .
DOE Office of Scientific and Technical Information (OSTI.GOV)
Procassini, R.J.
1997-12-31
The fine-scale, multi-space resolution that is envisioned for accurate simulations of complex weapons systems in three spatial dimensions implies flop-rate and memory-storage requirements that will only be obtained in the near future through the use of parallel computational techniques. Since the Monte Carlo transport models in these simulations usually stress both of these computational resources, they are prime candidates for parallelization. The MONACO Monte Carlo transport package, which is currently under development at LLNL, will utilize two types of parallelism within the context of a multi-physics design code: decomposition of the spatial domain across processors (spatial parallelism) and distribution ofmore » particles in a given spatial subdomain across additional processors (particle parallelism). This implementation of the package will utilize explicit data communication between domains (message passing). Such a parallel implementation of a Monte Carlo transport model will result in non-deterministic communication patterns. The communication of particles between subdomains during a Monte Carlo time step may require a significant level of effort to achieve a high parallel efficiency.« less
A Generic Mesh Data Structure with Parallel Applications
ERIC Educational Resources Information Center
Cochran, William Kenneth, Jr.
2009-01-01
High performance, massively-parallel multi-physics simulations are built on efficient mesh data structures. Most data structures are designed from the bottom up, focusing on the implementation of linear algebra routines. In this thesis, we explore a top-down approach to design, evaluating the various needs of many aspects of simulation, not just…
A three-dimensional spectral algorithm for simulations of transition and turbulence
NASA Technical Reports Server (NTRS)
Zang, T. A.; Hussaini, M. Y.
1985-01-01
A spectral algorithm for simulating three dimensional, incompressible, parallel shear flows is described. It applies to the channel, to the parallel boundary layer, and to other shear flows with one wall bounded and two periodic directions. Representative applications to the channel and to the heated boundary layer are presented.
Parallel Performance of a Combustion Chemistry Simulation
Skinner, Gregg; Eigenmann, Rudolf
1995-01-01
We used a description of a combustion simulation's mathematical and computational methods to develop a version for parallel execution. The result was a reasonable performance improvement on small numbers of processors. We applied several important programming techniques, which we describe, in optimizing the application. This work has implications for programming languages, compiler design, and software engineering.
On Parallelizing Single Dynamic Simulation Using HPC Techniques and APIs of Commercial Software
DOE Office of Scientific and Technical Information (OSTI.GOV)
Diao, Ruisheng; Jin, Shuangshuang; Howell, Frederic
Time-domain simulations are heavily used in today’s planning and operation practices to assess power system transient stability and post-transient voltage/frequency profiles following severe contingencies to comply with industry standards. Because of the increased modeling complexity, it is several times slower than real time for state-of-the-art commercial packages to complete a dynamic simulation for a large-scale model. With the growing stochastic behavior introduced by emerging technologies, power industry has seen a growing need for performing security assessment in real time. This paper presents a parallel implementation framework to speed up a single dynamic simulation by leveraging the existing stability model librarymore » in commercial tools through their application programming interfaces (APIs). Several high performance computing (HPC) techniques are explored such as parallelizing the calculation of generator current injection, identifying fast linear solvers for network solution, and parallelizing data outputs when interacting with APIs in the commercial package, TSAT. The proposed method has been tested on a WECC planning base case with detailed synchronous generator models and exhibits outstanding scalable performance with sufficient accuracy.« less
The 2nd Symposium on the Frontiers of Massively Parallel Computations
NASA Technical Reports Server (NTRS)
Mills, Ronnie (Editor)
1988-01-01
Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.
Peng, Chunwang; Liu, Jie; Zhao, Daohui; Zhou, Jian
2014-09-30
In this work, the adsorptions of hydrophobin (HFBI) on four different self-assembled monolayers (SAMs) (i.e., CH3-SAM, OH-SAM, COOH-SAM, and NH2-SAM) were investigated by parallel tempering Monte Carlo and molecular dynamics simulations. Simulation results indicate that the orientation of HFBI adsorbed on neutral surfaces is dominated by a hydrophobic dipole. HFBI adsorbs on the hydrophobic CH3-SAM through its hydrophobic patch and adopts a nearly vertical hydrophobic dipole relative to the surface, while it is nearly horizontal when adsorbed on the hydrophilic OH-SAM. For charged SAM surfaces, HFBI adopts a nearly vertical electric dipole relative to the surface. HFBI has the narrowest orientation distribution on the CH3-SAM, and thus can form an ordered monolayer and reverse the wettability of the surface. For HFBI adsorption on charged SAMs, the adsorption strength weakens as the surface charge density increases. Compared with those on other SAMs, a larger area of the hydrophobic patch is exposed to the solution when HFBI adsorbs on the NH2-SAM. This leads to an increase of the hydrophobicity of the surface, which is consistent with the experimental results. The binding of HFBI to the CH3-SAM is mainly through hydrophobic interactions, while it is mediated through a hydration water layer near the surface for the OH-SAM. For the charged SAM surfaces, the adsorption is mainly induced by electrostatic interactions between the charged surfaces and the oppositely charged residues. The effect of a hydrophobic dipole on protein adsorption onto hydrophobic surfaces is similar to that of an electric dipole for charged surfaces. Therefore, the hydrophobic dipole may be applied to predict the probable orientations of protein adsorbed on hydrophobic surfaces.
Impacts of climate change on paddy rice yield in a temperate climate.
Kim, Han-Yong; Ko, Jonghan; Kang, Suchel; Tenhunen, John
2013-02-01
The crop simulation model is a suitable tool for evaluating the potential impacts of climate change on crop production and on the environment. This study investigates the effects of climate change on paddy rice production in the temperate climate regions under the East Asian monsoon system using the CERES-Rice 4.0 crop simulation model. This model was first calibrated and validated for crop production under elevated CO2 and various temperature conditions. Data were obtained from experiments performed using a temperature gradient field chamber (TGFC) with a CO2 enrichment system installed at Chonnam National University in Gwangju, Korea in 2009 and 2010. Based on the empirical calibration and validation, the model was applied to deliver a simulated forecast of paddy rice production for the region, as well as for the other Japonica rice growing regions in East Asia, projecting for years 2050 and 2100. In these climate change projection simulations in Gwangju, Korea, the yield increases (+12.6 and + 22.0%) due to CO2 elevation were adjusted according to temperature increases showing variation dependent upon the cultivars, which resulted in significant yield decreases (-22.1% and -35.0%). The projected yields were determined to increase as latitude increases due to reduced temperature effects, showing the highest increase for any of the study locations (+24%) in Harbin, China. It appears that the potential negative impact on crop production may be mediated by appropriate cultivar selection and cultivation changes such as alteration of the planting date. Results reported in this study using the CERES-Rice 4.0 model demonstrate the promising potential for its further application in simulating the impacts of climate change on rice production from a local to a regional scale under the monsoon climate system. © 2012 Blackwell Publishing Ltd.
Hu, Zhongmin; Shi, Hao; Cheng, Kaili; Wang, Ying-Ping; Piao, Shilong; Li, Yue; Zhang, Li; Xia, Jianyang; Zhou, Lei; Yuan, Wenping; Running, Steve; Li, Longhui; Hao, Yanbin; He, Nianpeng; Yu, Qiang; Yu, Guirui
2018-04-17
Given the important contributions of semiarid region to global land carbon cycle, accurate modeling of the interannual variability (IAV) of terrestrial gross primary productivity (GPP) is important but remains challenging. By decomposing GPP into leaf area index (LAI) and photosynthesis per leaf area (i.e., GPP_leaf), we investigated the IAV of GPP and the mechanisms responsible in a temperate grassland of northwestern China. We further assessed six ecosystem models for their capabilities in reproducing the observed IAV of GPP in a temperate grassland from 2004 to 2011 in China. We observed that the responses to LAI and GPP_leaf to soil water significantly contributed to IAV of GPP at the grassland ecosystem. Two of six models with prescribed LAI simulated of the observed IAV of GPP quite well, but still underestimated the variance of GPP_leaf, therefore the variance of GPP. In comparison, simulated pattern by the other four models with prognostic LAI differed significantly from the observed IAV of GPP. Only some models with prognostic LAI can capture the observed sharp decline of GPP in drought years. Further analysis indicated that accurately representing the responses of GPP_leaf and leaf stomatal conductance to soil moisture are critical for the models to reproduce the observed IAV of GPP_leaf. Our framework also identified that the contributions of LAI and GPP_leaf to the observed IAV of GPP were relatively independent. We conclude that our framework of decomposing GPP into LAI and GPP_leaf has a significant potential for facilitating future model intercomparison, benchmarking and optimization should be adopted for future data-model comparisons. © 2018 John Wiley & Sons Ltd.
Synchrotron x-ray microtomography of the interior microstructure of chocolate
NASA Astrophysics Data System (ADS)
Lügger, Svenja K.; Wilde, Fabian; Dülger, Nihan; Reinke, Lennart M.; Kozhar, Sergii; Beckmann, Felix; Greving, Imke; Vieira, Josélio; Heinrich, Stefan; Palzer, Stefan
2016-10-01
The structure of chocolate, a multicomponent food product, was analyzed using microtomography. Chocolate consists of a semi-solid cocoa butter matrix and a dense network of suspended particles. A detailed analysis of the microstructure is needed to understand mass transport phenomena. Transport of lipids from e.g. a filling or liquid cocoa butter is responsible for major problems in the confectionery industry such as formation of chocolate bloom, which is the formation of visible white spots or a grayish haze on the chocolate surface and leads to consumer rejections and thus large sales losses for the confectionery industry. In this study it was possible to visualize the inner structure of chocolate and clearly distinguish the particles from the continuous phase by taking advantage of the high density contrast of synchrotron radiation. Consequently, particle arrangement and cracks within the sample were made visible. The cracks are several micrometers thick and propagate throughout the entire sample. Images of pure cocoa butter, chocolate without any particles, did not show any cracks and thus confirmed that cracks are a result of embedded particles. They arise during the manufacturing process. Thus, the solidification process, a critical manufacturing step, was simulated with finite element methods in order to understand crack formation during this step. The simulation showed that cracks arise because of significant contraction of cocoa butter, the matrix phase, without any major change of volume of the suspended particles. Tempering of the chocolate mass prior to solidification is another critical step for a good product quality. We found that samples which solidified in an uncontrolled manner are less homogeneous than tempered samples. In summary, our study visualized for the first time the inner microstructure of tempered and untempered cocoa butter as well as chocolate without sample destruction and revealed cracks, which might act as transport pathways.
Parallelization of a Monte Carlo particle transport simulation code
NASA Astrophysics Data System (ADS)
Hadjidoukas, P.; Bousis, C.; Emfietzoglou, D.
2010-05-01
We have developed a high performance version of the Monte Carlo particle transport simulation code MC4. The original application code, developed in Visual Basic for Applications (VBA) for Microsoft Excel, was first rewritten in the C programming language for improving code portability. Several pseudo-random number generators have been also integrated and studied. The new MC4 version was then parallelized for shared and distributed-memory multiprocessor systems using the Message Passing Interface. Two parallel pseudo-random number generator libraries (SPRNG and DCMT) have been seamlessly integrated. The performance speedup of parallel MC4 has been studied on a variety of parallel computing architectures including an Intel Xeon server with 4 dual-core processors, a Sun cluster consisting of 16 nodes of 2 dual-core AMD Opteron processors and a 200 dual-processor HP cluster. For large problem size, which is limited only by the physical memory of the multiprocessor server, the speedup results are almost linear on all systems. We have validated the parallel implementation against the serial VBA and C implementations using the same random number generator. Our experimental results on the transport and energy loss of electrons in a water medium show that the serial and parallel codes are equivalent in accuracy. The present improvements allow for studying of higher particle energies with the use of more accurate physical models, and improve statistics as more particles tracks can be simulated in low response time.
Kan, Zigui; Zhu, Qiang; Yang, Lijiang; Huang, Zhixiong; Jin, Biaobing; Ma, Jing
2017-05-04
Conformation of cellulose with various degree of polymerization of n = 1-12 in ionic liquid 1,3-dimethylimidazolium chloride ([C 1 mim]Cl) and the intermolecular interaction between them was studied by means of molecular dynamics (MD) simulations with fixed-charge and charge variable polarizable force fields, respectively. The integrated tempering enhanced sampling method was also employed in the simulations in order to improve the sampling efficiency. Cellulose undergoes significant conformational changes from a gaseous right-hand helical twist along the long axis to a flexible conformation in ionic liquid. The intermolecular interactions between cellulose and ionic liquid were studied by both infrared spectrum measurements and theoretical simulations. Designated by their puckering parameters, the pyranose rings of cellulose oligomers are mainly arranged in a chair conformation. With the increase in the degree of polymerization of cellulose, the boat and skew-boat conformations of cellulose appear in the MD simulations, especially in the simulations with polarization model. The number and population of hydrogen bonds between the cellulose and the chloride anions show that chloride anion is prone to form HBs whenever it approaches the hydroxyl groups of cellulose and, thus, each hydroxyl group is fully hydrogen bonded to the chloride anion. MD simulations with polarization model presented more abundant conformations than that with nonpolarization model. The application of the enhanced sampling method further enlarged the conformational spaces that could be visited by facilitating the system escaping from the local minima. It was found that the electrostatics interactions between the cellulose and ionic liquid contribute more to the total interaction energies than the van der Waals interactions. Although the interaction energy between the cellulose and anion is about 2.9 times that between the cellulose and cation, the role of cation is non-negligible. In contrast, the interaction energy between the cellulose and water is too weak to dissolve cellulose in water.
Adaptive multi-GPU Exchange Monte Carlo for the 3D Random Field Ising Model
NASA Astrophysics Data System (ADS)
Navarro, Cristóbal A.; Huang, Wei; Deng, Youjin
2016-08-01
This work presents an adaptive multi-GPU Exchange Monte Carlo approach for the simulation of the 3D Random Field Ising Model (RFIM). The design is based on a two-level parallelization. The first level, spin-level parallelism, maps the parallel computation as optimal 3D thread-blocks that simulate blocks of spins in shared memory with minimal halo surface, assuming a constant block volume. The second level, replica-level parallelism, uses multi-GPU computation to handle the simulation of an ensemble of replicas. CUDA's concurrent kernel execution feature is used in order to fill the occupancy of each GPU with many replicas, providing a performance boost that is more notorious at the smallest values of L. In addition to the two-level parallel design, the work proposes an adaptive multi-GPU approach that dynamically builds a proper temperature set free of exchange bottlenecks. The strategy is based on mid-point insertions at the temperature gaps where the exchange rate is most compromised. The extra work generated by the insertions is balanced across the GPUs independently of where the mid-point insertions were performed. Performance results show that spin-level performance is approximately two orders of magnitude faster than a single-core CPU version and one order of magnitude faster than a parallel multi-core CPU version running on 16-cores. Multi-GPU performance is highly convenient under a weak scaling setting, reaching up to 99 % efficiency as long as the number of GPUs and L increase together. The combination of the adaptive approach with the parallel multi-GPU design has extended our possibilities of simulation to sizes of L = 32 , 64 for a workstation with two GPUs. Sizes beyond L = 64 can eventually be studied using larger multi-GPU systems.
Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
BAER,THOMAS A.; SACKINGER,PHILIP A.; SUBIA,SAMUEL R.
1999-10-14
Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-staticmore » solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance.« less
Vanadium Microalloyed High Strength Martensitic Steel Sheet for Hot-Dip Coating
NASA Astrophysics Data System (ADS)
Hutchinson, Bevis; Komenda, Jacek; Martin, David
Cold rolled steels with various vanadium and nitrogen levels have been treated to simulate the application of galvanizing and galvannealing to hardened martensitic microstructures. Strength levels were raised 100-150MPa by alloying with vanadium, which mitigates the effect of tempering. This opens the way for new ultra-high strength steels with corrosion resistant coatings produced by hot dip galvanising.
Simulated impacts of insect defoliation on forest carbon dynamics
D. Medvigy; K.L. Clark; N.S. Skowronski; K.V.R. Schäfer
2012-01-01
Many temperate and boreal forests are subject to insect epidemics. In the eastern US, over 41 million meters squared of tree basal area are thought to be at risk of gypsy moth defoliation. However, the decadal-to-century scale implications of defoliation events for ecosystem carbon dynamics are not well understood. In this study, the effects of defoliation intensity,...
Wen J. Wang; Hong S. He; Frank R. Thompson; Jacob S. Fraser; Brice B. Hanberry; William D. Dijak
2015-01-01
Most temperate forests in U.S. are recovering from heavy exploitation and are in intermediate successional stages where partial tree harvest is the primary disturbance. Changes in regional forest composition in response to climate change are often predicted for plant functional types using biophysical process models. These models usually simplify the simulation of...
Robert S. Ahl; Scott W. Woods; Hans R. Zuuring
2008-01-01
The Soil and Water Assessment Tool (SWAT) has been applied successfully in temperate environments but little is known about its performance in the snow-dominated, forested, mountainous watersheds that provide much of the water supply in western North America. To address this knowledge gap, we configured SWAT to simulate the streamflow of Tenderfoot Creek (TCSWAT)....
Douglas J. Shinneman; Brian J. Palik; Meredith W. Cornett
2012-01-01
Management strategies to restore forest landscapes are often designed to concurrently reduce fire risk. However, the compatibility of these two objectives is not always clear, and uncoordinated management among landowners may have unintended consequences. We used a forest landscape simulation model to compare the effects of contemporary management and hypothetical...
Zhao, Dongsheng; Wu, Shaohong; Yin, Yunhe
2013-01-01
The impact of regional climate change on net primary productivity (NPP) is an important aspect in the study of ecosystems’ response to global climate change. China’s ecosystems are very sensitive to climate change owing to the influence of the East Asian monsoon. The Lund–Potsdam–Jena Dynamic Global Vegetation Model for China (LPJ-CN), a global dynamical vegetation model developed for China’s terrestrial ecosystems, was applied in this study to simulate the NPP changes affected by future climate change. As the LPJ-CN model is based on natural vegetation, the simulation in this study did not consider the influence of anthropogenic activities. Results suggest that future climate change would have adverse effects on natural ecosystems, with NPP tending to decrease in eastern China, particularly in the temperate and warm temperate regions. NPP would increase in western China, with a concentration in the Tibetan Plateau and the northwest arid regions. The increasing trend in NPP in western China and the decreasing trend in eastern China would be further enhanced by the warming climate. The spatial distribution of NPP, which declines from the southeast coast to the northwest inland, would have minimal variation under scenarios of climate change. PMID:23593325
Zhao, Dongsheng; Wu, Shaohong; Yin, Yunhe
2013-01-01
The impact of regional climate change on net primary productivity (NPP) is an important aspect in the study of ecosystems' response to global climate change. China's ecosystems are very sensitive to climate change owing to the influence of the East Asian monsoon. The Lund-Potsdam-Jena Dynamic Global Vegetation Model for China (LPJ-CN), a global dynamical vegetation model developed for China's terrestrial ecosystems, was applied in this study to simulate the NPP changes affected by future climate change. As the LPJ-CN model is based on natural vegetation, the simulation in this study did not consider the influence of anthropogenic activities. Results suggest that future climate change would have adverse effects on natural ecosystems, with NPP tending to decrease in eastern China, particularly in the temperate and warm temperate regions. NPP would increase in western China, with a concentration in the Tibetan Plateau and the northwest arid regions. The increasing trend in NPP in western China and the decreasing trend in eastern China would be further enhanced by the warming climate. The spatial distribution of NPP, which declines from the southeast coast to the northwest inland, would have minimal variation under scenarios of climate change.
Using sketch-map coordinates to analyze and bias molecular dynamics simulations
Tribello, Gareth A.; Ceriotti, Michele; Parrinello, Michele
2012-01-01
When examining complex problems, such as the folding of proteins, coarse grained descriptions of the system drive our investigation and help us to rationalize the results. Oftentimes collective variables (CVs), derived through some chemical intuition about the process of interest, serve this purpose. Because finding these CVs is the most difficult part of any investigation, we recently developed a dimensionality reduction algorithm, sketch-map, that can be used to build a low-dimensional map of a phase space of high-dimensionality. In this paper we discuss how these machine-generated CVs can be used to accelerate the exploration of phase space and to reconstruct free-energy landscapes. To do so, we develop a formalism in which high-dimensional configurations are no longer represented by low-dimensional position vectors. Instead, for each configuration we calculate a probability distribution, which has a domain that encompasses the entirety of the low-dimensional space. To construct a biasing potential, we exploit an analogy with metadynamics and use the trajectory to adaptively construct a repulsive, history-dependent bias from the distributions that correspond to the previously visited configurations. This potential forces the system to explore more of phase space by making it desirable to adopt configurations whose distributions do not overlap with the bias. We apply this algorithm to a small model protein and succeed in reproducing the free-energy surface that we obtain from a parallel tempering calculation. PMID:22427357
Exploiting molecular dynamics in Nested Sampling simulations of small peptides
NASA Astrophysics Data System (ADS)
Burkoff, Nikolas S.; Baldock, Robert J. N.; Várnai, Csilla; Wild, David L.; Csányi, Gábor
2016-04-01
Nested Sampling (NS) is a parameter space sampling algorithm which can be used for sampling the equilibrium thermodynamics of atomistic systems. NS has previously been used to explore the potential energy surface of a coarse-grained protein model and has significantly outperformed parallel tempering when calculating heat capacity curves of Lennard-Jones clusters. The original NS algorithm uses Monte Carlo (MC) moves; however, a variant, Galilean NS, has recently been introduced which allows NS to be incorporated into a molecular dynamics framework, so NS can be used for systems which lack efficient prescribed MC moves. In this work we demonstrate the applicability of Galilean NS to atomistic systems. We present an implementation of Galilean NS using the Amber molecular dynamics package and demonstrate its viability by sampling alanine dipeptide, both in vacuo and implicit solvent. Unlike previous studies of this system, we present the heat capacity curves of alanine dipeptide, whose calculation provides a stringent test for sampling algorithms. We also compare our results with those calculated using replica exchange molecular dynamics (REMD) and find good agreement. We show the computational effort required for accurate heat capacity estimation for small peptides. We also calculate the alanine dipeptide Ramachandran free energy surface for a range of temperatures and use it to compare the results using the latest Amber force field with previous theoretical and experimental results.
Long-term variation of total ozone
NASA Astrophysics Data System (ADS)
Kane, R. P.
1988-03-01
The long-term variation of total ozone is studied for 1957 up to date for different latitude zones. The 3-year running averages show that, apart from a small portion showing parallelism with sunspot cycles, the trends in different latitude zones are dissimilar. In particular, where northern latitudes show a rising trend, the southern latitudes show an opposite (decreasing) trend. In the north-temperate group, Europe, North America and Asia show dissimilar trends. The longer data series (1932 ownards) for Arosa shows, besides a solar-cycle-dependent component, a steady level during 1932 1953 and a down-trend thereafter up to date. Very localised but long-lasting circulation patterns, different in different geographical regions, are indicated.
Xyce parallel electronic simulator reference guide, version 6.1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.
2014-03-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide [1] .
Parallel spatial direct numerical simulations on the Intel iPSC/860 hypercube
NASA Technical Reports Server (NTRS)
Joslin, Ronald D.; Zubair, Mohammad
1993-01-01
The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube is documented. The direct numerical simulation approach is used to compute spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows. The feasibility of using the PSDNS on the hypercube to perform transition studies is examined. The results indicate that the direct numerical simulation approach can effectively be parallelized on a distributed-memory parallel machine. By increasing the number of processors nearly ideal linear speedups are achieved with nonoptimized routines; slower than linear speedups are achieved with optimized (machine dependent library) routines. This slower than linear speedup results because the Fast Fourier Transform (FFT) routine dominates the computational cost and because the routine indicates less than ideal speedups. However with the machine-dependent routines the total computational cost decreases by a factor of 4 to 5 compared with standard FORTRAN routines. The computational cost increases linearly with spanwise wall-normal and streamwise grid refinements. The hypercube with 32 processors was estimated to require approximately twice the amount of Cray supercomputer single processor time to complete a comparable simulation; however it is estimated that a subgrid-scale model which reduces the required number of grid points and becomes a large-eddy simulation (PSLES) would reduce the computational cost and memory requirements by a factor of 10 over the PSDNS. This PSLES implementation would enable transition simulations on the hypercube at a reasonable computational cost.
Computer Science Techniques Applied to Parallel Atomistic Simulation
NASA Astrophysics Data System (ADS)
Nakano, Aiichiro
1998-03-01
Recent developments in parallel processing technology and multiresolution numerical algorithms have established large-scale molecular dynamics (MD) simulations as a new research mode for studying materials phenomena such as fracture. However, this requires large system sizes and long simulated times. We have developed: i) Space-time multiresolution schemes; ii) fuzzy-clustering approach to hierarchical dynamics; iii) wavelet-based adaptive curvilinear-coordinate load balancing; iv) multilevel preconditioned conjugate gradient method; and v) spacefilling-curve-based data compression for parallel I/O. Using these techniques, million-atom parallel MD simulations are performed for the oxidation dynamics of nanocrystalline Al. The simulations take into account the effect of dynamic charge transfer between Al and O using the electronegativity equalization scheme. The resulting long-range Coulomb interaction is calculated efficiently with the fast multipole method. Results for temperature and charge distributions, residual stresses, bond lengths and bond angles, and diffusivities of Al and O will be presented. The oxidation of nanocrystalline Al is elucidated through immersive visualization in virtual environments. A unique dual-degree education program at Louisiana State University will also be discussed in which students can obtain a Ph.D. in Physics & Astronomy and a M.S. from the Department of Computer Science in five years. This program fosters interdisciplinary research activities for interfacing High Performance Computing and Communications with large-scale atomistic simulations of advanced materials. This work was supported by NSF (CAREER Program), ARO, PRF, and Louisiana LEQSF.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications
NASA Technical Reports Server (NTRS)
Sun, Xian-He
1997-01-01
Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
An intelligent processing environment for real-time simulation
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Wells, Buren Earl, Jr.
1988-01-01
The development of a highly efficient and thus truly intelligent processing environment for real-time general purpose simulation of continuous systems is described. Such an environment can be created by mapping the simulation process directly onto the University of Alamba's OPERA architecture. To facilitate this effort, the field of continuous simulation is explored, highlighting areas in which efficiency can be improved. Areas in which parallel processing can be applied are also identified, and several general OPERA type hardware configurations that support improved simulation are investigated. Three direct execution parallel processing environments are introduced, each of which greatly improves efficiency by exploiting distinct areas of the simulation process. These suggested environments are candidate architectures around which a highly intelligent real-time simulation configuration can be developed.
Particle simulation of plasmas on the massively parallel processor
NASA Technical Reports Server (NTRS)
Gledhill, I. M. A.; Storey, L. R. O.
1987-01-01
Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.
Use of Parallel Micro-Platform for the Simulation the Space Exploration
NASA Astrophysics Data System (ADS)
Velasco Herrera, Victor Manuel; Velasco Herrera, Graciela; Rosano, Felipe Lara; Rodriguez Lozano, Salvador; Lucero Roldan Serrato, Karen
The purpose of this work is to create a parallel micro-platform, that simulates the virtual movements of a space exploration in 3D. One of the innovations presented in this design consists of the application of a lever mechanism for the transmission of the movement. The development of such a robot is a challenging task very different of the industrial manipulators due to a totally different target system of requirements. This work presents the study and simulation, aided by computer, of the movement of this parallel manipulator. The development of this model has been developed using the platform of computer aided design Unigraphics, in which it was done the geometric modeled of each one of the components and end assembly (CAD), the generation of files for the computer aided manufacture (CAM) of each one of the pieces and the kinematics simulation of the system evaluating different driving schemes. We used the toolbox (MATLAB) of aerospace and create an adaptive control module to simulate the system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Langer, S; Rotman, D; Schwegler, E
The Institutional Computing Executive Group (ICEG) review of FY05-06 Multiprogrammatic and Institutional Computing (M and IC) activities is presented in the attached report. In summary, we find that the M and IC staff does an outstanding job of acquiring and supporting a wide range of institutional computing resources to meet the programmatic and scientific goals of LLNL. The responsiveness and high quality of support given to users and the programs investing in M and IC reflects the dedication and skill of the M and IC staff. M and IC has successfully managed serial capacity, parallel capacity, and capability computing resources.more » Serial capacity computing supports a wide range of scientific projects which require access to a few high performance processors within a shared memory computer. Parallel capacity computing supports scientific projects that require a moderate number of processors (up to roughly 1000) on a parallel computer. Capability computing supports parallel jobs that push the limits of simulation science. M and IC has worked closely with Stockpile Stewardship, and together they have made LLNL a premier institution for computational and simulation science. Such a standing is vital to the continued success of laboratory science programs and to the recruitment and retention of top scientists. This report provides recommendations to build on M and IC's accomplishments and improve simulation capabilities at LLNL. We recommend that institution fully fund (1) operation of the atlas cluster purchased in FY06 to support a few large projects; (2) operation of the thunder and zeus clusters to enable 'mid-range' parallel capacity simulations during normal operation and a limited number of large simulations during dedicated application time; (3) operation of the new yana cluster to support a wide range of serial capacity simulations; (4) improvements to the reliability and performance of the Lustre parallel file system; (5) support for the new GDO petabyte-class storage facility on the green network for use in data intensive external collaborations; and (6) continued support for visualization and other methods for analyzing large simulations. We also recommend that M and IC begin planning in FY07 for the next upgrade of its parallel clusters. LLNL investments in M and IC have resulted in a world-class simulation capability leading to innovative science. We thank the LLNL management for its continued support and thank the M and IC staff for its vision and dedicated efforts to make it all happen.« less
A foundation for initial attack simulation: the Fried and Fried fire containment model
Jeremy S. Fried; Burton D. Fried
2010-01-01
The Fried and Fried containment algorithm, which models the effect of suppression efforts on fire growth, allows simulation of any mathematically representable fire shape, provides for "head" and "tail" attack tactics as well as parallel attack (building fireline parallel to but at some offset distance from the free-burning fire perimeter, alone and...
Parallel 3D Finite Element Numerical Modelling of DC Electron Guns
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prudencio, E.; Candel, A.; Ge, L.
2008-02-04
In this paper we present Gun3P, a parallel 3D finite element application that the Advanced Computations Department at the Stanford Linear Accelerator Center is developing for the analysis of beam formation in DC guns and beam transport in klystrons. Gun3P is targeted specially to complex geometries that cannot be described by 2D models and cannot be easily handled by finite difference discretizations. Its parallel capability allows simulations with more accuracy and less processing time than packages currently available. We present simulation results for the L-band Sheet Beam Klystron DC gun, in which case Gun3P is able to reduce simulation timemore » from days to some hours.« less
Wake Encounter Analysis for a Closely Spaced Parallel Runway Paired Approach Simulation
NASA Technical Reports Server (NTRS)
Mckissick,Burnell T.; Rico-Cusi, Fernando J.; Murdoch, Jennifer; Oseguera-Lohr, Rosa M.; Stough, Harry P, III; O'Connor, Cornelius J.; Syed, Hazari I.
2009-01-01
A Monte Carlo simulation of simultaneous approaches performed by two transport category aircraft from the final approach fix to a pair of closely spaced parallel runways was conducted to explore the aft boundary of the safe zone in which separation assurance and wake avoidance are provided. The simulation included variations in runway centerline separation, initial longitudinal spacing of the aircraft, crosswind speed, and aircraft speed during the approach. The data from the simulation showed that the majority of the wake encounters occurred near or over the runway and the aft boundaries of the safe zones were identified for all simulation conditions.
Long-time atomistic simulations with the Parallel Replica Dynamics method
NASA Astrophysics Data System (ADS)
Perez, Danny
Molecular Dynamics (MD) -- the numerical integration of atomistic equations of motion -- is a workhorse of computational materials science. Indeed, MD can in principle be used to obtain any thermodynamic or kinetic quantity, without introducing any approximation or assumptions beyond the adequacy of the interaction potential. It is therefore an extremely powerful and flexible tool to study materials with atomistic spatio-temporal resolution. These enviable qualities however come at a steep computational price, hence limiting the system sizes and simulation times that can be achieved in practice. While the size limitation can be efficiently addressed with massively parallel implementations of MD based on spatial decomposition strategies, allowing for the simulation of trillions of atoms, the same approach usually cannot extend the timescales much beyond microseconds. In this article, we discuss an alternative parallel-in-time approach, the Parallel Replica Dynamics (ParRep) method, that aims at addressing the timescale limitation of MD for systems that evolve through rare state-to-state transitions. We review the formal underpinnings of the method and demonstrate that it can provide arbitrarily accurate results for any definition of the states. When an adequate definition of the states is available, ParRep can simulate trajectories with a parallel speedup approaching the number of replicas used. We demonstrate the usefulness of ParRep by presenting different examples of materials simulations where access to long timescales was essential to access the physical regime of interest and discuss practical considerations that must be addressed to carry out these simulations. Work supported by the United States Department of Energy (U.S. DOE), Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division.
Xyce parallel electronic simulator : reference guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.
2011-05-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to runmore » on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.« less
NASA Astrophysics Data System (ADS)
Sun, Rui; Xiao, Heng
2016-04-01
With the growth of available computational resource, CFD-DEM (computational fluid dynamics-discrete element method) becomes an increasingly promising and feasible approach for the study of sediment transport. Several existing CFD-DEM solvers are applied in chemical engineering and mining industry. However, a robust CFD-DEM solver for the simulation of sediment transport is still desirable. In this work, the development of a three-dimensional, massively parallel, and open-source CFD-DEM solver SediFoam is detailed. This solver is built based on open-source solvers OpenFOAM and LAMMPS. OpenFOAM is a CFD toolbox that can perform three-dimensional fluid flow simulations on unstructured meshes; LAMMPS is a massively parallel DEM solver for molecular dynamics. Several validation tests of SediFoam are performed using cases of a wide range of complexities. The results obtained in the present simulations are consistent with those in the literature, which demonstrates the capability of SediFoam for sediment transport applications. In addition to the validation test, the parallel efficiency of SediFoam is studied to test the performance of the code for large-scale and complex simulations. The parallel efficiency tests show that the scalability of SediFoam is satisfactory in the simulations using up to O(107) particles.
Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators
Wang, Wei; Xu, Lifan; Cavazos, John; Huang, Howie H.; Kay, Matthew
2014-01-01
Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that parallelized code is usually not portable to different architectures, creates major challenges for exploiting the full capabilities of modern computational accelerators. In this work, we sought to overcome these challenges by studying how to achieve both automated parallelization using OpenACC and enhanced portability using OpenCL. We applied our parallelization schemes using GPUs as well as Intel Many Integrated Core (MIC) coprocessor to reduce the run time of wave propagation simulations. We used a well-established 2D cardiac action potential model as a specific case-study. To the best of our knowledge, we are the first to study auto-parallelization of 2D cardiac wave propagation simulations using OpenACC. Our results identify several approaches that provide substantial speedups. The OpenACC-generated GPU code achieved more than speedup above the sequential implementation and required the addition of only a few OpenACC pragmas to the code. An OpenCL implementation provided speedups on GPUs of at least faster than the sequential implementation and faster than a parallelized OpenMP implementation. An implementation of OpenMP on Intel MIC coprocessor provided speedups of with only a few code changes to the sequential implementation. We highlight that OpenACC provides an automatic, efficient, and portable approach to achieve parallelization of 2D cardiac wave simulations on GPUs. Our approach of using OpenACC, OpenCL, and OpenMP to parallelize this particular model on modern computational accelerators should be applicable to other computational models of wave propagation in multi-dimensional media. PMID:24497950
PIC Simulation of Laser Plasma Interactions with Temporal Bandwidths
NASA Astrophysics Data System (ADS)
Tsung, Frank; Weaver, J.; Lehmberg, R.
2015-11-01
We are performing particle-in-cell simulations using the code OSIRIS to study the effects of laser plasma interactions in the presence of temperal bandwidths under conditions relevant to current and future shock ignition experiments on the NIKE laser. Our simulations show that, for sufficiently large bandwidth, the saturation level, and the distribution of hot electrons, can be effected by the addition of temporal bandwidths (which can be accomplished in experiments using smoothing techniques such as SSD or ISI). We will show that temporal bandwidth along play an important role in the control of LPI's in these lasers and discuss future directions. This work is conducted under the auspices of NRL.
Schnek: A C++ library for the development of parallel simulation codes on regular grids
NASA Astrophysics Data System (ADS)
Schmitz, Holger
2018-05-01
A large number of algorithms across the field of computational physics are formulated on grids with a regular topology. We present Schnek, a library that enables fast development of parallel simulations on regular grids. Schnek contains a number of easy-to-use modules that greatly reduce the amount of administrative code for large-scale simulation codes. The library provides an interface for reading simulation setup files with a hierarchical structure. The structure of the setup file is translated into a hierarchy of simulation modules that the developer can specify. The reader parses and evaluates mathematical expressions and initialises variables or grid data. This enables developers to write modular and flexible simulation codes with minimal effort. Regular grids of arbitrary dimension are defined as well as mechanisms for defining physical domain sizes, grid staggering, and ghost cells on these grids. Ghost cells can be exchanged between neighbouring processes using MPI with a simple interface. The grid data can easily be written into HDF5 files using serial or parallel I/O.
Crystal MD: The massively parallel molecular dynamics software for metal with BCC structure
NASA Astrophysics Data System (ADS)
Hu, Changjun; Bai, He; He, Xinfu; Zhang, Boyao; Nie, Ningming; Wang, Xianmeng; Ren, Yingwen
2017-02-01
Material irradiation effect is one of the most important keys to use nuclear power. However, the lack of high-throughput irradiation facility and knowledge of evolution process, lead to little understanding of the addressed issues. With the help of high-performance computing, we could make a further understanding of micro-level-material. In this paper, a new data structure is proposed for the massively parallel simulation of the evolution of metal materials under irradiation environment. Based on the proposed data structure, we developed the new molecular dynamics software named Crystal MD. The simulation with Crystal MD achieved over 90% parallel efficiency in test cases, and it takes more than 25% less memory on multi-core clusters than LAMMPS and IMD, which are two popular molecular dynamics simulation software. Using Crystal MD, a two trillion particles simulation has been performed on Tianhe-2 cluster.
Inflated speedups in parallel simulations via malloc()
NASA Technical Reports Server (NTRS)
Nicol, David M.
1990-01-01
Discrete-event simulation programs make heavy use of dynamic memory allocation in order to support simulation's very dynamic space requirements. When programming in C one is likely to use the malloc() routine. However, a parallel simulation which uses the standard Unix System V malloc() implementation may achieve an overly optimistic speedup, possibly superlinear. An alternate implementation provided on some (but not all systems) can avoid the speedup anomaly, but at the price of significantly reduced available free space. This is especially severe on most parallel architectures, which tend not to support virtual memory. It is shown how a simply implemented user-constructed interface to malloc() can both avoid artificially inflated speedups, and make efficient use of the dynamic memory space. The interface simply catches blocks on the basis of their size. The problem is demonstrated empirically, and the effectiveness of the solution is shown both empirically and analytically.
NASA Astrophysics Data System (ADS)
Wendel, D. E.; Olson, D. K.; Hesse, M.; Karimabadi, H.; Daughton, W. S.
2013-12-01
We investigate the distribution of parallel electric fields and their relationship to the location and rate of magnetic reconnection of a large particle-in-cell simulation of 3D turbulent magnetic reconnection with open boundary conditions. The simulation's guide field geometry inhibits the formation of topological features such as separators and null points. Therefore, we derive the location of potential changes in magnetic connectivity by finding the field lines that experience a large relative change between their endpoints, i.e., the quasi-separatrix layer. We find a correspondence between the locus of changes in magnetic connectivity, or the quasi-separatrix layer, and the map of large gradients in the integrated parallel electric field (or quasi-potential). Furthermore, we compare the distribution of parallel electric fields along field lines with the reconnection rate. We find the reconnection rate is controlled by only the low-amplitude, zeroth and first-order trends in the parallel electric field, while the contribution from high amplitude parallel fluctuations, such as electron holes, is negligible. The results impact the determination of reconnection sites within models of 3D turbulent reconnection as well as the inference of reconnection rates from in situ spacecraft measurements. It is difficult through direct observation to isolate the locus of the reconnection parallel electric field amidst the large amplitude fluctuations. However, we demonstrate that a positive slope of the partial sum of the parallel electric field along the field line as a function of field line length indicates where reconnection is occurring along the field line.
Petascale turbulence simulation using a highly parallel fast multipole method on GPUs
NASA Astrophysics Data System (ADS)
Yokota, Rio; Barba, L. A.; Narumi, Tetsu; Yasuoka, Kenji
2013-03-01
This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on GPU hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the FMM-based vortex method achieving 74% parallel efficiency on 4096 processes (one GPU per MPI process, 3 GPUs per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of MPI processes (using only CPU cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.
Biocellion: accelerating computer simulation of multicellular biological system models
Kang, Seunghwa; Kahan, Simon; McDermott, Jason; Flann, Nicholas; Shmulevich, Ilya
2014-01-01
Motivation: Biological system behaviors are often the outcome of complex interactions among a large number of cells and their biotic and abiotic environment. Computational biologists attempt to understand, predict and manipulate biological system behavior through mathematical modeling and computer simulation. Discrete agent-based modeling (in combination with high-resolution grids to model the extracellular environment) is a popular approach for building biological system models. However, the computational complexity of this approach forces computational biologists to resort to coarser resolution approaches to simulate large biological systems. High-performance parallel computers have the potential to address the computing challenge, but writing efficient software for parallel computers is difficult and time-consuming. Results: We have developed Biocellion, a high-performance software framework, to solve this computing challenge using parallel computers. To support a wide range of multicellular biological system models, Biocellion asks users to provide their model specifics by filling the function body of pre-defined model routines. Using Biocellion, modelers without parallel computing expertise can efficiently exploit parallel computers with less effort than writing sequential programs from scratch. We simulate cell sorting, microbial patterning and a bacterial system in soil aggregate as case studies. Availability and implementation: Biocellion runs on x86 compatible systems with the 64 bit Linux operating system and is freely available for academic use. Visit http://biocellion.com for additional information. Contact: seunghwa.kang@pnnl.gov PMID:25064572
2014-09-01
simulation time frame from 30 days to one year. This was enabled by porting the simulation to the Pleiades supercomputer at NASA Ames Research Center, a...including the motivation for changes to our past approach. We then present the software implementation (3) on the NASA Ames Pleiades supercomputer...significantly updated since last year’s paper [25]. The main incentive for that was the shift to a highly parallel approach in order to utilize the Pleiades
Advances in Parallelization for Large Scale Oct-Tree Mesh Generation
NASA Technical Reports Server (NTRS)
O'Connell, Matthew; Karman, Steve L.
2015-01-01
Despite great advancements in the parallelization of numerical simulation codes over the last 20 years, it is still common to perform grid generation in serial. Generating large scale grids in serial often requires using special "grid generation" compute machines that can have more than ten times the memory of average machines. While some parallel mesh generation techniques have been proposed, generating very large meshes for LES or aeroacoustic simulations is still a challenging problem. An automated method for the parallel generation of very large scale off-body hierarchical meshes is presented here. This work enables large scale parallel generation of off-body meshes by using a novel combination of parallel grid generation techniques and a hybrid "top down" and "bottom up" oct-tree method. Meshes are generated using hardware commonly found in parallel compute clusters. The capability to generate very large meshes is demonstrated by the generation of off-body meshes surrounding complex aerospace geometries. Results are shown including a one billion cell mesh generated around a Predator Unmanned Aerial Vehicle geometry, which was generated on 64 processors in under 45 minutes.
On Designing Multicore-Aware Simulators for Systems Biology Endowed with OnLine Statistics
Calcagno, Cristina; Coppo, Mario
2014-01-01
The paper arguments are on enabling methodologies for the design of a fully parallel, online, interactive tool aiming to support the bioinformatics scientists .In particular, the features of these methodologies, supported by the FastFlow parallel programming framework, are shown on a simulation tool to perform the modeling, the tuning, and the sensitivity analysis of stochastic biological models. A stochastic simulation needs thousands of independent simulation trajectories turning into big data that should be analysed by statistic and data mining tools. In the considered approach the two stages are pipelined in such a way that the simulation stage streams out the partial results of all simulation trajectories to the analysis stage that immediately produces a partial result. The simulation-analysis workflow is validated for performance and effectiveness of the online analysis in capturing biological systems behavior on a multicore platform and representative proof-of-concept biological systems. The exploited methodologies include pattern-based parallel programming and data streaming that provide key features to the software designers such as performance portability and efficient in-memory (big) data management and movement. Two paradigmatic classes of biological systems exhibiting multistable and oscillatory behavior are used as a testbed. PMID:25050327
On designing multicore-aware simulators for systems biology endowed with OnLine statistics.
Aldinucci, Marco; Calcagno, Cristina; Coppo, Mario; Damiani, Ferruccio; Drocco, Maurizio; Sciacca, Eva; Spinella, Salvatore; Torquati, Massimo; Troina, Angelo
2014-01-01
The paper arguments are on enabling methodologies for the design of a fully parallel, online, interactive tool aiming to support the bioinformatics scientists .In particular, the features of these methodologies, supported by the FastFlow parallel programming framework, are shown on a simulation tool to perform the modeling, the tuning, and the sensitivity analysis of stochastic biological models. A stochastic simulation needs thousands of independent simulation trajectories turning into big data that should be analysed by statistic and data mining tools. In the considered approach the two stages are pipelined in such a way that the simulation stage streams out the partial results of all simulation trajectories to the analysis stage that immediately produces a partial result. The simulation-analysis workflow is validated for performance and effectiveness of the online analysis in capturing biological systems behavior on a multicore platform and representative proof-of-concept biological systems. The exploited methodologies include pattern-based parallel programming and data streaming that provide key features to the software designers such as performance portability and efficient in-memory (big) data management and movement. Two paradigmatic classes of biological systems exhibiting multistable and oscillatory behavior are used as a testbed.
A Parallel, Finite-Volume Algorithm for Large-Eddy Simulation of Turbulent Flows
NASA Technical Reports Server (NTRS)
Bui, Trong T.
1999-01-01
A parallel, finite-volume algorithm has been developed for large-eddy simulation (LES) of compressible turbulent flows. This algorithm includes piecewise linear least-square reconstruction, trilinear finite-element interpolation, Roe flux-difference splitting, and second-order MacCormack time marching. Parallel implementation is done using the message-passing programming model. In this paper, the numerical algorithm is described. To validate the numerical method for turbulence simulation, LES of fully developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. Direct numerical simulation (DNS) results are available for this test case, and the accuracy of this algorithm for turbulence simulations can be ascertained by comparing the LES solutions with the DNS results. The effects of grid resolution, upwind numerical dissipation, and subgrid-scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux-difference splitting dissipation adversely affects the accuracy of the turbulence simulation. For accurate turbulence simulations, only 3-5 percent of the standard Roe flux-difference splitting dissipation is needed.
Parallel ecological networks in ecosystems
Olff, Han; Alonso, David; Berg, Matty P.; Eriksson, B. Klemens; Loreau, Michel; Piersma, Theunis; Rooney, Neil
2009-01-01
In ecosystems, species interact with other species directly and through abiotic factors in multiple ways, often forming complex networks of various types of ecological interaction. Out of this suite of interactions, predator–prey interactions have received most attention. The resulting food webs, however, will always operate simultaneously with networks based on other types of ecological interaction, such as through the activities of ecosystem engineers or mutualistic interactions. Little is known about how to classify, organize and quantify these other ecological networks and their mutual interplay. The aim of this paper is to provide new and testable ideas on how to understand and model ecosystems in which many different types of ecological interaction operate simultaneously. We approach this problem by first identifying six main types of interaction that operate within ecosystems, of which food web interactions are one. Then, we propose that food webs are structured among two main axes of organization: a vertical (classic) axis representing trophic position and a new horizontal ‘ecological stoichiometry’ axis representing decreasing palatability of plant parts and detritus for herbivores and detrivores and slower turnover times. The usefulness of these new ideas is then explored with three very different ecosystems as test cases: temperate intertidal mudflats; temperate short grass prairie; and tropical savannah. PMID:19451126
NASA Astrophysics Data System (ADS)
Villegas-Ríos, David; Alonso-Fernández, Alexandre; Domínguez-Petit, Rosario; Saborido-Rey, Fran
2014-02-01
Energy allocation is an important component of life-history variation since it determines the tradeoff between growth and reproduction. In this study we investigated the state-dependent and sex-specific energy allocation pattern and the reproductive investment of a protogynous hermaphrodite fish with parental care. Individuals of Labrus bergylta, a temperate wrasse displaying two main different colour patterns (plain and spotted), were obtained from the fish markets in NW Spain between 2009 and 2012. Total energy of the gonad, liver, mesenteric fat and muscle (obtained by calorimetric analysis) and gut weight (as a proxy of feeding intensity) were modelled in relation to the reproductive phase of the individuals. A decrease in the energy stored as mesenteric fat from prespawning to spawning paralleled the increase in the gonad total energy in the same period. The predicted reduction in stored total energy over the reproductive cycle was higher than the energy required to develop the ovaries for the full range of female sizes analysed, suggesting a capital breeding strategy. Males stored less energy over a season and invested fewer resources in gamete production than females. Reproductive investment (both fecundity and energy required to produce the gonads) was higher in plain than in spotted females, which is in agreement with the different growth patterns described for the species.
Rivas, A; Barceló-Quintal, I; Moeller, G E
2011-01-01
A multi-stage municipal wastewater treatment system is proposed to comply with Mexican standards for discharge into receiving water bodies. The system is located in Santa Fe de la Laguna, Mexico, an area with a temperate climate. It was designed for 2,700 people equivalent (259.2 m3/d) and consists of a preliminary treatment, a septic tank as well as two modules operating in parallel, each consisting of a horizontal subsurface-flow wetland, a maturation pond and a vertical flow polishing wetland. After two years of operation, on-site research was performed. An efficient biochemical oxygen demand (BOD5) (94-98%), chemical oxygen demand (91-93%), total suspended solids (93-97%), total Kjeldahl nitrogen (56-88%) and fecal coliform (4-5 logs) removal was obtained. Significant phosphorus removal was not accomplished in this study (25-52%). Evapotranspiration was measured in different treatment units. This study demonstrates that during the dry season wastewater treatment by this multi-stage system cannot comply with the limits established by Mexican standards for receiving water bodies type 'C'. However, it has demonstrated the system's potential for less restrictive uses such as agricultural irrigation, recreation and provides the opportunity for wastewater treatment in rural areas without electric energy.
Fast-cycling unit of root turnover in perennial herbaceous plants in a cold temperate ecosystem
NASA Astrophysics Data System (ADS)
Sun, Kai; Luke McCormack, M.; Li, Le; Ma, Zeqing; Guo, Dali
2016-01-01
Roots of perennial plants have both persistent portion and fast-cycling units represented by different levels of branching. In woody species, the distal nonwoody branch orders as a unit are born and die together relatively rapidly (within 1-2 years). However, whether the fast-cycling units also exist in perennial herbs is unknown. We monitored root demography of seven perennial herbs over two years in a cold temperate ecosystem and we classified the largest roots on the root collar or rhizome as basal roots, and associated finer laterals as secondary, tertiary and quaternary roots. Parallel to woody plants in which distal root orders form a fast-cycling module, basal root and its finer laterals also represent a fast-cycling module in herbaceous plants. Within this module, basal roots had a lifespan of 0.5-2 years and represented 62-87% of total root biomass, thus dominating annual root turnover (60%-81% of the total). Moreover, root traits including root length, tissue density, and biomass were useful predictors of root lifespan. We conclude that both herbaceous and woody plants have fast-cycling modular units and future studies identifying the fast-cycling module across plant species should allow better understanding of how root construction and turnover are linked to whole-plant strategies.
Lin, Shiwei; Wu, Ruidong; Hua, Chaolang; Ma, Jianzhong; Wang, Wenli; Yang, Feiling; Wang, Junjun
2016-01-01
Protecting wilderness areas (WAs) is a crucial proactive approach to sustain biodiversity. However, studies identifying local-scale WAs for on-ground conservation efforts are still very limited. This paper investigated the spatial patterns of wilderness in a global biodiversity hotspot – Three Parallel Rivers Region (TPRR) in southwest China. Wilderness was classified into levels 1 to 10 based on a cluster analysis of five indicators, namely human population density, naturalness, fragmentation, remoteness, and ruggedness. Only patches characterized by wilderness level 1 and ≥1.0 km2 were considered WAs. The wilderness levels in the northwest were significantly higher than those in the southeast, and clearly increased with the increase in elevation. The WAs covered approximately 25% of TPRR’s land, 89.3% of which was located in the >3,000 m elevation zones. WAs consisted of 20 vegetation types, among which temperate conifer forest, cold temperate shrub and alpine ecosystems covered 79.4% of WAs’ total area. Most WAs were still not protected yet by existing reserves. Topography and human activities are the primary influencing factors on the spatial patterns of wilderness. We suggest establishing strictly protected reserves for most large WAs, while some sustainable management approaches might be more optimal solutions for many highly fragmented small WAs. PMID:27181186
Fast I/O for Massively Parallel Applications
NASA Technical Reports Server (NTRS)
OKeefe, Matthew T.
1996-01-01
The two primary goals for this report were the design, contruction and modeling of parallel disk arrays for scientific visualization and animation, and a study of the IO requirements of highly parallel applications. In addition, further work in parallel display systems required to project and animate the very high-resolution frames resulting from our supercomputing simulations in ocean circulation and compressible gas dynamics.
ERIC Educational Resources Information Center
Gil, Arturo; Peidró, Adrián; Reinoso, Óscar; Marín, José María
2017-01-01
This paper presents a tool, LABEL, oriented to the teaching of parallel robotics. The application, organized as a set of tools developed using Easy Java Simulations, enables the study of the kinematics of parallel robotics. A set of classical parallel structures was implemented such that LABEL can solve the inverse and direct kinematic problem of…
The Parallel System for Integrating Impact Models and Sectors (pSIMS)
NASA Technical Reports Server (NTRS)
Elliott, Joshua; Kelly, David; Chryssanthacopoulos, James; Glotter, Michael; Jhunjhnuwala, Kanika; Best, Neil; Wilde, Michael; Foster, Ian
2014-01-01
We present a framework for massively parallel climate impact simulations: the parallel System for Integrating Impact Models and Sectors (pSIMS). This framework comprises a) tools for ingesting and converting large amounts of data to a versatile datatype based on a common geospatial grid; b) tools for translating this datatype into custom formats for site-based models; c) a scalable parallel framework for performing large ensemble simulations, using any one of a number of different impacts models, on clusters, supercomputers, distributed grids, or clouds; d) tools and data standards for reformatting outputs to common datatypes for analysis and visualization; and e) methodologies for aggregating these datatypes to arbitrary spatial scales such as administrative and environmental demarcations. By automating many time-consuming and error-prone aspects of large-scale climate impacts studies, pSIMS accelerates computational research, encourages model intercomparison, and enhances reproducibility of simulation results. We present the pSIMS design and use example assessments to demonstrate its multi-model, multi-scale, and multi-sector versatility.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Zuwei; Zhao, Haibo, E-mail: klinsmannzhb@163.com; Zheng, Chuguang
2015-01-15
This paper proposes a comprehensive framework for accelerating population balance-Monte Carlo (PBMC) simulation of particle coagulation dynamics. By combining Markov jump model, weighted majorant kernel and GPU (graphics processing unit) parallel computing, a significant gain in computational efficiency is achieved. The Markov jump model constructs a coagulation-rule matrix of differentially-weighted simulation particles, so as to capture the time evolution of particle size distribution with low statistical noise over the full size range and as far as possible to reduce the number of time loopings. Here three coagulation rules are highlighted and it is found that constructing appropriate coagulation rule providesmore » a route to attain the compromise between accuracy and cost of PBMC methods. Further, in order to avoid double looping over all simulation particles when considering the two-particle events (typically, particle coagulation), the weighted majorant kernel is introduced to estimate the maximum coagulation rates being used for acceptance–rejection processes by single-looping over all particles, and meanwhile the mean time-step of coagulation event is estimated by summing the coagulation kernels of rejected and accepted particle pairs. The computational load of these fast differentially-weighted PBMC simulations (based on the Markov jump model) is reduced greatly to be proportional to the number of simulation particles in a zero-dimensional system (single cell). Finally, for a spatially inhomogeneous multi-dimensional (multi-cell) simulation, the proposed fast PBMC is performed in each cell, and multiple cells are parallel processed by multi-cores on a GPU that can implement the massively threaded data-parallel tasks to obtain remarkable speedup ratio (comparing with CPU computation, the speedup ratio of GPU parallel computing is as high as 200 in a case of 100 cells with 10 000 simulation particles per cell). These accelerating approaches of PBMC are demonstrated in a physically realistic Brownian coagulation case. The computational accuracy is validated with benchmark solution of discrete-sectional method. The simulation results show that the comprehensive approach can attain very favorable improvement in cost without sacrificing computational accuracy.« less
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)
1993-01-01
This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
High-performance computational fluid dynamics: a custom-code approach
NASA Astrophysics Data System (ADS)
Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.
2016-07-01
We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
Discrete Event Modeling and Massively Parallel Execution of Epidemic Outbreak Phenomena
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S; Seal, Sudip K
2011-01-01
In complex phenomena such as epidemiological outbreaks, the intensity of inherent feedback effects and the significant role of transients in the dynamics make simulation the only effective method for proactive, reactive or post-facto analysis. The spatial scale, runtime speed, and behavioral detail needed in detailed simulations of epidemic outbreaks make it necessary to use large-scale parallel processing. Here, an optimistic parallel execution of a new discrete event formulation of a reaction-diffusion simulation model of epidemic propagation is presented to facilitate in dramatically increasing the fidelity and speed by which epidemiological simulations can be performed. Rollback support needed during optimistic parallelmore » execution is achieved by combining reverse computation with a small amount of incremental state saving. Parallel speedup of over 5,500 and other runtime performance metrics of the system are observed with weak-scaling execution on a small (8,192-core) Blue Gene / P system, while scalability with a weak-scaling speedup of over 10,000 is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes exceeding several hundreds of millions of individuals in the largest cases are successfully exercised to verify model scalability.« less
NASA Astrophysics Data System (ADS)
Leamy, Michael J.; Springer, Adam C.
In this research we report parallel implementation of a Cellular Automata-based simulation tool for computing elastodynamic response on complex, two-dimensional domains. Elastodynamic simulation using Cellular Automata (CA) has recently been presented as an alternative, inherently object-oriented technique for accurately and efficiently computing linear and nonlinear wave propagation in arbitrarily-shaped geometries. The local, autonomous nature of the method should lead to straight-forward and efficient parallelization. We address this notion on symmetric multiprocessor (SMP) hardware using a Java-based object-oriented CA code implementing triangular state machines (i.e., automata) and the MPI bindings written in Java (MPJ Express). We use MPJ Express to reconfigure our existing CA code to distribute a domain's automata to cores present on a dual quad-core shared-memory system (eight total processors). We note that this message passing parallelization strategy is directly applicable to computer clustered computing, which will be the focus of follow-on research. Results on the shared memory platform indicate nearly-ideal, linear speed-up. We conclude that the CA-based elastodynamic simulator is easily configured to run in parallel, and yields excellent speed-up on SMP hardware.
NASA Astrophysics Data System (ADS)
Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu
2015-07-01
The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
Parallel distributed, reciprocal Monte Carlo radiation in coupled, large eddy combustion simulations
NASA Astrophysics Data System (ADS)
Hunsaker, Isaac L.
Radiation is the dominant mode of heat transfer in high temperature combustion environments. Radiative heat transfer affects the gas and particle phases, including all the associated combustion chemistry. The radiative properties are in turn affected by the turbulent flow field. This bi-directional coupling of radiation turbulence interactions poses a major challenge in creating parallel-capable, high-fidelity combustion simulations. In this work, a new model was developed in which reciprocal monte carlo radiation was coupled with a turbulent, large-eddy simulation combustion model. A technique wherein domain patches are stitched together was implemented to allow for scalable parallelism. The combustion model runs in parallel on a decomposed domain. The radiation model runs in parallel on a recomposed domain. The recomposed domain is stored on each processor after information sharing of the decomposed domain is handled via the message passing interface. Verification and validation testing of the new radiation model were favorable. Strong scaling analyses were performed on the Ember cluster and the Titan cluster for the CPU-radiation model and GPU-radiation model, respectively. The model demonstrated strong scaling to over 1,700 and 16,000 processing cores on Ember and Titan, respectively.
Predictive study on the risk of malaria spreading due to global warming
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ono, Masaji
Global warming will bring about a temperature elevation, and the habitat of vectors of infectious diseases, such as malaria and dengue fever, will spread into subtropical or temperate zone. The purpose of this study is to simulate the spreading of these diseases through reexamination of existing data and collection of some additional information by field survey. From these data, the author will establish the relationship between meteorological conditions, vector density and malaria occurrence. And then he will simulate and predict the malaria epidemics in case of temperature elevation in southeast Asia and Japan.
W. Wang; J. Xiao; S. V. Ollinger; J. Chen; A. Noormets
2014-01-01
Stand-replacing disturbances including harvests have substantial impacts on forest carbon (C) fluxes and stocks. The quantification and simulation of these effects is essential for better understanding forest C dynamics and informing forest management 5 in the context of global change. We evaluated the process-based forest ecosystem model, PnET-CN, for how well and by...
NASA Astrophysics Data System (ADS)
Baregheh, Mandana; Mezentsev, Vladimir; Schmitz, Holger
2011-06-01
We describe a parallel multi-threaded approach for high performance modelling of wide class of phenomena in ultrafast nonlinear optics. Specific implementation has been performed using the highly parallel capabilities of a programmable graphics processor.
Dib, Alain E; Johnson, Chris E; Driscoll, Charles T; Fahey, Timothy J; Hayhoe, Katharine
2014-05-01
Carbon (C) sequestration in forest biomass and soils may help decrease regional C footprints and mitigate future climate change. The efficacy of these practices must be verified by monitoring and by approved calculation methods (i.e., models) to be credible in C markets. Two widely used soil organic matter models - CENTURY and RothC - were used to project changes in SOC pools after clear-cutting disturbance, as well as under a range of future climate and atmospheric carbon dioxide (CO(2) ) scenarios. Data from the temperate, predominantly deciduous Hubbard Brook Experimental Forest (HBEF) in New Hampshire, USA, were used to parameterize and validate the models. Clear-cutting simulations demonstrated that both models can effectively simulate soil C dynamics in the northern hardwood forest when adequately parameterized. The minimum postharvest SOC predicted by RothC occurred in postharvest year 14 and was within 1.5% of the observed minimum, which occurred in year 8. CENTURY predicted the postharvest minimum SOC to occur in year 45, at a value 6.9% greater than the observed minimum; the slow response of both models to disturbance suggests that they may overestimate the time required to reach new steady-state conditions. Four climate change scenarios were used to simulate future changes in SOC pools. Climate-change simulations predicted increases in SOC by as much as 7% at the end of this century, partially offsetting future CO(2) emissions. This sequestration was the product of enhanced forest productivity, and associated litter input to the soil, due to increased temperature, precipitation and CO(2) . The simulations also suggested that considerable losses of SOC (8-30%) could occur if forest vegetation at HBEF does not respond to changes in climate and CO(2) levels. Therefore, the source/sink behavior of temperate forest soils likely depends on the degree to which forest growth is stimulated by new climate and CO(2) conditions. © 2013 John Wiley & Sons Ltd.
Persistent random walk of cells involving anomalous effects and random death
NASA Astrophysics Data System (ADS)
Fedotov, Sergei; Tan, Abby; Zubarev, Andrey
2015-04-01
The purpose of this paper is to implement a random death process into a persistent random walk model which produces sub-ballistic superdiffusion (Lévy walk). We develop a stochastic two-velocity jump model of cell motility for which the switching rate depends upon the time which the cell has spent moving in one direction. It is assumed that the switching rate is a decreasing function of residence (running) time. This assumption leads to the power law for the velocity switching time distribution. This describes the anomalous persistence of cell motility: the longer the cell moves in one direction, the smaller the switching probability to another direction becomes. We derive master equations for the cell densities with the generalized switching terms involving the tempered fractional material derivatives. We show that the random death of cells has an important implication for the transport process through tempering of the superdiffusive process. In the long-time limit we write stationary master equations in terms of exponentially truncated fractional derivatives in which the rate of death plays the role of tempering of a Lévy jump distribution. We find the upper and lower bounds for the stationary profiles corresponding to the ballistic transport and diffusion with the death-rate-dependent diffusion coefficient. Monte Carlo simulations confirm these bounds.
Ozone-induced stomatal sluggishness changes carbon and water balance of temperate deciduous forests.
Hoshika, Yasutomo; Katata, Genki; Deushi, Makoto; Watanabe, Makoto; Koike, Takayoshi; Paoletti, Elena
2015-05-06
Tropospheric ozone concentrations have increased by 60-100% in the Northern Hemisphere since the 19(th) century. The phytotoxic nature of ozone can impair forest productivity. In addition, ozone affects stomatal functions, by both favoring stomatal closure and impairing stomatal control. Ozone-induced stomatal sluggishness, i.e., a delay in stomatal responses to fluctuating stimuli, has the potential to change the carbon and water balance of forests. This effect has to be included in models for ozone risk assessment. Here we examine the effects of ozone-induced stomatal sluggishness on carbon assimilation and transpiration of temperate deciduous forests in the Northern Hemisphere in 2006-2009 by combining a detailed multi-layer land surface model and a global atmospheric chemistry model. An analysis of results by ozone FACE (Free-Air Controlled Exposure) experiments suggested that ozone-induced stomatal sluggishness can be incorporated into modelling based on a simple parameter (gmin, minimum stomatal conductance) which is used in the coupled photosynthesis-stomatal model. Our simulation showed that ozone can decrease water use efficiency, i.e., the ratio of net CO2 assimilation to transpiration, of temperate deciduous forests up to 20% when ozone-induced stomatal sluggishness is considered, and up to only 5% when the stomatal sluggishness is neglected.
NASA Astrophysics Data System (ADS)
Moon, Joonoh; Lee, Chang-Hoon; Lee, Tae-Ho; Kim, Hyoung Chan
2015-01-01
The phase transformation and mechanical properties in the weld heat-affected zone (HAZ) of a reduced activation ferritic/martensitic steel were explored. The samples for HAZs were prepared using a Gleeble simulator at different heat inputs. The base steel consisted of tempered martensite and carbides through quenching and tempering treatment, whereas the HAZs consisted of martensite, δ-ferrite, and a small volume of autotempered martensite. The prior austenite grain size, lath width of martensite, and δ-ferrite fraction in the HAZs increased with increase in the heat input. The mechanical properties were evaluated using Vickers hardness and Charpy V-notch impact test. The Vickers hardness in the HAZs was higher than that in the base steel but did not change noticeably with increase in the heat input. The HAZs showed poor impact property due to the formation of martensite and δ-ferrite as compared to the base steel. In addition, the impact property of the HAZs deteriorated more with the increase in the heat input. Post weld heat treatment contributed to improve the impact property of the HAZs through the formation of tempered martensite, but the impact property of the HAZs remained lower than that of base steel.
The Change in the area of various land covers on the Tibetan Plateau during 1957-2015
NASA Astrophysics Data System (ADS)
Cuo, Lan; Zhang, Yongxin
2017-04-01
With average elevation of 4000 m and area of 2.5×106 km2, Tibetan Plateau hosts various fragile ecosystems such as perennial alpine meadow, perennial alpine steppe, temperate evergreen needleleaf trees, temperate deciduous trees, temperate shrub grassland, and barely vegetated desert. Perennial alpine meadow and steppe are the two dominant vegetation types on the heartland of the plateau. MODIS Leaf Area Index (LAI) ranges from 0 to 2 in most part of the plateau. With climate change, these ecosystems are expected to undergo alteration. This study uses a dynamic vegetation model - Lund-Potsdam-Jena (LPJ) to investigate the change of the barely vegetated area and other vegetation types caused by climate change during 1957-2015 on the Tibetan Plateau. Model simulated foliage projective coverage (FPC) and plant functional types (PFTs) are selected for the investigation. The model is evaluated first using both field surveyed land cover map and MODIS LAI images. Long term trends of vegetation FPC is examined. Decadal variations of vegetated and barely vegetated land are compared. The impacts of extreme precipitation, air temperature and CO2 on the expansion and contraction of barely vegetated and vegetated areas are shown. The study will identify the dominant climate factors in affecting the desert area in the region.
cellGPU: Massively parallel simulations of dynamic vertex models
NASA Astrophysics Data System (ADS)
Sussman, Daniel M.
2017-10-01
Vertex models represent confluent tissue by polygonal or polyhedral tilings of space, with the individual cells interacting via force laws that depend on both the geometry of the cells and the topology of the tessellation. This dependence on the connectivity of the cellular network introduces several complications to performing molecular-dynamics-like simulations of vertex models, and in particular makes parallelizing the simulations difficult. cellGPU addresses this difficulty and lays the foundation for massively parallelized, GPU-based simulations of these models. This article discusses its implementation for a pair of two-dimensional models, and compares the typical performance that can be expected between running cellGPU entirely on the CPU versus its performance when running on a range of commercial and server-grade graphics cards. By implementing the calculation of topological changes and forces on cells in a highly parallelizable fashion, cellGPU enables researchers to simulate time- and length-scales previously inaccessible via existing single-threaded CPU implementations. Program Files doi:http://dx.doi.org/10.17632/6j2cj29t3r.1 Licensing provisions: MIT Programming language: CUDA/C++ Nature of problem: Simulations of off-lattice "vertex models" of cells, in which the interaction forces depend on both the geometry and the topology of the cellular aggregate. Solution method: Highly parallelized GPU-accelerated dynamical simulations in which the force calculations and the topological features can be handled on either the CPU or GPU. Additional comments: The code is hosted at https://gitlab.com/dmsussman/cellGPU, with documentation additionally maintained at http://dmsussman.gitlab.io/cellGPUdocumentation
NASA Astrophysics Data System (ADS)
Slaughter, A. E.; Permann, C.; Peterson, J. W.; Gaston, D.; Andrs, D.; Miller, J.
2014-12-01
The Idaho National Laboratory (INL)-developed Multiphysics Object Oriented Simulation Environment (MOOSE; www.mooseframework.org), is an open-source, parallel computational framework for enabling the solution of complex, fully implicit multiphysics systems. MOOSE provides a set of computational tools that scientists and engineers can use to create sophisticated multiphysics simulations. Applications built using MOOSE have computed solutions for chemical reaction and transport equations, computational fluid dynamics, solid mechanics, heat conduction, mesoscale materials modeling, geomechanics, and others. To facilitate the coupling of diverse and highly-coupled physical systems, MOOSE employs the Jacobian-free Newton-Krylov (JFNK) method when solving the coupled nonlinear systems of equations arising in multiphysics applications. The MOOSE framework is written in C++, and leverages other high-quality, open-source scientific software packages such as LibMesh, Hypre, and PETSc. MOOSE uses a "hybrid parallel" model which combines both shared memory (thread-based) and distributed memory (MPI-based) parallelism to ensure efficient resource utilization on a wide range of computational hardware. MOOSE-based applications are inherently modular, which allows for simulation expansion (via coupling of additional physics modules) and the creation of multi-scale simulations. Any application developed with MOOSE supports running (in parallel) any other MOOSE-based application. Each application can be developed independently, yet easily communicate with other applications (e.g., conductivity in a slope-scale model could be a constant input, or a complete phase-field micro-structure simulation) without additional code being written. This method of development has proven effective at INL and expedites the development of sophisticated, sustainable, and collaborative simulation tools.
NASA Astrophysics Data System (ADS)
Wan, Yuhong; Man, Tianlong; Wu, Fan; Kim, Myung K.; Wang, Dayong
2016-11-01
We present a new self-interference digital holographic approach that allows single-shot capturing three-dimensional intensity distribution of the spatially incoherent objects. The Fresnel incoherent correlation holographic microscopy is combined with parallel phase-shifting technique to instantaneously obtain spatially multiplexed phase-shifting holograms. The compressive-sensing-based reconstruction algorithm is implemented to reconstruct the original object from the under sampled demultiplexed holograms. The scheme is verified with simulations. The validity of the proposed method is experimentally demonstrated in an indirectly way by simulating the use of specific parallel phase-shifting recording device.
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows
NASA Technical Reports Server (NTRS)
Moitra, Stuti; Gatski, Thomas B.
1997-01-01
A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Performance issues for domain-oriented time-driven distributed simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.
1987-01-01
It has long been recognized that simulations form an interesting and important class of computations that may benefit from distributed or parallel processing. Since the point of parallel processing is improved performance, the recent proliferation of multiprocessors requires that we consider the performance issues that naturally arise when attempting to implement a distributed simulation. Three such issues are: (1) the problem of mapping the simulation onto the architecture, (2) the possibilities for performing redundant computation in order to reduce communication, and (3) the avoidance of deadlock due to distributed contention for message-buffer space. These issues are discussed in the context of a battlefield simulation implemented on a medium-scale multiprocessor message-passing architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faduska, A.; Rau, E.; Alger, J.V.
Data are given on the corrosion properties of type 410 stainless steel tempered at 1150 d F. Control mechanismn-drive motor tubes and some outer housings are constructed of 650 d F tempered type 410 stainless steel. Since the stress corrosion resistance of type 410 in the 1150 d F tempered condition is superior, the utilization of the 1150 d F tempered material is more desirable for this application. The properties of 410 stainless steel hardened and tempered at 1150 d F are given. (W.L.H.)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiong, Yi; Fakcharoenphol, Perapon; Wang, Shihao
2013-12-01
TOUGH2-EGS-MP is a parallel numerical simulation program coupling geomechanics with fluid and heat flow in fractured and porous media, and is applicable for simulation of enhanced geothermal systems (EGS). TOUGH2-EGS-MP is based on the TOUGH2-MP code, the massively parallel version of TOUGH2. In TOUGH2-EGS-MP, the fully-coupled flow-geomechanics model is developed from linear elastic theory for thermo-poro-elastic systems and is formulated in terms of mean normal stress as well as pore pressure and temperature. Reservoir rock properties such as porosity and permeability depend on rock deformation, and the relationships between these two, obtained from poro-elasticity theories and empirical correlations, are incorporatedmore » into the simulation. This report provides the user with detailed information on the TOUGH2-EGS-MP mathematical model and instructions for using it for Thermal-Hydrological-Mechanical (THM) simulations. The mathematical model includes the fluid and heat flow equations, geomechanical equation, and discretization of those equations. In addition, the parallel aspects of the code, such as domain partitioning and communication between processors, are also included. Although TOUGH2-EGS-MP has the capability for simulating fluid and heat flows coupled with geomechanical effects, it is up to the user to select the specific coupling process, such as THM or only TH, in a simulation. There are several example problems illustrating applications of this program. These example problems are described in detail and their input data are presented. Their results demonstrate that this program can be used for field-scale geothermal reservoir simulation in porous and fractured media with fluid and heat flow coupled with geomechanical effects.« less
Modularized Parallel Neutron Instrument Simulation on the TeraGrid
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Meili; Cobb, John W; Hagen, Mark E
2007-01-01
In order to build a bridge between the TeraGrid (TG), a national scale cyberinfrastructure resource, and neutron science, the Neutron Science TeraGrid Gateway (NSTG) is focused on introducing productive HPC usage to the neutron science community, primarily the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL). Monte Carlo simulations are used as a powerful tool for instrument design and optimization at SNS. One of the successful efforts of a collaboration team composed of NSTG HPC experts and SNS instrument scientists is the development of a software facility named PSoNI, Parallelizing Simulations of Neutron Instruments. Parallelizing the traditional serialmore » instrument simulation on TeraGrid resources, PSoNI quickly computes full instrument simulation at sufficient statistical levels in instrument de-sign. Upon SNS successful commissioning, to the end of 2007, three out of five commissioned instruments in SNS target station will be available for initial users. Advanced instrument study, proposal feasibility evalua-tion, and experiment planning are on the immediate schedule of SNS, which pose further requirements such as flexibility and high runtime efficiency on fast instrument simulation. PSoNI has been redesigned to meet the new challenges and a preliminary version is developed on TeraGrid. This paper explores the motivation and goals of the new design, and the improved software structure. Further, it describes the realized new fea-tures seen from MPI parallelized McStas running high resolution design simulations of the SEQUOIA and BSS instruments at SNS. A discussion regarding future work, which is targeted to do fast simulation for automated experiment adjustment and comparing models to data in analysis, is also presented.« less
Parallel Evolution of Cold Tolerance within Drosophila melanogaster
Braun, Dylan T.; Lack, Justin B.
2017-01-01
Drosophila melanogaster originated in tropical Africa before expanding into strikingly different temperate climates in Eurasia and beyond. Here, we find elevated cold tolerance in three distinct geographic regions: beyond the well-studied non-African case, we show that populations from the highlands of Ethiopia and South Africa have significantly increased cold tolerance as well. We observe greater cold tolerance in outbred versus inbred flies, but only in populations with higher inversion frequencies. Each cold-adapted population shows lower inversion frequencies than a closely-related warm-adapted population, suggesting that inversion frequencies may decrease with altitude in addition to latitude. Using the FST-based “Population Branch Excess” statistic (PBE), we found only limited evidence for parallel genetic differentiation at the scale of ∼4 kb windows, specifically between Ethiopian and South African cold-adapted populations. And yet, when we looked for single nucleotide polymorphisms (SNPs) with codirectional frequency change in two or three cold-adapted populations, strong genomic enrichments were observed from all comparisons. These findings could reflect an important role for selection on standing genetic variation leading to “soft sweeps”. One SNP showed sufficient codirectional frequency change in all cold-adapted populations to achieve experiment-wide significance: an intronic variant in the synaptic gene Prosap. Another codirectional outlier SNP, at senseless-2, had a strong association with our cold trait measurements, but in the opposite direction as predicted. More generally, proteins involved in neurotransmission were enriched as potential targets of parallel adaptation. The ability to study cold tolerance evolution in a parallel framework will enhance this classic study system for climate adaptation. PMID:27777283
Texture and Tempered Condition Combined Effects on Fatigue Behavior in an Al-Cu-Li Alloy
NASA Astrophysics Data System (ADS)
Wang, An; Liu, Zhiyi; Liu, Meng; Wu, Wenting; Bai, Song; Yang, Rongxian
2017-05-01
Texture and tempered condition combined effects on fatigue behavior in an Al-Cu-Li alloy have been investigated using tensile testing, cyclic loading testing, scanning electron microscope (SEM), transmission electron microscopy (TEM) and texture analysis. Results showed that in near-threshold region, T4-tempered samples possessed the lowest fatigue crack propagation (FCP) rate. In Paris regime, T4-tempered sample had similar FCP rate with T6-tempered sample. T83-tempered sample exhibited the greatest FCP rate among the three tempered conditions. 3% pre-stretching in T83-tempered sample resulted in a reducing intensity of Goss texture and facilitated T1 precipitation. SEM results showed that less crack deflection was observed in T83-tempered sample, as compared to other two tempered samples. It was the combined effects of a lower intensity of Goss texture and T1 precipitates retarding the reversible dislocation slipping in the plastic zone ahead the crack tip.
Wakschlag, Lauren S.; Choi, Seung W.; Carter, Alice S.; Hullsiek, Heide; Burns, James; McCarthy, Kimberly; Leibenluft, Ellen; Briggs-Gowan, Margaret J.
2013-01-01
Background Temper modulation problems are both a hallmark of early childhood and a common mental health concern. Thus, characterizing specific behavioral manifestations of temper loss along a dimension from normative misbehaviors to clinically significant problems is an important step toward identifying clinical thresholds. Methods Parent-reported patterns of temper loss were delineated in a diverse community sample of preschoolers (n = 1,490). A developmentally sensitive questionnaire, the Multidimensional Assessment of Preschool Disruptive Behavior (MAP-DB), was used to assess temper loss in terms of tantrum features and anger regulation. Specific aims were: (a) document the normative distribution of temper loss in preschoolers from normative misbehaviors to clinically concerning temper loss behaviors, and test for sociodemographic differences; (b) use Item Response Theory (IRT) to model a Temper Loss dimension; and (c) examine associations of temper loss and concurrent emotional and behavioral problems. Results Across sociodemographic subgroups, a unidimensional Temper Loss model fit the data well. Nearly all (83.7%) preschoolers had tantrums sometimes but only 8.6% had daily tantrums. Normative misbehaviors occurred more frequently than clinically concerning temper loss behaviors. Milder behaviors tended to reflect frustration in expectable contexts, whereas clinically concerning problem indicators were unpredictable, prolonged, and/or destructive. In multivariate models, Temper Loss was associated with emotional and behavioral problems. Conclusions Parent reports on a developmentally informed questionnaire, administered to a large and diverse sample, distinguished normative and problematic manifestations of preschool temper loss. A developmental, dimensional approach shows promise for elucidating the boundaries between normative early childhood temper loss and emergent psychopathology. PMID:22928674
NASA Astrophysics Data System (ADS)
Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro
2016-08-01
We present the basic idea, implementation, measured performance, and performance model of FDPS (Framework for Developing Particle Simulators). FDPS is an application-development framework which helps researchers to develop simulation programs using particle methods for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, exchange of particles which are not in the domain of each computing node, and gathering of the particle information in other nodes which are necessary for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as the Barnes-Hut tree algorithm or the Fast Multipole Method should be used in the case of long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are required. FDPS provides all of these functions which are necessary for efficient parallel execution of particle-based simulations as "templates," which are independent of the actual data structure of particles and the functional form of the particle-particle interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speed-up was obtained for up to the full system of the K computer. The minimum calculation time per timestep is in the range of 30 ms (N = 107) to 300 ms (N = 109). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.
Specification and Analysis of Parallel Machine Architecture
1990-03-17
Parallel Machine Architeture C.V. Ramamoorthy Computer Science Division Dept. of Electrical Engineering and Computer Science University of California...capacity. (4) Adaptive: The overhead in resolution of deadlocks, etc. should be in proportion to their frequency. (5) Avoid rollbacks: Rollbacks can be...snapshots of system state graphically at a rate proportional to simulation time. Some of the examples are as follow: (1) When the simulation clock of
Argonne Simulation Framework for Intelligent Transportation Systems
DOT National Transportation Integrated Search
1996-01-01
A simulation framework has been developed which defines a high-level architecture for a large-scale, comprehensive, scalable simulation of an Intelligent Transportation System (ITS). The simulator is designed to run on parallel computers and distribu...
Parallel DSMC Solution of Three-Dimensional Flow Over a Finite Flat Plate
NASA Technical Reports Server (NTRS)
Nance, Robert P.; Wilmoth, Richard G.; Moon, Bongki; Hassan, H. A.; Saltz, Joel
1994-01-01
This paper describes a parallel implementation of the direct simulation Monte Carlo (DSMC) method. Runtime library support is used for scheduling and execution of communication between nodes, and domain decomposition is performed dynamically to maintain a good load balance. Performance tests are conducted using the code to evaluate various remapping and remapping-interval policies, and it is shown that a one-dimensional chain-partitioning method works best for the problems considered. The parallel code is then used to simulate the Mach 20 nitrogen flow over a finite-thickness flat plate. It is shown that the parallel algorithm produces results which compare well with experimental data. Moreover, it yields significantly faster execution times than the scalar code, as well as very good load-balance characteristics.
STOCHSIMGPU: parallel stochastic simulation for the Systems Biology Toolbox 2 for MATLAB.
Klingbeil, Guido; Erban, Radek; Giles, Mike; Maini, Philip K
2011-04-15
The importance of stochasticity in biological systems is becoming increasingly recognized and the computational cost of biologically realistic stochastic simulations urgently requires development of efficient software. We present a new software tool STOCHSIMGPU that exploits graphics processing units (GPUs) for parallel stochastic simulations of biological/chemical reaction systems and show that significant gains in efficiency can be made. It is integrated into MATLAB and works with the Systems Biology Toolbox 2 (SBTOOLBOX2) for MATLAB. The GPU-based parallel implementation of the Gillespie stochastic simulation algorithm (SSA), the logarithmic direct method (LDM) and the next reaction method (NRM) is approximately 85 times faster than the sequential implementation of the NRM on a central processing unit (CPU). Using our software does not require any changes to the user's models, since it acts as a direct replacement of the stochastic simulation software of the SBTOOLBOX2. The software is open source under the GPL v3 and available at http://www.maths.ox.ac.uk/cmb/STOCHSIMGPU. The web site also contains supplementary information. klingbeil@maths.ox.ac.uk Supplementary data are available at Bioinformatics online.
Predicting Flows of Rarefied Gases
NASA Technical Reports Server (NTRS)
LeBeau, Gerald J.; Wilmoth, Richard G.
2005-01-01
DSMC Analysis Code (DAC) is a flexible, highly automated, easy-to-use computer program for predicting flows of rarefied gases -- especially flows of upper-atmospheric, propulsion, and vented gases impinging on spacecraft surfaces. DAC implements the direct simulation Monte Carlo (DSMC) method, which is widely recognized as standard for simulating flows at densities so low that the continuum-based equations of computational fluid dynamics are invalid. DAC enables users to model complex surface shapes and boundary conditions quickly and easily. The discretization of a flow field into computational grids is automated, thereby relieving the user of a traditionally time-consuming task while ensuring (1) appropriate refinement of grids throughout the computational domain, (2) determination of optimal settings for temporal discretization and other simulation parameters, and (3) satisfaction of the fundamental constraints of the method. In so doing, DAC ensures an accurate and efficient simulation. In addition, DAC can utilize parallel processing to reduce computation time. The domain decomposition needed for parallel processing is completely automated, and the software employs a dynamic load-balancing mechanism to ensure optimal parallel efficiency throughout the simulation.
Enabling parallel simulation of large-scale HPC network systems
Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.; ...
2016-04-07
Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
Enabling parallel simulation of large-scale HPC network systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.
Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
A method for data handling numerical results in parallel OpenFOAM simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anton, Alin; Muntean, Sebastian
Parallel computational fluid dynamics simulations produce vast amount of numerical result data. This paper introduces a method for reducing the size of the data by replaying the interprocessor traffic. The results are recovered only in certain regions of interest configured by the user. A known test case is used for several mesh partitioning scenarios using the OpenFOAM toolkit{sup ®}[1]. The space savings obtained with classic algorithms remain constant for more than 60 Gb of floating point data. Our method is most efficient on large simulation meshes and is much better suited for compressing large scale simulation results than the regular algorithms.
Wu, Tianmin; Yang, Lijiang; Zhang, Ruiting; Shao, Qiang; Zhuang, Wei
2013-07-25
We simulated the equilibrium isotope-edited FTIR and 2DIR spectra of a β-hairpin peptide trpzip2 at a series of temperatures. The simulation was based on the configuration distributions generated using the GB(OBC) implicit solvent model and the integrated tempering sampling (ITS) technique. A soaking procedure was adapted to generate the peptide in explicit solvent configurations for the spectroscopy calculations. The nonlinear exciton propagation (NEP) method was then used to calculate the spectra. Agreeing with the experiments, the intensities and ellipticities of the isotope-shifted peaks in our simulated signals have the site-specific temperature dependences, which suggest the inhomogeneous local thermal stabilities along the peptide chain. Our simulation thus proposes a cost-effective means to understand a peptide's conformational change and related IR spectra across its thermal unfolding transition.
Parallel machine architecture and compiler design facilities
NASA Technical Reports Server (NTRS)
Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex
1990-01-01
The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.
JETSPIN: A specific-purpose open-source software for simulations of nanofiber electrospinning
NASA Astrophysics Data System (ADS)
Lauricella, Marco; Pontrelli, Giuseppe; Coluzza, Ivan; Pisignano, Dario; Succi, Sauro
2015-12-01
We present the open-source computer program JETSPIN, specifically designed to simulate the electrospinning process of nanofibers. Its capabilities are shown with proper reference to the underlying model, as well as a description of the relevant input variables and associated test-case simulations. The various interactions included in the electrospinning model implemented in JETSPIN are discussed in detail. The code is designed to exploit different computational architectures, from single to parallel processor workstations. This paper provides an overview of JETSPIN, focusing primarily on its structure, parallel implementations, functionality, performance, and availability.
Xyce™ Parallel Electronic Simulator Reference Guide, Version 6.5
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, Eric R.; Aadithya, Karthik V.; Mei, Ting
2016-06-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users’ Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users’ Guide. The information herein is subject to change without notice. Copyright © 2002-2016 Sandia Corporation. All rights reserved.
Substructured multibody molecular dynamics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grest, Gary Stephen; Stevens, Mark Jackson; Plimpton, Steven James
2006-11-01
We have enhanced our parallel molecular dynamics (MD) simulation software LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator, lammps.sandia.gov) to include many new features for accelerated simulation including articulated rigid body dynamics via coupling to the Rensselaer Polytechnic Institute code POEMS (Parallelizable Open-source Efficient Multibody Software). We use new features of the LAMMPS software package to investigate rhodopsin photoisomerization, and water model surface tension and capillary waves at the vapor-liquid interface. Finally, we motivate the recipes of MD for practitioners and researchers in numerical analysis and computational mechanics.
Biocellion: accelerating computer simulation of multicellular biological system models.
Kang, Seunghwa; Kahan, Simon; McDermott, Jason; Flann, Nicholas; Shmulevich, Ilya
2014-11-01
Biological system behaviors are often the outcome of complex interactions among a large number of cells and their biotic and abiotic environment. Computational biologists attempt to understand, predict and manipulate biological system behavior through mathematical modeling and computer simulation. Discrete agent-based modeling (in combination with high-resolution grids to model the extracellular environment) is a popular approach for building biological system models. However, the computational complexity of this approach forces computational biologists to resort to coarser resolution approaches to simulate large biological systems. High-performance parallel computers have the potential to address the computing challenge, but writing efficient software for parallel computers is difficult and time-consuming. We have developed Biocellion, a high-performance software framework, to solve this computing challenge using parallel computers. To support a wide range of multicellular biological system models, Biocellion asks users to provide their model specifics by filling the function body of pre-defined model routines. Using Biocellion, modelers without parallel computing expertise can efficiently exploit parallel computers with less effort than writing sequential programs from scratch. We simulate cell sorting, microbial patterning and a bacterial system in soil aggregate as case studies. Biocellion runs on x86 compatible systems with the 64 bit Linux operating system and is freely available for academic use. Visit http://biocellion.com for additional information. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Zhang, Chi; Ren, Wei
2017-09-01
Central Asia covers a large land area of 5 × 106 km2 and has unique temperate dryland ecosystems, with over 80% of the world's temperate deserts, which has been experiencing dramatic warming and drought in the recent decades. How the temperate dryland responds to complex climate change, however, is still far from clear. This study quantitatively investigates terrestrial net primary productivity (NPP) in responses to temperature, precipitation, and atmospheric CO2 during 1980-2014, by using the Arid Ecosystem Model, which can realistically predict ecosystems' responses to changes in climate and atmospheric CO2 according to model evaluation against 28 field experiments/observations. The simulation results show that unlike other middle-/high-latitude regions, NPP in central Asia declined by 10% (0.12 × 1015 g C) since the 1980s in response to a warmer and drier climate. The dryland's response to warming was weak, while its cropland was sensitive to the CO2 fertilization effect (CFE). However, the CFE was inhibited by the long-term drought from 1998 to 2008 and the positive effect of warming on photosynthesis was largely offset by the enhanced water deficit. The complex interactive effects among climate drivers, unique responses from diverse ecosystem types, and intensive and heterogeneous climatic changes led to highly complex NPP changing patterns in central Asia, of which 69% was dominated by precipitation variation and 20% and 9% was dominated by CO2 and temperature, respectively. The Turgay Plateau in northern Kazakhstan and southern Xinjiang in China are hot spots of NPP degradation in response to climate change during the past three decades and in the future.
NASA Astrophysics Data System (ADS)
Holm, J. A.; Knox, R. G.; Koven, C.; Riley, W. J.; Bisht, G.; Fisher, R.; Christoffersen, B. O.; Dietze, M.; Chambers, J. Q.
2017-12-01
The inclusion of dynamic vegetation demography in Earth System Models (ESMs) has been identified as a critical step in moving ESMs towards more realistic representations of plant ecology and the processes that govern climatically important fluxes of carbon, energy, and water. Successful application of dynamic vegetation models, and process-based approaches to simulate plant demography, succession, and response to disturbances without climate envelopes at the global scale is a challenging endeavor. We integrated demographic processes using the Functionally-Assembled Terrestrial Ecosystem Simulator (FATES) in the newly developed ACME Land Model (ALM). We then use an ALM-FATES globally gridded simulation for the first time to investigate plant functional type (PFT) distributions and dynamic turnover rates. Initial global simulations successfully include six interacting and competing PFTs (ranging from tropical to boreal, evergreen, deciduous, needleleaf and broadleaf); including more PFTs is planned. Global maps of net primary productivity, leaf area index, and total vegetation biomass by ALM-FATES matched patterns and values when compared to CLM4.5-BGC and MODIS estimates. We also present techniques for PFT parameterization based on the Predictive Ecosystem Analyzer (PEcAn), field based turnover rates, improved PFT groupings based on trait-tradeoffs, and improved representation of multiple canopy positions. Finally, we applied the improved ALM-FATES model at a central Amazon tropical and western U.S. temperate sites and demonstrate improvements in predicted PFT size- and age-structure and regional distribution. Results from the Amazon tropical site investigate the ability and magnitude of a tropical forest to act as a carbon sink by 2100 with a doubling of CO2, while results from the temperate sites investigate the response of forest mortality with increasing droughts.
Optimized Hypervisor Scheduler for Parallel Discrete Event Simulations on Virtual Machine Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoginath, Srikanth B; Perumalla, Kalyan S
2013-01-01
With the advent of virtual machine (VM)-based platforms for parallel computing, it is now possible to execute parallel discrete event simulations (PDES) over multiple virtual machines, in contrast to executing in native mode directly over hardware as is traditionally done over the past decades. While mature VM-based parallel systems now offer new, compelling benefits such as serviceability, dynamic reconfigurability and overall cost effectiveness, the runtime performance of parallel applications can be significantly affected. In particular, most VM-based platforms are optimized for general workloads, but PDES execution exhibits unique dynamics significantly different from other workloads. Here we first present results frommore » experiments that highlight the gross deterioration of the runtime performance of VM-based PDES simulations when executed using traditional VM schedulers, quantitatively showing the bad scaling properties of the scheduler as the number of VMs is increased. The mismatch is fundamental in nature in the sense that any fairness-based VM scheduler implementation would exhibit this mismatch with PDES runs. We also present a new scheduler optimized specifically for PDES applications, and describe its design and implementation. Experimental results obtained from running PDES benchmarks (PHOLD and vehicular traffic simulations) over VMs show over an order of magnitude improvement in the run time of the PDES-optimized scheduler relative to the regular VM scheduler, with over 20 reduction in run time of simulations using up to 64 VMs. The observations and results are timely in the context of emerging systems such as cloud platforms and VM-based high performance computing installations, highlighting to the community the need for PDES-specific support, and the feasibility of significantly reducing the runtime overhead for scalable PDES on VM platforms.« less
On the relationship between parallel computation and graph embedding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gupta, A.K.
1989-01-01
The problem of efficiently simulating an algorithm designed for an n-processor parallel machine G on an m-processor parallel machine H with n > m arises when parallel algorithms designed for an ideal size machine are simulated on existing machines which are of a fixed size. The author studies this problem when every processor of H takes over the function of a number of processors in G, and he phrases the simulation problem as a graph embedding problem. New embeddings presented address relevant issues arising from the parallel computation environment. The main focus centers around embedding complete binary trees into smaller-sizedmore » binary trees, butterflies, and hypercubes. He also considers simultaneous embeddings of r source machines into a single hypercube. Constant factors play a crucial role in his embeddings since they are not only important in practice but also lead to interesting theoretical problems. All of his embeddings minimize dilation and load, which are the conventional cost measures in graph embeddings and determine the maximum amount of time required to simulate one step of G on H. His embeddings also optimize a new cost measure called ({alpha},{beta})-utilization which characterizes how evenly the processors of H are used by the processors of G. Ideally, the utilization should be balanced (i.e., every processor of H simulates at most (n/m) processors of G) and the ({alpha},{beta})-utilization measures how far off from a balanced utilization the embedding is. He presents embeddings for the situation when some processors of G have different capabilities (e.g. memory or I/O) than others and the processors with different capabilities are to be distributed uniformly among the processors of H. Placing such conditions on an embedding results in an increase in some of the cost measures.« less
STOCHASTIC INTEGRATION FOR TEMPERED FRACTIONAL BROWNIAN MOTION.
Meerschaert, Mark M; Sabzikar, Farzad
2014-07-01
Tempered fractional Brownian motion is obtained when the power law kernel in the moving average representation of a fractional Brownian motion is multiplied by an exponential tempering factor. This paper develops the theory of stochastic integrals for tempered fractional Brownian motion. Along the way, we develop some basic results on tempered fractional calculus.
NASA Astrophysics Data System (ADS)
Hofierka, Jaroslav; Lacko, Michal; Zubal, Stanislav
2017-10-01
In this paper, we describe the parallelization of three complex and computationally intensive modules of GRASS GIS using the OpenMP application programming interface for multi-core computers. These include the v.surf.rst module for spatial interpolation, the r.sun module for solar radiation modeling and the r.sim.water module for water flow simulation. We briefly describe the functionality of the modules and parallelization approaches used in the modules. Our approach includes the analysis of the module's functionality, identification of source code segments suitable for parallelization and proper application of OpenMP parallelization code to create efficient threads processing the subtasks. We document the efficiency of the solutions using the airborne laser scanning data representing land surface in the test area and derived high-resolution digital terrain model grids. We discuss the performance speed-up and parallelization efficiency depending on the number of processor threads. The study showed a substantial increase in computation speeds on a standard multi-core computer while maintaining the accuracy of results in comparison to the output from original modules. The presented parallelization approach showed the simplicity and efficiency of the parallelization of open-source GRASS GIS modules using OpenMP, leading to an increased performance of this geospatial software on standard multi-core computers.
NASA Astrophysics Data System (ADS)
Cai, Yong; Cui, Xiangyang; Li, Guangyao; Liu, Wenyang
2018-04-01
The edge-smooth finite element method (ES-FEM) can improve the computational accuracy of triangular shell elements and the mesh partition efficiency of complex models. In this paper, an approach is developed to perform explicit finite element simulations of contact-impact problems with a graphical processing unit (GPU) using a special edge-smooth triangular shell element based on ES-FEM. Of critical importance for this problem is achieving finer-grained parallelism to enable efficient data loading and to minimize communication between the device and host. Four kinds of parallel strategies are then developed to efficiently solve these ES-FEM based shell element formulas, and various optimization methods are adopted to ensure aligned memory access. Special focus is dedicated to developing an approach for the parallel construction of edge systems. A parallel hierarchy-territory contact-searching algorithm (HITA) and a parallel penalty function calculation method are embedded in this parallel explicit algorithm. Finally, the program flow is well designed, and a GPU-based simulation system is developed, using Nvidia's CUDA. Several numerical examples are presented to illustrate the high quality of the results obtained with the proposed methods. In addition, the GPU-based parallel computation is shown to significantly reduce the computing time.
Donovan, Rory M.; Tapia, Jose-Juan; Sullivan, Devin P.; Faeder, James R.; Murphy, Robert F.; Dittrich, Markus; Zuckerman, Daniel M.
2016-01-01
The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE) strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assigned weights and uses a statistical resampling strategy of replicating and pruning trajectories to focus computational effort on difficult-to-sample regions. The method can also generate unbiased estimates of non-equilibrium and equilibrium observables, sometimes with significantly less aggregate computing time than would be possible using standard parallelization. Here, we use WE to orchestrate particle-based kinetic Monte Carlo simulations, which include spatial geometry (e.g., of organelles, plasma membrane) and biochemical interactions among mobile molecular species. We study a series of models exhibiting spatial, temporal and biochemical complexity and show that although WE has important limitations, it can achieve performance significantly exceeding standard parallel simulation—by orders of magnitude for some observables. PMID:26845334
Parallel-distributed mobile robot simulator
NASA Astrophysics Data System (ADS)
Okada, Hiroyuki; Sekiguchi, Minoru; Watanabe, Nobuo
1996-06-01
The aim of this project is to achieve an autonomous learning and growth function based on active interaction with the real world. It should also be able to autonomically acquire knowledge about the context in which jobs take place, and how the jobs are executed. This article describes a parallel distributed movable robot system simulator with an autonomous learning and growth function. The autonomous learning and growth function which we are proposing is characterized by its ability to learn and grow through interaction with the real world. When the movable robot interacts with the real world, the system compares the virtual environment simulation with the interaction result in the real world. The system then improves the virtual environment to match the real-world result more closely. This the system learns and grows. It is very important that such a simulation is time- realistic. The parallel distributed movable robot simulator was developed to simulate the space of a movable robot system with an autonomous learning and growth function. The simulator constructs a virtual space faithful to the real world and also integrates the interfaces between the user, the actual movable robot and the virtual movable robot. Using an ultrafast CG (computer graphics) system (FUJITSU AG series), time-realistic 3D CG is displayed.
Takano, Yu; Nakata, Kazuto; Yonezawa, Yasushige; Nakamura, Haruki
2016-05-05
A massively parallel program for quantum mechanical-molecular mechanical (QM/MM) molecular dynamics simulation, called Platypus (PLATform for dYnamic Protein Unified Simulation), was developed to elucidate protein functions. The speedup and the parallelization ratio of Platypus in the QM and QM/MM calculations were assessed for a bacteriochlorophyll dimer in the photosynthetic reaction center (DIMER) on the K computer, a massively parallel computer achieving 10 PetaFLOPs with 705,024 cores. Platypus exhibited the increase in speedup up to 20,000 core processors at the HF/cc-pVDZ and B3LYP/cc-pVDZ, and up to 10,000 core processors by the CASCI(16,16)/6-31G** calculations. We also performed excited QM/MM-MD simulations on the chromophore of Sirius (SIRIUS) in water. Sirius is a pH-insensitive and photo-stable ultramarine fluorescent protein. Platypus accelerated on-the-fly excited-state QM/MM-MD simulations for SIRIUS in water, using over 4000 core processors. In addition, it also succeeded in 50-ps (200,000-step) on-the-fly excited-state QM/MM-MD simulations for the SIRIUS in water. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.
Parallel Stochastic discrete event simulation of calcium dynamics in neuron.
Ishlam Patoary, Mohammad Nazrul; Tropper, Carl; McDougal, Robert A; Zhongwei, Lin; Lytton, William W
2017-09-26
The intra-cellular calcium signaling pathways of a neuron depends on both biochemical reactions and diffusions. Some quasi-isolated compartments (e.g. spines) are so small and calcium concentrations are so low that one extra molecule diffusing in by chance can make a nontrivial difference in its concentration (percentage-wise). These rare events can affect dynamics discretely in such way that they cannot be evaluated by a deterministic simulation. Stochastic models of such a system provide a more detailed understanding of these systems than existing deterministic models because they capture their behavior at a molecular level. Our research focuses on the development of a high performance parallel discrete event simulation environment, Neuron Time Warp (NTW), which is intended for use in the parallel simulation of stochastic reaction-diffusion systems such as intra-calcium signaling. NTW is integrated with NEURON, a simulator which is widely used within the neuroscience community. We simulate two models, a calcium buffer and a calcium wave model. The calcium buffer model is employed in order to verify the correctness and performance of NTW by comparing it to a serial deterministic simulation in NEURON. We also derived a discrete event calcium wave model from a deterministic model using the stochastic IP3R structure.
Computational Thermodynamics Characterization of 7075, 7039, and 7020 Aluminum Alloys Using JMatPro
2011-09-01
parameters of temperature and time may be selected to simulate effects on microstructure during annealing , solution treating, quenching, and tempering...nucleation may be taken into account by use of a wetting angle function. Activation energy may be taken into account for rapidly quenched alloys...the stable forms of precipitates that result from solutionizing, annealing or intermediate heat treatment, and phase formation during nonequilibrium
Estimating the capital recovery costs of alternative patch retention treatments in eastern hardwoods
Chris B. LeDoux; Andrew Whitman
2006-01-01
We used a simulation model to estimate the economic opportunity costs and the density of large stems retained for patch retention in two temperate oak stands representative of the oak/hickory forest type in the eastern United States. Opportunity/retention costs ranged from $321.0 to $760.7/ha [$129.9 to $307.8/acre] depending on the species mix in the stand, the...
NASA Astrophysics Data System (ADS)
Arefi, Hadi H.; Yamamoto, Takeshi
2017-12-01
Conventional molecular-dynamics (cMD) simulation has a well-known limitation in accessible time and length scales, and thus various enhanced sampling techniques have been proposed to alleviate the problem. In this paper, we explore the utility of replica exchange with solute tempering (REST) (i.e., a variant of Hamiltonian replica exchange methods) to simulate the self-assembly of a supramolecular polymer in explicit solvent and compare the performance with temperature-based replica exchange MD (T-REMD) as well as cMD. As a test system, we consider a relatively simple all-atom model of supramolecular polymerization (namely, benzene-1,3,5-tricarboxamides in methylcyclohexane solvent). Our results show that both REST and T-REMD are able to predict highly ordered polymer structures with helical H-bonding patterns, in contrast to cMD which completely fails to obtain such a structure for the present model. At the same time, we have also experienced some technical challenge (i.e., aggregation-dispersion transition and the resulting bottleneck for replica traversal), which is illustrated numerically. Since the computational cost of REST scales more moderately than T-REMD, we expect that REST will be useful for studying the self-assembly of larger systems in solution with enhanced rearrangement of monomers.
Design of a bounded wave EMP (Electromagnetic Pulse) simulator
NASA Astrophysics Data System (ADS)
Sevat, P. A. A.
1989-06-01
Electromagnetic Pulse (EMP) simulators are used to simulate the EMP generated by a nuclear weapon and to harden equipment against the effects of EMP. At present, DREO has a 1 m EMP simulator for testing computer terminal size equipment. To develop the R and D capability for testing larger objects, such as a helicopter, a much bigger threat level facility is required. This report concerns the design of a bounded wave EMP simulator suitable for testing large size equipment. Different types of simulators are described and their pros and cons are discussed. A bounded wave parallel plate type simulator is chosen for it's efficiency and the least environmental impact. Detailed designs are given for 6 m and 10 m parallel plate type wire grid simulators. Electromagnetic fields inside and outside the simulators are computed. Preliminary specifications for a pulse generator required for the simulator are also given. Finally, the electromagnetic fields radiated from the simulator are computed and discussed.
Durham extremely large telescope adaptive optics simulation platform.
Basden, Alastair; Butterley, Timothy; Myers, Richard; Wilson, Richard
2007-03-01
Adaptive optics systems are essential on all large telescopes for which image quality is important. These are complex systems with many design parameters requiring optimization before good performance can be achieved. The simulation of adaptive optics systems is therefore necessary to categorize the expected performance. We describe an adaptive optics simulation platform, developed at Durham University, which can be used to simulate adaptive optics systems on the largest proposed future extremely large telescopes as well as on current systems. This platform is modular, object oriented, and has the benefit of hardware application acceleration that can be used to improve the simulation performance, essential for ensuring that the run time of a given simulation is acceptable. The simulation platform described here can be highly parallelized using parallelization techniques suited for adaptive optics simulation, while still offering the user complete control while the simulation is running. The results from the simulation of a ground layer adaptive optics system are provided as an example to demonstrate the flexibility of this simulation platform.
Super and parallel computers and their impact on civil engineering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamat, M.P.
1986-01-01
This book presents the papers given at a conference on the use of supercomputers in civil engineering. Topics considered at the conference included solving nonlinear equations on a hypercube, a custom architectured parallel processing system, distributed data processing, algorithms, computer architecture, parallel processing, vector processing, computerized simulation, and cost benefit analysis.
Evaluation of Parallel Analysis Methods for Determining the Number of Factors
ERIC Educational Resources Information Center
Crawford, Aaron V.; Green, Samuel B.; Levy, Roy; Lo, Wen-Juo; Scott, Lietta; Svetina, Dubravka; Thompson, Marilyn S.
2010-01-01
Population and sample simulation approaches were used to compare the performance of parallel analysis using principal component analysis (PA-PCA) and parallel analysis using principal axis factoring (PA-PAF) to identify the number of underlying factors. Additionally, the accuracies of the mean eigenvalue and the 95th percentile eigenvalue criteria…
Design of a massively parallel computer using bit serial processing elements
NASA Technical Reports Server (NTRS)
Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing
1995-01-01
A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.
Scalable isosurface visualization of massive datasets on commodity off-the-shelf clusters
Bajaj, Chandrajit
2009-01-01
Tomographic imaging and computer simulations are increasingly yielding massive datasets. Interactive and exploratory visualizations have rapidly become indispensable tools to study large volumetric imaging and simulation data. Our scalable isosurface visualization framework on commodity off-the-shelf clusters is an end-to-end parallel and progressive platform, from initial data access to the final display. Interactive browsing of extracted isosurfaces is made possible by using parallel isosurface extraction, and rendering in conjunction with a new specialized piece of image compositing hardware called Metabuffer. In this paper, we focus on the back end scalability by introducing a fully parallel and out-of-core isosurface extraction algorithm. It achieves scalability by using both parallel and out-of-core processing and parallel disks. It statically partitions the volume data to parallel disks with a balanced workload spectrum, and builds I/O-optimal external interval trees to minimize the number of I/O operations of loading large data from disk. We also describe an isosurface compression scheme that is efficient for progress extraction, transmission and storage of isosurfaces. PMID:19756231
Progress in Unsteady Turbopump Flow Simulations Using Overset Grid Systems
NASA Technical Reports Server (NTRS)
Kiris, Cetin C.; Chan, William; Kwak, Dochan
2002-01-01
This viewgraph presentation provides information on unsteady flow simulations for the Second Generation RLV (Reusable Launch Vehicle) baseline turbopump. Three impeller rotations were simulated by using a 34.3 million grid points model. MPI/OpenMP hybrid parallelism and MLP shared memory parallelism has been implemented and benchmarked in INS3D, an incompressible Navier-Stokes solver. For RLV turbopump simulations a speed up of more than 30 times has been obtained. Moving boundary capability is obtained by using the DCF module. Scripting capability from CAD geometry to solution is developed. Unsteady flow simulations for advanced consortium impeller/diffuser by using a 39 million grid points model are currently underway. 1.2 impeller rotations are completed. The fluid/structure coupling is initiated.