parallel computation chemistry: Topics by Science.gov

Sample records for parallel computation chemistry

The performance of low-cost commercial cloud computing as an alternative in computational chemistry.

PubMed

Thackston, Russell; Fortenberry, Ryan C

2015-05-05

The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.
Parallel-Processing Test Bed For Simulation Software

NASA Technical Reports Server (NTRS)

Blech, Richard; Cole, Gary; Townsend, Scott

1996-01-01

Second-generation Hypercluster computing system is multiprocessor test bed for research on parallel algorithms for simulation in fluid dynamics, electromagnetics, chemistry, and other fields with large computational requirements but relatively low input/output requirements. Built from standard, off-shelf hardware readily upgraded as improved technology becomes available. System used for experiments with such parallel-processing concepts as message-passing algorithms, debugging software tools, and computational steering. First-generation Hypercluster system described in "Hypercluster Parallel Processor" (LEW-15283).
Software Applications on the Peregrine System | High-Performance Computing

Science.gov Websites

programming and optimization. Gaussian Chemistry Program for calculating molecular electronic structure and Materials Science Open-source classical molecular dynamics program designed for massively parallel systems framework Q-Chem Chemistry ab initio quantum chemistry package for predictin molecular structures
Partial Overhaul and Initial Parallel Optimization of KINETICS, a Coupled Dynamics and Chemistry Atmosphere Model

NASA Technical Reports Server (NTRS)

Nguyen, Howard; Willacy, Karen; Allen, Mark

2012-01-01

KINETICS is a coupled dynamics and chemistry atmosphere model that is data intensive and computationally demanding. The potential performance gain from using a supercomputer motivates the adaptation from a serial version to a parallelized one. Although the initial parallelization had been done, bottlenecks caused by an abundance of communication calls between processors led to an unfavorable drop in performance. Before starting on the parallel optimization process, a partial overhaul was required because a large emphasis was placed on streamlining the code for user convenience and revising the program to accommodate the new supercomputers at Caltech and JPL. After the first round of optimizations, the partial runtime was reduced by a factor of 23; however, performance gains are dependent on the size of the data, the number of processors requested, and the computer used.
Chemical calculations on Cray computers

NASA Technical Reports Server (NTRS)

Taylor, Peter R.; Bauschlicher, Charles W., Jr.; Schwenke, David W.

1989-01-01

The influence of recent developments in supercomputing on computational chemistry is discussed with particular reference to Cray computers and their pipelined vector/limited parallel architectures. After reviewing Cray hardware and software the performance of different elementary program structures are examined, and effective methods for improving program performance are outlined. The computational strategies appropriate for obtaining optimum performance in applications to quantum chemistry and dynamics are discussed. Finally, some discussion is given of new developments and future hardware and software improvements.
Massively parallel sparse matrix function calculations with NTPoly

NASA Astrophysics Data System (ADS)

Dawson, William; Nakajima, Takahito

2018-04-01

We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
A Framework for Load Balancing of Tensor Contraction Expressions via Dynamic Task Partitioning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lai, Pai-Wei; Stock, Kevin; Rajbhandari, Samyam

In this paper, we introduce the Dynamic Load-balanced Tensor Contractions (DLTC), a domain-specific library for efficient task parallel execution of tensor contraction expressions, a class of computation encountered in quantum chemistry and physics. Our framework decomposes each contraction into smaller unit of tasks, represented by an abstraction referred to as iterators. We exploit an extra level of parallelism by having tasks across independent contractions executed concurrently through a dynamic load balancing run- time. We demonstrate the improved performance, scalability, and flexibility for the computation of tensor contraction expressions on parallel computers using examples from coupled cluster methods.
Current status and future prospects for enabling chemistry technology in the drug discovery process.

PubMed

Djuric, Stevan W; Hutchins, Charles W; Talaty, Nari N

2016-01-01

This review covers recent advances in the implementation of enabling chemistry technologies into the drug discovery process. Areas covered include parallel synthesis chemistry, high-throughput experimentation, automated synthesis and purification methods, flow chemistry methodology including photochemistry, electrochemistry, and the handling of "dangerous" reagents. Also featured are advances in the "computer-assisted drug design" area and the expanding application of novel mass spectrometry-based techniques to a wide range of drug discovery activities.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajbhandari, Samyam; NIkam, Akshay; Lai, Pai-Wei

Tensor contractions represent the most compute-intensive core kernels in ab initio computational quantum chemistry and nuclear physics. Symmetries in these tensor contractions makes them difficult to load balance and scale to large distributed systems. In this paper, we develop an efficient and scalable algorithm to contract symmetric tensors. We introduce a novel approach that avoids data redistribution in contracting symmetric tensors while also avoiding redundant storage and maintaining load balance. We present experimental results on two parallel supercomputers for several symmetric contractions that appear in the CCSD quantum chemistry method. We also present a novel approach to tensor redistribution thatmore » can take advantage of parallel hyperplanes when the initial distribution has replicated dimensions, and use collective broadcast when the final distribution has replicated dimensions, making the algorithm very efficient.« less
Parallel algorithms for quantum chemistry. I. Integral transformations on a hypercube multiprocessor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Whiteside, R.A.; Binkley, J.S.; Colvin, M.E.

1987-02-15

For many years it has been recognized that fundamental physical constraints such as the speed of light will limit the ultimate speed of single processor computers to less than about three billion floating point operations per second (3 GFLOPS). This limitation is becoming increasingly restrictive as commercially available machines are now within an order of magnitude of this asymptotic limit. A natural way to avoid this limit is to harness together many processors to work on a single computational problem. In principle, these parallel processing computers have speeds limited only by the number of processors one chooses to acquire. Themore » usefulness of potentially unlimited processing speed to a computationally intensive field such as quantum chemistry is obvious. If these methods are to be applied to significantly larger chemical systems, parallel schemes will have to be employed. For this reason we have developed distributed-memory algorithms for a number of standard quantum chemical methods. We are currently implementing these on a 32 processor Intel hypercube. In this paper we present our algorithm and benchmark results for one of the bottleneck steps in quantum chemical calculations: the four index integral transformation.« less
First Encounters of the Close Kind: The Formation Process of Airline Flight Crews

DTIC Science & Technology

1987-01-01

process and aircrew performance, Foushee notes an interesting etymological parallel: "Webster’s New Collegiate Dictionary (1961) defines cockpit as ’a...here combines applications from the physical science of chemistry and the modern science of computers. In chemistry , a shell is a space occupied by
Current status and future prospects for enabling chemistry technology in the drug discovery process

PubMed Central

Djuric, Stevan W.; Hutchins, Charles W.; Talaty, Nari N.

2016-01-01

This review covers recent advances in the implementation of enabling chemistry technologies into the drug discovery process. Areas covered include parallel synthesis chemistry, high-throughput experimentation, automated synthesis and purification methods, flow chemistry methodology including photochemistry, electrochemistry, and the handling of “dangerous” reagents. Also featured are advances in the “computer-assisted drug design” area and the expanding application of novel mass spectrometry-based techniques to a wide range of drug discovery activities. PMID:27781094
Parallel Performance of a Combustion Chemistry Simulation

DOE PAGES

Skinner, Gregg; Eigenmann, Rudolf

1995-01-01

We used a description of a combustion simulation's mathematical and computational methods to develop a version for parallel execution. The result was a reasonable performance improvement on small numbers of processors. We applied several important programming techniques, which we describe, in optimizing the application. This work has implications for programming languages, compiler design, and software engineering.
Calculating Potential Energy Curves with Quantum Monte Carlo

NASA Astrophysics Data System (ADS)

Powell, Andrew D.; Dawes, Richard

2014-06-01

Quantum Monte Carlo (QMC) is a computational technique that can be applied to the electronic Schrödinger equation for molecules. QMC methods such as Variational Monte Carlo (VMC) and Diffusion Monte Carlo (DMC) have demonstrated the capability of capturing large fractions of the correlation energy, thus suggesting their possible use for high-accuracy quantum chemistry calculations. QMC methods scale particularly well with respect to parallelization making them an attractive consideration in anticipation of next-generation computing architectures which will involve massive parallelization with millions of cores. Due to the statistical nature of the approach, in contrast to standard quantum chemistry methods, uncertainties (error-bars) are associated with each calculated energy. This study focuses on the cost, feasibility and practical application of calculating potential energy curves for small molecules with QMC methods. Trial wave functions were constructed with the multi-configurational self-consistent field (MCSCF) method from GAMESS-US.[1] The CASINO Monte Carlo quantum chemistry package [2] was used for all of the DMC calculations. An overview of our progress in this direction will be given. References: M. W. Schmidt et al. J. Comput. Chem. 14, 1347 (1993). R. J. Needs et al. J. Phys.: Condensed Matter 22, 023201 (2010).
Multicore Challenges and Benefits for High Performance Scientific Computing

DOE PAGES

Nielsen, Ida M. B.; Janssen, Curtis L.

2008-01-01

Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexitymore » of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.« less
High-Fidelity, Computational Modeling of Non-Equilibrium Discharges for Combustion Applications

DTIC Science & Technology

2013-10-01

gradient reconstruction)  4th order RK time integration  Domain decomposition parallel enabled Plasma chemistry mechanism 22  Methane-air... plasma chemistry mechanism  Species and pathways relevant to plasma time scale (~10’s ns)  26 Species : E, O, N2 , O2 , H , N2+ , O2+ , N4+ , O4...Photoionization (3-term Helmholtz equation model) 0.0067 0.0447 0.0346 0.1121 0.3059 0.5994 Plasma chemistry mechanism used in studies 81
Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform.

PubMed

Pronk, Sander; Pouya, Iman; Lundborg, Magnus; Rotskoff, Grant; Wesén, Björn; Kasson, Peter M; Lindahl, Erik

2015-06-09

Computational chemistry and other simulation fields are critically dependent on computing resources, but few problems scale efficiently to the hundreds of thousands of processors available in current supercomputers-particularly for molecular dynamics. This has turned into a bottleneck as new hardware generations primarily provide more processing units rather than making individual units much faster, which simulation applications are addressing by increasingly focusing on sampling with algorithms such as free-energy perturbation, Markov state modeling, metadynamics, or milestoning. All these rely on combining results from multiple simulations into a single observation. They are potentially powerful approaches that aim to predict experimental observables directly, but this comes at the expense of added complexity in selecting sampling strategies and keeping track of dozens to thousands of simulations and their dependencies. Here, we describe how the distributed execution framework Copernicus allows the expression of such algorithms in generic workflows: dataflow programs. Because dataflow algorithms explicitly state dependencies of each constituent part, algorithms only need to be described on conceptual level, after which the execution is maximally parallel. The fully automated execution facilitates the optimization of these algorithms with adaptive sampling, where undersampled regions are automatically detected and targeted without user intervention. We show how several such algorithms can be formulated for computational chemistry problems, and how they are executed efficiently with many loosely coupled simulations using either distributed or parallel resources with Copernicus.
Supercomputers Of The Future

NASA Technical Reports Server (NTRS)

Peterson, Victor L.; Kim, John; Holst, Terry L.; Deiwert, George S.; Cooper, David M.; Watson, Andrew B.; Bailey, F. Ron

1992-01-01

Report evaluates supercomputer needs of five key disciplines: turbulence physics, aerodynamics, aerothermodynamics, chemistry, and mathematical modeling of human vision. Predicts these fields will require computer speed greater than 10(Sup 18) floating-point operations per second (FLOP's) and memory capacity greater than 10(Sup 15) words. Also, new parallel computer architectures and new structured numerical methods will make necessary speed and capacity available.
Modelling Detailed-Chemistry Effects on Turbulent Diffusion Flames using a Parallel Solution-Adaptive Scheme

NASA Astrophysics Data System (ADS)

Jha, Pradeep Kumar

Capturing the effects of detailed-chemistry on turbulent combustion processes is a central challenge faced by the numerical combustion community. However, the inherent complexity and non-linear nature of both turbulence and chemistry require that combustion models rely heavily on engineering approximations to remain computationally tractable. This thesis proposes a computationally efficient algorithm for modelling detailed-chemistry effects in turbulent diffusion flames and numerically predicting the associated flame properties. The cornerstone of this combustion modelling tool is the use of parallel Adaptive Mesh Refinement (AMR) scheme with the recently proposed Flame Prolongation of Intrinsic low-dimensional manifold (FPI) tabulated-chemistry approach for modelling complex chemistry. The effect of turbulence on the mean chemistry is incorporated using a Presumed Conditional Moment (PCM) approach based on a beta-probability density function (PDF). The two-equation k-w turbulence model is used for modelling the effects of the unresolved turbulence on the mean flow field. The finite-rate of methane-air combustion is represented here by using the GRI-Mech 3.0 scheme. This detailed mechanism is used to build the FPI tables. A state of the art numerical scheme based on a parallel block-based solution-adaptive algorithm has been developed to solve the Favre-averaged Navier-Stokes (FANS) and other governing partial-differential equations using a second-order accurate, fully-coupled finite-volume formulation on body-fitted, multi-block, quadrilateral/hexahedral mesh for two-dimensional and three-dimensional flow geometries, respectively. A standard fourth-order Runge-Kutta time-marching scheme is used for time-accurate temporal discretizations. Numerical predictions of three different diffusion flames configurations are considered in the present work: a laminar counter-flow flame; a laminar co-flow diffusion flame; and a Sydney bluff-body turbulent reacting flow. Comparisons are made between the predicted results of the present FPI scheme and Steady Laminar Flamelet Model (SLFM) approach for diffusion flames. The effects of grid resolution on the predicted overall flame solutions are also assessed. Other non-reacting flows have also been considered to further validate other aspects of the numerical scheme. The present schemes predict results which are in good agreement with published experimental results and reduces the computational cost involved in modelling turbulent diffusion flames significantly, both in terms of storage and processing time.
Clinical chemistry through Clinical Chemistry: a journal timeline.

PubMed

Rej, Robert

2004-12-01

The establishment of the modern discipline of clinical chemistry was concurrent with the foundation of the journal Clinical Chemistry and that of the American Association for Clinical Chemistry in the late 1940s and early 1950s. To mark the 50th volume of this Journal, I chronicle and highlight scientific milestones, and those within the discipline, as documented in the pages of Clinical Chemistry. Amazing progress has been made in the field of laboratory diagnostics over these five decades, in many cases paralleling-as well as being bolstered by-the rapid pace in the development of computer technologies. Specific areas of laboratory medicine particularly well represented in Clinical Chemistry include lipids, endocrinology, protein markers, quality of laboratory measurements, molecular diagnostics, and general advances in methodology and instrumentation.

Overview of the NCC

NASA Technical Reports Server (NTRS)

Liu, Nan-Suey

2001-01-01

A multi-disciplinary design/analysis tool for combustion systems is critical for optimizing the low-emission, high-performance combustor design process. Based on discussions between then NASA Lewis Research Center and the jet engine companies, an industry-government team was formed in early 1995 to develop the National Combustion Code (NCC), which is an integrated system of computer codes for the design and analysis of combustion systems. NCC has advanced features that address the need to meet designer's requirements such as "assured accuracy", "fast turnaround", and "acceptable cost". The NCC development team is comprised of Allison Engine Company (Allison), CFD Research Corporation (CFDRC), GE Aircraft Engines (GEAE), NASA Glenn Research Center (LeRC), and Pratt & Whitney (P&W). The "unstructured mesh" capability and "parallel computing" are fundamental features of NCC from its inception. The NCC system is composed of a set of "elements" which includes grid generator, main flow solver, turbulence module, turbulence and chemistry interaction module, chemistry module, spray module, radiation heat transfer module, data visualization module, and a post-processor for evaluating engine performance parameters. Each element may have contributions from several team members. Such a multi-source multi-element system needs to be integrated in a way that facilitates inter-module data communication, flexibility in module selection, and ease of integration. The development of the NCC beta version was essentially completed in June 1998. Technical details of the NCC elements are given in the Reference List. Elements such as the baseline flow solver, turbulence module, and the chemistry module, have been extensively validated; and their parallel performance on large-scale parallel systems has been evaluated and optimized. However the scalar PDF module and the Spray module, as well as their coupling with the baseline flow solver, were developed in a small-scale distributed computing environment. As a result, the validation of the NCC beta version as a whole was quite limited. Current effort has been focused on the validation of the integrated code and the evaluation/optimization of its overall performance on large-scale parallel systems.
Field programmable chemistry: integrated chemical and electronic processing of informational molecules towards electronic chemical cells.

PubMed

Wagler, Patrick F; Tangen, Uwe; Maeke, Thomas; McCaskill, John S

2012-07-01

The topic addressed is that of combining self-constructing chemical systems with electronic computation to form unconventional embedded computation systems performing complex nano-scale chemical tasks autonomously. The hybrid route to complex programmable chemistry, and ultimately to artificial cells based on novel chemistry, requires a solution of the two-way massively parallel coupling problem between digital electronics and chemical systems. We present a chemical microprocessor technology and show how it can provide a generic programmable platform for complex molecular processing tasks in Field Programmable Chemistry, including steps towards the grand challenge of constructing the first electronic chemical cells. Field programmable chemistry employs a massively parallel field of electrodes, under the control of latched voltages, which are used to modulate chemical activity. We implement such a field programmable chemistry which links to chemistry in rather generic, two-phase microfluidic channel networks that are separated into weakly coupled domains. Electric fields, produced by the high-density array of electrodes embedded in the channel floors, are used to control the transport of chemicals across the hydrodynamic barriers separating domains. In the absence of electric fields, separate microfluidic domains are essentially independent with only slow diffusional interchange of chemicals. Electronic chemical cells, based on chemical microprocessors, exploit a spatially resolved sandwich structure in which the electronic and chemical systems are locally coupled through homogeneous fine-grained actuation and sensor networks and play symmetric and complementary roles. We describe how these systems are fabricated, experimentally test their basic functionality, simulate their potential (e.g. for feed forward digital electrophoretic (FFDE) separation) and outline the application to building electronic chemical cells. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Ray Next-Event Estimator Transport of Primary and Secondary Gamma Rays

DTIC Science & Technology

2011-03-01

McGraw-Hill. Choppin, G. R., Liljenzin, J.-O., & Rydberg, J. (2002). Radiochemistry and Nuclear Chemistry (3rd ed.). Woburn, MA: Butterworth- Heinemann ...time-energy bins. Any performance enhancements (maybe parallel searching?) to the search routines decrease estimator computational time
Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajbhandari, Samyam; Rastello, Fabrice; Kowalski, Karol

The four-index integral transform is a fundamental and computationally demanding calculation used in many computational chemistry suites such as NWChem. It transforms a four-dimensional tensor from an atomic basis to a molecular basis. This transformation is most efficiently implemented as a sequence of four tensor contractions that each contract a four-dimensional tensor with a two-dimensional transformation matrix. Differing degrees of permutation symmetry in the intermediate and final tensors in the sequence of contractions cause intermediate tensors to be much larger than the final tensor and limit the number of electronic states in the modeled systems. Loop fusion, in conjunction withmore » tiling, can be very effective in reducing the total space requirement, as well as data movement. However, the large number of possible choices for loop fusion and tiling, and data/computation distribution across a parallel system, make it challenging to develop an optimized parallel implementation for the four-index integral transform. We develop a novel approach to address this problem, using lower bounds modeling of data movement complexity. We establish relationships between available aggregate physical memory in a parallel computer system and ineffective fusion configurations, enabling their pruning and consequent identification of effective choices and a characterization of optimality criteria. This work has resulted in the development of a significantly improved implementation of the four-index transform that enables higher performance and the ability to model larger electronic systems than the current implementation in the NWChem quantum chemistry software suite.« less
Workshop report on large-scale matrix diagonalization methods in chemistry theory institute

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bischof, C.H.; Shepard, R.L.; Huss-Lederman, S.

The Large-Scale Matrix Diagonalization Methods in Chemistry theory institute brought together 41 computational chemists and numerical analysts. The goal was to understand the needs of the computational chemistry community in problems that utilize matrix diagonalization techniques. This was accomplished by reviewing the current state of the art and looking toward future directions in matrix diagonalization techniques. This institute occurred about 20 years after a related meeting of similar size. During those 20 years the Davidson method continued to dominate the problem of finding a few extremal eigenvalues for many computational chemistry problems. Work on non-diagonally dominant and non-Hermitian problems asmore » well as parallel computing has also brought new methods to bear. The changes and similarities in problems and methods over the past two decades offered an interesting viewpoint for the success in this area. One important area covered by the talks was overviews of the source and nature of the chemistry problems. The numerical analysts were uniformly grateful for the efforts to convey a better understanding of the problems and issues faced in computational chemistry. An important outcome was an understanding of the wide range of eigenproblems encountered in computational chemistry. The workshop covered problems involving self- consistent-field (SCF), configuration interaction (CI), intramolecular vibrational relaxation (IVR), and scattering problems. In atomic structure calculations using the Hartree-Fock method (SCF), the symmetric matrices can range from order hundreds to thousands. These matrices often include large clusters of eigenvalues which can be as much as 25% of the spectrum. However, if Cl methods are also used, the matrix size can be between 10{sup 4} and 10{sup 9} where only one or a few extremal eigenvalues and eigenvectors are needed. Working with very large matrices has lead to the development of« less
Parallel scalability of Hartree-Fock calculations

NASA Astrophysics Data System (ADS)

Chow, Edmond; Liu, Xing; Smelyanskiy, Mikhail; Hammond, Jeff R.

2015-03-01

Quantum chemistry is increasingly performed using large cluster computers consisting of multiple interconnected nodes. For a fixed molecular problem, the efficiency of a calculation usually decreases as more nodes are used, due to the cost of communication between the nodes. This paper empirically investigates the parallel scalability of Hartree-Fock calculations. The construction of the Fock matrix and the density matrix calculation are analyzed separately. For the former, we use a parallelization of Fock matrix construction based on a static partitioning of work followed by a work stealing phase. For the latter, we use density matrix purification from the linear scaling methods literature, but without using sparsity. When using large numbers of nodes for moderately sized problems, density matrix computations are network-bandwidth bound, making purification methods potentially faster than eigendecomposition methods.
Work stealing for GPU-accelerated parallel programs in a global address space framework: WORK STEALING ON GPU-ACCELERATED SYSTEMS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram

Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain.« less
Work stealing for GPU-accelerated parallel programs in a global address space framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram

Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain« less
Extensible Computational Chemistry Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

2012-08-09

ECCE provides a sophisticated graphical user interface, scientific visualization tools, and the underlying data management framework enabling scientists to efficiently set up calculations and store, retrieve, and analyze the rapidly growing volumes of data produced by computational chemistry studies. ECCE was conceived as part of the Environmental Molecular Sciences Laboratory construction to solve the problem of researchers being able to effectively utilize complex computational chemistry codes and massively parallel high performance compute resources. Bringing the power of these codes and resources to the desktops of researcher and thus enabling world class research without users needing a detailed understanding of themore » inner workings of either the theoretical codes or the supercomputers needed to run them was a grand challenge problem in the original version of the EMSL. ECCE allows collaboration among researchers using a web-based data repository where the inputs and results for all calculations done within ECCE are organized. ECCE is a first of kind end-to-end problem solving environment for all phases of computational chemistry research: setting up calculations with sophisticated GUI and direct manipulation visualization tools, submitting and monitoring calculations on remote high performance supercomputers without having to be familiar with the details of using these compute resources, and performing results visualization and analysis including creating publication quality images. ECCE is a suite of tightly integrated applications that are employed as the user moves through the modeling process.« less
Cumulative reports and publications through December 31, 1989

NASA Technical Reports Server (NTRS)

1990-01-01

A complete list of reports from the Institute for Computer Applications in Science and Engineering (ICASE) is presented. The major categories of the current ICASE research program are: numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; control and parameter identification problems, with emphasis on effectual numerical methods; computational problems in engineering and the physical sciences, particularly fluid dynamics, acoustics, structural analysis, and chemistry; computer systems and software, especially vector and parallel computers, microcomputers, and data management. Since ICASE reports are intended to be preprints of articles that will appear in journals or conference proceedings, the published reference is included when it is available.
Development of a Grid-Independent Geos-Chem Chemical Transport Model (v9-02) as an Atmospheric Chemistry Module for Earth System Models

NASA Technical Reports Server (NTRS)

Long, M. S.; Yantosca, R.; Nielsen, J. E; Keller, C. A.; Da Silva, A.; Sulprizio, M. P.; Pawson, S.; Jacob, D. J.

2015-01-01

The GEOS-Chem global chemical transport model (CTM), used by a large atmospheric chemistry research community, has been re-engineered to also serve as an atmospheric chemistry module for Earth system models (ESMs). This was done using an Earth System Modeling Framework (ESMF) interface that operates independently of the GEOSChem scientific code, permitting the exact same GEOSChem code to be used as an ESM module or as a standalone CTM. In this manner, the continual stream of updates contributed by the CTM user community is automatically passed on to the ESM module, which remains state of science and referenced to the latest version of the standard GEOS-Chem CTM. A major step in this re-engineering was to make GEOS-Chem grid independent, i.e., capable of using any geophysical grid specified at run time. GEOS-Chem data sockets were also created for communication between modules and with external ESM code. The grid-independent, ESMF-compatible GEOS-Chem is now the standard version of the GEOS-Chem CTM. It has been implemented as an atmospheric chemistry module into the NASA GEOS- 5 ESM. The coupled GEOS-5-GEOS-Chem system was tested for scalability and performance with a tropospheric oxidant-aerosol simulation (120 coupled species, 66 transported tracers) using 48-240 cores and message-passing interface (MPI) distributed-memory parallelization. Numerical experiments demonstrate that the GEOS-Chem chemistry module scales efficiently for the number of cores tested, with no degradation as the number of cores increases. Although inclusion of atmospheric chemistry in ESMs is computationally expensive, the excellent scalability of the chemistry module means that the relative cost goes down with increasing number of cores in a massively parallel environment.
plasmaFoam: An OpenFOAM framework for computational plasma physics and chemistry

NASA Astrophysics Data System (ADS)

Venkattraman, Ayyaswamy; Verma, Abhishek Kumar

2016-09-01

As emphasized in the 2012 Roadmap for low temperature plasmas (LTP), scientific computing has emerged as an essential tool for the investigation and prediction of the fundamental physical and chemical processes associated with these systems. While several in-house and commercial codes exist, with each having its own advantages and disadvantages, a common framework that can be developed by researchers from all over the world will likely accelerate the impact of computational studies on advances in low-temperature plasma physics and chemistry. In this regard, we present a finite volume computational toolbox to perform high-fidelity simulations of LTP systems. This framework, primarily based on the OpenFOAM solver suite, allows us to enhance our understanding of multiscale plasma phenomenon by performing massively parallel, three-dimensional simulations on unstructured meshes using well-established high performance computing tools that are widely used in the computational fluid dynamics community. In this talk, we will present preliminary results obtained using the OpenFOAM-based solver suite with benchmark three-dimensional simulations of microplasma devices including both dielectric and plasma regions. We will also discuss the future outlook for the solver suite.
Employing OpenCL to Accelerate Ab Initio Calculations on Graphics Processing Units.

PubMed

Kussmann, Jörg; Ochsenfeld, Christian

2017-06-13

We present an extension of our graphics processing units (GPU)-accelerated quantum chemistry package to employ OpenCL compute kernels, which can be executed on a wide range of computing devices like CPUs, Intel Xeon Phi, and AMD GPUs. Here, we focus on the use of AMD GPUs and discuss differences as compared to CUDA-based calculations on NVIDIA GPUs. First illustrative timings are presented for hybrid density functional theory calculations using serial as well as parallel compute environments. The results show that AMD GPUs are as fast or faster than comparable NVIDIA GPUs and provide a viable alternative for quantum chemical applications.
Cumulative reports and publications through 31 December 1983

NASA Technical Reports Server (NTRS)

1983-01-01

All reports for the calendar years 1975 through December 1983 are listed by author. Since ICASE reports are intended to be preprints of articles for journals and conference proceedings, the published reference is included when available. Thirteen older journal and conference proceedings references are included as well as five additional reports by ICASE personnel. Major categories of research covered include: (1) numerical methods, with particular emphasis on the development and analysis of basic algorithms; (2) computational problems in engineering and the physical sciences, particularly fluid dynamics, acoustics, structural analysis, and chemistry; and (3) computer systems and software, especially vector and parallel computers, microcomputers, and data management.
Airbreathing Propulsion System Analysis Using Multithreaded Parallel Processing

NASA Technical Reports Server (NTRS)

Schunk, Richard Gregory; Chung, T. J.; Rodriguez, Pete (Technical Monitor)

2000-01-01

In this paper, parallel processing is used to analyze the mixing, and combustion behavior of hypersonic flow. Preliminary work for a sonic transverse hydrogen jet injected from a slot into a Mach 4 airstream in a two-dimensional duct combustor has been completed [Moon and Chung, 1996]. Our aim is to extend this work to three-dimensional domain using multithreaded domain decomposition parallel processing based on the flowfield-dependent variation theory. Numerical simulations of chemically reacting flows are difficult because of the strong interactions between the turbulent hydrodynamic and chemical processes. The algorithm must provide an accurate representation of the flowfield, since unphysical flowfield calculations will lead to the faulty loss or creation of species mass fraction, or even premature ignition, which in turn alters the flowfield information. Another difficulty arises from the disparity in time scales between the flowfield and chemical reactions, which may require the use of finite rate chemistry. The situations are more complex when there is a disparity in length scales involved in turbulence. In order to cope with these complicated physical phenomena, it is our plan to utilize the flowfield-dependent variation theory mentioned above, facilitated by large eddy simulation. Undoubtedly, the proposed computation requires the most sophisticated computational strategies. The multithreaded domain decomposition parallel processing will be necessary in order to reduce both computational time and storage. Without special treatments involved in computer engineering, our attempt to analyze the airbreathing combustion appears to be difficult, if not impossible.
Earth system modelling on system-level heterogeneous architectures: EMAC (version 2.42) on the Dynamical Exascale Entry Platform (DEEP)

NASA Astrophysics Data System (ADS)

Christou, Michalis; Christoudias, Theodoros; Morillo, Julián; Alvarez, Damian; Merx, Hendrik

2016-09-01

We examine an alternative approach to heterogeneous cluster-computing in the many-core era for Earth system models, using the European Centre for Medium-Range Weather Forecasts Hamburg (ECHAM)/Modular Earth Submodel System (MESSy) Atmospheric Chemistry (EMAC) model as a pilot application on the Dynamical Exascale Entry Platform (DEEP). A set of autonomous coprocessors interconnected together, called Booster, complements a conventional HPC Cluster and increases its computing performance, offering extra flexibility to expose multiple levels of parallelism and achieve better scalability. The EMAC model atmospheric chemistry code (Module Efficiently Calculating the Chemistry of the Atmosphere (MECCA)) was taskified with an offload mechanism implemented using OmpSs directives. The model was ported to the MareNostrum 3 supercomputer to allow testing with Intel Xeon Phi accelerators on a production-size machine. The changes proposed in this paper are expected to contribute to the eventual adoption of Cluster-Booster division and Many Integrated Core (MIC) accelerated architectures in presently available implementations of Earth system models, towards exploiting the potential of a fully Exascale-capable platform.
Optimizing Performance of Combustion Chemistry Solvers on Intel's Many Integrated Core (MIC) Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sitaraman, Hariswaran; Grout, Ray W

This work investigates novel algorithm designs and optimization techniques for restructuring chemistry integrators in zero and multidimensional combustion solvers, which can then be effectively used on the emerging generation of Intel's Many Integrated Core/Xeon Phi processors. These processors offer increased computing performance via large number of lightweight cores at relatively lower clock speeds compared to traditional processors (e.g. Intel Sandybridge/Ivybridge) used in current supercomputers. This style of processor can be productively used for chemistry integrators that form a costly part of computational combustion codes, in spite of their relatively lower clock speeds. Performance commensurate with traditional processors is achieved heremore » through the combination of careful memory layout, exposing multiple levels of fine grain parallelism and through extensive use of vendor supported libraries (Cilk Plus and Math Kernel Libraries). Important optimization techniques for efficient memory usage and vectorization have been identified and quantified. These optimizations resulted in a factor of ~ 3 speed-up using Intel 2013 compiler and ~ 1.5 using Intel 2017 compiler for large chemical mechanisms compared to the unoptimized version on the Intel Xeon Phi. The strategies, especially with respect to memory usage and vectorization, should also be beneficial for general purpose computational fluid dynamics codes.« less
IMACS 󈨟: Proceedings of the IMACS World Congress on Computation and Applied Mathematics (13th) Held in Dublin, Ireland on July 22-26, 1991. Volume 2. Computational Fluid Dynamics and Wave Propagation, Parallel Computing, Concurrent and Supercomputing, Computational Physics/Computational Chemistry and Evolutionary Systems

DTIC Science & Technology

1991-01-01

Investigations of uniform convergence Sorption gel5ster Stoffe in porisen Medien (in German). Ver- are-in progress. lag P -Lang, Frankfurt/M., 1991 (in press...ad- sorption terms. Numerical results for solute transport with instantaneous, /it + 01(p): - q(p)#= 0, X > 0, t > 0. (5) nonlinear adsorption-are...13 ] A, S113S = (PfIJl) 8Iax, S1136= m A, y H137=- pf A, v 138 -f A, IIso l r opic-visco° c ousic- Y 139 im A, z-pliole (C’ilz) y 1141 = [ PfA 𔃻 + P
Parallelization of the Physical-Space Statistical Analysis System (PSAS)

NASA Technical Reports Server (NTRS)

Larson, J. W.; Guo, J.; Lyster, P. M.

1999-01-01

Atmospheric data assimilation is a method of combining observations with model forecasts to produce a more accurate description of the atmosphere than the observations or forecast alone can provide. Data assimilation plays an increasingly important role in the study of climate and atmospheric chemistry. The NASA Data Assimilation Office (DAO) has developed the Goddard Earth Observing System Data Assimilation System (GEOS DAS) to create assimilated datasets. The core computational components of the GEOS DAS include the GEOS General Circulation Model (GCM) and the Physical-space Statistical Analysis System (PSAS). The need for timely validation of scientific enhancements to the data assimilation system poses computational demands that are best met by distributed parallel software. PSAS is implemented in Fortran 90 using object-based design principles. The analysis portions of the code solve two equations. The first of these is the "innovation" equation, which is solved on the unstructured observation grid using a preconditioned conjugate gradient (CG) method. The "analysis" equation is a transformation from the observation grid back to a structured grid, and is solved by a direct matrix-vector multiplication. Use of a factored-operator formulation reduces the computational complexity of both the CG solver and the matrix-vector multiplication, rendering the matrix-vector multiplications as a successive product of operators on a vector. Sparsity is introduced to these operators by partitioning the observations using an icosahedral decomposition scheme. PSAS builds a large (approx. 128MB) run-time database of parameters used in the calculation of these operators. Implementing a message passing parallel computing paradigm into an existing yet developing computational system as complex as PSAS is nontrivial. One of the technical challenges is balancing the requirements for computational reproducibility with the need for high performance. The problem of computational reproducibility is well known in the parallel computing community. It is a requirement that the parallel code perform calculations in a fashion that will yield identical results on different configurations of processing elements on the same platform. In some cases this problem can be solved by sacrificing performance. Meeting this requirement and still achieving high performance is very difficult. Topics to be discussed include: current PSAS design and parallelization strategy; reproducibility issues; load balance vs. database memory demands, possible solutions to these problems.
Component-based integration of chemistry and optimization software.

PubMed

Kenny, Joseph P; Benson, Steven J; Alexeev, Yuri; Sarich, Jason; Janssen, Curtis L; McInnes, Lois Curfman; Krishnan, Manojkumar; Nieplocha, Jarek; Jurrus, Elizabeth; Fahlstrom, Carl; Windus, Theresa L

2004-11-15

Typical scientific software designs make rigid assumptions regarding programming language and data structures, frustrating software interoperability and scientific collaboration. Component-based software engineering is an emerging approach to managing the increasing complexity of scientific software. Component technology facilitates code interoperability and reuse. Through the adoption of methodology and tools developed by the Common Component Architecture Forum, we have developed a component architecture for molecular structure optimization. Using the NWChem and Massively Parallel Quantum Chemistry packages, we have produced chemistry components that provide capacity for energy and energy derivative evaluation. We have constructed geometry optimization applications by integrating the Toolkit for Advanced Optimization, Portable Extensible Toolkit for Scientific Computation, and Global Arrays packages, which provide optimization and linear algebra capabilities. We present a brief overview of the component development process and a description of abstract interfaces for chemical optimizations. The components conforming to these abstract interfaces allow the construction of applications using different chemistry and mathematics packages interchangeably. Initial numerical results for the component software demonstrate good performance, and highlight potential research enabled by this platform.

A high performance scientific cloud computing environment for materials simulations

NASA Astrophysics Data System (ADS)

Jorissen, K.; Vila, F. D.; Rehr, J. J.

2012-09-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
Development of massive multilevel molecular dynamics simulation program, Platypus (PLATform for dYnamic Protein Unified Simulation), for the elucidation of protein functions.

PubMed

Takano, Yu; Nakata, Kazuto; Yonezawa, Yasushige; Nakamura, Haruki

2016-05-05

A massively parallel program for quantum mechanical-molecular mechanical (QM/MM) molecular dynamics simulation, called Platypus (PLATform for dYnamic Protein Unified Simulation), was developed to elucidate protein functions. The speedup and the parallelization ratio of Platypus in the QM and QM/MM calculations were assessed for a bacteriochlorophyll dimer in the photosynthetic reaction center (DIMER) on the K computer, a massively parallel computer achieving 10 PetaFLOPs with 705,024 cores. Platypus exhibited the increase in speedup up to 20,000 core processors at the HF/cc-pVDZ and B3LYP/cc-pVDZ, and up to 10,000 core processors by the CASCI(16,16)/6-31G** calculations. We also performed excited QM/MM-MD simulations on the chromophore of Sirius (SIRIUS) in water. Sirius is a pH-insensitive and photo-stable ultramarine fluorescent protein. Platypus accelerated on-the-fly excited-state QM/MM-MD simulations for SIRIUS in water, using over 4000 core processors. In addition, it also succeeded in 50-ps (200,000-step) on-the-fly excited-state QM/MM-MD simulations for the SIRIUS in water. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.
Programs as Polypeptides.

PubMed

Williams, Lance R

2016-01-01

Object-oriented combinator chemistry (OOCC) is an artificial chemistry with composition devices borrowed from object-oriented and functional programming languages. Actors in OOCC are embedded in space and subject to diffusion; since they are neither created nor destroyed, their mass is conserved. Actors use programs constructed from combinators to asynchronously update their own states and the states of other actors in their neighborhoods. The fact that programs and combinators are themselves reified as actors makes it possible to build programs that build programs from combinators of a few primitive types using asynchronous spatial processes that resemble chemistry as much as computation. To demonstrate this, OOCC is used to define a parallel, asynchronous, spatially distributed self-replicating system modeled in part on the living cell. Since interactions among its parts result in the construction of more of these same parts, the system is strongly constructive. The system's high normalized complexity is contrasted with that of a simple composome.
Interactive computer modeling of combustion chemistry and coalescence-dispersion modeling of turbulent combustion

NASA Technical Reports Server (NTRS)

Pratt, D. T.

1984-01-01

An interactive computer code for simulation of a high-intensity turbulent combustor as a single point inhomogeneous stirred reactor was developed from an existing batch processing computer code CDPSR. The interactive CDPSR code was used as a guide for interpretation and direction of DOE-sponsored companion experiments utilizing Xenon tracer with optical laser diagnostic techniques to experimentally determine the appropriate mixing frequency, and for validation of CDPSR as a mixing-chemistry model for a laboratory jet-stirred reactor. The coalescence-dispersion model for finite rate mixing was incorporated into an existing interactive code AVCO-MARK I, to enable simulation of a combustor as a modular array of stirred flow and plug flow elements, each having a prescribed finite mixing frequency, or axial distribution of mixing frequency, as appropriate. Further increase the speed and reliability of the batch kinetics integrator code CREKID was increased by rewriting in vectorized form for execution on a vector or parallel processor, and by incorporating numerical techniques which enhance execution speed by permitting specification of a very low accuracy tolerance.
Parallel medicinal chemistry approaches to selective HDAC1/HDAC2 inhibitor (SHI-1:2) optimization.

PubMed

Kattar, Solomon D; Surdi, Laura M; Zabierek, Anna; Methot, Joey L; Middleton, Richard E; Hughes, Bethany; Szewczak, Alexander A; Dahlberg, William K; Kral, Astrid M; Ozerova, Nicole; Fleming, Judith C; Wang, Hongmei; Secrist, Paul; Harsch, Andreas; Hamill, Julie E; Cruz, Jonathan C; Kenific, Candia M; Chenard, Melissa; Miller, Thomas A; Berk, Scott C; Tempest, Paul

2009-02-15

The successful application of both solid and solution phase library synthesis, combined with tight integration into the medicinal chemistry effort, resulted in the efficient optimization of a novel structural series of selective HDAC1/HDAC2 inhibitors by the MRL-Boston Parallel Medicinal Chemistry group. An initial lead from a small parallel library was found to be potent and selective in biochemical assays. Advanced compounds were the culmination of iterative library design and possess excellent biochemical and cellular potency, as well as acceptable PK and efficacy in animal models.
Pushing configuration-interaction to the limit: Towards massively parallel MCSCF calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vogiatzis, Konstantinos D.; Ma, Dongxia; Olsen, Jeppe

A new large-scale parallel multiconfigurational self-consistent field (MCSCF) implementation in the open-source NWChem computational chemistry code is presented. The generalized active space approach is used to partition large configuration interaction (CI) vectors and generate a sufficient number of batches that can be distributed to the available cores. Massively parallel CI calculations with large active spaces can be performed. The new parallel MCSCF implementation is tested for the chromium trimer and for an active space of 20 electrons in 20 orbitals, which can now routinely be performed. Unprecedented CI calculations with an active space of 22 electrons in 22 orbitals formore » the pentacene systems were performed and a single CI iteration calculation with an active space of 24 electrons in 24 orbitals for the chromium tetramer was possible. In conclusion, the chromium tetramer corresponds to a CI expansion of one trillion Slater determinants (914 058 513 424) and is the largest conventional CI calculation attempted up to date.« less
Pushing configuration-interaction to the limit: Towards massively parallel MCSCF calculations

DOE PAGES

Vogiatzis, Konstantinos D.; Ma, Dongxia; Olsen, Jeppe; ...

2017-11-14

A new large-scale parallel multiconfigurational self-consistent field (MCSCF) implementation in the open-source NWChem computational chemistry code is presented. The generalized active space approach is used to partition large configuration interaction (CI) vectors and generate a sufficient number of batches that can be distributed to the available cores. Massively parallel CI calculations with large active spaces can be performed. The new parallel MCSCF implementation is tested for the chromium trimer and for an active space of 20 electrons in 20 orbitals, which can now routinely be performed. Unprecedented CI calculations with an active space of 22 electrons in 22 orbitals formore » the pentacene systems were performed and a single CI iteration calculation with an active space of 24 electrons in 24 orbitals for the chromium tetramer was possible. In conclusion, the chromium tetramer corresponds to a CI expansion of one trillion Slater determinants (914 058 513 424) and is the largest conventional CI calculation attempted up to date.« less
Pushing configuration-interaction to the limit: Towards massively parallel MCSCF calculations

NASA Astrophysics Data System (ADS)

Vogiatzis, Konstantinos D.; Ma, Dongxia; Olsen, Jeppe; Gagliardi, Laura; de Jong, Wibe A.

2017-11-01

A new large-scale parallel multiconfigurational self-consistent field (MCSCF) implementation in the open-source NWChem computational chemistry code is presented. The generalized active space approach is used to partition large configuration interaction (CI) vectors and generate a sufficient number of batches that can be distributed to the available cores. Massively parallel CI calculations with large active spaces can be performed. The new parallel MCSCF implementation is tested for the chromium trimer and for an active space of 20 electrons in 20 orbitals, which can now routinely be performed. Unprecedented CI calculations with an active space of 22 electrons in 22 orbitals for the pentacene systems were performed and a single CI iteration calculation with an active space of 24 electrons in 24 orbitals for the chromium tetramer was possible. The chromium tetramer corresponds to a CI expansion of one trillion Slater determinants (914 058 513 424) and is the largest conventional CI calculation attempted up to date.
Algorithms Bridging Quantum Computation and Chemistry

NASA Astrophysics Data System (ADS)

McClean, Jarrod Ryan

The design of new materials and chemicals derived entirely from computation has long been a goal of computational chemistry, and the governing equation whose solution would permit this dream is known. Unfortunately, the exact solution to this equation has been far too expensive and clever approximations fail in critical situations. Quantum computers offer a novel solution to this problem. In this work, we develop not only new algorithms to use quantum computers to study hard problems in chemistry, but also explore how such algorithms can help us to better understand and improve our traditional approaches. In particular, we first introduce a new method, the variational quantum eigensolver, which is designed to maximally utilize the quantum resources available in a device to solve chemical problems. We apply this method in a real quantum photonic device in the lab to study the dissociation of the helium hydride (HeH+) molecule. We also enhance this methodology with architecture specific optimizations on ion trap computers and show how linear-scaling techniques from traditional quantum chemistry can be used to improve the outlook of similar algorithms on quantum computers. We then show how studying quantum algorithms such as these can be used to understand and enhance the development of classical algorithms. In particular we use a tool from adiabatic quantum computation, Feynman's Clock, to develop a new discrete time variational principle and further establish a connection between real-time quantum dynamics and ground state eigenvalue problems. We use these tools to develop two novel parallel-in-time quantum algorithms that outperform competitive algorithms as well as offer new insights into the connection between the fermion sign problem of ground states and the dynamical sign problem of quantum dynamics. Finally we use insights gained in the study of quantum circuits to explore a general notion of sparsity in many-body quantum systems. In particular we use developments from the field of compressed sensing to find compact representations of ground states. As an application we study electronic systems and find solutions dramatically more compact than traditional configuration interaction expansions, offering hope to extend this methodology to challenging systems in chemical and material design.
Research in Computational Astrobiology

NASA Technical Reports Server (NTRS)

Chaban, Galina; Jaffe, Richard; Liang, Shoudan; New, Michael H.; Pohorille, Andrew; Wilson, Michael A.

2002-01-01

We present results from several projects in the new field of computational astrobiology, which is devoted to advancing our understanding of the origin, evolution and distribution of life in the Universe using theoretical and computational tools. We have developed a procedure for calculating long-range effects in molecular dynamics using a plane wave expansion of the electrostatic potential. This method is expected to be highly efficient for simulating biological systems on massively parallel supercomputers. We have perform genomics analysis on a family of actin binding proteins. We have performed quantum mechanical calculations on carbon nanotubes and nucleic acids, which simulations will allow us to investigate possible sources of organic material on the early earth. Finally, we have developed a model of protobiological chemistry using neural networks.
Computational Science at the Argonne Leadership Computing Facility

NASA Astrophysics Data System (ADS)

Romero, Nichols

2014-03-01

The goal of the Argonne Leadership Computing Facility (ALCF) is to extend the frontiers of science by solving problems that require innovative approaches and the largest-scale computing systems. ALCF's most powerful computer - Mira, an IBM Blue Gene/Q system - has nearly one million cores. How does one program such systems? What software tools are available? Which scientific and engineering applications are able to utilize such levels of parallelism? This talk will address these questions and describe a sampling of projects that are using ALCF systems in their research, including ones in nanoscience, materials science, and chemistry. Finally, the ways to gain access to ALCF resources will be presented. This research used resources of the Argonne Leadership Computing Facility at Argonne National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-06CH11357.
An Overview of the NCC Spray/Monte-Carlo-PDF Computations

NASA Technical Reports Server (NTRS)

Raju, M. S.; Liu, Nan-Suey (Technical Monitor)

2000-01-01

This paper advances the state-of-the-art in spray computations with some of our recent contributions involving scalar Monte Carlo PDF (Probability Density Function), unstructured grids and parallel computing. It provides a complete overview of the scalar Monte Carlo PDF and Lagrangian spray computer codes developed for application with unstructured grids and parallel computing. Detailed comparisons for the case of a reacting non-swirling spray clearly highlight the important role that chemistry/turbulence interactions play in the modeling of reacting sprays. The results from the PDF and non-PDF methods were found to be markedly different and the PDF solution is closer to the reported experimental data. The PDF computations predict that some of the combustion occurs in a predominantly premixed-flame environment and the rest in a predominantly diffusion-flame environment. However, the non-PDF solution predicts wrongly for the combustion to occur in a vaporization-controlled regime. Near the premixed flame, the Monte Carlo particle temperature distribution shows two distinct peaks: one centered around the flame temperature and the other around the surrounding-gas temperature. Near the diffusion flame, the Monte Carlo particle temperature distribution shows a single peak. In both cases, the computed PDF's shape and strength are found to vary substantially depending upon the proximity to the flame surface. The results bring to the fore some of the deficiencies associated with the use of assumed-shape PDF methods in spray computations. Finally, we end the paper by demonstrating the computational viability of the present solution procedure for its use in 3D combustor calculations by summarizing the results of a 3D test case with periodic boundary conditions. For the 3D case, the parallel performance of all the three solvers (CFD, PDF, and spray) has been found to be good when the computations were performed on a 24-processor SGI Origin work-station.
Flame-vortex interactions imaged in microgravity

NASA Technical Reports Server (NTRS)

Driscoll, James F.; Dahm, Werner J. A.; Sichel, Martin

1995-01-01

The scientific objective is to obtain high quality color-enhanced digital images of a vortex exerting aerodynamic strain on premixed and nonpremixed flames with the complicating effects of buoyancy removed. The images will provide universal (buoyancy free) scaling relations that are required to improve several types of models of turbulent combustion, including KIVA-3, discrete vortex, and large-eddy simulations. The images will be used to help quantify several source terms in the models, including those due to flame stretch, flame-generated vorticity, flame curvature, and preferential diffusion, for a range of vortex sizes and flame conditions. The experiment is an ideal way to study turbulence-chemistry interactions and isolate the effect of vortices of different sizes and strengths in a repeatable manner. A parallel computational effort is being conducted which considers full chemistry and preferential diffusion.
Ab Initio Reactive Computer Aided Molecular Design

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martínez, Todd J.

Few would dispute that theoretical chemistry tools can now provide keen insights into chemical phenomena. Yet the holy grail of efficient and reliable prediction of complex reactivity has remained elusive. Fortunately, recent advances in electronic structure theory based on the concepts of both element- and rank-sparsity, coupled with the emergence of new highly parallel computer architectures, have led to a significant increase in the time and length scales which can be simulated using first principles molecular dynamics. This then opens the possibility of new discovery-based approaches to chemical reactivity, such as the recently proposed ab initio nanoreactor. Here, we arguemore » that due to these and other recent advances, the holy grail of computational discovery for complex chemical reactivity is rapidly coming within our reach.« less
Ab Initio Reactive Computer Aided Molecular Design

DOE PAGES

Martínez, Todd J.

2017-03-21

Few would dispute that theoretical chemistry tools can now provide keen insights into chemical phenomena. Yet the holy grail of efficient and reliable prediction of complex reactivity has remained elusive. Fortunately, recent advances in electronic structure theory based on the concepts of both element- and rank-sparsity, coupled with the emergence of new highly parallel computer architectures, have led to a significant increase in the time and length scales which can be simulated using first principles molecular dynamics. This then opens the possibility of new discovery-based approaches to chemical reactivity, such as the recently proposed ab initio nanoreactor. Here, we arguemore » that due to these and other recent advances, the holy grail of computational discovery for complex chemical reactivity is rapidly coming within our reach.« less
Developing Chemistry and Kinetic Modeling Tools for Low-Temperature Plasma Simulations

NASA Astrophysics Data System (ADS)

Jenkins, Thomas; Beckwith, Kris; Davidson, Bradley; Kruger, Scott; Pankin, Alexei; Roark, Christine; Stoltz, Peter

2015-09-01

We discuss the use of proper orthogonal decomposition (POD) methods in VSim, a FDTD plasma simulation code capable of both PIC/MCC and fluid modeling. POD methods efficiently generate smooth representations of noisy self-consistent or test-particle PIC data, and are thus advantageous in computing macroscopic fluid quantities from large PIC datasets (e.g. for particle-based closure computations) and in constructing optimal visual representations of the underlying physics. They may also confer performance advantages for massively parallel simulations, due to the significant reduction in dataset sizes conferred by truncated singular-value decompositions of the PIC data. We also demonstrate how complex LTP chemistry scenarios can be modeled in VSim via an interface with MUNCHKIN, a developing standalone python/C++/SQL code that identifies reaction paths for given input species, solves 1D rate equations for the time-dependent chemical evolution of the system, and generates corresponding VSim input blocks with appropriate cross-sections/reaction rates. MUNCHKIN also computes reaction rates from user-specified distribution functions, and conducts principal path analyses to reduce the number of simulated chemical reactions. Supported by U.S. Department of Energy SBIR program, Award DE-SC0009501.
Environmental research program. 1995 Annual report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, N.J.

1996-06-01

The objective of the Environmental Research Program is to enhance the understanding of, and mitigate the effects of pollutants on health, ecological systems, global and regional climate, and air quality. The program is multidisciplinary and includes fundamental research and development in efficient and environmentally benign combustion, pollutant abatement and destruction, and novel methods of detection and analysis of criteria and noncriteria pollutants. This diverse group conducts investigations in combustion, atmospheric and marine processes, flue-gas chemistry, and ecological systems. Combustion chemistry research emphasizes modeling at microscopic and macroscopic scales. At the microscopic scale, functional sensitivity analysis is used to explore themore » nature of the potential-to-dynamics relationships for reacting systems. Rate coefficients are estimated using quantum dynamics and path integral approaches. At the macroscopic level, combustion processes are modelled using chemical mechanisms at the appropriate level of detail dictated by the requirements of predicting particular aspects of combustion behavior. Parallel computing has facilitated the efforts to use detailed chemistry in models of turbulent reacting flow to predict minor species concentrations.« less
Advances in molecular quantum chemistry contained in the Q-Chem 4 program package

NASA Astrophysics Data System (ADS)

Shao, Yihan; Gan, Zhengting; Epifanovsky, Evgeny; Gilbert, Andrew T. B.; Wormit, Michael; Kussmann, Joerg; Lange, Adrian W.; Behn, Andrew; Deng, Jia; Feng, Xintian; Ghosh, Debashree; Goldey, Matthew; Horn, Paul R.; Jacobson, Leif D.; Kaliman, Ilya; Khaliullin, Rustam Z.; Kuś, Tomasz; Landau, Arie; Liu, Jie; Proynov, Emil I.; Rhee, Young Min; Richard, Ryan M.; Rohrdanz, Mary A.; Steele, Ryan P.; Sundstrom, Eric J.; Woodcock, H. Lee, III; Zimmerman, Paul M.; Zuev, Dmitry; Albrecht, Ben; Alguire, Ethan; Austin, Brian; Beran, Gregory J. O.; Bernard, Yves A.; Berquist, Eric; Brandhorst, Kai; Bravaya, Ksenia B.; Brown, Shawn T.; Casanova, David; Chang, Chun-Min; Chen, Yunqing; Chien, Siu Hung; Closser, Kristina D.; Crittenden, Deborah L.; Diedenhofen, Michael; DiStasio, Robert A., Jr.; Do, Hainam; Dutoi, Anthony D.; Edgar, Richard G.; Fatehi, Shervin; Fusti-Molnar, Laszlo; Ghysels, An; Golubeva-Zadorozhnaya, Anna; Gomes, Joseph; Hanson-Heine, Magnus W. D.; Harbach, Philipp H. P.; Hauser, Andreas W.; Hohenstein, Edward G.; Holden, Zachary C.; Jagau, Thomas-C.; Ji, Hyunjun; Kaduk, Benjamin; Khistyaev, Kirill; Kim, Jaehoon; Kim, Jihan; King, Rollin A.; Klunzinger, Phil; Kosenkov, Dmytro; Kowalczyk, Tim; Krauter, Caroline M.; Lao, Ka Un; Laurent, Adèle D.; Lawler, Keith V.; Levchenko, Sergey V.; Lin, Ching Yeh; Liu, Fenglai; Livshits, Ester; Lochan, Rohini C.; Luenser, Arne; Manohar, Prashant; Manzer, Samuel F.; Mao, Shan-Ping; Mardirossian, Narbe; Marenich, Aleksandr V.; Maurer, Simon A.; Mayhall, Nicholas J.; Neuscamman, Eric; Oana, C. Melania; Olivares-Amaya, Roberto; O'Neill, Darragh P.; Parkhill, John A.; Perrine, Trilisa M.; Peverati, Roberto; Prociuk, Alexander; Rehn, Dirk R.; Rosta, Edina; Russ, Nicholas J.; Sharada, Shaama M.; Sharma, Sandeep; Small, David W.; Sodt, Alexander; Stein, Tamar; Stück, David; Su, Yu-Chuan; Thom, Alex J. W.; Tsuchimochi, Takashi; Vanovschi, Vitalii; Vogt, Leslie; Vydrov, Oleg; Wang, Tao; Watson, Mark A.; Wenzel, Jan; White, Alec; Williams, Christopher F.; Yang, Jun; Yeganeh, Sina; Yost, Shane R.; You, Zhi-Qiang; Zhang, Igor Ying; Zhang, Xing; Zhao, Yan; Brooks, Bernard R.; Chan, Garnet K. L.; Chipman, Daniel M.; Cramer, Christopher J.; Goddard, William A., III; Gordon, Mark S.; Hehre, Warren J.; Klamt, Andreas; Schaefer, Henry F., III; Schmidt, Michael W.; Sherrill, C. David; Truhlar, Donald G.; Warshel, Arieh; Xu, Xin; Aspuru-Guzik, Alán; Baer, Roi; Bell, Alexis T.; Besley, Nicholas A.; Chai, Jeng-Da; Dreuw, Andreas; Dunietz, Barry D.; Furlani, Thomas R.; Gwaltney, Steven R.; Hsu, Chao-Ping; Jung, Yousung; Kong, Jing; Lambrecht, Daniel S.; Liang, WanZhen; Ochsenfeld, Christian; Rassolov, Vitaly A.; Slipchenko, Lyudmila V.; Subotnik, Joseph E.; Van Voorhis, Troy; Herbert, John M.; Krylov, Anna I.; Gill, Peter M. W.; Head-Gordon, Martin

2015-01-01

A summary of the technical advances that are incorporated in the fourth major release of the Q-Chem quantum chemistry program is provided, covering approximately the last seven years. These include developments in density functional theory methods and algorithms, nuclear magnetic resonance (NMR) property evaluation, coupled cluster and perturbation theories, methods for electronically excited and open-shell species, tools for treating extended environments, algorithms for walking on potential surfaces, analysis tools, energy and electron transfer modelling, parallel computing capabilities, and graphical user interfaces. In addition, a selection of example case studies that illustrate these capabilities is given. These include extensive benchmarks of the comparative accuracy of modern density functionals for bonded and non-bonded interactions, tests of attenuated second order Møller-Plesset (MP2) methods for intermolecular interactions, a variety of parallel performance benchmarks, and tests of the accuracy of implicit solvation models. Some specific chemical examples include calculations on the strongly correlated Cr2 dimer, exploring zeolite-catalysed ethane dehydrogenation, energy decomposition analysis of a charged ter-molecular complex arising from glycerol photoionisation, and natural transition orbitals for a Frenkel exciton state in a nine-unit model of a self-assembling nanotube.
Collectively loading an application in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.
Tensor contraction engine: Abstraction and automated parallel implementation of configuration-interaction, coupled-cluster, and many-body perturbation theories

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hirata, So

2003-11-20

We develop a symbolic manipulation program and program generator (Tensor Contraction Engine or TCE) that automatically derives the working equations of a well-defined model of second-quantized many-electron theories and synthesizes efficient parallel computer programs on the basis of these equations. Provided an ansatz of a many-electron theory model, TCE performs valid contractions of creation and annihilation operators according to Wick's theorem, consolidates identical terms, and reduces the expressions into the form of multiple tensor contractions acted by permutation operators. Subsequently, it determines the binary contraction order for each multiple tensor contraction with the minimal operation and memory cost, factorizes commonmore » binary contractions (defines intermediate tensors), and identifies reusable intermediates. The resulting ordered list of binary tensor contractions, additions, and index permutations is translated into an optimized program that is combined with the NWChem and UTChem computational chemistry software packages. The programs synthesized by TCE take advantage of spin symmetry, Abelian point-group symmetry, and index permutation symmetry at every stage of calculations to minimize the number of arithmetic operations and storage requirement, adjust the peak local memory usage by index range tiling, and support parallel I/O interfaces and dynamic load balancing for parallel executions. We demonstrate the utility of TCE through automatic derivation and implementation of parallel programs for various models of configuration-interaction theory (CISD, CISDT, CISDTQ), many-body perturbation theory [MBPT(2), MBPT(3), MBPT(4)], and coupled-cluster theory (LCCD, CCD, LCCSD, CCSD, QCISD, CCSDT, and CCSDTQ).« less

Parallelization and checkpointing of GPU applications through program transformation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Solano-Quinde, Lizandro Damian

2012-01-01

GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general purpose applications. Among the areas that have benefited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, and, in general, the High Performance Computing (HPC) Industry. In order to continue to exploit higher levels of parallelism with GPUs, multi-GPU systems are gaining popularity. In this context, single-GPU applications are parallelized for running in multi-GPU systems. Furthermore, multi-GPU systems help to solvemore » the GPU memory limitation for applications with large application memory footprint. Parallelizing single-GPU applications has been approached by libraries that distribute the workload at runtime, however, they impose execution overhead and are not portable. On the other hand, on traditional CPU systems, parallelization has been approached through application transformation at pre-compile time, which enhances the application to distribute the workload at application level and does not have the issues of library-based approaches. Hence, a parallelization scheme for GPU systems based on application transformation is needed. Like any computing engine of today, reliability is also a concern in GPUs. GPUs are vulnerable to transient and permanent failures. Current checkpoint/restart techniques are not suitable for systems with GPUs. Checkpointing for GPU systems present new and interesting challenges, primarily due to the natural differences imposed by the hardware design, the memory subsystem architecture, the massive number of threads, and the limited amount of synchronization among threads. Therefore, a checkpoint/restart technique suitable for GPU systems is needed. The goal of this work is to exploit higher levels of parallelism and to develop support for application-level fault tolerance in applications using multiple GPUs. Our techniques reduce the burden of enhancing single-GPU applications to support these features. To achieve our goal, this work designs and implements a framework for enhancing a single-GPU OpenCL application through application transformation.« less
SDA 7: A modular and parallel implementation of the simulation of diffusional association software

PubMed Central

Martinez, Michael; Romanowska, Julia; Kokh, Daria B.; Ozboyaci, Musa; Yu, Xiaofeng; Öztürk, Mehmet Ali; Richter, Stefan

2015-01-01

The simulation of diffusional association (SDA) Brownian dynamics software package has been widely used in the study of biomacromolecular association. Initially developed to calculate bimolecular protein–protein association rate constants, it has since been extended to study electron transfer rates, to predict the structures of biomacromolecular complexes, to investigate the adsorption of proteins to inorganic surfaces, and to simulate the dynamics of large systems containing many biomacromolecular solutes, allowing the study of concentration‐dependent effects. These extensions have led to a number of divergent versions of the software. In this article, we report the development of the latest version of the software (SDA 7). This release was developed to consolidate the existing codes into a single framework, while improving the parallelization of the code to better exploit modern multicore shared memory computer architectures. It is built using a modular object‐oriented programming scheme, to allow for easy maintenance and extension of the software, and includes new features, such as adding flexible solute representations. We discuss a number of application examples, which describe some of the methods available in the release, and provide benchmarking data to demonstrate the parallel performance. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. PMID:26123630
West Virginia US Department of Energy experimental program to stimulate competitive research. Section 2: Human resource development; Section 3: Carbon-based structural materials research cluster; Section 3: Data parallel algorithms for scientific computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1994-02-02

This report consists of three separate but related reports. They are (1) Human Resource Development, (2) Carbon-based Structural Materials Research Cluster, and (3) Data Parallel Algorithms for Scientific Computing. To meet the objectives of the Human Resource Development plan, the plan includes K--12 enrichment activities, undergraduate research opportunities for students at the state`s two Historically Black Colleges and Universities, graduate research through cluster assistantships and through a traineeship program targeted specifically to minorities, women and the disabled, and faculty development through participation in research clusters. One research cluster is the chemistry and physics of carbon-based materials. The objective of thismore » cluster is to develop a self-sustaining group of researchers in carbon-based materials research within the institutions of higher education in the state of West Virginia. The projects will involve analysis of cokes, graphites and other carbons in order to understand the properties that provide desirable structural characteristics including resistance to oxidation, levels of anisotropy and structural characteristics of the carbons themselves. In the proposed cluster on parallel algorithms, research by four WVU faculty and three state liberal arts college faculty are: (1) modeling of self-organized critical systems by cellular automata; (2) multiprefix algorithms and fat-free embeddings; (3) offline and online partitioning of data computation; and (4) manipulating and rendering three dimensional objects. This cluster furthers the state Experimental Program to Stimulate Competitive Research plan by building on existing strengths at WVU in parallel algorithms.« less
Impact of the Flipped Classroom on Student Performance and Retention: A Parallel Controlled Study in General Chemistry

ERIC Educational Resources Information Center

Ryan, Michael D.; Reid, Scott A.

2016-01-01

Despite much recent interest in the flipped classroom, quantitative studies are slowly emerging, particularly in the sciences. We report a year-long parallel controlled study of the flipped classroom in a second-term general chemistry course. The flipped course was piloted in the off-semester course in Fall 2014, and the availability of the…
Combinatorial computational chemistry approach for materials design: applications in deNOx catalysis, Fischer-Tropsch synthesis, lanthanoid complex, and lithium ion secondary battery.

PubMed

Koyama, Michihisa; Tsuboi, Hideyuki; Endou, Akira; Takaba, Hiromitsu; Kubo, Momoji; Del Carpio, Carlos A; Miyamoto, Akira

2007-02-01

Computational chemistry can provide fundamental knowledge regarding various aspects of materials. While its impact in scientific research is greatly increasing, its contributions to industrially important issues are far from satisfactory. In order to realize industrial innovation by computational chemistry, a new concept "combinatorial computational chemistry" has been proposed by introducing the concept of combinatorial chemistry to computational chemistry. This combinatorial computational chemistry approach enables theoretical high-throughput screening for materials design. In this manuscript, we review the successful applications of combinatorial computational chemistry to deNO(x) catalysts, Fischer-Tropsch catalysts, lanthanoid complex catalysts, and cathodes of the lithium ion secondary battery.
A Technological Acceptance of Remote Laboratory in Chemistry Education

ERIC Educational Resources Information Center

Ling, Wendy Sing Yii; Lee, Tien Tien; Tho, Siew Wei

2017-01-01

The purpose of this study is to evaluate the technological acceptance of Chemistry students, and the opinions of Chemistry lecturers and laboratory assistants towards the use of remote laboratory in Chemistry education. The convergent parallel design mixed method was carried out in this study. The instruments involved were questionnaire and…
Quantitative collision induced mass spectrometry of substituted piperazines - A correlative analysis between theory and experiment

NASA Astrophysics Data System (ADS)

Ivanova, Bojidarka; Spiteller, Michael

2017-12-01

The present paper deals with quantitative kinetics and thermodynamics of collision induced dissociation (CID) reactions of piperazines under different experimental conditions together with a systematic description of effect of counter-ions on common MS fragment reactions of piperazines; and intra-molecular effect of quaternary cyclization of substituted piperazines yielding to quaternary salts. There are discussed quantitative model equations of rate constants as well as free Gibbs energies of series of m-independent CID fragment processes in GP, which have been evidenced experimentally. Both kinetic and thermodynamic parameters are also predicted by computational density functional theory (DFT) and ab initio both static and dynamic methods. The paper examines validity of Maxwell-Boltzmann distribution to non-Boltzmann CID processes in quantitatively as well. The experiments conducted within the latter framework yield to an excellent correspondence with theoretical quantum chemical modeling. The important property of presented model equations of reaction kinetics is the applicability in predicting unknown and assigning of known mass spectrometric (MS) patterns. The nature of "GP" continuum of CID-MS coupled scheme of measurements with electrospray ionization (ESI) source is discussed, performing parallel computations in gas-phase (GP) and polar continuum at different temperatures and ionic strengths. The effect of pressure is presented. The study contributes significantly to methodological and phenomenological developments of CID-MS and its analytical implementations for quantitative and structural analyses. It also demonstrates great prospective of a complementary application of experimental CID-MS and computational quantum chemistry studying chemical reactivity, among others. To a considerable extend this work underlies the place of computational quantum chemistry to the field of experimental analytical chemistry in particular highlighting the structural analysis.
Highly parallel implementation of non-adiabatic Ehrenfest molecular dynamics

NASA Astrophysics Data System (ADS)

Kanai, Yosuke; Schleife, Andre; Draeger, Erik; Anisimov, Victor; Correa, Alfredo

2014-03-01

While the adiabatic Born-Oppenheimer approximation tremendously lowers computational effort, many questions in modern physics, chemistry, and materials science require an explicit description of coupled non-adiabatic electron-ion dynamics. Electronic stopping, i.e. the energy transfer of a fast projectile atom to the electronic system of the target material, is a notorious example. We recently implemented real-time time-dependent density functional theory based on the plane-wave pseudopotential formalism in the Qbox/qb@ll codes. We demonstrate that explicit integration using a fourth-order Runge-Kutta scheme is very suitable for modern highly parallelized supercomputers. Applying the new implementation to systems with hundreds of atoms and thousands of electrons, we achieved excellent performance and scalability on a large number of nodes both on the BlueGene based ``Sequoia'' system at LLNL as well as the Cray architecture of ``Blue Waters'' at NCSA. As an example, we discuss our work on computing the electronic stopping power of aluminum and gold for hydrogen projectiles, showing an excellent agreement with experiment. These first-principles calculations allow us to gain important insight into the the fundamental physics of electronic stopping.
Evolution of computational chemistry, the "launch pad" to scientific computational models: The early days from a personal account, the present status from the TACC-2012 congress, and eventual future applications from the global simulation approach

NASA Astrophysics Data System (ADS)

Clementi, Enrico

2012-06-01

This is the introductory chapter to the AIP Proceedings volume "Theory and Applications of Computational Chemistry: The First Decade of the Second Millennium" where we discuss the evolution of "computational chemistry". Very early variational computational chemistry developments are reported in Sections 1 to 7, and 11, 12 by recalling some of the computational chemistry contributions by the author and his collaborators (from late 1950 to mid 1990); perturbation techniques are not considered in this already extended work. Present day's computational chemistry is partly considered in Sections 8 to 10 where more recent studies by the author and his collaborators are discussed, including the Hartree-Fock-Heitler-London method; a more general discussion on present day computational chemistry is presented in Section 14. The following chapters of this AIP volume provide a view of modern computational chemistry. Future computational chemistry developments can be extrapolated from the chapters of this AIP volume; further, in Sections 13 and 15 present an overall analysis on computational chemistry, obtained from the Global Simulation approach, by considering the evolution of scientific knowledge confronted with the opportunities offered by modern computers.
What Physicists Should Know About High Performance Computing - Circa 2002

NASA Astrophysics Data System (ADS)

Frederick, Donald

2002-08-01

High Performance Computing (HPC) is a dynamic, cross-disciplinary field that traditionally has involved applied mathematicians, computer scientists, and others primarily from the various disciplines that have been major users of HPC resources - physics, chemistry, engineering, with increasing use by those in the life sciences. There is a technological dynamic that is powered by economic as well as by technical innovations and developments. This talk will discuss practical ideas to be considered when developing numerical applications for research purposes. Even with the rapid pace of development in the field, the author believes that these concepts will not become obsolete for a while, and will be of use to scientists who either are considering, or who have already started down the HPC path. These principles will be applied in particular to current parallel HPC systems, but there will also be references of value to desktop users. The talk will cover such topics as: computing hardware basics, single-cpu optimization, compilers, timing, numerical libraries, debugging and profiling tools and the emergence of Computational Grids.
Virtually going green: The role of quantum computational chemistry in reducing pollution and toxicity in chemistry

NASA Astrophysics Data System (ADS)

Stevens, Jonathan

2017-07-01

Continuing advances in computational chemistry has permitted quantum mechanical calculation to assist in research in green chemistry and to contribute to the greening of chemical practice. Presented here are recent examples illustrating the contribution of computational quantum chemistry to green chemistry, including the possibility of using computation as a green alternative to experiments, but also illustrating contributions to greener catalysis and the search for greener solvents. Examples of applications of computation to ambitious projects for green synthetic chemistry using carbon dioxide are also presented.
Aggregating job exit statuses of a plurality of compute nodes executing a parallel application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

Aggregating job exit statuses of a plurality of compute nodes executing a parallel application, including: identifying a subset of compute nodes in the parallel computer to execute the parallel application; selecting one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; initiating execution of the parallel application on the subset of compute nodes; receiving an exit status from each compute node in the subset of compute nodes, where the exit status for each compute node includes information describing execution of some portion of the parallel application by the compute node; aggregatingmore » each exit status from each compute node in the subset of compute nodes; and sending an aggregated exit status for the subset of compute nodes in the parallel computer.« less
Accelerating Dust Storm Simulation by Balancing Task Allocation in Parallel Computing Environment

NASA Astrophysics Data System (ADS)

Gui, Z.; Yang, C.; XIA, J.; Huang, Q.; YU, M.

2013-12-01

Dust storm has serious negative impacts on environment, human health, and assets. The continuing global climate change has increased the frequency and intensity of dust storm in the past decades. To better understand and predict the distribution, intensity and structure of dust storm, a series of dust storm models have been developed, such as Dust Regional Atmospheric Model (DREAM), the NMM meteorological module (NMM-dust) and Chinese Unified Atmospheric Chemistry Environment for Dust (CUACE/Dust). The developments and applications of these models have contributed significantly to both scientific research and our daily life. However, dust storm simulation is a data and computing intensive process. Normally, a simulation for a single dust storm event may take several days or hours to run. It seriously impacts the timeliness of prediction and potential applications. To speed up the process, high performance computing is widely adopted. By partitioning a large study area into small subdomains according to their geographic location and executing them on different computing nodes in a parallel fashion, the computing performance can be significantly improved. Since spatiotemporal correlations exist in the geophysical process of dust storm simulation, each subdomain allocated to a node need to communicate with other geographically adjacent subdomains to exchange data. Inappropriate allocations may introduce imbalance task loads and unnecessary communications among computing nodes. Therefore, task allocation method is the key factor, which may impact the feasibility of the paralleling. The allocation algorithm needs to carefully leverage the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire system. This presentation introduces two algorithms for such allocation and compares them with evenly distributed allocation method. Specifically, 1) In order to get optimized solutions, a quadratic programming based modeling method is proposed. This algorithm performs well with small amount of computing tasks. However, its efficiency decreases significantly as the subdomain number and computing node number increase. 2) To compensate performance decreasing for large scale tasks, a K-Means clustering based algorithm is introduced. Instead of dedicating to get optimized solutions, this method can get relatively good feasible solutions within acceptable time. However, it may introduce imbalance communication for nodes or node-isolated subdomains. This research shows both two algorithms have their own strength and weakness for task allocation. A combination of the two algorithms is under study to obtain a better performance. Keywords: Scheduling; Parallel Computing; Load Balance; Optimization; Cost Model
Theoretical Hammett Plot for the Gas-Phase Ionization of Benzoic Acid versus Phenol: A Computational Chemistry Lab Exercise

ERIC Educational Resources Information Center

Ziegler, Blake E.

2013-01-01

Computational chemistry undergraduate laboratory courses are now part of the chemistry curriculum at many universities. However, there remains a lack of computational chemistry exercises available to instructors. This exercise is presented for students to develop skills using computational chemistry software while supplementing their knowledge of…
Enabling Chemistry Technologies and Parallel Synthesis-Accelerators of Drug Discovery Programmes.

PubMed

Vasudevan, A; Bogdan, A R; Koolman, H F; Wang, Y; Djuric, S W

There is a pressing need to improve overall productivity in the pharmaceutical industry. Judicious investments in chemistry technologies can have a significant impact on cycle times, cost of goods and probability of technical success. This perspective describes some of these technologies developed and implemented at AbbVie, and their applications to the synthesis of novel scaffolds and to parallel synthesis. © 2017 Elsevier B.V. All rights reserved.
Performance Characterization of Global Address Space Applications: A Case Study with NWChem

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hammond, Jeffrey R.; Krishnamoorthy, Sriram; Shende, Sameer

The use of global address space languages and one-sided communication for complex applications is gaining attention in the parallel computing community. However, lack of good evaluative methods to observe multiple levels of performance makes it difficult to isolate the cause of performance deficiencies and to understand the fundamental limitations of system and application design for future improvement. NWChem is a popular computational chemistry package which depends on the Global Arrays/ ARMCI suite for partitioned global address space functionality to deliver high-end molecular modeling capabilities. A workload characterization methodology was developed to support NWChem performance engineering on large-scale parallel platforms. Themore » research involved both the integration of performance instrumentation and measurement in the NWChem software, as well as the analysis of one-sided communication performance in the context of NWChem workloads. Scaling studies were conducted for NWChem on Blue Gene/P and on two large-scale clusters using different generation Infiniband interconnects and x86 processors. The performance analysis and results show how subtle changes in the runtime parameters related to the communication subsystem could have significant impact on performance behavior. The tool has successfully identified several algorithmic bottlenecks which are already being tackled by computational chemists to improve NWChem performance.« less
Molcas 8: New capabilities for multiconfigurational quantum chemical calculations across the periodic table.

PubMed

Aquilante, Francesco; Autschbach, Jochen; Carlson, Rebecca K; Chibotaru, Liviu F; Delcey, Mickaël G; De Vico, Luca; Fdez Galván, Ignacio; Ferré, Nicolas; Frutos, Luis Manuel; Gagliardi, Laura; Garavelli, Marco; Giussani, Angelo; Hoyer, Chad E; Li Manni, Giovanni; Lischka, Hans; Ma, Dongxia; Malmqvist, Per Åke; Müller, Thomas; Nenov, Artur; Olivucci, Massimo; Pedersen, Thomas Bondo; Peng, Daoling; Plasser, Felix; Pritchard, Ben; Reiher, Markus; Rivalta, Ivan; Schapiro, Igor; Segarra-Martí, Javier; Stenrup, Michael; Truhlar, Donald G; Ungur, Liviu; Valentini, Alessio; Vancoillie, Steven; Veryazov, Valera; Vysotskiy, Victor P; Weingart, Oliver; Zapata, Felipe; Lindh, Roland

2016-02-15

In this report, we summarize and describe the recent unique updates and additions to the Molcas quantum chemistry program suite as contained in release version 8. These updates include natural and spin orbitals for studies of magnetic properties, local and linear scaling methods for the Douglas-Kroll-Hess transformation, the generalized active space concept in MCSCF methods, a combination of multiconfigurational wave functions with density functional theory in the MC-PDFT method, additional methods for computation of magnetic properties, methods for diabatization, analytical gradients of state average complete active space SCF in association with density fitting, methods for constrained fragment optimization, large-scale parallel multireference configuration interaction including analytic gradients via the interface to the Columbus package, and approximations of the CASPT2 method to be used for computations of large systems. In addition, the report includes the description of a computational machinery for nonlinear optical spectroscopy through an interface to the QM/MM package Cobramm. Further, a module to run molecular dynamics simulations is added, two surface hopping algorithms are included to enable nonadiabatic calculations, and the DQ method for diabatization is added. Finally, we report on the subject of improvements with respects to alternative file options and parallelization. © 2015 Wiley Periodicals, Inc.
NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations

NASA Astrophysics Data System (ADS)

Valiev, M.; Bylaska, E. J.; Govind, N.; Kowalski, K.; Straatsma, T. P.; Van Dam, H. J. J.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T. L.; de Jong, W. A.

2010-09-01

The latest release of NWChem delivers an open-source computational chemistry package with extensive capabilities for large scale simulations of chemical and biological systems. Utilizing a common computational framework, diverse theoretical descriptions can be used to provide the best solution for a given scientific problem. Scalable parallel implementations and modular software design enable efficient utilization of current computational architectures. This paper provides an overview of NWChem focusing primarily on the core theoretical modules provided by the code and their parallel performance. Program summaryProgram title: NWChem Catalogue identifier: AEGI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGI_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Open Source Educational Community License No. of lines in distributed program, including test data, etc.: 11 709 543 No. of bytes in distributed program, including test data, etc.: 680 696 106 Distribution format: tar.gz Programming language: Fortran 77, C Computer: all Linux based workstations and parallel supercomputers, Windows and Apple machines Operating system: Linux, OS X, Windows Has the code been vectorised or parallelized?: Code is parallelized Classification: 2.1, 2.2, 3, 7.3, 7.7, 16.1, 16.2, 16.3, 16.10, 16.13 Nature of problem: Large-scale atomistic simulations of chemical and biological systems require efficient and reliable methods for ground and excited solutions of many-electron Hamiltonian, analysis of the potential energy surface, and dynamics. Solution method: Ground and excited solutions of many-electron Hamiltonian are obtained utilizing density-functional theory, many-body perturbation approach, and coupled cluster expansion. These solutions or a combination thereof with classical descriptions are then used to analyze potential energy surface and perform dynamical simulations. Additional comments: Full documentation is provided in the distribution file. This includes an INSTALL file giving details of how to build the package. A set of test runs is provided in the examples directory. The distribution file for this program is over 90 Mbytes and therefore is not delivered directly when download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. Running time: Running time depends on the size of the chemical system, complexity of the method, number of cpu's and the computational task. It ranges from several seconds for serial DFT energy calculations on a few atoms to several hours for parallel coupled cluster energy calculations on tens of atoms or ab-initio molecular dynamics simulation on hundreds of atoms.
Electron-correlated fragment-molecular-orbital calculations for biomolecular and nano systems.

PubMed

Tanaka, Shigenori; Mochizuki, Yuji; Komeiji, Yuto; Okiyama, Yoshio; Fukuzawa, Kaori

2014-06-14

Recent developments in the fragment molecular orbital (FMO) method for theoretical formulation, implementation, and application to nano and biomolecular systems are reviewed. The FMO method has enabled ab initio quantum-mechanical calculations for large molecular systems such as protein-ligand complexes at a reasonable computational cost in a parallelized way. There have been a wealth of application outcomes from the FMO method in the fields of biochemistry, medicinal chemistry and nanotechnology, in which the electron correlation effects play vital roles. With the aid of the advances in high-performance computing, the FMO method promises larger, faster, and more accurate simulations of biomolecular and related systems, including the descriptions of dynamical behaviors in solvent environments. The current status and future prospects of the FMO scheme are addressed in these contexts.
Contributions of computational chemistry and biophysical techniques to fragment-based drug discovery.

PubMed

Gozalbes, Rafael; Carbajo, Rodrigo J; Pineda-Lucena, Antonio

2010-01-01

In the last decade, fragment-based drug discovery (FBDD) has evolved from a novel approach in the search of new hits to a valuable alternative to the high-throughput screening (HTS) campaigns of many pharmaceutical companies. The increasing relevance of FBDD in the drug discovery universe has been concomitant with an implementation of the biophysical techniques used for the detection of weak inhibitors, e.g. NMR, X-ray crystallography or surface plasmon resonance (SPR). At the same time, computational approaches have also been progressively incorporated into the FBDD process and nowadays several computational tools are available. These stretch from the filtering of huge chemical databases in order to build fragment-focused libraries comprising compounds with adequate physicochemical properties, to more evolved models based on different in silico methods such as docking, pharmacophore modelling, QSAR and virtual screening. In this paper we will review the parallel evolution and complementarities of biophysical techniques and computational methods, providing some representative examples of drug discovery success stories by using FBDD.

Simulated Raman Spectral Analysis of Organic Molecules

NASA Astrophysics Data System (ADS)

Lu, Lu

The advent of the laser technology in the 1960s solved the main difficulty of Raman spectroscopy, resulted in simplified Raman spectroscopy instruments and also boosted the sensitivity of the technique. Up till now, Raman spectroscopy is commonly used in chemistry and biology. As vibrational information is specific to the chemical bonds, Raman spectroscopy provides fingerprints to identify the type of molecules in the sample. In this thesis, we simulate the Raman Spectrum of organic and inorganic materials by General Atomic and Molecular Electronic Structure System (GAMESS) and Gaussian, two computational codes that perform several general chemistry calculations. We run these codes on our CPU-based high-performance cluster (HPC). Through the message passing interface (MPI), a standardized and portable message-passing system which can make the codes run in parallel, we are able to decrease the amount of time for computation and increase the sizes and capacities of systems simulated by the codes. From our simulations, we will set up a database that allows search algorithm to quickly identify N-H and O-H bonds in different materials. Our ultimate goal is to analyze and identify the spectra of organic matter compositions from meteorites and compared these spectra with terrestrial biologically-produced amino acids and residues.
Large-Scale Parallel Simulations of Turbulent Combustion using Combined Dimension Reduction and Tabulation of Chemistry

DTIC Science & Technology

2012-05-22

tabulation of the reduced space is performed using the In Situ Adaptive Tabulation ( ISAT ) algorithm. In addition, we use x2f mpi – a Fortran library...for parallel vector-valued function evaluation (used with ISAT in this context) – to efficiently redistribute the chemistry workload among the...Constrained-Equilibrium (RCCE) method, and tabulation of the reduced space is performed using the In Situ Adaptive Tabulation ( ISAT ) algorithm. In addition
Parallel computational fluid dynamics '91; Conference Proceedings, Stuttgart, Germany, Jun. 10-12, 1991

NASA Technical Reports Server (NTRS)

Reinsch, K. G. (Editor); Schmidt, W. (Editor); Ecer, A. (Editor); Haeuser, Jochem (Editor); Periaux, J. (Editor)

1992-01-01

A conference was held on parallel computational fluid dynamics and produced related papers. Topics discussed in these papers include: parallel implicit and explicit solvers for compressible flow, parallel computational techniques for Euler and Navier-Stokes equations, grid generation techniques for parallel computers, and aerodynamic simulation om massively parallel systems.
The Research of the Parallel Computing Development from the Angle of Cloud Computing

NASA Astrophysics Data System (ADS)

Peng, Zhensheng; Gong, Qingge; Duan, Yanyu; Wang, Yun

2017-10-01

Cloud computing is the development of parallel computing, distributed computing and grid computing. The development of cloud computing makes parallel computing come into people’s lives. Firstly, this paper expounds the concept of cloud computing and introduces two several traditional parallel programming model. Secondly, it analyzes and studies the principles, advantages and disadvantages of OpenMP, MPI and Map Reduce respectively. Finally, it takes MPI, OpenMP models compared to Map Reduce from the angle of cloud computing. The results of this paper are intended to provide a reference for the development of parallel computing.
Spatial data analytics on heterogeneous multi- and many-core parallel architectures using python

USGS Publications Warehouse

Laura, Jason R.; Rey, Sergio J.

2017-01-01

Parallel vector spatial analysis concerns the application of parallel computational methods to facilitate vector-based spatial analysis. The history of parallel computation in spatial analysis is reviewed, and this work is placed into the broader context of high-performance computing (HPC) and parallelization research. The rise of cyber infrastructure and its manifestation in spatial analysis as CyberGIScience is seen as a main driver of renewed interest in parallel computation in the spatial sciences. Key problems in spatial analysis that have been the focus of parallel computing are covered. Chief among these are spatial optimization problems, computational geometric problems including polygonization and spatial contiguity detection, the use of Monte Carlo Markov chain simulation in spatial statistics, and parallel implementations of spatial econometric methods. Future directions for research on parallelization in computational spatial analysis are outlined.
Metal-Acetylacetonate Synthesis Experiments: Which Is Greener?

ERIC Educational Resources Information Center

Ribeiro, M. Gabriela T. C.; Machado, Adlio A. S. C.

2011-01-01

A procedure for teaching green chemistry through laboratory experiments is presented in which students are challenged to use the 12 principles of green chemistry to review and modify synthesis protocols to improve greenness. A global metric, green star, is used in parallel with green chemistry mass metrics to evaluate the improvement in greenness.…
Integrating Computational Chemistry into a Course in Classical Thermodynamics

ERIC Educational Resources Information Center

Martini, Sheridan R.; Hartzell, Cynthia J.

2015-01-01

Computational chemistry is commonly addressed in the quantum mechanics course of undergraduate physical chemistry curricula. Since quantum mechanics traditionally follows the thermodynamics course, there is a lack of curricula relating computational chemistry to thermodynamics. A method integrating molecular modeling software into a semester long…
chemf: A purely functional chemistry toolkit.

PubMed

Höck, Stefan; Riedl, Rainer

2012-12-20

Although programming in a type-safe and referentially transparent style offers several advantages over working with mutable data structures and side effects, this style of programming has not seen much use in chemistry-related software. Since functional programming languages were designed with referential transparency in mind, these languages offer a lot of support when writing immutable data structures and side-effects free code. We therefore started implementing our own toolkit based on the above programming paradigms in a modern, versatile programming language. We present our initial results with functional programming in chemistry by first describing an immutable data structure for molecular graphs together with a couple of simple algorithms to calculate basic molecular properties before writing a complete SMILES parser in accordance with the OpenSMILES specification. Along the way we show how to deal with input validation, error handling, bulk operations, and parallelization in a purely functional way. At the end we also analyze and improve our algorithms and data structures in terms of performance and compare it to existing toolkits both object-oriented and purely functional. All code was written in Scala, a modern multi-paradigm programming language with a strong support for functional programming and a highly sophisticated type system. We have successfully made the first important steps towards a purely functional chemistry toolkit. The data structures and algorithms presented in this article perform well while at the same time they can be safely used in parallelized applications, such as computer aided drug design experiments, without further adjustments. This stands in contrast to existing object-oriented toolkits where thread safety of data structures and algorithms is a deliberate design decision that can be hard to implement. Finally, the level of type-safety achieved by Scala highly increased the reliability of our code as well as the productivity of the programmers involved in this project.
chemf: A purely functional chemistry toolkit

PubMed Central

2012-01-01

Background Although programming in a type-safe and referentially transparent style offers several advantages over working with mutable data structures and side effects, this style of programming has not seen much use in chemistry-related software. Since functional programming languages were designed with referential transparency in mind, these languages offer a lot of support when writing immutable data structures and side-effects free code. We therefore started implementing our own toolkit based on the above programming paradigms in a modern, versatile programming language. Results We present our initial results with functional programming in chemistry by first describing an immutable data structure for molecular graphs together with a couple of simple algorithms to calculate basic molecular properties before writing a complete SMILES parser in accordance with the OpenSMILES specification. Along the way we show how to deal with input validation, error handling, bulk operations, and parallelization in a purely functional way. At the end we also analyze and improve our algorithms and data structures in terms of performance and compare it to existing toolkits both object-oriented and purely functional. All code was written in Scala, a modern multi-paradigm programming language with a strong support for functional programming and a highly sophisticated type system. Conclusions We have successfully made the first important steps towards a purely functional chemistry toolkit. The data structures and algorithms presented in this article perform well while at the same time they can be safely used in parallelized applications, such as computer aided drug design experiments, without further adjustments. This stands in contrast to existing object-oriented toolkits where thread safety of data structures and algorithms is a deliberate design decision that can be hard to implement. Finally, the level of type-safety achieved by Scala highly increased the reliability of our code as well as the productivity of the programmers involved in this project. PMID:23253942
Outlook Bright for Computers in Chemistry.

ERIC Educational Resources Information Center

Baum, Rudy M.

1981-01-01

Discusses the recent decision to close down the National Resource for Computation in Chemistry (NRCC), implications of that decision, and various alternatives in the field of computational chemistry. (CS)
Massively parallel implementations of coupled-cluster methods for electron spin resonance spectra. I. Isotropic hyperfine coupling tensors in large radicals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Verma, Prakash; Morales, Jorge A., E-mail: jorge.morales@ttu.edu; Perera, Ajith

2013-11-07

Coupled cluster (CC) methods provide highly accurate predictions of molecular properties, but their high computational cost has precluded their routine application to large systems. Fortunately, recent computational developments in the ACES III program by the Bartlett group [the OED/ERD atomic integral package, the super instruction processor, and the super instruction architecture language] permit overcoming that limitation by providing a framework for massively parallel CC implementations. In that scheme, we are further extending those parallel CC efforts to systematically predict the three main electron spin resonance (ESR) tensors (A-, g-, and D-tensors) to be reported in a series of papers. Inmore » this paper inaugurating that series, we report our new ACES III parallel capabilities that calculate isotropic hyperfine coupling constants in 38 neutral, cationic, and anionic radicals that include the {sup 11}B, {sup 17}O, {sup 9}Be, {sup 19}F, {sup 1}H, {sup 13}C, {sup 35}Cl, {sup 33}S,{sup 14}N, {sup 31}P, and {sup 67}Zn nuclei. Present parallel calculations are conducted at the Hartree-Fock (HF), second-order many-body perturbation theory [MBPT(2)], CC singles and doubles (CCSD), and CCSD with perturbative triples [CCSD(T)] levels using Roos augmented double- and triple-zeta atomic natural orbitals basis sets. HF results consistently overestimate isotropic hyperfine coupling constants. However, inclusion of electron correlation effects in the simplest way via MBPT(2) provides significant improvements in the predictions, but not without occasional failures. In contrast, CCSD results are consistently in very good agreement with experimental results. Inclusion of perturbative triples to CCSD via CCSD(T) leads to small improvements in the predictions, which might not compensate for the extra computational effort at a non-iterative N{sup 7}-scaling in CCSD(T). The importance of these accurate computations of isotropic hyperfine coupling constants to elucidate experimental ESR spectra, to interpret spin-density distributions, and to characterize and identify radical species is illustrated with our results from large organic radicals. Those include species relevant for organic chemistry, petroleum industry, and biochemistry, such as the cyclo-hexyl, 1-adamatyl, and Zn-porphycene anion radicals, inter alia.« less
Broadcasting collective operation contributions throughout a parallel computer

DOEpatents

Faraj, Ahmad [Rochester, MN

2012-02-21

Methods, systems, and products are disclosed for broadcasting collective operation contributions throughout a parallel computer. The parallel computer includes a plurality of compute nodes connected together through a data communications network. Each compute node has a plurality of processors for use in collective parallel operations on the parallel computer. Broadcasting collective operation contributions throughout a parallel computer according to embodiments of the present invention includes: transmitting, by each processor on each compute node, that processor's collective operation contribution to the other processors on that compute node using intra-node communications; and transmitting on a designated network link, by each processor on each compute node according to a serial processor transmission sequence, that processor's collective operation contribution to the other processors on the other compute nodes using inter-node communications.
Application Portable Parallel Library

NASA Technical Reports Server (NTRS)

Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

1995-01-01

Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Quantum kernel applications in medicinal chemistry.

PubMed

Huang, Lulu; Massa, Lou

2012-07-01

Progress in the quantum mechanics of biological molecules is being driven by computational advances. The notion of quantum kernels can be introduced to simplify the formalism of quantum mechanics, making it especially suitable for parallel computation of very large biological molecules. The essential idea is to mathematically break large biological molecules into smaller kernels that are calculationally tractable, and then to represent the full molecule by a summation over the kernels. The accuracy of the kernel energy method (KEM) is shown by systematic application to a great variety of molecular types found in biology. These include peptides, proteins, DNA and RNA. Examples are given that explore the KEM across a variety of chemical models, and to the outer limits of energy accuracy and molecular size. KEM represents an advance in quantum biology applicable to problems in medicine and drug design.
Multiscale Modeling: A Review

NASA Astrophysics Data System (ADS)

Horstemeyer, M. F.

This review of multiscale modeling covers a brief history of various multiscale methodologies related to solid materials and the associated experimental influences, the various influence of multiscale modeling on different disciplines, and some examples of multiscale modeling in the design of structural components. Although computational multiscale modeling methodologies have been developed in the late twentieth century, the fundamental notions of multiscale modeling have been around since da Vinci studied different sizes of ropes. The recent rapid growth in multiscale modeling is the result of the confluence of parallel computing power, experimental capabilities to characterize structure-property relations down to the atomic level, and theories that admit multiple length scales. The ubiquitous research that focus on multiscale modeling has broached different disciplines (solid mechanics, fluid mechanics, materials science, physics, mathematics, biological, and chemistry), different regions of the world (most continents), and different length scales (from atoms to autos).
Laser Powered Launch Vehicle Performance Analyses

NASA Technical Reports Server (NTRS)

Chen, Yen-Sen; Liu, Jiwen; Wang, Ten-See (Technical Monitor)

2001-01-01

The purpose of this study is to establish the technical ground for modeling the physics of laser powered pulse detonation phenomenon. Laser powered propulsion systems involve complex fluid dynamics, thermodynamics and radiative transfer processes. Successful predictions of the performance of laser powered launch vehicle concepts depend on the sophisticate models that reflects the underlying flow physics including the laser ray tracing the focusing, inverse Bremsstrahlung (IB) effects, finite-rate air chemistry, thermal non-equilibrium, plasma radiation and detonation wave propagation, etc. The proposed work will extend the base-line numerical model to an efficient design analysis tool. The proposed model is suitable for 3-D analysis using parallel computing methods.
Performance of parallel computation using CUDA for solving the one-dimensional elasticity equations

NASA Astrophysics Data System (ADS)

Darmawan, J. B. B.; Mungkasi, S.

2017-01-01

In this paper, we investigate the performance of parallel computation in solving the one-dimensional elasticity equations. Elasticity equations are usually implemented in engineering science. Solving these equations fast and efficiently is desired. Therefore, we propose the use of parallel computation. Our parallel computation uses CUDA of the NVIDIA. Our research results show that parallel computation using CUDA has a great advantage and is powerful when the computation is of large scale.
Multi-Zone Liquid Thrust Chamber Performance Code with Domain Decomposition for Parallel Processing

NASA Technical Reports Server (NTRS)

Navaz, Homayun K.

2002-01-01

Computational Fluid Dynamics (CFD) has considerably evolved in the last decade. There are many computer programs that can perform computations on viscous internal or external flows with chemical reactions. CFD has become a commonly used tool in the design and analysis of gas turbines, ramjet combustors, turbo-machinery, inlet ducts, rocket engines, jet interaction, missile, and ramjet nozzles. One of the problems of interest to NASA has always been the performance prediction for rocket and air-breathing engines. Due to the complexity of flow in these engines it is necessary to resolve the flowfield into a fine mesh to capture quantities like turbulence and heat transfer. However, calculation on a high-resolution grid is associated with a prohibitively increasing computational time that can downgrade the value of the CFD for practical engineering calculations. The Liquid Thrust Chamber Performance (LTCP) code was developed for NASA/MSFC (Marshall Space Flight Center) to perform liquid rocket engine performance calculations. This code is a 2D/axisymmetric full Navier-Stokes (NS) solver with fully coupled finite rate chemistry and Eulerian treatment of liquid fuel and/or oxidizer droplets. One of the advantages of this code has been the resemblance of its input file to the JANNAF (Joint Army Navy NASA Air Force Interagency Propulsion Committee) standard TDK code, and its automatic grid generation for JANNAF defined combustion chamber wall geometry. These options minimize the learning effort for TDK users, and make the code a good candidate for performing engineering calculations. Although the LTCP code was developed for liquid rocket engines, it is a general-purpose code and has been used for solving many engineering problems. However, the single zone formulation of the LTCP has limited the code to be applicable to problems with complex geometry. Furthermore, the computational time becomes prohibitively large for high-resolution problems with chemistry, two-equation turbulence model, and two-phase flow. To overcome these limitations, the LTCP code is rewritten to include the multi-zone capability with domain decomposition that makes it suitable for parallel processing, i.e., enabling the code to run every zone or sub-domain on a separate processor. This can reduce the run time by a factor of 6 to 8, depending on the problem.
Increasing processor utilization during parallel computation rundown

NASA Technical Reports Server (NTRS)

Jones, W. H.

1986-01-01

Some parallel processing environments provide for asynchronous execution and completion of general purpose parallel computations from a single computational phase. When all the computations from such a phase are complete, a new parallel computational phase is begun. Depending upon the granularity of the parallel computations to be performed, there may be a shortage of available work as a particular computational phase draws to a close (computational rundown). This can result in the waste of computing resources and the delay of the overall problem. In many practical instances, strict sequential ordering of phases of parallel computation is not totally required. In such cases, the beginning of one phase can be correctly computed before the end of a previous phase is completed. This allows additional work to be generated somewhat earlier to keep computing resources busy during each computational rundown. The conditions under which this can occur are identified and the frequency of occurrence of such overlapping in an actual parallel Navier-Stokes code is reported. A language construct is suggested and possible control strategies for the management of such computational phase overlapping are discussed.
Simulation of Aerosols and Chemistry with a Unified Global Model

NASA Technical Reports Server (NTRS)

Chin, Mian

2004-01-01

This project is to continue the development of the global simulation capabilities of tropospheric and stratospheric chemistry and aerosols in a unified global model. This is a part of our overall investigation of aerosol-chemistry-climate interaction. In the past year, we have enabled the tropospheric chemistry simulations based on the GEOS-CHEM model, and added stratospheric chemical reactions into the GEOS-CHEM such that a globally unified troposphere-stratosphere chemistry and transport can be simulated consistently without any simplifications. The tropospheric chemical mechanism in the GEOS-CHEM includes 80 species and 150 reactions. 24 tracers are transported, including O3, NOx, total nitrogen (NOy), H2O2, CO, and several types of hydrocarbon. The chemical solver used in the GEOS-CHEM model is a highly accurate sparse-matrix vectorized Gear solver (SMVGEAR). The stratospheric chemical mechanism includes an additional approximately 100 reactions and photolysis processes. Because of the large number of total chemical reactions and photolysis processes and very different photochemical regimes involved in the unified simulation, the model demands significant computer resources that are currently not practical. Therefore, several improvements will be taken, such as massive parallelization, code optimization, or selecting a faster solver. We have also continued aerosol simulation (including sulfate, dust, black carbon, organic carbon, and sea-salt) in the global model to cover most of year 2002. These results have been made available to many groups worldwide and accessible from the website http://code916.gsfc.nasa.gov/People/Chin/aot.html.

Broadcasting a message in a parallel computer

DOEpatents

Berg, Jeremy E [Rochester, MN; Faraj, Ahmad A [Rochester, MN

2011-08-02

Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.
Force user's manual: A portable, parallel FORTRAN

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Arenstorf, Norbert S.; Ramanan, Aruna V.

1990-01-01

The use of Force, a parallel, portable FORTRAN on shared memory parallel computers is described. Force simplifies writing code for parallel computers and, once the parallel code is written, it is easily ported to computers on which Force is installed. Although Force is nearly the same for all computers, specific details are included for the Cray-2, Cray-YMP, Convex 220, Flex/32, Encore, Sequent, Alliant computers on which it is installed.
Disciplines, models, and computers: the path to computational quantum chemistry.

PubMed

Lenhard, Johannes

2014-12-01

Many disciplines and scientific fields have undergone a computational turn in the past several decades. This paper analyzes this sort of turn by investigating the case of computational quantum chemistry. The main claim is that the transformation from quantum to computational quantum chemistry involved changes in three dimensions. First, on the side of instrumentation, small computers and a networked infrastructure took over the lead from centralized mainframe architecture. Second, a new conception of computational modeling became feasible and assumed a crucial role. And third, the field of computa- tional quantum chemistry became organized in a market-like fashion and this market is much bigger than the number of quantum theory experts. These claims will be substantiated by an investigation of the so-called density functional theory (DFT), the arguably pivotal theory in the turn to computational quantum chemistry around 1990.
Architecture Adaptive Computing Environment

NASA Technical Reports Server (NTRS)

Dorband, John E.

2006-01-01

Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
Parallel computing works

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less
Computer Series, 101: Accurate Equations of State in Computational Chemistry Projects.

ERIC Educational Resources Information Center

Albee, David; Jones, Edward

1989-01-01

Discusses the use of computers in chemistry courses at the United States Military Academy. Provides two examples of computer projects: (1) equations of state, and (2) solving for molar volume. Presents BASIC and PASCAL listings for the second project. Lists 10 applications for physical chemistry. (MVL)
Distributing an executable job load file to compute nodes in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gooding, Thomas M.

Distributing an executable job load file to compute nodes in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: determining, by a compute node in the parallel computer, whether the compute node is participating in a job; determining, by the compute node in the parallel computer, whether a descendant compute node is participating in the job; responsive to determining that the compute node is participating in the job or that the descendant compute node is participating in the job, communicating, by the compute node to a parent compute node, an identification of a data communications linkmore » over which the compute node receives data from the parent compute node; constructing a class route for the job, wherein the class route identifies all compute nodes participating in the job; and broadcasting the executable load file for the job along the class route for the job.« less
TABULATED EQUIVALENT SDR FLAMELET (TESF) MODEFL

DOE Office of Scientific and Technical Information (OSTI.GOV)

KUNDU, PRITHWISH; AMEEN, mUHSIN MOHAMMED; UNNIKRISHNAN, UMESH

The code consists of an implementation of a novel tabulated combustion model for non-premixed flames in CFD solvers. This novel technique/model is used to implement an unsteady flamelet tabulation without using progress variables for non-premixed flames. It also has the capability to include history effects which is unique within tabulated flamelet models. The flamelet table generation code can be run in parallel to generate tables with large chemistry mechanisms in relatively short wall clock times. The combustion model/code reads these tables. This framework can be coupled with any CFD solver with RANS as well as LES turbulence models. This frameworkmore » enables CFD solvers to run large chemistry mechanisms with large number of grids at relatively lower computational costs. Currently it has been coupled with the Converge CFD code and validated against available experimental data. This model can be used to simulate non-premixed combustion in a variety of applications like reciprocating engines, gas turbines and industrial burners operating over a wide range of fuels.« less
Development and Assessment of a Chemistry-Based Computer Video Game as a Learning Tool

ERIC Educational Resources Information Center

Martinez-Hernandez, Kermin Joel

2010-01-01

The chemistry-based computer video game is a multidisciplinary collaboration between chemistry and computer graphics and technology fields developed to explore the use of video games as a possible learning tool. This innovative approach aims to integrate elements of commercial video game and authentic chemistry context environments into a learning…
Publication and Retrieval of Computational Chemical-Physical Data Via the Semantic Web. Final Technical Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ostlund, Neil

This research showed the feasibility of applying the concepts of the Semantic Web to Computation Chemistry. We have created the first web portal (www.chemsem.com) that allows data created in the calculations of quantum chemistry, and other such chemistry calculations to be placed on the web in a way that makes the data accessible to scientists in a semantic form never before possible. The semantic web nature of the portal allows data to be searched, found, and used as an advance over the usual approach of a relational database. The semantic data on our portal has the nature of a Giantmore » Global Graph (GGG) that can be easily merged with related data and searched globally via a SPARQL Protocol and RDF Query Language (SPARQL) that makes global searches for data easier than with traditional methods. Our Semantic Web Portal requires that the data be understood by a computer and hence defined by an ontology (vocabulary). This ontology is used by the computer in understanding the data. We have created such an ontology for computational chemistry (purl.org/gc) that encapsulates a broad knowledge of the field of computational chemistry. We refer to this ontology as the Gainesville Core. While it is perhaps the first ontology for computational chemistry and is used by our portal, it is only a start of what must be a long multi-partner effort to define computational chemistry. In conjunction with the above efforts we have defined a new potential file standard (Common Standard for eXchange – CSX for computational chemistry data). This CSX file is the precursor of data in the Resource Description Framework (RDF) form that the semantic web requires. Our portal translates CSX files (as well as other computational chemistry data files) into RDF files that are part of the graph database that the semantic web employs. We propose a CSX file as a convenient way to encapsulate computational chemistry data.« less
Computational Chemistry in the Pharmaceutical Industry: From Childhood to Adolescence.

PubMed

Hillisch, Alexander; Heinrich, Nikolaus; Wild, Hanno

2015-12-01

Computational chemistry within the pharmaceutical industry has grown into a field that proactively contributes to many aspects of drug design, including target selection and lead identification and optimization. While methodological advancements have been key to this development, organizational developments have been crucial to our success as well. In particular, the interaction between computational and medicinal chemistry and the integration of computational chemistry into the entire drug discovery process have been invaluable. Over the past ten years we have shaped and developed a highly efficient computational chemistry group for small-molecule drug discovery at Bayer HealthCare that has significantly impacted the clinical development pipeline. In this article we describe the setup and tasks of the computational group and discuss external collaborations. We explain what we have found to be the most valuable and productive methods and discuss future directions for computational chemistry method development. We share this information with the hope of igniting interesting discussions around this topic. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
First-Principles Design of Novel Catalytic and Chemoresponsive Materials

NASA Astrophysics Data System (ADS)

Roling, Luke T.

An emerging trend in materials design is the use of computational chemistry tools to accelerate materials discovery and implementation. In particular, the parallel nature of computational models enables high-throughput screening approaches that would be laborious and time-consuming with experiments alone, and can be useful for identifying promising candidate materials for experimental synthesis and evaluation. Additionally, atomic-scale modeling allows researchers to obtain a detailed understanding of phenomena invisible to many current experimental techniques. In this thesis, we highlight mechanistic studies and successes in catalyst design for heterogeneous electrochemical reactions, discussing both anode and cathode chemistries. In particular, we evaluate the properties of a new class of Pd-Pt core-shell and hollow nanocatalysts toward the oxygen reduction reaction. We do not limit our study to electrochemical reactivity, but also consider these catalysts in a broader context by performing in-depth studies of their stability at elevated temperatures as well as investigating the mechanisms by which they are able to form. We also present fundamental surface science studies, investigating graphene formation and H2 dissociation, which are processes of both fundamental and practical interest in many catalytic applications. Finally, we extend our materials design paradigm outside the field of catalysis to develop and apply a model for the detection of small chemical analytes by chemoresponsive liquid crystals, and offer several predictions for improving the detection of small chemicals. A close connection between computation, synthesis, and experimental evaluation is essential to the work described herein, as computations are used to gain fundamental insight into experimental observations, and experiments and synthesis are in turn used to validate predictions of material activities from computational models.
Enhancement of DFT-calculations at petascale: Nuclear Magnetic Resonance, Hybrid Density Functional Theory and Car-Parrinello calculations

NASA Astrophysics Data System (ADS)

Varini, Nicola; Ceresoli, Davide; Martin-Samos, Layla; Girotto, Ivan; Cavazzoni, Carlo

2013-08-01

One of the most promising techniques used for studying the electronic properties of materials is based on Density Functional Theory (DFT) approach and its extensions. DFT has been widely applied in traditional solid state physics problems where periodicity and symmetry play a crucial role in reducing the computational workload. With growing compute power capability and the development of improved DFT methods, the range of potential applications is now including other scientific areas such as Chemistry and Biology. However, cross disciplinary combinations of traditional Solid-State Physics, Chemistry and Biology drastically improve the system complexity while reducing the degree of periodicity and symmetry. Large simulation cells containing of hundreds or even thousands of atoms are needed to model these kind of physical systems. The treatment of those systems still remains a computational challenge even with modern supercomputers. In this paper we describe our work to improve the scalability of Quantum ESPRESSO (Giannozzi et al., 2009 [3]) for treating very large cells and huge numbers of electrons. To this end we have introduced an extra level of parallelism, over electronic bands, in three kernels for solving computationally expensive problems: the Sternheimer equation solver (Nuclear Magnetic Resonance, package QE-GIPAW), the Fock operator builder (electronic ground-state, package PWscf) and most of the Car-Parrinello routines (Car-Parrinello dynamics, package CP). Final benchmarks show our success in computing the Nuclear Magnetic Response (NMR) chemical shift of a large biological assembly, the electronic structure of defected amorphous silica with hybrid exchange-correlation functionals and the equilibrium atomic structure of height Porphyrins anchored to a Carbon Nanotube, on many thousands of CPU cores.
Transuranic Computational Chemistry.

PubMed

Kaltsoyannis, Nikolas

2018-02-26

Recent developments in the chemistry of the transuranic elements are surveyed, with particular emphasis on computational contributions. Examples are drawn from molecular coordination and organometallic chemistry, and from the study of extended solid systems. The role of the metal valence orbitals in covalent bonding is a particular focus, especially the consequences of the stabilization of the 5f orbitals as the actinide series is traversed. The fledgling chemistry of transuranic elements in the +II oxidation state is highlighted. Throughout, the symbiotic interplay of experimental and computational studies is emphasized; the extraordinary challenges of experimental transuranic chemistry afford computational chemistry a particularly valuable role at the frontier of the periodic table. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Discrete event performance prediction of speculatively parallel temperature-accelerated dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zamora, Richard James; Voter, Arthur F.; Perez, Danny

Due to its unrivaled ability to predict the dynamical evolution of interacting atoms, molecular dynamics (MD) is a widely used computational method in theoretical chemistry, physics, biology, and engineering. Despite its success, MD is only capable of modeling time scales within several orders of magnitude of thermal vibrations, leaving out many important phenomena that occur at slower rates. The Temperature Accelerated Dynamics (TAD) method overcomes this limitation by thermally accelerating the state-to-state evolution captured by MD. Due to the algorithmically complex nature of the serial TAD procedure, implementations have yet to improve performance by parallelizing the concurrent exploration of multiplemore » states. Here we utilize a discrete event-based application simulator to introduce and explore a new Speculatively Parallel TAD (SpecTAD) method. We investigate the SpecTAD algorithm, without a full-scale implementation, by constructing an application simulator proxy (SpecTADSim). Finally, following this method, we discover that a nontrivial relationship exists between the optimal SpecTAD parameter set and the number of CPU cores available at run-time. Furthermore, we find that a majority of the available SpecTAD boost can be achieved within an existing TAD application using relatively simple algorithm modifications.« less
Discrete event performance prediction of speculatively parallel temperature-accelerated dynamics

DOE PAGES

Zamora, Richard James; Voter, Arthur F.; Perez, Danny; ...

2016-12-01

Due to its unrivaled ability to predict the dynamical evolution of interacting atoms, molecular dynamics (MD) is a widely used computational method in theoretical chemistry, physics, biology, and engineering. Despite its success, MD is only capable of modeling time scales within several orders of magnitude of thermal vibrations, leaving out many important phenomena that occur at slower rates. The Temperature Accelerated Dynamics (TAD) method overcomes this limitation by thermally accelerating the state-to-state evolution captured by MD. Due to the algorithmically complex nature of the serial TAD procedure, implementations have yet to improve performance by parallelizing the concurrent exploration of multiplemore » states. Here we utilize a discrete event-based application simulator to introduce and explore a new Speculatively Parallel TAD (SpecTAD) method. We investigate the SpecTAD algorithm, without a full-scale implementation, by constructing an application simulator proxy (SpecTADSim). Finally, following this method, we discover that a nontrivial relationship exists between the optimal SpecTAD parameter set and the number of CPU cores available at run-time. Furthermore, we find that a majority of the available SpecTAD boost can be achieved within an existing TAD application using relatively simple algorithm modifications.« less
Exploiting Locality in Quantum Computation for Quantum Chemistry.

PubMed

McClean, Jarrod R; Babbush, Ryan; Love, Peter J; Aspuru-Guzik, Alán

2014-12-18

Accurate prediction of chemical and material properties from first-principles quantum chemistry is a challenging task on traditional computers. Recent developments in quantum computation offer a route toward highly accurate solutions with polynomial cost; however, this solution still carries a large overhead. In this Perspective, we aim to bring together known results about the locality of physical interactions from quantum chemistry with ideas from quantum computation. We show that the utilization of spatial locality combined with the Bravyi-Kitaev transformation offers an improvement in the scaling of known quantum algorithms for quantum chemistry and provides numerical examples to help illustrate this point. We combine these developments to improve the outlook for the future of quantum chemistry on quantum computers.
Matching pursuit parallel decomposition of seismic data

NASA Astrophysics Data System (ADS)

Li, Chuanhui; Zhang, Fanchang

2017-07-01

In order to improve the computation speed of matching pursuit decomposition of seismic data, a matching pursuit parallel algorithm is designed in this paper. We pick a fixed number of envelope peaks from the current signal in every iteration according to the number of compute nodes and assign them to the compute nodes on average to search the optimal Morlet wavelets in parallel. With the help of parallel computer systems and Message Passing Interface, the parallel algorithm gives full play to the advantages of parallel computing to significantly improve the computation speed of the matching pursuit decomposition and also has good expandability. Besides, searching only one optimal Morlet wavelet by every compute node in every iteration is the most efficient implementation.
Computer hardware fault administration

DOEpatents

Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

2010-09-14

Computer hardware fault administration carried out in a parallel computer, where the parallel computer includes a plurality of compute nodes. The compute nodes are coupled for data communications by at least two independent data communications networks, where each data communications network includes data communications links connected to the compute nodes. Typical embodiments carry out hardware fault administration by identifying a location of a defective link in the first data communications network of the parallel computer and routing communications data around the defective link through the second data communications network of the parallel computer.
In-Service Chemistry Teachers Training: The Impact of Introducing Computer Technology on Teachers' Attitudes.

ERIC Educational Resources Information Center

Dori, Y. J.; Barnea, N.

A computer-assisted instruction (CAI) module on polymers was used to introduce chemistry teachers (n=64) to the variety of possibilities and benefits of using courseware in the current chemistry curriculum in Israel. From an analysis of a pre-and post-attitude questionnaire regarding the use of computers in chemistry teaching, it was concluded…

Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2014-02-11

Data communications in a parallel active messaging interface ('PAMI') or a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution of a compute node, including specification of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications instruction, the instruction characterized by instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance witht the instruction type, the transfer data from the origin endpoin to the target endpoint.
High-performance computing — an overview

NASA Astrophysics Data System (ADS)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
The semantics of Chemical Markup Language (CML) for computational chemistry : CompChem.

PubMed

Phadungsukanan, Weerapong; Kraft, Markus; Townsend, Joe A; Murray-Rust, Peter

2012-08-07

: This paper introduces a subdomain chemistry format for storing computational chemistry data called CompChem. It has been developed based on the design, concepts and methodologies of Chemical Markup Language (CML) by adding computational chemistry semantics on top of the CML Schema. The format allows a wide range of ab initio quantum chemistry calculations of individual molecules to be stored. These calculations include, for example, single point energy calculation, molecular geometry optimization, and vibrational frequency analysis. The paper also describes the supporting infrastructure, such as processing software, dictionaries, validation tools and database repositories. In addition, some of the challenges and difficulties in developing common computational chemistry dictionaries are discussed. The uses of CompChem are illustrated by two practical applications.
The semantics of Chemical Markup Language (CML) for computational chemistry : CompChem

PubMed Central

2012-01-01

This paper introduces a subdomain chemistry format for storing computational chemistry data called CompChem. It has been developed based on the design, concepts and methodologies of Chemical Markup Language (CML) by adding computational chemistry semantics on top of the CML Schema. The format allows a wide range of ab initio quantum chemistry calculations of individual molecules to be stored. These calculations include, for example, single point energy calculation, molecular geometry optimization, and vibrational frequency analysis. The paper also describes the supporting infrastructure, such as processing software, dictionaries, validation tools and database repositories. In addition, some of the challenges and difficulties in developing common computational chemistry dictionaries are discussed. The uses of CompChem are illustrated by two practical applications. PMID:22870956
Rapid determination of enantiomeric excess: a focus on optical approaches.

PubMed

Leung, Diana; Kang, Sung Ok; Anslyn, Eric V

2012-01-07

High-throughput screening (HTS) methods are becoming increasingly essential in discovering chiral catalysts or auxiliaries for asymmetric transformations due to the advent of parallel synthesis and combinatorial chemistry. Both parallel synthesis and combinatorial chemistry can lead to the exploration of a range of structural candidates and reaction conditions as a means to obtain the highest enantiomeric excess (ee) of a desired transformation. One current bottleneck in these approaches to asymmetric reactions is the determination of ee, which has led researchers to explore a wide range of HTS techniques. To be truly high-throughput, it has been proposed that a technique that can analyse a thousand or more samples per day is needed. Many of the current approaches to this goal are based on optical methods because they allow for a rapid determination of ee due to quick data collection and their parallel analysis capabilities. In this critical review these techniques are reviewed with a discussion of their respective advantages and drawbacks, and with a contrast to chromatographic methods (180 references). This journal is © The Royal Society of Chemistry 2012
Construction of a robust, large-scale, collaborative database for raw data in computational chemistry: the Collaborative Chemistry Database Tool (CCDBT).

PubMed

Chen, Mingyang; Stott, Amanda C; Li, Shenggang; Dixon, David A

2012-04-01

A robust metadata database called the Collaborative Chemistry Database Tool (CCDBT) for massive amounts of computational chemistry raw data has been designed and implemented. It performs data synchronization and simultaneously extracts the metadata. Computational chemistry data in various formats from different computing sources, software packages, and users can be parsed into uniform metadata for storage in a MySQL database. Parsing is performed by a parsing pyramid, including parsers written for different levels of data types and sets created by the parser loader after loading parser engines and configurations. Copyright Â© 2011 Elsevier Inc. All rights reserved.
From data to analysis: linking NWChem and Avogadro with the syntax and semantics of Chemical Markup Language

DOE Office of Scientific and Technical Information (OSTI.GOV)

De Jong, Wibe A.; Walker, Andrew M.; Hanwell, Marcus D.

Background Multidisciplinary integrated research requires the ability to couple the diverse sets of data obtained from a range of complex experiments and computer simulations. Integrating data requires semantically rich information. In this paper the generation of semantically rich data from the NWChem computational chemistry software is discussed within the Chemical Markup Language (CML) framework. Results The NWChem computational chemistry software has been modified and coupled to the FoX library to write CML compliant XML data files. The FoX library was expanded to represent the lexical input files used by the computational chemistry software. Conclusions The production of CML compliant XMLmore » files for the computational chemistry software NWChem can be relatively easily accomplished using the FoX library. A unified computational chemistry or CompChem convention and dictionary needs to be developed through a community-based effort. The long-term goal is to enable a researcher to do Google-style chemistry and physics searches.« less
A four-dimensional variational chemistry data assimilation scheme for Eulerian chemistry transport modeling

NASA Astrophysics Data System (ADS)

Eibern, Hendrik; Schmidt, Hauke

1999-08-01

The inverse problem of data assimilation of tropospheric trace gas observations into an Eulerian chemistry transport model has been solved by the four-dimensional variational technique including chemical reactions, transport, and diffusion. The University of Cologne European Air Pollution Dispersion Chemistry Transport Model 2 with the Regional Acid Deposition Model 2 gas phase mechanism is taken as the basis for developing a full four-dimensional variational data assimilation package, on the basis of the adjoint model version, which includes the adjoint operators of horizontal and vertical advection, implicit vertical diffusion, and the adjoint gas phase mechanism. To assess the potential and limitations of the technique without degrading the impact of nonperfect meteorological analyses and statistically not established error covariance estimates, artificial meteorological data and observations are used. The results are presented on the basis of a suite of experiments, where reduced records of artificial "observations" are provided to the assimilation procedure, while other "data" is retained for performance control of the analysis. The paper demonstrates that the four-dimensional variational technique is applicable for a comprehensive chemistry transport model in terms of computational and storage requirements on advanced parallel platforms. It is further shown that observed species can generally be analyzed, even if the "measurements" have unbiased random errors. More challenging experiments are presented, aiming to tax the skill of the method (1) by restricting available observations mostly to surface ozone observations for a limited assimilation interval of 6 hours and (2) by starting with poorly chosen first guess values. In this first such application to a three-dimensional chemistry transport model, success was also achieved in analyzing not only observed but also chemically closely related unobserved constituents.
Computational Chemistry Comparison and Benchmark Database

National Institute of Standards and Technology Data Gateway

SRD 101 NIST Computational Chemistry Comparison and Benchmark Database (Web, free access) The NIST Computational Chemistry Comparison and Benchmark Database is a collection of experimental and ab initio thermochemical properties for a selected set of molecules. The goals are to provide a benchmark set of molecules for the evaluation of ab initio computational methods and allow the comparison between different ab initio computational methods for the prediction of thermochemical properties.
"First generation" automated DNA sequencing technology.

PubMed

Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

2011-10-01

Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
Web-Based Job Submission Interface for the GAMESS Computational Chemistry Program

ERIC Educational Resources Information Center

Perri, M. J.; Weber, S. H.

2014-01-01

A Web site is described that facilitates use of the free computational chemistry software: General Atomic and Molecular Electronic Structure System (GAMESS). Its goal is to provide an opportunity for undergraduate students to perform computational chemistry experiments without the need to purchase expensive software.
The 2nd Symposium on the Frontiers of Massively Parallel Computations

NASA Technical Reports Server (NTRS)

Mills, Ronnie (Editor)

1988-01-01

Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-11-12

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.
Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB

NASA Technical Reports Server (NTRS)

Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.

2017-01-01

Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.
Performance Evaluation in Network-Based Parallel Computing

NASA Technical Reports Server (NTRS)

Dezhgosha, Kamyar

1996-01-01

Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.
THE INTEGRATED USE OF COMPUTATIONAL CHEMISTRY, SCANNING PROBE MICROSCOPY, AND VIRTUAL REALITY TO PREDICT THE CHEMICAL REACTIVITY OF ENVIRONMENTAL SURFACES

EPA Science Inventory

In the last decade three new techniques scanning probe microscopy (SPM), virtual reality (YR) and computational chemistry ave emerged with the combined capability of a priori predicting the chemically reactivity of environmental surfaces. Computational chemistry provides the cap...
Comptox Chemistry Dashboard: Web-Based Data Integration Hub for Environmental Chemistry and Toxicology Data (ACS Fall meeting 4 of 12)

EPA Science Inventory

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrate advances in biology, chemistry, exposure and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and da...
Parallel Computing Using Web Servers and "Servlets".

ERIC Educational Resources Information Center

Lo, Alfred; Bloor, Chris; Choi, Y. K.

2000-01-01

Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…
A class of parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Fijany, Amir; Bejczy, Antal K.

1989-01-01

Parallel and parallel/pipeline algorithms for computation of the manipulator inertia matrix are presented. An algorithm based on composite rigid-body spatial inertia method, which provides better features for parallelization, is used for the computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound in computation. Also described is the mapping of these algorithms with topological variation on a two-dimensional processor array, with nearest-neighbor connection, and with cardinality variation on a linear processor array. An efficient parallel/pipeline algorithm for the linear array was also developed, but at significantly higher efficiency.
Parallel computing of a climate model on the dawn 1000 by domain decomposition method

NASA Astrophysics Data System (ADS)

Bi, Xunqiang

1997-12-01

In this paper the parallel computing of a grid-point nine-level atmospheric general circulation model on the Dawn 1000 is introduced. The model was developed by the Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS). The Dawn 1000 is a MIMD massive parallel computer made by National Research Center for Intelligent Computer (NCIC), CAS. A two-dimensional domain decomposition method is adopted to perform the parallel computing. The potential ways to increase the speed-up ratio and exploit more resources of future massively parallel supercomputation are also discussed.

Parallel Computing Strategies for Irregular Algorithms

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

NASA Astrophysics Data System (ADS)

Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.

2013-08-01

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time ti (trajectory positions and velocities xi = (ri, vi)) to time ti + 1 (xi + 1) by xi + 1 = fi(xi), the dynamics problem spanning an interval from t0…tM can be transformed into a root finding problem, F(X) = [xi - f(x(i - 1)]i = 1, M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H2O AIMD simulation at the MP2 level. The maximum speedup (serial execution time/parallel execution time) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.
Parallel solution of sparse one-dimensional dynamic programming problems

NASA Technical Reports Server (NTRS)

Nicol, David M.

1989-01-01

Parallel computation offers the potential for quickly solving large computational problems. However, it is often a non-trivial task to effectively use parallel computers. Solution methods must sometimes be reformulated to exploit parallelism; the reformulations are often more complex than their slower serial counterparts. We illustrate these points by studying the parallelization of sparse one-dimensional dynamic programming problems, those which do not obviously admit substantial parallelization. We propose a new method for parallelizing such problems, develop analytic models which help us to identify problems which parallelize well, and compare the performance of our algorithm with existing algorithms on a multiprocessor.
Decomposition method for fast computation of gigapixel-sized Fresnel holograms on a graphics processing unit cluster.

PubMed

Jackin, Boaz Jessie; Watanabe, Shinpei; Ootsu, Kanemitsu; Ohkawa, Takeshi; Yokota, Takashi; Hayasaki, Yoshio; Yatagai, Toyohiko; Baba, Takanobu

2018-04-20

A parallel computation method for large-size Fresnel computer-generated hologram (CGH) is reported. The method was introduced by us in an earlier report as a technique for calculating Fourier CGH from 2D object data. In this paper we extend the method to compute Fresnel CGH from 3D object data. The scale of the computation problem is also expanded to 2 gigapixels, making it closer to real application requirements. The significant feature of the reported method is its ability to avoid communication overhead and thereby fully utilize the computing power of parallel devices. The method exhibits three layers of parallelism that favor small to large scale parallel computing machines. Simulation and optical experiments were conducted to demonstrate the workability and to evaluate the efficiency of the proposed technique. A two-times improvement in computation speed has been achieved compared to the conventional method, on a 16-node cluster (one GPU per node) utilizing only one layer of parallelism. A 20-times improvement in computation speed has been estimated utilizing two layers of parallelism on a very large-scale parallel machine with 16 nodes, where each node has 16 GPUs.
A Parallel Genetic Algorithm for Automated Electronic Circuit Design

NASA Technical Reports Server (NTRS)

Long, Jason D.; Colombano, Silvano P.; Haith, Gary L.; Stassinopoulos, Dimitris

2000-01-01

Parallelized versions of genetic algorithms (GAs) are popular primarily for three reasons: the GA is an inherently parallel algorithm, typical GA applications are very compute intensive, and powerful computing platforms, especially Beowulf-style computing clusters, are becoming more affordable and easier to implement. In addition, the low communication bandwidth required allows the use of inexpensive networking hardware such as standard office ethernet. In this paper we describe a parallel GA and its use in automated high-level circuit design. Genetic algorithms are a type of trial-and-error search technique that are guided by principles of Darwinian evolution. Just as the genetic material of two living organisms can intermix to produce offspring that are better adapted to their environment, GAs expose genetic material, frequently strings of 1s and Os, to the forces of artificial evolution: selection, mutation, recombination, etc. GAs start with a pool of randomly-generated candidate solutions which are then tested and scored with respect to their utility. Solutions are then bred by probabilistically selecting high quality parents and recombining their genetic representations to produce offspring solutions. Offspring are typically subjected to a small amount of random mutation. After a pool of offspring is produced, this process iterates until a satisfactory solution is found or an iteration limit is reached. Genetic algorithms have been applied to a wide variety of problems in many fields, including chemistry, biology, and many engineering disciplines. There are many styles of parallelism used in implementing parallel GAs. One such method is called the master-slave or processor farm approach. In this technique, slave nodes are used solely to compute fitness evaluations (the most time consuming part). The master processor collects fitness scores from the nodes and performs the genetic operators (selection, reproduction, variation, etc.). Because of dependency issues in the GA, it is possible to have idle processors. However, as long as the load at each processing node is similar, the processors are kept busy nearly all of the time. In applying GAs to circuit design, a suitable genetic representation 'is that of a circuit-construction program. We discuss one such circuit-construction programming language and show how evolution can generate useful analog circuit designs. This language has the desirable property that virtually all sets of combinations of primitives result in valid circuit graphs. Our system allows circuit size (number of devices), circuit topology, and device values to be evolved. Using a parallel genetic algorithm and circuit simulation software, we present experimental results as applied to three analog filter and two amplifier design tasks. For example, a figure shows an 85 dB amplifier design evolved by our system, and another figure shows the performance of that circuit (gain and frequency response). In all tasks, our system is able to generate circuits that achieve the target specifications.
Parallel 3-D numerical simulation of dielectric barrier discharge plasma actuators

NASA Astrophysics Data System (ADS)

Houba, Tomas

Dielectric barrier discharge plasma actuators have shown promise in a range of applications including flow control, sterilization and ozone generation. Developing numerical models of plasma actuators is of great importance, because a high-fidelity parallel numerical model allows new design configurations to be tested rapidly. Additionally, it provides a better understanding of the plasma actuator physics which is useful for further innovation. The physics of plasma actuators is studied numerically. A loosely coupled approach is utilized for the coupling of the plasma to the neutral fluid. The state of the art in numerical plasma modeling is advanced by the development of a parallel, three-dimensional, first-principles model with detailed air chemistry. The model incorporates 7 charged species and 18 reactions, along with a solution of the electron energy equation. To the author's knowledge, a parallel three-dimensional model of a gas discharge with a detailed air chemistry model and the solution of electron energy is unique. Three representative geometries are studied using the gas discharge model. The discharge of gas between two parallel electrodes is used to validate the air chemistry model developed for the gas discharge code. The gas discharge model is then applied to the discharge produced by placing a dc powered wire and grounded plate electrodes in a channel. Finally, a three-dimensional simulation of gas discharge produced by electrodes placed inside a riblet is carried out. The body force calculated with the gas discharge model is loosely coupled with a fluid model to predict the induced flow inside the riblet.
MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program

NASA Astrophysics Data System (ADS)

Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.

2018-02-01

We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-10-29

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a data communications instruction, the instruction characterized by an instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance with the instruction type, the transfer data from the origin endpoint to the target endpoint.
A CFD Heterogeneous Parallel Solver Based on Collaborating CPU and GPU

NASA Astrophysics Data System (ADS)

Lai, Jianqi; Tian, Zhengyu; Li, Hua; Pan, Sha

2018-03-01

Since Graphic Processing Unit (GPU) has a strong ability of floating-point computation and memory bandwidth for data parallelism, it has been widely used in the areas of common computing such as molecular dynamics (MD), computational fluid dynamics (CFD) and so on. The emergence of compute unified device architecture (CUDA), which reduces the complexity of compiling program, brings the great opportunities to CFD. There are three different modes for parallel solution of NS equations: parallel solver based on CPU, parallel solver based on GPU and heterogeneous parallel solver based on collaborating CPU and GPU. As we can see, GPUs are relatively rich in compute capacity but poor in memory capacity and the CPUs do the opposite. We need to make full use of the GPUs and CPUs, so a CFD heterogeneous parallel solver based on collaborating CPU and GPU has been established. Three cases are presented to analyse the solver’s computational accuracy and heterogeneous parallel efficiency. The numerical results agree well with experiment results, which demonstrate that the heterogeneous parallel solver has high computational precision. The speedup on a single GPU is more than 40 for laminar flow, it decreases for turbulent flow, but it still can reach more than 20. What’s more, the speedup increases as the grid size becomes larger.
Elucidating Hyperconjugation from Electronegativity to Predict Drug Conformational Energy in a High Throughput Manner.

PubMed

Liu, Zhaomin; Pottel, Joshua; Shahamat, Moeed; Tomberg, Anna; Labute, Paul; Moitessier, Nicolas

2016-04-25

Computational chemists use structure-based drug design and molecular dynamics of drug/protein complexes which require an accurate description of the conformational space of drugs. Organic chemists use qualitative chemical principles such as the effect of electronegativity on hyperconjugation, the impact of steric clashes on stereochemical outcome of reactions, and the consequence of resonance on the shape of molecules to rationalize experimental observations. While computational chemists speak about electron densities and molecular orbitals, organic chemists speak about partial charges and localized molecular orbitals. Attempts to reconcile these two parallel approaches such as programs for natural bond orbitals and intrinsic atomic orbitals computing Lewis structures-like orbitals and reaction mechanism have appeared. In the past, we have shown that encoding and quantifying chemistry knowledge and qualitative principles can lead to predictive methods. In the same vein, we thought to understand the conformational behaviors of molecules and to encode this knowledge back into a molecular mechanics tool computing conformational potential energy and to develop an alternative to atom types and training of force fields on large sets of molecules. Herein, we describe a conceptually new approach to model torsion energies based on fundamental chemistry principles. To demonstrate our approach, torsional energy parameters were derived on-the-fly from atomic properties. When the torsional energy terms implemented in GAFF, Parm@Frosst, and MMFF94 were substituted by our method, the accuracy of these force fields to reproduce MP2-derived torsional energy profiles and their transferability to a variety of functional groups and drug fragments were overall improved. In addition, our method did not rely on atom types and consequently did not suffer from poor automated atom type assignments.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Gooding, Thomas M.

Distributing an executable job load file to compute nodes in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: determining, by a compute node in the parallel computer, whether the compute node is participating in a job; determining, by the compute node in the parallel computer, whether a descendant compute node is participating in the job; responsive to determining that the compute node is participating in the job or that the descendant compute node is participating in the job, communicating, by the compute node to a parent compute node, an identification of a data communications linkmore » over which the compute node receives data from the parent compute node; constructing a class route for the job, wherein the class route identifies all compute nodes participating in the job; and broadcasting the executable load file for the job along the class route for the job.« less
SIAM Conference on Parallel Processing for Scientific Computing, 4th, Chicago, IL, Dec. 11-13, 1989, Proceedings

NASA Technical Reports Server (NTRS)

Dongarra, Jack (Editor); Messina, Paul (Editor); Sorensen, Danny C. (Editor); Voigt, Robert G. (Editor)

1990-01-01

Attention is given to such topics as an evaluation of block algorithm variants in LAPACK and presents a large-grain parallel sparse system solver, a multiprocessor method for the solution of the generalized Eigenvalue problem on an interval, and a parallel QR algorithm for iterative subspace methods on the CM2. A discussion of numerical methods includes the topics of asynchronous numerical solutions of PDEs on parallel computers, parallel homotopy curve tracking on a hypercube, and solving Navier-Stokes equations on the Cedar Multi-Cluster system. A section on differential equations includes a discussion of a six-color procedure for the parallel solution of elliptic systems using the finite quadtree structure, data parallel algorithms for the finite element method, and domain decomposition methods in aerodynamics. Topics dealing with massively parallel computing include hypercube vs. 2-dimensional meshes and massively parallel computation of conservation laws. Performance and tools are also discussed.
The ChemViz Project: Using a Supercomputer To Illustrate Abstract Concepts in Chemistry.

ERIC Educational Resources Information Center

Beckwith, E. Kenneth; Nelson, Christopher

1998-01-01

Describes the Chemistry Visualization (ChemViz) Project, a Web venture maintained by the University of Illinois National Center for Supercomputing Applications (NCSA) that enables high school students to use computational chemistry as a technique for understanding abstract concepts. Discusses the evolution of computational chemistry and provides a…
Effects of Combined Hands-on Laboratory and Computer Modeling on Student Learning of Gas Laws: A Quasi-Experimental Study

ERIC Educational Resources Information Center

Liu, Xiufeng

2006-01-01

Based on current theories of chemistry learning, this study intends to test a hypothesis that computer modeling enhanced hands-on chemistry laboratories are more effective than hands-on laboratories or computer modeling laboratories alone in facilitating high school students' understanding of chemistry concepts. Thirty-three high school chemistry…
Improving Students' Understanding of Molecular Structure through Broad-Based Use of Computer Models in the Undergraduate Organic Chemistry Lecture

ERIC Educational Resources Information Center

Springer, Michael T.

2014-01-01

Several articles suggest how to incorporate computer models into the organic chemistry laboratory, but relatively few papers discuss how to incorporate these models broadly into the organic chemistry lecture. Previous research has suggested that "manipulating" physical or computer models enhances student understanding; this study…
Handling Big Data in Medical Imaging: Iterative Reconstruction with Large-Scale Automated Parallel Computation

PubMed Central

Lee, Jae H.; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T.; Seo, Youngho

2014-01-01

The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting. PMID:27081299
Handling Big Data in Medical Imaging: Iterative Reconstruction with Large-Scale Automated Parallel Computation.

PubMed

Lee, Jae H; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T; Seo, Youngho

2014-11-01

The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting.
Computations and interpretations: The growth of quantum chemistry, 1927-1967

NASA Astrophysics Data System (ADS)

Park, Buhm Soon

1999-10-01

This dissertation is a contribution to the historical study of scientific disciplines in the twentieth century. It seeks to examine the development of quantum chemistry during the four decades after its inception in 1927. This development was manifest in theories, tools, scientists, and institutions, all of which constituted the disciplinary identity of quantum chemistry. To characterize its identity, I deal with the origins of key ideas and concepts; the change of computational tools from desk calculators to digital computers; the formation of a network among research groups and individuals; and the institutionalization of annual meetings. The dissertation's thesis is three-fold. First, in the pre- World War II years, there were individual contributions to the development of theories in quantum chemistry, but the founding fathers worked in their disciplinary contexts of physics or chemistry with little interest in building a quantum chemistry community. Second, the introduction of electronic digital computers in the postwar years affected the resurgence of the ab initio approach-the attempt to solve the Schrödinger equation without recourse to empirical data-and also the emergence of a community of quantum chemists. But the use of computers did not give rise to a consensus over the aims, methods, or content of the discipline. Third, quantum chemistry exerted a significant influence upon the transformation of chemical education and research in general, thanks to ``chemical translators,'' who sought to explain the gist of quantum chemistry in a language that chemists could understand. In sum, quantum chemistry has been a discipline characterized by diverse traditions, and the whole of chemistry has been under the influence of computations and interpretations made by quantum chemists.
Global Particulate Matter Source Apportionment

NASA Astrophysics Data System (ADS)

Lamancusa, C.; Wagstrom, K.

2017-12-01

As our global society develops and grows it is necessary to better understand the impacts and nuances of atmospheric chemistry, in particular those associated with atmospheric particulate matter. We have developed a source apportionment scheme for the GEOS-Chem global atmospheric chemical transport model. While these approaches have existed for several years in regional chemical transport models, the Global Particulate Matter Source Apportionment Technology (GPSAT) represents the first incorporation into a global chemical transport model. GPSAT runs in parallel to a standard GEOS-Chem run. GPSAT uses the fact that all molecules of a given species have the same probability of undergoing any given process as a core principle. This allows GPSAT to track many different species using only the flux information provided by GEOS-Chem's many processes. GPSAT accounts for the change in source specific concentrations as a result of aqueous and gas-phase chemistry, horizontal and vertical transport, condensation and evaporation on particulate matter, emissions, and wet and dry deposition. By using fluxes, GPSAT minimizes computational cost by circumventing the computationally costly chemistry and transport solvers. GPSAT will allow researchers to address many pertinent research questions about global particulate matter including the global impact of emissions from different source regions and the climate impacts from different source types and regions. For this first application of GPSAT, we investigate the contribution of the twenty largest urban areas worldwide to global particulate matter concentrations. The species investigated include: ammonium, nitrates, sulfates, and the secondary organic aerosols formed by the oxidation of benzene, isoprene, and terpenes. While GPSAT is not yet publically available, we will incorporate it into a future standard release of GEOS-Chem so that all GEOS-Chem users will have access to this new tool.
Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics.

PubMed

Wu, Xiao-Lin; Sun, Chuanyu; Beissinger, Timothy M; Rosa, Guilherme Jm; Weigel, Kent A; Gatti, Natalia de Leon; Gianola, Daniel

2012-09-25

Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs.

Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics

PubMed Central

2012-01-01

Background Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Results Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Conclusions Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs. PMID:23009363
Parallel approach in RDF query processing

NASA Astrophysics Data System (ADS)

Vajgl, Marek; Parenica, Jan

2017-07-01

Parallel approach is nowadays a very cheap solution to increase computational power due to possibility of usage of multithreaded computational units. This hardware became typical part of nowadays personal computers or notebooks and is widely spread. This contribution deals with experiments how evaluation of computational complex algorithm of the inference over RDF data can be parallelized over graphical cards to decrease computational time.
NCC: A Multidisciplinary Design/Analysis Tool for Combustion Systems

NASA Technical Reports Server (NTRS)

Liu, Nan-Suey; Quealy, Angela

1999-01-01

A multi-disciplinary design/analysis tool for combustion systems is critical for optimizing the low-emission, high-performance combustor design process. Based on discussions between NASA Lewis Research Center and the jet engine companies, an industry-government team was formed in early 1995 to develop the National Combustion Code (NCC), which is an integrated system of computer codes for the design and analysis of combustion systems. NCC has advanced features that address the need to meet designer's requirements such as "assured accuracy", "fast turnaround", and "acceptable cost". The NCC development team is comprised of Allison Engine Company (Allison), CFD Research Corporation (CFDRC), GE Aircraft Engines (GEAE), NASA Lewis Research Center (LeRC), and Pratt & Whitney (P&W). This development team operates under the guidance of the NCC steering committee. The "unstructured mesh" capability and "parallel computing" are fundamental features of NCC from its inception. The NCC system is composed of a set of "elements" which includes grid generator, main flow solver, turbulence module, turbulence and chemistry interaction module, chemistry module, spray module, radiation heat transfer module, data visualization module, and a post-processor for evaluating engine performance parameters. Each element may have contributions from several team members. Such a multi-source multi-element system needs to be integrated in a way that facilitates inter-module data communication, flexibility in module selection, and ease of integration.
Limits of computational biology.

PubMed

Bray, Dennis

2015-01-01

Are we close to a complete inventory of living processes so that we might expect in the near future to reproduce every essential aspect necessary for life? Or are there mechanisms and processes in cells and organisms that are presently inaccessible to us? Here I argue that a close examination of a particularly well-understood system--that of Escherichia coli chemotaxis--shows we are still a long way from a complete description. There is a level of molecular uncertainty, particularly that responsible for fine-tuning and adaptation to myriad external conditions, which we presently cannot resolve or reproduce on a computer. Moreover, the same uncertainty exists for any process in any organism and is especially pronounced and important in higher animals such as humans. Embryonic development, tissue homeostasis, immune recognition, memory formation, and survival in the real world, all depend on vast numbers of subtle variations in cell chemistry most of which are presently unknown or only poorly characterized. Overcoming these limitations will require us to not only accumulate large quantities of highly detailed data but also develop new computational methods able to recapitulate the massively parallel processing of living cells.
A comparative study of serial and parallel aeroelastic computations of wings

NASA Technical Reports Server (NTRS)

Byun, Chansup; Guruswamy, Guru P.

1994-01-01

A procedure for computing the aeroelasticity of wings on parallel multiple-instruction, multiple-data (MIMD) computers is presented. In this procedure, fluids are modeled using Euler equations, and structures are modeled using modal or finite element equations. The procedure is designed in such a way that each discipline can be developed and maintained independently by using a domain decomposition approach. In the present parallel procedure, each computational domain is scalable. A parallel integration scheme is used to compute aeroelastic responses by solving fluid and structural equations concurrently. The computational efficiency issues of parallel integration of both fluid and structural equations are investigated in detail. This approach, which reduces the total computational time by a factor of almost 2, is demonstrated for a typical aeroelastic wing by using various numbers of processors on the Intel iPSC/860.
Research in parallel computing

NASA Technical Reports Server (NTRS)

Ortega, James M.; Henderson, Charles

1994-01-01

This report summarizes work on parallel computations for NASA Grant NAG-1-1529 for the period 1 Jan. - 30 June 1994. Short summaries on highly parallel preconditioners, target-specific parallel reductions, and simulation of delta-cache protocols are provided.
A Flipped Classroom Redesign in General Chemistry

ERIC Educational Resources Information Center

Reid, Scott A.

2016-01-01

The flipped classroom continues to attract significant attention in higher education. Building upon our recent parallel controlled study of the flipped classroom in a second-term general chemistry course ("J. Chem. Educ.," 2016, 93, 13-23), here we report on a redesign of the flipped course aimed at scaling up total enrollment while…
Parallel computations and control of adaptive structures

NASA Technical Reports Server (NTRS)

Park, K. C.; Alvin, Kenneth F.; Belvin, W. Keith; Chong, K. P. (Editor); Liu, S. C. (Editor); Li, J. C. (Editor)

1991-01-01

The equations of motion for structures with adaptive elements for vibration control are presented for parallel computations to be used as a software package for real-time control of flexible space structures. A brief introduction of the state-of-the-art parallel computational capability is also presented. Time marching strategies are developed for an effective use of massive parallel mapping, partitioning, and the necessary arithmetic operations. An example is offered for the simulation of control-structure interaction on a parallel computer and the impact of the approach presented for applications in other disciplines than aerospace industry is assessed.
Design of a massively parallel computer using bit serial processing elements

NASA Technical Reports Server (NTRS)

Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing

1995-01-01

A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.
Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.

2013-08-21

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f , (e.g. Verlet algorithm) is available to propagate the system from time ti (trajectory positions and velocities xi = (ri; vi)) to time ti+1 (xi+1) by xi+1 = fi(xi), the dynamics problem spanning an interval from t0 : : : tM can be transformed into a root finding problem, F(X) = [xi - f (x(i-1)]i=1;M = 0, for the trajectory variables. The root finding problem is solved using amore » variety of optimization techniques, including quasi-Newton and preconditioned quasi-Newton optimization schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed and the effectiveness of various approaches to solving the root finding problem are tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl+4H2O AIMD simulation at the MP2 level. The maximum speedup obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow TCP/IP networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl+4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. By using these algorithms we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 seconds per time step to 6.9 seconds per time step.« less
Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations.

PubMed

Bylaska, Eric J; Weare, Jonathan Q; Weare, John H

2013-08-21

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time ti (trajectory positions and velocities xi = (ri, vi)) to time ti + 1 (xi + 1) by xi + 1 = fi(xi), the dynamics problem spanning an interval from t0[ellipsis (horizontal)]tM can be transformed into a root finding problem, F(X) = [xi - f(x(i - 1)]i = 1, M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H2O AIMD simulation at the MP2 level. The maximum speedup (serial execution/timeparallel execution time) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.
Numerical study of supersonic combustors by multi-block grids with mismatched interfaces

NASA Technical Reports Server (NTRS)

Moon, Young J.

1990-01-01

A three dimensional, finite rate chemistry, Navier-Stokes code was extended to a multi-block code with mismatched interface for practical calculations of supersonic combustors. To ensure global conservation, a conservative algorithm was used for the treatment of mismatched interfaces. The extended code was checked against one test case, i.e., a generic supersonic combustor with transverse fuel injection, examining solution accuracy, convergence, and local mass flux error. After testing, the code was used to simulate the chemically reacting flow fields in a scramjet combustor with parallel fuel injectors (unswept and swept ramps). Computational results were compared with experimental shadowgraph and pressure measurements. Fuel-air mixing characteristics of the unswept and swept ramps were compared and investigated.
Performance Evaluation of Parallel Branch and Bound Search with the Intel iPSC (Intel Personal SuperComputer) Hypercube Computer.

DTIC Science & Technology

1986-12-01

17 III. Analysis of Parallel Design ................................................ 18 Parallel Abstract Data ...Types ........................................... 18 Abstract Data Type .................................................. 19 Parallel ADT...22 Data -Structure Design ........................................... 23 Object-Oriented Design
Promoting Intrinsic and Extrinsic Motivation among Chemistry Students Using Computer-Assisted Instruction

ERIC Educational Resources Information Center

Gambari, Isiaka A.; Gbodi, Bimpe E.; Olakanmi, Eyitao U.; Abalaka, Eneojo N.

2016-01-01

The role of computer-assisted instruction in promoting intrinsic and extrinsic motivation among Nigerian secondary school chemistry students was investigated in this study. The study employed two modes of computer-assisted instruction (computer simulation instruction and computer tutorial instructional packages) and two levels of gender (male and…
Hypercluster Parallel Processor

NASA Technical Reports Server (NTRS)

Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela

1992-01-01

Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.
Parallel processing architecture for computing inverse differential kinematic equations of the PUMA arm

NASA Technical Reports Server (NTRS)

Hsia, T. C.; Lu, G. Z.; Han, W. H.

1987-01-01

In advanced robot control problems, on-line computation of inverse Jacobian solution is frequently required. Parallel processing architecture is an effective way to reduce computation time. A parallel processing architecture is developed for the inverse Jacobian (inverse differential kinematic equation) of the PUMA arm. The proposed pipeline/parallel algorithm can be inplemented on an IC chip using systolic linear arrays. This implementation requires 27 processing cells and 25 time units. Computation time is thus significantly reduced.
Parallels between New Paradigms in Science and in Reading and Literary Theories.

ERIC Educational Resources Information Center

Weaver, Constance

Drawing upon research from a number of fields, this paper explores parallels between new paradigms in the sciences--particularly physics, chemistry, and biology--and new paradigms in reading and literary theory--particularly a socio- or psycholinguistic, semiotic, transactional view of reading and a transactional view of the literary experience.…
Molecular dynamics simulations through GPU video games technologies

PubMed Central

Loukatou, Styliani; Papageorgiou, Louis; Fakourelis, Paraskevas; Filntisi, Arianna; Polychronidou, Eleftheria; Bassis, Ioannis; Megalooikonomou, Vasileios; Makałowski, Wojciech; Vlachakis, Dimitrios; Kossida, Sophia

2016-01-01

Bioinformatics is the scientific field that focuses on the application of computer technology to the management of biological information. Over the years, bioinformatics applications have been used to store, process and integrate biological and genetic information, using a wide range of methodologies. One of the most de novo techniques used to understand the physical movements of atoms and molecules is molecular dynamics (MD). MD is an in silico method to simulate the physical motions of atoms and molecules under certain conditions. This has become a state strategic technique and now plays a key role in many areas of exact sciences, such as chemistry, biology, physics and medicine. Due to their complexity, MD calculations could require enormous amounts of computer memory and time and therefore their execution has been a big problem. Despite the huge computational cost, molecular dynamics have been implemented using traditional computers with a central memory unit (CPU). A graphics processing unit (GPU) computing technology was first designed with the goal to improve video games, by rapidly creating and displaying images in a frame buffer such as screens. The hybrid GPU-CPU implementation, combined with parallel computing is a novel technology to perform a wide range of calculations. GPUs have been proposed and used to accelerate many scientific computations including MD simulations. Herein, we describe the new methodologies developed initially as video games and how they are now applied in MD simulations. PMID:27525251
A scalable parallel black oil simulator on distributed memory parallel computers

NASA Astrophysics Data System (ADS)

Wang, Kun; Liu, Hui; Chen, Zhangxin

2015-11-01

This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.
Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes

NASA Technical Reports Server (NTRS)

Yan, Jerry; Jin, Haoqiang; Frumkin, Michael; Yan, Jerry (Technical Monitor)

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate OpenMP-based parallel programs with nominal user assistance. We outline techniques used in the implementation of the tool and discuss the application of this tool on the NAS Parallel Benchmarks and several computational fluid dynamics codes. This work demonstrates the great potential of using the tool to quickly port parallel programs and also achieve good performance that exceeds some of the commercial tools.

Hypergraph partitioning implementation for parallelizing matrix-vector multiplication using CUDA GPU-based parallel computing

NASA Astrophysics Data System (ADS)

Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.

2017-07-01

Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
[Series: Medical Applications of the PHITS Code (2): Acceleration by Parallel Computing].

PubMed

Furuta, Takuya; Sato, Tatsuhiko

2015-01-01

Time-consuming Monte Carlo dose calculation becomes feasible owing to the development of computer technology. However, the recent development is due to emergence of the multi-core high performance computers. Therefore, parallel computing becomes a key to achieve good performance of software programs. A Monte Carlo simulation code PHITS contains two parallel computing functions, the distributed-memory parallelization using protocols of message passing interface (MPI) and the shared-memory parallelization using open multi-processing (OpenMP) directives. Users can choose the two functions according to their needs. This paper gives the explanation of the two functions with their advantages and disadvantages. Some test applications are also provided to show their performance using a typical multi-core high performance workstation.
Evolving binary classifiers through parallel computation of multiple fitness cases.

PubMed

Cagnoni, Stefano; Bergenti, Federico; Mordonini, Monica; Adorni, Giovanni

2005-06-01

This paper describes two versions of a novel approach to developing binary classifiers, based on two evolutionary computation paradigms: cellular programming and genetic programming. Such an approach achieves high computation efficiency both during evolution and at runtime. Evolution speed is optimized by allowing multiple solutions to be computed in parallel. Runtime performance is optimized explicitly using parallel computation in the case of cellular programming or implicitly taking advantage of the intrinsic parallelism of bitwise operators on standard sequential architectures in the case of genetic programming. The approach was tested on a digit recognition problem and compared with a reference classifier.
Studying an Eulerian Computer Model on Different High-performance Computer Platforms and Some Applications

NASA Astrophysics Data System (ADS)

Georgiev, K.; Zlatev, Z.

2010-11-01

The Danish Eulerian Model (DEM) is an Eulerian model for studying the transport of air pollutants on large scale. Originally, the model was developed at the National Environmental Research Institute of Denmark. The model computational domain covers Europe and some neighbour parts belong to the Atlantic Ocean, Asia and Africa. If DEM model is to be applied by using fine grids, then its discretization leads to a huge computational problem. This implies that such a model as DEM must be run only on high-performance computer architectures. The implementation and tuning of such a complex large-scale model on each different computer is a non-trivial task. Here, some comparison results of running of this model on different kind of vector (CRAY C92A, Fujitsu, etc.), parallel computers with distributed memory (IBM SP, CRAY T3E, Beowulf clusters, Macintosh G4 clusters, etc.), parallel computers with shared memory (SGI Origin, SUN, etc.) and parallel computers with two levels of parallelism (IBM SMP, IBM BlueGene/P, clusters of multiprocessor nodes, etc.) will be presented. The main idea in the parallel version of DEM is domain partitioning approach. Discussions according to the effective use of the cache and hierarchical memories of the modern computers as well as the performance, speed-ups and efficiency achieved will be done. The parallel code of DEM, created by using MPI standard library, appears to be highly portable and shows good efficiency and scalability on different kind of vector and parallel computers. Some important applications of the computer model output are presented in short.
A Debugger for Computational Grid Applications

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)

2001-01-01

This viewgraph presentation gives an overview of a debugger for computational grid applications. Details are given on NAS parallel tools groups (including parallelization support tools, evaluation of various parallelization strategies, and distributed and aggregated computing), debugger dependencies, scalability, initial implementation, the process grid, and information on Globus.
Application of a Scalable, Parallel, Unstructured-Grid-Based Navier-Stokes Solver

NASA Technical Reports Server (NTRS)

Parikh, Paresh

2001-01-01

A parallel version of an unstructured-grid based Navier-Stokes solver, USM3Dns, previously developed for efficient operation on a variety of parallel computers, has been enhanced to incorporate upgrades made to the serial version. The resultant parallel code has been extensively tested on a variety of problems of aerospace interest and on two sets of parallel computers to understand and document its characteristics. An innovative grid renumbering construct and use of non-blocking communication are shown to produce superlinear computing performance. Preliminary results from parallelization of a recently introduced "porous surface" boundary condition are also presented.
How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing

NASA Astrophysics Data System (ADS)

Decyk, V. K.; Dauger, D. E.

We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.
Parallel simulation of tsunami inundation on a large-scale supercomputer

NASA Astrophysics Data System (ADS)

Oishi, Y.; Imamura, F.; Sugawara, D.

2013-12-01

An accurate prediction of tsunami inundation is important for disaster mitigation purposes. One approach is to approximate the tsunami wave source through an instant inversion analysis using real-time observation data (e.g., Tsushima et al., 2009) and then use the resulting wave source data in an instant tsunami inundation simulation. However, a bottleneck of this approach is the large computational cost of the non-linear inundation simulation and the computational power of recent massively parallel supercomputers is helpful to enable faster than real-time execution of a tsunami inundation simulation. Parallel computers have become approximately 1000 times faster in 10 years (www.top500.org), and so it is expected that very fast parallel computers will be more and more prevalent in the near future. Therefore, it is important to investigate how to efficiently conduct a tsunami simulation on parallel computers. In this study, we are targeting very fast tsunami inundation simulations on the K computer, currently the fastest Japanese supercomputer, which has a theoretical peak performance of 11.2 PFLOPS. One computing node of the K computer consists of 1 CPU with 8 cores that share memory, and the nodes are connected through a high-performance torus-mesh network. The K computer is designed for distributed-memory parallel computation, so we have developed a parallel tsunami model. Our model is based on TUNAMI-N2 model of Tohoku University, which is based on a leap-frog finite difference method. A grid nesting scheme is employed to apply high-resolution grids only at the coastal regions. To balance the computation load of each CPU in the parallelization, CPUs are first allocated to each nested layer in proportion to the number of grid points of the nested layer. Using CPUs allocated to each layer, 1-D domain decomposition is performed on each layer. In the parallel computation, three types of communication are necessary: (1) communication to adjacent neighbours for the finite difference calculation, (2) communication between adjacent layers for the calculations to connect each layer, and (3) global communication to obtain the time step which satisfies the CFL condition in the whole domain. A preliminary test on the K computer showed the parallel efficiency on 1024 cores was 57% relative to 64 cores. We estimate that the parallel efficiency will be considerably improved by applying a 2-D domain decomposition instead of the present 1-D domain decomposition in future work. The present parallel tsunami model was applied to the 2011 Great Tohoku tsunami. The coarsest resolution layer covers a 758 km × 1155 km region with a 405 m grid spacing. A nesting of five layers was used with the resolution ratio of 1/3 between nested layers. The finest resolution region has 5 m resolution and covers most of the coastal region of Sendai city. To complete 2 hours of simulation time, the serial (non-parallel) computation took approximately 4 days on a workstation. To complete the same simulation on 1024 cores of the K computer, it took 45 minutes which is more than two times faster than real-time. This presentation discusses the updated parallel computational performance and the efficient use of the K computer when considering the characteristics of the tsunami inundation simulation model in relation to the characteristics and capabilities of the K computer.
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

PubMed

Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

2014-01-16

To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks.
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment

PubMed Central

2014-01-01

Background To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. Results This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Conclusions Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks. PMID:24428926
Parallel CE/SE Computations via Domain Decomposition

NASA Technical Reports Server (NTRS)

Himansu, Ananda; Jorgenson, Philip C. E.; Wang, Xiao-Yen; Chang, Sin-Chung

2000-01-01

This paper describes the parallelization strategy and achieved parallel efficiency of an explicit time-marching algorithm for solving conservation laws. The Space-Time Conservation Element and Solution Element (CE/SE) algorithm for solving the 2D and 3D Euler equations is parallelized with the aid of domain decomposition. The parallel efficiency of the resultant algorithm on a Silicon Graphics Origin 2000 parallel computer is checked.
Aerodynamic simulation on massively parallel systems

NASA Technical Reports Server (NTRS)

Haeuser, Jochem; Simon, Horst D.

1992-01-01

This paper briefly addresses the computational requirements for the analysis of complete configurations of aircraft and spacecraft currently under design to be used for advanced transportation in commercial applications as well as in space flight. The discussion clearly shows that massively parallel systems are the only alternative which is both cost effective and on the other hand can provide the necessary TeraFlops, needed to satisfy the narrow design margins of modern vehicles. It is assumed that the solution of the governing physical equations, i.e., the Navier-Stokes equations which may be complemented by chemistry and turbulence models, is done on multiblock grids. This technique is situated between the fully structured approach of classical boundary fitted grids and the fully unstructured tetrahedra grids. A fully structured grid best represents the flow physics, while the unstructured grid gives best geometrical flexibility. The multiblock grid employed is structured within a block, but completely unstructured on the block level. While a completely unstructured grid is not straightforward to parallelize, the above mentioned multiblock grid is inherently parallel, in particular for multiple instruction multiple datastream (MIMD) machines. In this paper guidelines are provided for setting up or modifying an existing sequential code so that a direct parallelization on a massively parallel system is possible. Results are presented for three parallel systems, namely the Intel hypercube, the Ncube hypercube, and the FPS 500 system. Some preliminary results for an 8K CM2 machine will also be mentioned. The code run is the two dimensional grid generation module of Grid, which is a general two dimensional and three dimensional grid generation code for complex geometries. A system of nonlinear Poisson equations is solved. This code is also a good testcase for complex fluid dynamics codes, since the same datastructures are used. All systems provided good speedups, but message passing MIMD systems seem to be best suited for large miltiblock applications.
Parallel Algorithms for Least Squares and Related Computations.

DTIC Science & Technology

1991-03-22

for dense computations in linear algebra . The work has recently been published in a general reference book on parallel algorithms by SIAM. AFO SR...written his Ph.D. dissertation with the principal investigator. (See publication 6.) • Parallel Algorithms for Dense Linear Algebra Computations. Our...and describe and to put into perspective a selection of the more important parallel algorithms for numerical linear algebra . We give a major new
Reliability models for dataflow computer systems

NASA Technical Reports Server (NTRS)

Kavi, K. M.; Buckles, B. P.

1985-01-01

The demands for concurrent operation within a computer system and the representation of parallelism in programming languages have yielded a new form of program representation known as data flow (DENN 74, DENN 75, TREL 82a). A new model based on data flow principles for parallel computations and parallel computer systems is presented. Necessary conditions for liveness and deadlock freeness in data flow graphs are derived. The data flow graph is used as a model to represent asynchronous concurrent computer architectures including data flow computers.
Parallel computing method for simulating hydrological processesof large rivers under climate change

NASA Astrophysics Data System (ADS)

Wang, H.; Chen, Y.

2016-12-01

Climate change is one of the proverbial global environmental problems in the world.Climate change has altered the watershed hydrological processes in time and space distribution, especially in worldlarge rivers.Watershed hydrological process simulation based on physically based distributed hydrological model can could have better results compared with the lumped models.However, watershed hydrological process simulation includes large amount of calculations, especially in large rivers, thus needing huge computing resources that may not be steadily available for the researchers or at high expense, this seriously restricted the research and application. To solve this problem, the current parallel method are mostly parallel computing in space and time dimensions.They calculate the natural features orderly thatbased on distributed hydrological model by grid (unit, a basin) from upstream to downstream.This articleproposes ahigh-performancecomputing method of hydrological process simulation with high speedratio and parallel efficiency.It combinedthe runoff characteristics of time and space of distributed hydrological model withthe methods adopting distributed data storage, memory database, distributed computing, parallel computing based on computing power unit.The method has strong adaptability and extensibility,which means it canmake full use of the computing and storage resources under the condition of limited computing resources, and the computing efficiency can be improved linearly with the increase of computing resources .This method can satisfy the parallel computing requirements ofhydrological process simulation in small, medium and large rivers.
Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bylaska, Eric J., E-mail: Eric.Bylaska@pnnl.gov; Weare, Jonathan Q., E-mail: weare@uchicago.edu; Weare, John H., E-mail: jweare@ucsd.edu

2013-08-21

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time t{sub i} (trajectory positions and velocities x{sub i} = (r{sub i}, v{sub i})) to time t{sub i+1} (x{sub i+1}) by x{sub i+1} = f{sub i}(x{sub i}), the dynamics problem spanning an interval from t{sub 0}…t{sub M} can be transformed into a root finding problem, F(X) = [x{sub i} − f(x{sub (i−1})]{sub i} {sub =1,M} = 0, for themore » trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H{sub 2}O AIMD simulation at the MP2 level. The maximum speedup ((serial execution time)/(parallel execution time) ) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H{sub 2}O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.« less
Integrating Computational Chemistry into the Physical Chemistry Curriculum

ERIC Educational Resources Information Center

Johnson, Lewis E.; Engel, Thomas

2011-01-01

Relatively few undergraduate physical chemistry programs integrate molecular modeling into their quantum mechanics curriculum owing to concerns about limited access to computational facilities, the cost of software, and concerns about increasing the course material. However, modeling exercises can be integrated into an undergraduate course at a…
Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; Smith, Brian E

2014-02-11

Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective opeartion through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.
Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2014-08-12

Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.
Why not make a PC cluster of your own? 5. AppleSeed: A Parallel Macintosh Cluster for Scientific Computing

NASA Astrophysics Data System (ADS)

Decyk, Viktor K.; Dauger, Dean E.

We have constructed a parallel cluster consisting of Apple Macintosh G4 computers running both Classic Mac OS as well as the Unix-based Mac OS X, and have achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. Unlike other Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the mainstream of computing.

What Chemists (or Chemistry Students) Need to Know about Computing.

ERIC Educational Resources Information Center

Swift, Mary L.; Zielinski, Theresa Julia

1995-01-01

Presents key points of an on-line conference discussion and integrates them with information from the literature. Key points included: computer as a tool for learning, study, research, and communication; hardware, software, computing concepts, and other teaching concerns; and the appropriate place for chemistry computer-usage instruction. (45…
An Educational Approach to Computationally Modeling Dynamical Systems

ERIC Educational Resources Information Center

Chodroff, Leah; O'Neal, Tim M.; Long, David A.; Hemkin, Sheryl

2009-01-01

Chemists have used computational science methodologies for a number of decades and their utility continues to be unabated. For this reason we developed an advanced lab in computational chemistry in which students gain understanding of general strengths and weaknesses of computation-based chemistry by working through a specific research problem.…
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

NASA Astrophysics Data System (ADS)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)

NASA Technical Reports Server (NTRS)

Straeter, T. A.; Markos, A. T.

1975-01-01

A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.
Molecular Modeling and Computational Chemistry at Humboldt State University.

ERIC Educational Resources Information Center

Paselk, Richard A.; Zoellner, Robert W.

2002-01-01

Describes a molecular modeling and computational chemistry (MM&CC) facility for undergraduate instruction and research at Humboldt State University. This facility complex allows the introduction of MM&CC throughout the chemistry curriculum with tailored experiments in general, organic, and inorganic courses as well as a new molecular modeling…
Integration of Computational Chemistry into the Undergraduate Organic Chemistry Laboratory Curriculum

ERIC Educational Resources Information Center

Esselman, Brian J.; Hill, Nicholas J.

2016-01-01

Advances in software and hardware have promoted the use of computational chemistry in all branches of chemical research to probe important chemical concepts and to support experimentation. Consequently, it has become imperative that students in the modern undergraduate curriculum become adept at performing simple calculations using computational…
Conformational Analysis of Drug Molecules: A Practical Exercise in the Medicinal Chemistry Course

ERIC Educational Resources Information Center

Yuriev, Elizabeth; Chalmers, David; Capuano, Ben

2009-01-01

Medicinal chemistry is a specialized, scientific discipline. Computational chemistry and structure-based drug design constitute important themes in the education of medicinal chemists. This problem-based task is associated with structure-based drug design lectures. It requires students to use computational techniques to investigate conformational…
Methods of parallel computation applied on granular simulations

NASA Astrophysics Data System (ADS)

Martins, Gustavo H. B.; Atman, Allbens P. F.

2017-06-01

Every year, parallel computing has becoming cheaper and more accessible. As consequence, applications were spreading over all research areas. Granular materials is a promising area for parallel computing. To prove this statement we study the impact of parallel computing in simulations of the BNE (Brazil Nut Effect). This property is due the remarkable arising of an intruder confined to a granular media when vertically shaken against gravity. By means of DEM (Discrete Element Methods) simulations, we study the code performance testing different methods to improve clock time. A comparison between serial and parallel algorithms, using OpenMP® is also shown. The best improvement was obtained by optimizing the function that find contacts using Verlet's cells.
Parallel computation using boundary elements in solid mechanics

NASA Technical Reports Server (NTRS)

Chien, L. S.; Sun, C. T.

1990-01-01

The inherent parallelism of the boundary element method is shown. The boundary element is formulated by assuming the linear variation of displacements and tractions within a line element. Moreover, MACSYMA symbolic program is employed to obtain the analytical results for influence coefficients. Three computational components are parallelized in this method to show the speedup and efficiency in computation. The global coefficient matrix is first formed concurrently. Then, the parallel Gaussian elimination solution scheme is applied to solve the resulting system of equations. Finally, and more importantly, the domain solutions of a given boundary value problem are calculated simultaneously. The linear speedups and high efficiencies are shown for solving a demonstrated problem on Sequent Symmetry S81 parallel computing system.
Parallel Algorithms for the Exascale Era

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robey, Robert W.

New parallel algorithms are needed to reach the Exascale level of parallelism with millions of cores. We look at some of the research developed by students in projects at LANL. The research blends ideas from the early days of computing while weaving in the fresh approach brought by students new to the field of high performance computing. We look at reproducibility of global sums and why it is important to parallel computing. Next we look at how the concept of hashing has led to the development of more scalable algorithms suitable for next-generation parallel computers. Nearly all of this workmore » has been done by undergraduates and published in leading scientific journals.« less
Research in computer science

NASA Technical Reports Server (NTRS)

Ortega, J. M.

1986-01-01

Various graduate research activities in the field of computer science are reported. Among the topics discussed are: (1) failure probabilities in multi-version software; (2) Gaussian Elimination on parallel computers; (3) three dimensional Poisson solvers on parallel/vector computers; (4) automated task decomposition for multiple robot arms; (5) multi-color incomplete cholesky conjugate gradient methods on the Cyber 205; and (6) parallel implementation of iterative methods for solving linear equations.
A high-speed linear algebra library with automatic parallelism

NASA Technical Reports Server (NTRS)

Boucher, Michael L.

1994-01-01

Parallel or distributed processing is key to getting highest performance workstations. However, designing and implementing efficient parallel algorithms is difficult and error-prone. It is even more difficult to write code that is both portable to and efficient on many different computers. Finally, it is harder still to satisfy the above requirements and include the reliability and ease of use required of commercial software intended for use in a production environment. As a result, the application of parallel processing technology to commercial software has been extremely small even though there are numerous computationally demanding programs that would significantly benefit from application of parallel processing. This paper describes DSSLIB, which is a library of subroutines that perform many of the time-consuming computations in engineering and scientific software. DSSLIB combines the high efficiency and speed of parallel computation with a serial programming model that eliminates many undesirable side-effects of typical parallel code. The result is a simple way to incorporate the power of parallel processing into commercial software without compromising maintainability, reliability, or ease of use. This gives significant advantages over less powerful non-parallel entries in the market.
RRAM-based parallel computing architecture using k-nearest neighbor classification for pattern recognition

NASA Astrophysics Data System (ADS)

Jiang, Yuning; Kang, Jinfeng; Wang, Xinan

2017-03-01

Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance.
Symplectic molecular dynamics simulations on specially designed parallel computers.

PubMed

Borstnik, Urban; Janezic, Dusanka

2005-01-01

We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables the use of longer simulation time steps. The low-frequency motion is treated numerically on specially designed parallel computers, which decreases the computational time of each simulation time step. The combination of these approaches means that less time is required and fewer steps are needed and so enables fast MD simulations. We study the computational performance of MD simulation of molecular systems on specialized computers and provide a comparison to standard personal computers. The combination of the SISM with two specialized parallel computers is an effective way to increase the speed of MD simulations up to 16-fold over a single PC processor.
Computation of Reacting Flows in Combustion Processes

NASA Technical Reports Server (NTRS)

Keith, Theo G., Jr.; Chen, Kuo-Huey

1997-01-01

The main objective of this research was to develop an efficient three-dimensional computer code for chemically reacting flows. The main computer code developed is ALLSPD-3D. The ALLSPD-3D computer program is developed for the calculation of three-dimensional, chemically reacting flows with sprays. The ALL-SPD code employs a coupled, strongly implicit solution procedure for turbulent spray combustion flows. A stochastic droplet model and an efficient method for treatment of the spray source terms in the gas-phase equations are used to calculate the evaporating liquid sprays. The chemistry treatment in the code is general enough that an arbitrary number of reaction and species can be defined by the users. Also, it is written in generalized curvilinear coordinates with both multi-block and flexible internal blockage capabilities to handle complex geometries. In addition, for general industrial combustion applications, the code provides both dilution and transpiration cooling capabilities. The ALLSPD algorithm, which employs the preconditioning and eigenvalue rescaling techniques, is capable of providing efficient solution for flows with a wide range of Mach numbers. Although written for three-dimensional flows in general, the code can be used for two-dimensional and axisymmetric flow computations as well. The code is written in such a way that it can be run in various computer platforms (supercomputers, workstations and parallel processors) and the GUI (Graphical User Interface) should provide a user-friendly tool in setting up and running the code.
Chemistry by Computer.

ERIC Educational Resources Information Center

Garmon, Linda

1981-01-01

Describes the features of various computer chemistry programs. Utilization of computer graphics, color, digital imaging, and other innovations are discussed in programs including those which aid in the identification of unknowns, predict whether chemical reactions are feasible, and predict the biological activity of xenobiotic compounds. (CS)
Parallelization of fine-scale computation in Agile Multiscale Modelling Methodology

NASA Astrophysics Data System (ADS)

Macioł, Piotr; Michalik, Kazimierz

2016-10-01

Nowadays, multiscale modelling of material behavior is an extensively developed area. An important obstacle against its wide application is high computational demands. Among others, the parallelization of multiscale computations is a promising solution. Heterogeneous multiscale models are good candidates for parallelization, since communication between sub-models is limited. In this paper, the possibility of parallelization of multiscale models based on Agile Multiscale Methodology framework is discussed. A sequential, FEM based macroscopic model has been combined with concurrently computed fine-scale models, employing a MatCalc thermodynamic simulator. The main issues, being investigated in this work are: (i) the speed-up of multiscale models with special focus on fine-scale computations and (ii) on decreasing the quality of computations enforced by parallel execution. Speed-up has been evaluated on the basis of Amdahl's law equations. The problem of `delay error', rising from the parallel execution of fine scale sub-models, controlled by the sequential macroscopic sub-model is discussed. Some technical aspects of combining third-party commercial modelling software with an in-house multiscale framework and a MPI library are also discussed.
Parallel algorithms for mapping pipelined and parallel computations

NASA Technical Reports Server (NTRS)

Nicol, David M.

1988-01-01

Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.
Synthesizing parallel imaging applications using the CAP (computer-aided parallelization) tool

NASA Astrophysics Data System (ADS)

Gennart, Benoit A.; Mazzariol, Marc; Messerli, Vincent; Hersch, Roger D.

1997-12-01

Imaging applications such as filtering, image transforms and compression/decompression require vast amounts of computing power when applied to large data sets. These applications would potentially benefit from the use of parallel processing. However, dedicated parallel computers are expensive and their processing power per node lags behind that of the most recent commodity components. Furthermore, developing parallel applications remains a difficult task: writing and debugging the application is difficult (deadlocks), programs may not be portable from one parallel architecture to the other, and performance often comes short of expectations. In order to facilitate the development of parallel applications, we propose the CAP computer-aided parallelization tool which enables application programmers to specify at a high-level of abstraction the flow of data between pipelined-parallel operations. In addition, the CAP tool supports the programmer in developing parallel imaging and storage operations. CAP enables combining efficiently parallel storage access routines and image processing sequential operations. This paper shows how processing and I/O intensive imaging applications must be implemented to take advantage of parallelism and pipelining between data access and processing. This paper's contribution is (1) to show how such implementations can be compactly specified in CAP, and (2) to demonstrate that CAP specified applications achieve the performance of custom parallel code. The paper analyzes theoretically the performance of CAP specified applications and demonstrates the accuracy of the theoretical analysis through experimental measurements.
CSM parallel structural methods research

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.

1989-01-01

Parallel structural methods, research team activities, advanced architecture computers for parallel computational structural mechanics (CSM) research, the FLEX/32 multicomputer, a parallel structural analyses testbed, blade-stiffened aluminum panel with a circular cutout and the dynamic characteristics of a 60 meter, 54-bay, 3-longeron deployable truss beam are among the topics discussed.

Parallelized direct execution simulation of message-passing parallel programs

NASA Technical Reports Server (NTRS)

Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

1994-01-01

As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
A Parallel Controlled Study of the Effectiveness of a Partially Flipped Organic Chemistry Course on Student Performance, Perceptions, and Course Completion

ERIC Educational Resources Information Center

Shattuck, James C.

2016-01-01

Organic chemistry is very challenging to many students pursuing science careers. Flipping the classroom presents an opportunity to significantly improve student success by increasing active learning, which research shows is highly beneficial to student learning. However, flipping an entire course may seem too daunting or an instructor may simply…
Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.
Computer-aided drug discovery research at a global contract research organization

NASA Astrophysics Data System (ADS)

Kitchen, Douglas B.

2017-03-01

Computer-aided drug discovery started at Albany Molecular Research, Inc in 1997. Over nearly 20 years the role of cheminformatics and computational chemistry has grown throughout the pharmaceutical industry and at AMRI. This paper will describe the infrastructure and roles of CADD throughout drug discovery and some of the lessons learned regarding the success of several methods. Various contributions provided by computational chemistry and cheminformatics in chemical library design, hit triage, hit-to-lead and lead optimization are discussed. Some frequently used computational chemistry techniques are described. The ways in which they may contribute to discovery projects are presented based on a few examples from recent publications.
Computer-aided drug discovery research at a global contract research organization.

PubMed

Kitchen, Douglas B

2017-03-01

Computer-aided drug discovery started at Albany Molecular Research, Inc in 1997. Over nearly 20 years the role of cheminformatics and computational chemistry has grown throughout the pharmaceutical industry and at AMRI. This paper will describe the infrastructure and roles of CADD throughout drug discovery and some of the lessons learned regarding the success of several methods. Various contributions provided by computational chemistry and cheminformatics in chemical library design, hit triage, hit-to-lead and lead optimization are discussed. Some frequently used computational chemistry techniques are described. The ways in which they may contribute to discovery projects are presented based on a few examples from recent publications.
The science of computing - Parallel computation

NASA Technical Reports Server (NTRS)

Denning, P. J.

1985-01-01

Although parallel computation architectures have been known for computers since the 1920s, it was only in the 1970s that microelectronic components technologies advanced to the point where it became feasible to incorporate multiple processors in one machine. Concommitantly, the development of algorithms for parallel processing also lagged due to hardware limitations. The speed of computing with solid-state chips is limited by gate switching delays. The physical limit implies that a 1 Gflop operational speed is the maximum for sequential processors. A computer recently introduced features a 'hypercube' architecture with 128 processors connected in networks at 5, 6 or 7 points per grid, depending on the design choice. Its computing speed rivals that of supercomputers, but at a fraction of the cost. The added speed with less hardware is due to parallel processing, which utilizes algorithms representing different parts of an equation that can be broken into simpler statements and processed simultaneously. Present, highly developed computer languages like FORTRAN, PASCAL, COBOL, etc., rely on sequential instructions. Thus, increased emphasis will now be directed at parallel processing algorithms to exploit the new architectures.
System-wide power management control via clock distribution network

DOEpatents

Coteus, Paul W.; Gara, Alan; Gooding, Thomas M.; Haring, Rudolf A.; Kopcsay, Gerard V.; Liebsch, Thomas A.; Reed, Don D.

2015-05-19

An apparatus, method and computer program product for automatically controlling power dissipation of a parallel computing system that includes a plurality of processors. A computing device issues a command to the parallel computing system. A clock pulse-width modulator encodes the command in a system clock signal to be distributed to the plurality of processors. The plurality of processors in the parallel computing system receive the system clock signal including the encoded command, and adjusts power dissipation according to the encoded command.
Parallel Computing:. Some Activities in High Energy Physics

NASA Astrophysics Data System (ADS)

Willers, Ian

This paper examines some activities in High Energy Physics that utilise parallel computing. The topic includes all computing from the proposed SIMD front end detectors, the farming applications, high-powered RISC processors and the large machines in the computer centers. We start by looking at the motivation behind using parallelism for general purpose computing. The developments around farming are then described from its simplest form to the more complex system in Fermilab. Finally, there is a list of some developments that are happening close to the experiments.
Implementation of DFT application on ternary optical computer

NASA Astrophysics Data System (ADS)

Junjie, Peng; Youyi, Fu; Xiaofeng, Zhang; Shuai, Kong; Xinyu, Wei

2018-03-01

As its characteristics of huge number of data bits and low energy consumption, optical computing may be used in the applications such as DFT etc. which needs a lot of computation and can be implemented in parallel. According to this, DFT implementation methods in full parallel as well as in partial parallel are presented. Based on resources ternary optical computer (TOC), extensive experiments were carried out. Experimental results show that the proposed schemes are correct and feasible. They provide a foundation for further exploration of the applications on TOC that needs a large amount calculation and can be processed in parallel.
Using Computer Visualization Models in High School Chemistry: The Role of Teacher Beliefs.

ERIC Educational Resources Information Center

Robblee, Karen M.; Garik, Peter; Abegg, Gerald L.; Faux, Russell; Horwitz, Paul

This paper discusses the role of high school chemistry teachers' beliefs in implementing computer visualization software to teach atomic and molecular structure from a quantum mechanical perspective. The informants in this study were four high school chemistry teachers with comparable academic and professional backgrounds. These teachers received…
Integrating Free Computer Software in Chemistry and Biochemistry Instruction: An International Collaboration

ERIC Educational Resources Information Center

Cedeno, David L.; Jones, Marjorie A.; Friesen, Jon A.; Wirtz, Mark W.; Rios, Luz Amalia; Ocampo, Gonzalo Taborda

2010-01-01

At the Universidad de Caldas, Manizales, Colombia, we used their new computer facilities to introduce chemistry graduate students to biochemical database mining and quantum chemistry calculations using freeware. These hands-on workshops allowed the students a strong introduction to easily accessible software and how to use this software to begin…
ParFit: A Python-Based Object-Oriented Program for Fitting Molecular Mechanics Parameters to ab Initio Data

DOE PAGES

Zahariev, Federico; De Silva, Nuwan; Gordon, Mark S.; ...

2017-02-23

Here, a newly created object-oriented program for automating the process of fitting molecular-mechanics parameters to ab initio data, termed ParFit, is presented. ParFit uses a hybrid of deterministic and stochastic genetic algorithms. ParFit can simultaneously handle several molecular-mechanics parameters in multiple molecules and can also apply symmetric and antisymmetric constraints on the optimized parameters. The simultaneous handling of several molecules enhances the transferability of the fitted parameters. ParFit is written in Python, uses a rich set of standard and nonstandard Python libraries, and can be run in parallel on multicore computer systems. As an example, a series of phosphine oxides,more » important for metal extraction chemistry, are parametrized using ParFit.« less
ParFit: A Python-Based Object-Oriented Program for Fitting Molecular Mechanics Parameters to ab Initio Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zahariev, Federico; De Silva, Nuwan; Gordon, Mark S.

Here, a newly created object-oriented program for automating the process of fitting molecular-mechanics parameters to ab initio data, termed ParFit, is presented. ParFit uses a hybrid of deterministic and stochastic genetic algorithms. ParFit can simultaneously handle several molecular-mechanics parameters in multiple molecules and can also apply symmetric and antisymmetric constraints on the optimized parameters. The simultaneous handling of several molecules enhances the transferability of the fitted parameters. ParFit is written in Python, uses a rich set of standard and nonstandard Python libraries, and can be run in parallel on multicore computer systems. As an example, a series of phosphine oxides,more » important for metal extraction chemistry, are parametrized using ParFit.« less
Special purpose parallel computer architecture for real-time control and simulation in robotic applications

NASA Technical Reports Server (NTRS)

Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)

1993-01-01

This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
Exact parallel algorithms for some members of the traveling salesman problem family

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pekny, J.F.

1989-01-01

The traveling salesman problem and its many generalizations comprise one of the best known combinatorial optimization problem families. Most members of the family are NP-complete problems so that exact algorithms require an unpredictable and sometimes large computational effort. Parallel computers offer hope for providing the power required to meet these demands. A major barrier to applying parallel computers is the lack of parallel algorithms. The contributions presented in this thesis center around new exact parallel algorithms for the asymmetric traveling salesman problem (ATSP), prize collecting traveling salesman problem (PCTSP), and resource constrained traveling salesman problem (RCTSP). The RCTSP is amore » particularly difficult member of the family since finding a feasible solution is an NP-complete problem. An exact sequential algorithm is also presented for the directed hamiltonian cycle problem (DHCP). The DHCP algorithm is superior to current heuristic approaches and represents the first exact method applicable to large graphs. Computational results presented for each of the algorithms demonstrates the effectiveness of combining efficient algorithms with parallel computing methods. Performance statistics are reported for randomly generated ATSPs with 7,500 cities, PCTSPs with 200 cities, RCTSPs with 200 cities, DHCPs with 3,500 vertices, and assignment problems of size 10,000. Sequential results were collected on a Sun 4/260 engineering workstation, while parallel results were collected using a 14 and 100 processor BBN Butterfly Plus computer. The computational results represent the largest instances ever solved to optimality on any type of computer.« less
Use of parallel computing in mass processing of laser data

NASA Astrophysics Data System (ADS)

Będkowski, J.; Bratuś, R.; Prochaska, M.; Rzonca, A.

2015-12-01

The first part of the paper includes a description of the rules used to generate the algorithm needed for the purpose of parallel computing and also discusses the origins of the idea of research on the use of graphics processors in large scale processing of laser scanning data. The next part of the paper includes the results of an efficiency assessment performed for an array of different processing options, all of which were substantially accelerated with parallel computing. The processing options were divided into the generation of orthophotos using point clouds, coloring of point clouds, transformations, and the generation of a regular grid, as well as advanced processes such as the detection of planes and edges, point cloud classification, and the analysis of data for the purpose of quality control. Most algorithms had to be formulated from scratch in the context of the requirements of parallel computing. A few of the algorithms were based on existing technology developed by the Dephos Software Company and then adapted to parallel computing in the course of this research study. Processing time was determined for each process employed for a typical quantity of data processed, which helped confirm the high efficiency of the solutions proposed and the applicability of parallel computing to the processing of laser scanning data. The high efficiency of parallel computing yields new opportunities in the creation and organization of processing methods for laser scanning data.
Development and assessment of a chemistry-based computer video game as a learning tool

NASA Astrophysics Data System (ADS)

Martinez-Hernandez, Kermin Joel

The chemistry-based computer video game is a multidisciplinary collaboration between chemistry and computer graphics and technology fields developed to explore the use of video games as a possible learning tool. This innovative approach aims to integrate elements of commercial video game and authentic chemistry context environments into a learning experience through gameplay. The project consists of three areas: development, assessment, and implementation. However, the foci of this study were the development and assessment of the computer video game including possible learning outcomes and game design elements. A chemistry-based game using a mixed genre of a single player first-person game embedded with action-adventure and puzzle components was developed to determine if students' level of understanding of chemistry concepts change after gameplay intervention. Three phases have been completed to assess students' understanding of chemistry concepts prior and after gameplay intervention. Two main assessment instruments (pre/post open-ended content survey and individual semi-structured interviews) were used to assess student understanding of concepts. In addition, game design elements were evaluated for future development phases. Preliminary analyses of the interview data suggest that students were able to understand most of the chemistry challenges presented in the game and the game served as a review for previously learned concepts as well as a way to apply such previous knowledge. To guarantee a better understanding of the chemistry concepts, additions such as debriefing and feedback about the content presented in the game seem to be needed. The use of visuals in the game to represent chemical processes, game genre, and game idea appear to be the game design elements that students like the most about the current computer video game.
From transistor to trapped-ion computers for quantum chemistry.

PubMed

Yung, M-H; Casanova, J; Mezzacapo, A; McClean, J; Lamata, L; Aspuru-Guzik, A; Solano, E

2014-01-07

Over the last few decades, quantum chemistry has progressed through the development of computational methods based on modern digital computers. However, these methods can hardly fulfill the exponentially-growing resource requirements when applied to large quantum systems. As pointed out by Feynman, this restriction is intrinsic to all computational models based on classical physics. Recently, the rapid advancement of trapped-ion technologies has opened new possibilities for quantum control and quantum simulations. Here, we present an efficient toolkit that exploits both the internal and motional degrees of freedom of trapped ions for solving problems in quantum chemistry, including molecular electronic structure, molecular dynamics, and vibronic coupling. We focus on applications that go beyond the capacity of classical computers, but may be realizable on state-of-the-art trapped-ion systems. These results allow us to envision a new paradigm of quantum chemistry that shifts from the current transistor to a near-future trapped-ion-based technology.
From transistor to trapped-ion computers for quantum chemistry

PubMed Central

Yung, M.-H.; Casanova, J.; Mezzacapo, A.; McClean, J.; Lamata, L.; Aspuru-Guzik, A.; Solano, E.

2014-01-01

Over the last few decades, quantum chemistry has progressed through the development of computational methods based on modern digital computers. However, these methods can hardly fulfill the exponentially-growing resource requirements when applied to large quantum systems. As pointed out by Feynman, this restriction is intrinsic to all computational models based on classical physics. Recently, the rapid advancement of trapped-ion technologies has opened new possibilities for quantum control and quantum simulations. Here, we present an efficient toolkit that exploits both the internal and motional degrees of freedom of trapped ions for solving problems in quantum chemistry, including molecular electronic structure, molecular dynamics, and vibronic coupling. We focus on applications that go beyond the capacity of classical computers, but may be realizable on state-of-the-art trapped-ion systems. These results allow us to envision a new paradigm of quantum chemistry that shifts from the current transistor to a near-future trapped-ion-based technology. PMID:24395054
Parallelized computation for computer simulation of electrocardiograms using personal computers with multi-core CPU and general-purpose GPU.

PubMed

Shen, Wenfeng; Wei, Daming; Xu, Weimin; Zhu, Xin; Yuan, Shizhong

2010-10-01

Biological computations like electrocardiological modelling and simulation usually require high-performance computing environments. This paper introduces an implementation of parallel computation for computer simulation of electrocardiograms (ECGs) in a personal computer environment with an Intel CPU of Core (TM) 2 Quad Q6600 and a GPU of Geforce 8800GT, with software support by OpenMP and CUDA. It was tested in three parallelization device setups: (a) a four-core CPU without a general-purpose GPU, (b) a general-purpose GPU plus 1 core of CPU, and (c) a four-core CPU plus a general-purpose GPU. To effectively take advantage of a multi-core CPU and a general-purpose GPU, an algorithm based on load-prediction dynamic scheduling was developed and applied to setting (c). In the simulation with 1600 time steps, the speedup of the parallel computation as compared to the serial computation was 3.9 in setting (a), 16.8 in setting (b), and 20.0 in setting (c). This study demonstrates that a current PC with a multi-core CPU and a general-purpose GPU provides a good environment for parallel computations in biological modelling and simulation studies. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.

Creating a Parallel Version of VisIt for Microsoft Windows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Whitlock, B J; Biagas, K S; Rawson, P L

2011-12-07

VisIt is a popular, free interactive parallel visualization and analysis tool for scientific data. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images or movies for presentations. VisIt was designed from the ground up to work on many scales of computers from modest desktops up to massively parallel clusters. VisIt is comprised of a set of cooperating programs. All programs can be run locally or in client/server mode in which some run locally and some run remotely on compute clusters. The VisIt program most able to harness today's computing powermore » is the VisIt compute engine. The compute engine is responsible for reading simulation data from disk, processing it, and sending results or images back to the VisIt viewer program. In a parallel environment, the compute engine runs several processes, coordinating using the Message Passing Interface (MPI) library. Each MPI process reads some subset of the scientific data and filters the data in various ways to create useful visualizations. By using MPI, VisIt has been able to scale well into the thousands of processors on large computers such as dawn and graph at LLNL. The advent of multicore CPU's has made parallelism the 'new' way to achieve increasing performance. With today's computers having at least 2 cores and in many cases up to 8 and beyond, it is more important than ever to deploy parallel software that can use that computing power not only on clusters but also on the desktop. We have created a parallel version of VisIt for Windows that uses Microsoft's MPI implementation (MSMPI) to process data in parallel on the Windows desktop as well as on a Windows HPC cluster running Microsoft Windows Server 2008. Initial desktop parallel support for Windows was deployed in VisIt 2.4.0. Windows HPC cluster support has been completed and will appear in the VisIt 2.5.0 release. We plan to continue supporting parallel VisIt on Windows so our users will be able to take full advantage of their multicore resources.« less
HeNCE: A Heterogeneous Network Computing Environment

DOE PAGES

Beguelin, Adam; Dongarra, Jack J.; Geist, George Al; ...

1994-01-01

Network computing seeks to utilize the aggregate resources of many networked computers to solve a single problem. In so doing it is often possible to obtain supercomputer performance from an inexpensive local area network. The drawback is that network computing is complicated and error prone when done by hand, especially if the computers have different operating systems and data formats and are thus heterogeneous. The heterogeneous network computing environment (HeNCE) is an integrated graphical environment for creating and running parallel programs over a heterogeneous collection of computers. It is built on a lower level package called parallel virtual machine (PVM).more » The HeNCE philosophy of parallel programming is to have the programmer graphically specify the parallelism of a computation and to automate, as much as possible, the tasks of writing, compiling, executing, debugging, and tracing the network computation. Key to HeNCE is a graphical language based on directed graphs that describe the parallelism and data dependencies of an application. Nodes in the graphs represent conventional Fortran or C subroutines and the arcs represent data and control flow. This article describes the present state of HeNCE, its capabilities, limitations, and areas of future research.« less
Parallelized multi–graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy

PubMed Central

Tankam, Patrice; Santhanam, Anand P.; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P.

2014-01-01

Abstract. Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6 mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing. PMID:24695868
Parallelized multi-graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy.

PubMed

Tankam, Patrice; Santhanam, Anand P; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P

2014-07-01

Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6 mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing.
Axisymmetric computational fluid dynamics analysis of Saturn V/S1-C/F1 nozzle and plume

NASA Technical Reports Server (NTRS)

Ruf, Joseph H.

1993-01-01

An axisymmetric single engine Computational Fluid Dynamics calculation of the Saturn V/S 1-C vehicle base region and F1 engine plume is described. There were two objectives of this work, the first was to calculate an axisymmetric approximation of the nozzle, plume and base region flow fields of S1-C/F1, relate/scale this to flight data and apply this scaling factor to a NLS/STME axisymmetric calculations from a parallel effort. The second was to assess the differences in F1 and STME plume shear layer development and concentration of combustible gases. This second piece of information was to be input/supporting data for assumptions made in NLS2 base temperature scaling methodology from which the vehicle base thermal environments were being generated. The F1 calculations started at the main combustion chamber faceplate and incorporated the turbine exhaust dump/nozzle film coolant. The plume and base region calculations were made for ten thousand feet and 57 thousand feet altitude at vehicle flight velocity and in stagnant freestream. FDNS was implemented with a 14 species, 28 reaction finite rate chemistry model plus a soot burning model for the RP-1/LOX chemistry. Nozzle and plume flow fields are shown, the plume shear layer constituents are compared to a STME plume. Conclusions are made about the validity and status of the analysis and NLS2 vehicle base thermal environment definition methodology.
The EPA CompTox Chemistry Dashboard - an online resource for environmental chemists (ACS Spring Meeting)

EPA Science Inventory

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data drive...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Toxicology Data (SOT)

EPA Science Inventory

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data drive...
Seeing the forest for the trees: Networked workstations as a parallel processing computer

NASA Technical Reports Server (NTRS)

Breen, J. O.; Meleedy, D. M.

1992-01-01

Unlike traditional 'serial' processing computers in which one central processing unit performs one instruction at a time, parallel processing computers contain several processing units, thereby, performing several instructions at once. Many of today's fastest supercomputers achieve their speed by employing thousands of processing elements working in parallel. Few institutions can afford these state-of-the-art parallel processors, but many already have the makings of a modest parallel processing system. Workstations on existing high-speed networks can be harnessed as nodes in a parallel processing environment, bringing the benefits of parallel processing to many. While such a system can not rival the industry's latest machines, many common tasks can be accelerated greatly by spreading the processing burden and exploiting idle network resources. We study several aspects of this approach, from algorithms to select nodes to speed gains in specific tasks. With ever-increasing volumes of astronomical data, it becomes all the more necessary to utilize our computing resources fully.
Computational Chemistry Using Modern Electronic Structure Methods

ERIC Educational Resources Information Center

Bell, Stephen; Dines, Trevor J.; Chowdhry, Babur Z.; Withnall, Robert

2007-01-01

Various modern electronic structure methods are now days used to teach computational chemistry to undergraduate students. Such quantum calculations can now be easily used even for large size molecules.
Six Years of Parallel Computing at NAS (1987 - 1993): What Have we Learned?

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Cooper, D. M. (Technical Monitor)

1994-01-01

In the fall of 1987 the age of parallelism at NAS began with the installation of a 32K processor CM-2 from Thinking Machines. In 1987 this was described as an "experiment" in parallel processing. In the six years since, NAS acquired a series of parallel machines, and conducted an active research and development effort focused on the use of highly parallel machines for applications in the computational aerosciences. In this time period parallel processing for scientific applications evolved from a fringe research topic into the one of main activities at NAS. In this presentation I will review the history of parallel computing at NAS in the context of the major progress, which has been made in the field in general. I will attempt to summarize the lessons we have learned so far, and the contributions NAS has made to the state of the art. Based on these insights I will comment on the current state of parallel computing (including the HPCC effort) and try to predict some trends for the next six years.
Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R

Methods, apparatuses, and computer program products for endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (`PAMI`) of a parallel computer are provided. Embodiments include establishing by a parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry. Embodiments also include registering in each endpoint in the geometry a dispatch callback function for a collective operation and executing without blocking, through a single onemore » of the endpoints in the geometry, an instruction for the collective operation.« less
A parallel variable metric optimization algorithm

NASA Technical Reports Server (NTRS)

Straeter, T. A.

1973-01-01

An algorithm, designed to exploit the parallel computing or vector streaming (pipeline) capabilities of computers is presented. When p is the degree of parallelism, then one cycle of the parallel variable metric algorithm is defined as follows: first, the function and its gradient are computed in parallel at p different values of the independent variable; then the metric is modified by p rank-one corrections; and finally, a single univariant minimization is carried out in the Newton-like direction. Several properties of this algorithm are established. The convergence of the iterates to the solution is proved for a quadratic functional on a real separable Hilbert space. For a finite-dimensional space the convergence is in one cycle when p equals the dimension of the space. Results of numerical experiments indicate that the new algorithm will exploit parallel or pipeline computing capabilities to effect faster convergence than serial techniques.
Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R

Endpoint-based parallel data processing with non-blocking collective instructions in a PAMI of a parallel computer is disclosed. The PAMI is composed of data communications endpoints, each including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task. The compute nodes are coupled for data communications through the PAMI. The parallel application establishes a data communications geometry specifying a set of endpoints that are used in collective operations of the PAMI by associating with the geometry a list of collective algorithms valid for use with themore » endpoints of the geometry; registering in each endpoint in the geometry a dispatch callback function for a collective operation; and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.« less
Template based parallel checkpointing in a massively parallel computer system

DOEpatents

Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN

2009-01-13

A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.
Computer Simulations of Quantum Theory of Hydrogen Atom for Natural Science Education Students in a Virtual Lab

ERIC Educational Resources Information Center

Singh, Gurmukh

2012-01-01

The present article is primarily targeted for the advanced college/university undergraduate students of chemistry/physics education, computational physics/chemistry, and computer science. The most recent software system such as MS Visual Studio .NET version 2010 is employed to perform computer simulations for modeling Bohr's quantum theory of…
Computer-based, Jeopardy™-like game in general chemistry for engineering majors

NASA Astrophysics Data System (ADS)

Ling, S. S.; Saffre, F.; Kadadha, M.; Gater, D. L.; Isakovic, A. F.

2013-03-01

We report on the design of Jeopardy™-like computer game for enhancement of learning of general chemistry for engineering majors. While we examine several parameters of student achievement and attitude, our primary concern is addressing the motivation of students, which tends to be low in a traditionally run chemistry lectures. The effect of the game-playing is tested by comparing paper-based game quiz, which constitutes a control group, and computer-based game quiz, constituting a treatment group. Computer-based game quizzes are Java™-based applications that students run once a week in the second part of the last lecture of the week. Overall effectiveness of the semester-long program is measured through pretest-postest conceptual testing of general chemistry. The objective of this research is to determine to what extent this ``gamification'' of the course delivery and course evaluation processes may be beneficial to the undergraduates' learning of science in general, and chemistry in particular. We present data addressing gender-specific difference in performance, as well as background (pre-college) level of general science and chemistry preparation. We outline the plan how to extend such approach to general physics courses and to modern science driven electives, and we offer live, in-lectures examples of our computer gaming experience. We acknowledge support from Khalifa University, Abu Dhabi
Computational strategies for three-dimensional flow simulations on distributed computer systems. Ph.D. Thesis Semiannual Status Report, 15 Aug. 1993 - 15 Feb. 1994

NASA Technical Reports Server (NTRS)

Weed, Richard Allen; Sankar, L. N.

1994-01-01

An increasing amount of research activity in computational fluid dynamics has been devoted to the development of efficient algorithms for parallel computing systems. The increasing performance to price ratio of engineering workstations has led to research to development procedures for implementing a parallel computing system composed of distributed workstations. This thesis proposal outlines an ongoing research program to develop efficient strategies for performing three-dimensional flow analysis on distributed computing systems. The PVM parallel programming interface was used to modify an existing three-dimensional flow solver, the TEAM code developed by Lockheed for the Air Force, to function as a parallel flow solver on clusters of workstations. Steady flow solutions were generated for three different wing and body geometries to validate the code and evaluate code performance. The proposed research will extend the parallel code development to determine the most efficient strategies for unsteady flow simulations.
Concurrent extensions to the FORTRAN language for parallel programming of computational fluid dynamics algorithms

NASA Technical Reports Server (NTRS)

Weeks, Cindy Lou

1986-01-01

Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
Parallelization of interpolation, solar radiation and water flow simulation modules in GRASS GIS using OpenMP

NASA Astrophysics Data System (ADS)

Hofierka, Jaroslav; Lacko, Michal; Zubal, Stanislav

2017-10-01

In this paper, we describe the parallelization of three complex and computationally intensive modules of GRASS GIS using the OpenMP application programming interface for multi-core computers. These include the v.surf.rst module for spatial interpolation, the r.sun module for solar radiation modeling and the r.sim.water module for water flow simulation. We briefly describe the functionality of the modules and parallelization approaches used in the modules. Our approach includes the analysis of the module's functionality, identification of source code segments suitable for parallelization and proper application of OpenMP parallelization code to create efficient threads processing the subtasks. We document the efficiency of the solutions using the airborne laser scanning data representing land surface in the test area and derived high-resolution digital terrain model grids. We discuss the performance speed-up and parallelization efficiency depending on the number of processor threads. The study showed a substantial increase in computation speeds on a standard multi-core computer while maintaining the accuracy of results in comparison to the output from original modules. The presented parallelization approach showed the simplicity and efficiency of the parallelization of open-source GRASS GIS modules using OpenMP, leading to an increased performance of this geospatial software on standard multi-core computers.
Fast hydrological model calibration based on the heterogeneous parallel computing accelerated shuffled complex evolution method

NASA Astrophysics Data System (ADS)

Kan, Guangyuan; He, Xiaoyan; Ding, Liuqian; Li, Jiren; Hong, Yang; Zuo, Depeng; Ren, Minglei; Lei, Tianjie; Liang, Ke

2018-01-01

Hydrological model calibration has been a hot issue for decades. The shuffled complex evolution method developed at the University of Arizona (SCE-UA) has been proved to be an effective and robust optimization approach. However, its computational efficiency deteriorates significantly when the amount of hydrometeorological data increases. In recent years, the rise of heterogeneous parallel computing has brought hope for the acceleration of hydrological model calibration. This study proposed a parallel SCE-UA method and applied it to the calibration of a watershed rainfall-runoff model, the Xinanjiang model. The parallel method was implemented on heterogeneous computing systems using OpenMP and CUDA. Performance testing and sensitivity analysis were carried out to verify its correctness and efficiency. Comparison results indicated that heterogeneous parallel computing-accelerated SCE-UA converged much more quickly than the original serial version and possessed satisfactory accuracy and stability for the task of fast hydrological model calibration.

Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Shuangshuang; Chen, Yousu; Wu, Di

2015-12-09

Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Messagemore » Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.« less
A hybrid parallel architecture for electrostatic interactions in the simulation of dissipative particle dynamics

NASA Astrophysics Data System (ADS)

Yang, Sheng-Chun; Lu, Zhong-Yuan; Qian, Hu-Jun; Wang, Yong-Lei; Han, Jie-Ping

2017-11-01

In this work, we upgraded the electrostatic interaction method of CU-ENUF (Yang, et al., 2016) which first applied CUNFFT (nonequispaced Fourier transforms based on CUDA) to the reciprocal-space electrostatic computation and made the computation of electrostatic interaction done thoroughly in GPU. The upgraded edition of CU-ENUF runs concurrently in a hybrid parallel way that enables the computation parallelizing on multiple computer nodes firstly, then further on the installed GPU in each computer. By this parallel strategy, the size of simulation system will be never restricted to the throughput of a single CPU or GPU. The most critical technical problem is how to parallelize a CUNFFT in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Furthermore, the upgraded method is capable of computing electrostatic interactions for both the atomistic molecular dynamics (MD) and the dissipative particle dynamics (DPD). Finally, the benchmarks conducted for validation and performance indicate that the upgraded method is able to not only present a good precision when setting suitable parameters, but also give an efficient way to compute electrostatic interactions for huge simulation systems. Program Files doi:http://dx.doi.org/10.17632/zncf24fhpv.1 Licensing provisions: GNU General Public License 3 (GPL) Programming language: C, C++, and CUDA C Supplementary material: The program is designed for effective electrostatic interactions of large-scale simulation systems, which runs on particular computers equipped with NVIDIA GPUs. It has been tested on (a) single computer node with Intel(R) Core(TM) i7-3770@ 3.40 GHz (CPU) and GTX 980 Ti (GPU), and (b) MPI parallel computer nodes with the same configurations. Nature of problem: For molecular dynamics simulation, the electrostatic interaction is the most time-consuming computation because of its long-range feature and slow convergence in simulation space, which approximately take up most of the total simulation time. Although the parallel method CU-ENUF (Yang et al., 2016) based on GPU has achieved a qualitative leap compared with previous methods in electrostatic interactions computation, the computation capability is limited to the throughput capacity of a single GPU for super-scale simulation system. Therefore, we should look for an effective method to handle the calculation of electrostatic interactions efficiently for a simulation system with super-scale size. Solution method: We constructed a hybrid parallel architecture, in which CPU and GPU are combined to accelerate the electrostatic computation effectively. Firstly, the simulation system is divided into many subtasks via domain-decomposition method. Then MPI (Message Passing Interface) is used to implement the CPU-parallel computation with each computer node corresponding to a particular subtask, and furthermore each subtask in one computer node will be executed in GPU in parallel efficiently. In this hybrid parallel method, the most critical technical problem is how to parallelize a CUNFFT (nonequispaced fast Fourier transform based on CUDA) in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Restrictions: The HP-ENUF is mainly oriented to super-scale system simulations, in which the performance superiority is shown adequately. However, for a small simulation system containing less than 106 particles, the mode of multiple computer nodes has no apparent efficiency advantage or even lower efficiency due to the serious network delay among computer nodes, than the mode of single computer node. References: (1) S.-C. Yang, H.-J. Qian, Z.-Y. Lu, Appl. Comput. Harmon. Anal. 2016, http://dx.doi.org/10.1016/j.acha.2016.04.009. (2) S.-C. Yang, Y.-L. Wang, G.-S. Jiao, H.-J. Qian, Z.-Y. Lu, J. Comput. Chem. 37 (2016) 378. (3) S.-C. Yang, Y.-L. Zhu, H.-J. Qian, Z.-Y. Lu, Appl. Chem. Res. Chin. Univ., 2017, http://dx.doi.org/10.1007/s40242-016-6354-5. (4) Y.-L. Zhu, H. Liu, Z.-W. Li, H.-J. Qian, G. Milano, Z.-Y. Lu, J. Comput. Chem. 34 (2013) 2197.
Terascale direct numerical simulations of turbulent combustion using S3D

NASA Astrophysics Data System (ADS)

Chen, J. H.; Choudhary, A.; de Supinski, B.; DeVries, M.; Hawkes, E. R.; Klasky, S.; Liao, W. K.; Ma, K. L.; Mellor-Crummey, J.; Podhorszki, N.; Sankaran, R.; Shende, S.; Yoo, C. S.

2009-01-01

Computational science is paramount to the understanding of underlying processes in internal combustion engines of the future that will utilize non-petroleum-based alternative fuels, including carbon-neutral biofuels, and burn in new combustion regimes that will attain high efficiency while minimizing emissions of particulates and nitrogen oxides. Next-generation engines will likely operate at higher pressures, with greater amounts of dilution and utilize alternative fuels that exhibit a wide range of chemical and physical properties. Therefore, there is a significant role for high-fidelity simulations, direct numerical simulations (DNS), specifically designed to capture key turbulence-chemistry interactions in these relatively uncharted combustion regimes, and in particular, that can discriminate the effects of differences in fuel properties. In DNS, all of the relevant turbulence and flame scales are resolved numerically using high-order accurate numerical algorithms. As a consequence terascale DNS are computationally intensive, require massive amounts of computing power and generate tens of terabytes of data. Recent results from terascale DNS of turbulent flames are presented here, illustrating its role in elucidating flame stabilization mechanisms in a lifted turbulent hydrogen/air jet flame in a hot air coflow, and the flame structure of a fuel-lean turbulent premixed jet flame. Computing at this scale requires close collaborations between computer and combustion scientists to provide optimized scaleable algorithms and software for terascale simulations, efficient collective parallel I/O, tools for volume visualization of multiscale, multivariate data and automating the combustion workflow. The enabling computer science, applied to combustion science, is also required in many other terascale physics and engineering simulations. In particular, performance monitoring is used to identify the performance of key kernels in the DNS code, S3D and especially memory intensive loops in the code. Through the careful application of loop transformations, data reuse in cache is exploited thereby reducing memory bandwidth needs, and hence, improving S3D's nodal performance. To enhance collective parallel I/O in S3D, an MPI-I/O caching design is used to construct a two-stage write-behind method for improving the performance of write-only operations. The simulations generate tens of terabytes of data requiring analysis. Interactive exploration of the simulation data is enabled by multivariate time-varying volume visualization. The visualization highlights spatial and temporal correlations between multiple reactive scalar fields using an intuitive user interface based on parallel coordinates and time histogram. Finally, an automated combustion workflow is designed using Kepler to manage large-scale data movement, data morphing, and archival and to provide a graphical display of run-time diagnostics.
Ambarish Nag | NREL

Science.gov Websites

|Mathematical biology Education Ph.D., Computational Chemistry, University of Chicago M.S., Chemistry , University of Chicago M.S., (2-Year) Chemistry, Indian Institute of Technology, Kanpur, India B.S., Chemistry
The journey from forensic to predictive materials science using density functional theory

DOE PAGES

Schultz, Peter A.

2017-09-12

Approximate methods for electronic structure, implemented in sophisticated computer codes and married to ever-more powerful computing platforms, have become invaluable in chemistry and materials science. The maturing and consolidation of quantum chemistry codes since the 1980s, based upon explicitly correlated electronic wave functions, has made them a staple of modern molecular chemistry. Here, the impact of first principles electronic structure in physics and materials science had lagged owing to the extra formal and computational demands of bulk calculations.
The journey from forensic to predictive materials science using density functional theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schultz, Peter A.

Approximate methods for electronic structure, implemented in sophisticated computer codes and married to ever-more powerful computing platforms, have become invaluable in chemistry and materials science. The maturing and consolidation of quantum chemistry codes since the 1980s, based upon explicitly correlated electronic wave functions, has made them a staple of modern molecular chemistry. Here, the impact of first principles electronic structure in physics and materials science had lagged owing to the extra formal and computational demands of bulk calculations.
Introduction of Digital Computer Technology Into the Undergraduate Chemistry Laboratory. Final Technical Report.

ERIC Educational Resources Information Center

Perone, Sam P.

The objective of this project has been the development of a successful approach for the incorporation of on-line computer technology into the undergraduate chemistry laboratory. This approach assumes no prior programing, electronics or instrumental analysis experience on the part of the student; it does not displace the chemistry content with…
Computational Chemistry in the Undergraduate Laboratory: A Mechanistic Study of the Wittig Reaction

ERIC Educational Resources Information Center

Albrecht, Birgit

2014-01-01

The Wittig reaction is one of the most useful reactions in organic chemistry. Despite its prominence early in the organic chemistry curriculum, the exact mechanism of this reaction is still under debate, and this controversy is often neglected in the classroom. Introducing a simple computational study of the Wittig reaction illustrates the…
Quantum chemistry simulation on quantum computers: theories and experiments.

PubMed

Lu, Dawei; Xu, Boruo; Xu, Nanyang; Li, Zhaokai; Chen, Hongwei; Peng, Xinhua; Xu, Ruixue; Du, Jiangfeng

2012-07-14

It has been claimed that quantum computers can mimic quantum systems efficiently in the polynomial scale. Traditionally, those simulations are carried out numerically on classical computers, which are inevitably confronted with the exponential growth of required resources, with the increasing size of quantum systems. Quantum computers avoid this problem, and thus provide a possible solution for large quantum systems. In this paper, we first discuss the ideas of quantum simulation, the background of quantum simulators, their categories, and the development in both theories and experiments. We then present a brief introduction to quantum chemistry evaluated via classical computers followed by typical procedures of quantum simulation towards quantum chemistry. Reviewed are not only theoretical proposals but also proof-of-principle experimental implementations, via a small quantum computer, which include the evaluation of the static molecular eigenenergy and the simulation of chemical reaction dynamics. Although the experimental development is still behind the theory, we give prospects and suggestions for future experiments. We anticipate that in the near future quantum simulation will become a powerful tool for quantum chemistry over classical computations.
Computational chemistry at Janssen

NASA Astrophysics Data System (ADS)

van Vlijmen, Herman; Desjarlais, Renee L.; Mirzadegan, Tara

2017-03-01

Computer-aided drug discovery activities at Janssen are carried out by scientists in the Computational Chemistry group of the Discovery Sciences organization. This perspective gives an overview of the organizational and operational structure, the science, internal and external collaborations, and the impact of the group on Drug Discovery at Janssen.
Cooperative storage of shared files in a parallel computing system with dynamic block size

DOEpatents

Bent, John M.; Faibish, Sorin; Grider, Gary

2015-11-10

Improved techniques are provided for parallel writing of data to a shared object in a parallel computing system. A method is provided for storing data generated by a plurality of parallel processes to a shared object in a parallel computing system. The method is performed by at least one of the processes and comprises: dynamically determining a block size for storing the data; exchanging a determined amount of the data with at least one additional process to achieve a block of the data having the dynamically determined block size; and writing the block of the data having the dynamically determined block size to a file system. The determined block size comprises, e.g., a total amount of the data to be stored divided by the number of parallel processes. The file system comprises, for example, a log structured virtual parallel file system, such as a Parallel Log-Structured File System (PLFS).
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
Reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda A [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-01-10

Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Cambridge, MA; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

NASA Astrophysics Data System (ADS)

Simunovic, S.; Zacharia, T.; Baltas, N.; Spalding, D. B.

1995-03-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. The Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simunovic, S.; Zacharia, T.; Baltas, N.

1995-04-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. Themore » Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.« less
The IUPAC International Congresses of Pesticide Chemistry (1963-2014) and Pest Management Science: a half-century of progress.

PubMed

Brooks, Gerald T

2014-08-01

As we approach the 2014 San Francisco IUPAC Pesticide Chemistry Congress, we reflect on the 51 years of such congresses every 4 years since 1963. Meanwhile, our journal, Pesticide Science/Pest Management Science, has in parallel continually published relevant science for nearly as long (44 years from 1970). © 2014 Society of Chemical Industry.
Biocellion: accelerating computer simulation of multicellular biological system models

PubMed Central

Kang, Seunghwa; Kahan, Simon; McDermott, Jason; Flann, Nicholas; Shmulevich, Ilya

2014-01-01

Motivation: Biological system behaviors are often the outcome of complex interactions among a large number of cells and their biotic and abiotic environment. Computational biologists attempt to understand, predict and manipulate biological system behavior through mathematical modeling and computer simulation. Discrete agent-based modeling (in combination with high-resolution grids to model the extracellular environment) is a popular approach for building biological system models. However, the computational complexity of this approach forces computational biologists to resort to coarser resolution approaches to simulate large biological systems. High-performance parallel computers have the potential to address the computing challenge, but writing efficient software for parallel computers is difficult and time-consuming. Results: We have developed Biocellion, a high-performance software framework, to solve this computing challenge using parallel computers. To support a wide range of multicellular biological system models, Biocellion asks users to provide their model specifics by filling the function body of pre-defined model routines. Using Biocellion, modelers without parallel computing expertise can efficiently exploit parallel computers with less effort than writing sequential programs from scratch. We simulate cell sorting, microbial patterning and a bacterial system in soil aggregate as case studies. Availability and implementation: Biocellion runs on x86 compatible systems with the 64 bit Linux operating system and is freely available for academic use. Visit http://biocellion.com for additional information. Contact: seunghwa.kang@pnnl.gov PMID:25064572
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.

Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

NASA Technical Reports Server (NTRS)

Logan, Terry G.

1994-01-01

The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.
Learning Kinetic Monte Carlo Models of Condensed Phase High Temperature Chemistry from Molecular Dynamics

NASA Astrophysics Data System (ADS)

Yang, Qian; Sing-Long, Carlos; Chen, Enze; Reed, Evan

2017-06-01

Complex chemical processes, such as the decomposition of energetic materials and the chemistry of planetary interiors, are typically studied using large-scale molecular dynamics simulations that run for weeks on high performance parallel machines. These computations may involve thousands of atoms forming hundreds of molecular species and undergoing thousands of reactions. It is natural to wonder whether this wealth of data can be utilized to build more efficient, interpretable, and predictive models. In this talk, we will use techniques from statistical learning to develop a framework for constructing Kinetic Monte Carlo (KMC) models from molecular dynamics data. We will show that our KMC models can not only extrapolate the behavior of the chemical system by as much as an order of magnitude in time, but can also be used to study the dynamics of entirely different chemical trajectories with a high degree of fidelity. Then, we will discuss three different methods for reducing our learned KMC models, including a new and efficient data-driven algorithm using L1-regularization. We demonstrate our framework throughout on a system of high-temperature high-pressure liquid methane, thought to be a major component of gas giant planetary interiors.
Parallel aeroelastic computations for wing and wing-body configurations

NASA Technical Reports Server (NTRS)

Byun, Chansup

1994-01-01

The objective of this research is to develop computationally efficient methods for solving fluid-structural interaction problems by directly coupling finite difference Euler/Navier-Stokes equations for fluids and finite element dynamics equations for structures on parallel computers. This capability will significantly impact many aerospace projects of national importance such as Advanced Subsonic Civil Transport (ASCT), where the structural stability margin becomes very critical at the transonic region. This research effort will have direct impact on the High Performance Computing and Communication (HPCC) Program of NASA in the area of parallel computing.
Implementation of a 3D mixing layer code on parallel computers

NASA Technical Reports Server (NTRS)

Roe, K.; Thakur, R.; Dang, T.; Bogucz, E.

1995-01-01

This paper summarizes our progress and experience in the development of a Computational-Fluid-Dynamics code on parallel computers to simulate three-dimensional spatially-developing mixing layers. In this initial study, the three-dimensional time-dependent Euler equations are solved using a finite-volume explicit time-marching algorithm. The code was first programmed in Fortran 77 for sequential computers. The code was then converted for use on parallel computers using the conventional message-passing technique, while we have not been able to compile the code with the present version of HPF compilers.
NASA Workshop on Computational Structural Mechanics 1987, part 1

NASA Technical Reports Server (NTRS)

Sykes, Nancy P. (Editor)

1989-01-01

Topics in Computational Structural Mechanics (CSM) are reviewed. CSM parallel structural methods, a transputer finite element solver, architectures for multiprocessor computers, and parallel eigenvalue extraction are among the topics discussed.
Using Games To Teach Chemistry: An Annotated Bibliography

NASA Astrophysics Data System (ADS)

Russell, Jeanne V.

1999-04-01

A list of published or marketed games based on a chemistry motif is presented. Each game is listed according to its level, subject matter, and title. A bibliographic notation and a short description are given for each game. For Introductory/High School/General Chemistry, 45 games are listed under the subjects General Knowledge; Elements & Atomic Structure (not Symbols); Nomenclature, Formulas, & Equation Writing; Chemical Reactions: Solutions & Solubilities; and Other Subjects. Seventeen games are listed under Organic Chemistry and 4 games under Other Chemistry Games. Computer games designed for outdated computers (PDP-11, TRS-80, and Apple II) are not included.
Parallel computation with molecular-motor-propelled agents in nanofabricated networks.

PubMed

Nicolau, Dan V; Lard, Mercy; Korten, Till; van Delft, Falco C M J M; Persson, Malin; Bengtsson, Elina; Månsson, Alf; Diez, Stefan; Linke, Heiner; Nicolau, Dan V

2016-03-08

The combinatorial nature of many important mathematical problems, including nondeterministic-polynomial-time (NP)-complete problems, places a severe limitation on the problem size that can be solved with conventional, sequentially operating electronic computers. There have been significant efforts in conceiving parallel-computation approaches in the past, for example: DNA computation, quantum computation, and microfluidics-based computation. However, these approaches have not proven, so far, to be scalable and practical from a fabrication and operational perspective. Here, we report the foundations of an alternative parallel-computation system in which a given combinatorial problem is encoded into a graphical, modular network that is embedded in a nanofabricated planar device. Exploring the network in a parallel fashion using a large number of independent, molecular-motor-propelled agents then solves the mathematical problem. This approach uses orders of magnitude less energy than conventional computers, thus addressing issues related to power consumption and heat dissipation. We provide a proof-of-concept demonstration of such a device by solving, in a parallel fashion, the small instance {2, 5, 9} of the subset sum problem, which is a benchmark NP-complete problem. Finally, we discuss the technical advances necessary to make our system scalable with presently available technology.
Computers in Science: Thinking Outside the Discipline.

ERIC Educational Resources Information Center

Hamilton, Todd M.

2003-01-01

Describes the Computers in Science course which integrates computer-related techniques into the science disciplines of chemistry, physics, biology, and Earth science. Uses a team teaching approach and teaches students how to solve chemistry problems with spreadsheets, identify minerals with X-rays, and chemical and force analysis. (Contains 14…
Computer-Based Molecular Modelling: Finnish School Teachers' Experiences and Views

ERIC Educational Resources Information Center

Aksela, Maija; Lundell, Jan

2008-01-01

Modern computer-based molecular modelling opens up new possibilities for chemistry teaching at different levels. This article presents a case study seeking insight into Finnish school teachers' use of computer-based molecular modelling in teaching chemistry, into the different working and teaching methods used, and their opinions about necessary…
Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

NASA Astrophysics Data System (ADS)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
A highly efficient multi-core algorithm for clustering extremely large datasets

PubMed Central

2010-01-01

Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
Alternative modeling methods for plasma-based Rf ion sources

DOE Office of Scientific and Technical Information (OSTI.GOV)

Veitzer, Seth A., E-mail: veitzer@txcorp.com; Kundrapu, Madhusudhan, E-mail: madhusnk@txcorp.com; Stoltz, Peter H., E-mail: phstoltz@txcorp.com

Rf-driven ion sources for accelerators and many industrial applications benefit from detailed numerical modeling and simulation of plasma characteristics. For instance, modeling of the Spallation Neutron Source (SNS) internal antenna H{sup −} source has indicated that a large plasma velocity is induced near bends in the antenna where structural failures are often observed. This could lead to improved designs and ion source performance based on simulation and modeling. However, there are significant separations of time and spatial scales inherent to Rf-driven plasma ion sources, which makes it difficult to model ion sources with explicit, kinetic Particle-In-Cell (PIC) simulation codes. Inmore » particular, if both electron and ion motions are to be explicitly modeled, then the simulation time step must be very small, and total simulation times must be large enough to capture the evolution of the plasma ions, as well as extending over many Rf periods. Additional physics processes such as plasma chemistry and surface effects such as secondary electron emission increase the computational requirements in such a way that even fully parallel explicit PIC models cannot be used. One alternative method is to develop fluid-based codes coupled with electromagnetics in order to model ion sources. Time-domain fluid models can simulate plasma evolution, plasma chemistry, and surface physics models with reasonable computational resources by not explicitly resolving electron motions, which thereby leads to an increase in the time step. This is achieved by solving fluid motions coupled with electromagnetics using reduced-physics models, such as single-temperature magnetohydrodynamics (MHD), extended, gas dynamic, and Hall MHD, and two-fluid MHD models. We show recent results on modeling the internal antenna H{sup −} ion source for the SNS at Oak Ridge National Laboratory using the fluid plasma modeling code USim. We compare demonstrate plasma temperature equilibration in two-temperature MHD models for the SNS source and present simulation results demonstrating plasma evolution over many Rf periods for different plasma temperatures. We perform the calculations in parallel, on unstructured meshes, using finite-volume solvers in order to obtain results in reasonable time.« less
Alternative modeling methods for plasma-based Rf ion sources.

PubMed

Veitzer, Seth A; Kundrapu, Madhusudhan; Stoltz, Peter H; Beckwith, Kristian R C

2016-02-01

Rf-driven ion sources for accelerators and many industrial applications benefit from detailed numerical modeling and simulation of plasma characteristics. For instance, modeling of the Spallation Neutron Source (SNS) internal antenna H(-) source has indicated that a large plasma velocity is induced near bends in the antenna where structural failures are often observed. This could lead to improved designs and ion source performance based on simulation and modeling. However, there are significant separations of time and spatial scales inherent to Rf-driven plasma ion sources, which makes it difficult to model ion sources with explicit, kinetic Particle-In-Cell (PIC) simulation codes. In particular, if both electron and ion motions are to be explicitly modeled, then the simulation time step must be very small, and total simulation times must be large enough to capture the evolution of the plasma ions, as well as extending over many Rf periods. Additional physics processes such as plasma chemistry and surface effects such as secondary electron emission increase the computational requirements in such a way that even fully parallel explicit PIC models cannot be used. One alternative method is to develop fluid-based codes coupled with electromagnetics in order to model ion sources. Time-domain fluid models can simulate plasma evolution, plasma chemistry, and surface physics models with reasonable computational resources by not explicitly resolving electron motions, which thereby leads to an increase in the time step. This is achieved by solving fluid motions coupled with electromagnetics using reduced-physics models, such as single-temperature magnetohydrodynamics (MHD), extended, gas dynamic, and Hall MHD, and two-fluid MHD models. We show recent results on modeling the internal antenna H(-) ion source for the SNS at Oak Ridge National Laboratory using the fluid plasma modeling code USim. We compare demonstrate plasma temperature equilibration in two-temperature MHD models for the SNS source and present simulation results demonstrating plasma evolution over many Rf periods for different plasma temperatures. We perform the calculations in parallel, on unstructured meshes, using finite-volume solvers in order to obtain results in reasonable time.
Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)

2001-01-01

This viewgraph presentation provides information on the technical aspects of debugging computer code that has been automatically converted for use in a parallel computing system. Shared memory parallelization and distributed memory parallelization entail separate and distinct challenges for a debugging program. A prototype system has been developed which integrates various tools for the debugging of automatically parallelized programs including the CAPTools Database which provides variable definition information across subroutines as well as array distribution information.
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Van Rosendale, John

1989-01-01

A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. The authors focus on tensor product array computations, a simple but important class of numerical algorithms. They consider first the problem of programming one-dimensional kernel routines, such as parallel tridiagonal solvers, and then look at how such parallel kernels can be combined to form parallel tensor product algorithms.
Development of a Distributed Parallel Computing Framework to Facilitate Regional/Global Gridded Crop Modeling with Various Scenarios

NASA Astrophysics Data System (ADS)

Jang, W.; Engda, T. A.; Neff, J. C.; Herrick, J.

2017-12-01

Many crop models are increasingly used to evaluate crop yields at regional and global scales. However, implementation of these models across large areas using fine-scale grids is limited by computational time requirements. In order to facilitate global gridded crop modeling with various scenarios (i.e., different crop, management schedule, fertilizer, and irrigation) using the Environmental Policy Integrated Climate (EPIC) model, we developed a distributed parallel computing framework in Python. Our local desktop with 14 cores (28 threads) was used to test the distributed parallel computing framework in Iringa, Tanzania which has 406,839 grid cells. High-resolution soil data, SoilGrids (250 x 250 m), and climate data, AgMERRA (0.25 x 0.25 deg) were also used as input data for the gridded EPIC model. The framework includes a master file for parallel computing, input database, input data formatters, EPIC model execution, and output analyzers. Through the master file for parallel computing, the user-defined number of threads of CPU divides the EPIC simulation into jobs. Then, Using EPIC input data formatters, the raw database is formatted for EPIC input data and the formatted data moves into EPIC simulation jobs. Then, 28 EPIC jobs run simultaneously and only interesting results files are parsed and moved into output analyzers. We applied various scenarios with seven different slopes and twenty-four fertilizer ranges. Parallelized input generators create different scenarios as a list for distributed parallel computing. After all simulations are completed, parallelized output analyzers are used to analyze all outputs according to the different scenarios. This saves significant computing time and resources, making it possible to conduct gridded modeling at regional to global scales with high-resolution data. For example, serial processing for the Iringa test case would require 113 hours, while using the framework developed in this study requires only approximately 6 hours, a nearly 95% reduction in computing time.
Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nielsen, Jens; D’Avezac, Mayeul; Hetherington, James

2013-12-14

Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. Moremore » recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion.« less
Case Studies in Systems Chemistry. Final Report. [Includes Complete Case Study, Carboxylic Acid Equilibria

ERIC Educational Resources Information Center

Fleck, George

This publication was produced as a teaching tool for college chemistry. The book is a text for a computer-based unit on the chemistry of acid-base titrations, and is designed for use with FORTRAN or BASIC computer systems, and with a programmable electronic calculator, in a variety of educational settings. The text attempts to present computer…
Introduction to Computational Chemistry: Teaching Hu¨ckel Molecular Orbital Theory Using an Excel Workbook for Matrix Diagonalization

ERIC Educational Resources Information Center

Litofsky, Joshua; Viswanathan, Rama

2015-01-01

Matrix diagonalization, the key technique at the heart of modern computational chemistry for the numerical solution of the Schrödinger equation, can be easily introduced in the physical chemistry curriculum in a pedagogical context using simple Hückel molecular orbital theory for p bonding in molecules. We present details and results of…
Effects of Computer Based Learning on Students' Attitudes and Achievements towards Analytical Chemistry

ERIC Educational Resources Information Center

Akcay, Hüsamettin; Durmaz, Asli; Tüysüz, Cengiz; Feyzioglu, Burak

2006-01-01

The aim of this study was to compare the effects of computer-based learning and traditional method on students' attitudes and achievement towards analytical chemistry. Students from Chemistry Education Department at Dokuz Eylul University (D.E.U) were selected randomly and divided into three groups; two experimental (Eg-1 and Eg-2) and a control…

Reaction of formaldehyde at the ortho- and para-positions of phenol: exploration of mechanisms using computational chemistry.

Treesearch

Anthony H. Conner; Melissa S. Reeves

2001-01-01

Computational chemistry methods can be used to explore the theoretical chemistry behind reactive systems, to compare the relative chemical reactivity of different systems, and, by extension, to predict the reactivity of new systems. Ongoing research has focused on the reactivity of a wide variety of phenolic compounds with formaldehyde using semi-empirical and ab...
Continuous development of schemes for parallel computing of the electrostatics in biological systems: implementation in DelPhi.

PubMed

Li, Chuan; Petukh, Marharyta; Li, Lin; Alexov, Emil

2013-08-15

Due to the enormous importance of electrostatics in molecular biology, calculating the electrostatic potential and corresponding energies has become a standard computational approach for the study of biomolecules and nano-objects immersed in water and salt phase or other media. However, the electrostatics of large macromolecules and macromolecular complexes, including nano-objects, may not be obtainable via explicit methods and even the standard continuum electrostatics methods may not be applicable due to high computational time and memory requirements. Here, we report further development of the parallelization scheme reported in our previous work (Li, et al., J. Comput. Chem. 2012, 33, 1960) to include parallelization of the molecular surface and energy calculations components of the algorithm. The parallelization scheme utilizes different approaches such as space domain parallelization, algorithmic parallelization, multithreading, and task scheduling, depending on the quantity being calculated. This allows for efficient use of the computing resources of the corresponding computer cluster. The parallelization scheme is implemented in the popular software DelPhi and results in speedup of several folds. As a demonstration of the efficiency and capability of this methodology, the electrostatic potential, and electric field distributions are calculated for the bovine mitochondrial supercomplex illustrating their complex topology, which cannot be obtained by modeling the supercomplex components alone. Copyright © 2013 Wiley Periodicals, Inc.
Redundant binary number representation for an inherently parallel arithmetic on optical computers.

PubMed

De Biase, G A; Massini, A

1993-02-10

A simple redundant binary number representation suitable for digital-optical computers is presented. By means of this representation it is possible to build an arithmetic with carry-free parallel algebraic sums carried out in constant time and parallel multiplication in log N time. This redundant number representation naturally fits the 2's complement binary number system and permits the construction of inherently parallel arithmetic units that are used in various optical technologies. Some properties of this number representation and several examples of computation are presented.
Backtracking and Re-execution in the Automatic Debugging of Parallelized Programs

NASA Technical Reports Server (NTRS)

Matthews, Gregory; Hood, Robert; Johnson, Stephen; Leggett, Peter; Biegel, Bryan (Technical Monitor)

2002-01-01

In this work we describe a new approach using relative debugging to find differences in computation between a serial program and a parallel version of th it program. We use a combination of re-execution and backtracking in order to find the first difference in computation that may ultimately lead to an incorrect value that the user has indicated. In our prototype implementation we use static analysis information from a parallelization tool in order to perform the backtracking as well as the mapping required between serial and parallel computations.
Traffic Simulations on Parallel Computers Using Domain Decomposition Techniques

DOT National Transportation Integrated Search

1995-01-01

Large scale simulations of Intelligent Transportation Systems (ITS) can only be acheived by using the computing resources offered by parallel computing architectures. Domain decomposition techniques are proposed which allow the performance of traffic...
A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers

NASA Technical Reports Server (NTRS)

Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)

1997-01-01

The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on-going study progresses.
Parallel computing on Unix workstation arrays

NASA Astrophysics Data System (ADS)

Reale, F.; Bocchino, F.; Sciortino, S.

1994-12-01

We have tested arrays of general-purpose Unix workstations used as MIMD systems for massive parallel computations. In particular we have solved numerically a demanding test problem with a 2D hydrodynamic code, generally developed to study astrophysical flows, by exucuting it on arrays either of DECstations 5000/200 on Ethernet LAN, or of DECstations 3000/400, equipped with powerful Alpha processors, on FDDI LAN. The code is appropriate for data-domain decomposition, and we have used a library for parallelization previously developed in our Institute, and easily extended to work on Unix workstation arrays by using the PVM software toolset. We have compared the parallel efficiencies obtained on arrays of several processors to those obtained on a dedicated MIMD parallel system, namely a Meiko Computing Surface (CS-1), equipped with Intel i860 processors. We discuss the feasibility of using non-dedicated parallel systems and conclude that the convenience depends essentially on the size of the computational domain as compared to the relative processor power and network bandwidth. We point out that for future perspectives a parallel development of processor and network technology is important, and that the software still offers great opportunities of improvement, especially in terms of latency times in the message-passing protocols. In conditions of significant gain in terms of speedup, such workstation arrays represent a cost-effective approach to massive parallel computations.
Parallelization strategies for continuum-generalized method of moments on the multi-thread systems

NASA Astrophysics Data System (ADS)

Bustamam, A.; Handhika, T.; Ernastuti, Kerami, D.

2017-07-01

Continuum-Generalized Method of Moments (C-GMM) covers the Generalized Method of Moments (GMM) shortfall which is not as efficient as Maximum Likelihood estimator by using the continuum set of moment conditions in a GMM framework. However, this computation would take a very long time since optimizing regularization parameter. Unfortunately, these calculations are processed sequentially whereas in fact all modern computers are now supported by hierarchical memory systems and hyperthreading technology, which allowing for parallel computing. This paper aims to speed up the calculation process of C-GMM by designing a parallel algorithm for C-GMM on the multi-thread systems. First, parallel regions are detected for the original C-GMM algorithm. There are two parallel regions in the original C-GMM algorithm, that are contributed significantly to the reduction of computational time: the outer-loop and the inner-loop. Furthermore, this parallel algorithm will be implemented with standard shared-memory application programming interface, i.e. Open Multi-Processing (OpenMP). The experiment shows that the outer-loop parallelization is the best strategy for any number of observations.
Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform.

PubMed

Tao, Liang; Kwan, Hon Keung

2012-07-01

Novel algorithms for the multirate and fast parallel implementation of the 2-D discrete Hartley transform (DHT)-based real-valued discrete Gabor transform (RDGT) and its inverse transform are presented in this paper. A 2-D multirate-based analysis convolver bank is designed for the 2-D RDGT, and a 2-D multirate-based synthesis convolver bank is designed for the 2-D inverse RDGT. The parallel channels in each of the two convolver banks have a unified structure and can apply the 2-D fast DHT algorithm to speed up their computations. The computational complexity of each parallel channel is low and is independent of the Gabor oversampling rate. All the 2-D RDGT coefficients of an image are computed in parallel during the analysis process and can be reconstructed in parallel during the synthesis process. The computational complexity and time of the proposed parallel algorithms are analyzed and compared with those of the existing fastest algorithms for 2-D discrete Gabor transforms. The results indicate that the proposed algorithms are the fastest, which make them attractive for real-time image processing.
An abstraction layer for efficient memory management of tabulated chemistry and flamelet solutions

NASA Astrophysics Data System (ADS)

Weise, Steffen; Messig, Danny; Meyer, Bernd; Hasse, Christian

2013-06-01

A large number of methods for simulating reactive flows exist, some of them, for example, directly use detailed chemical kinetics or use precomputed and tabulated flame solutions. Both approaches couple the research fields computational fluid dynamics and chemistry tightly together using either an online or offline approach to solve the chemistry domain. The offline approach usually involves a method of generating databases or so-called Lookup-Tables (LUTs). As these LUTs are extended to not only contain material properties but interactions between chemistry and turbulent flow, the number of parameters and thus dimensions increases. Given a reasonable discretisation, file sizes can increase drastically. The main goal of this work is to provide methods that handle large database files efficiently. A Memory Abstraction Layer (MAL) has been developed that handles requested LUT entries efficiently by splitting the database file into several smaller blocks. It keeps the total memory usage at a minimum using thin allocation methods and compression to minimise filesystem operations. The MAL has been evaluated using three different test cases. The first rather generic one is a sequential reading operation on an LUT to evaluate the runtime behaviour as well as the memory consumption of the MAL. The second test case is a simulation of a non-premixed turbulent flame, the so-called HM1 flame, which is a well-known test case in the turbulent combustion community. The third test case is a simulation of a non-premixed laminar flame as described by McEnally in 1996 and Bennett in 2000. Using the previously developed solver 'flameletFoam' in conjunction with the MAL, memory consumption and the performance penalty introduced were studied. The total memory used while running a parallel simulation was reduced significantly while the CPU time overhead associated with the MAL remained low.
Parallel Computational Fluid Dynamics: Current Status and Future Requirements

NASA Technical Reports Server (NTRS)

Simon, Horst D.; VanDalsem, William R.; Dagum, Leonardo; Kutler, Paul (Technical Monitor)

1994-01-01

One or the key objectives of the Applied Research Branch in the Numerical Aerodynamic Simulation (NAS) Systems Division at NASA Allies Research Center is the accelerated introduction of highly parallel machines into a full operational environment. In this report we discuss the performance results obtained from the implementation of some computational fluid dynamics (CFD) applications on the Connection Machine CM-2 and the Intel iPSC/860. We summarize some of the experiences made so far with the parallel testbed machines at the NAS Applied Research Branch. Then we discuss the long term computational requirements for accomplishing some of the grand challenge problems in computational aerosciences. We argue that only massively parallel machines will be able to meet these grand challenge requirements, and we outline the computer science and algorithm research challenges ahead.
Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

NASA Astrophysics Data System (ADS)

Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

2018-01-01

We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.
A parallel-processing approach to computing for the geographic sciences; applications and systems enhancements

USGS Publications Warehouse

Crane, Michael; Steinwand, Dan; Beckmann, Tim; Krpan, Greg; Liu, Shu-Guang; Nichols, Erin; Haga, Jim; Maddox, Brian; Bilderback, Chris; Feller, Mark; Homer, George

2001-01-01

The overarching goal of this project is to build a spatially distributed infrastructure for information science research by forming a team of information science researchers and providing them with similar hardware and software tools to perform collaborative research. Four geographically distributed Centers of the U.S. Geological Survey (USGS) are developing their own clusters of low-cost, personal computers into parallel computing environments that provide a costeffective way for the USGS to increase participation in the high-performance computing community. Referred to as Beowulf clusters, these hybrid systems provide the robust computing power required for conducting information science research into parallel computing systems and applications.
Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schmalz, Mark S

2011-07-24

Statement of Problem - Department of Energy has many legacy codes for simulation of computational particle dynamics and computational fluid dynamics applications that are designed to run on sequential processors and are not easily parallelized. Emerging high-performance computing architectures employ massively parallel multicore architectures (e.g., graphics processing units) to increase throughput. Parallelization of legacy simulation codes is a high priority, to achieve compatibility, efficiency, accuracy, and extensibility. General Statement of Solution - A legacy simulation application designed for implementation on mainly-sequential processors has been represented as a graph G. Mathematical transformations, applied to G, produce a graph representation {und G}more » for a high-performance architecture. Key computational and data movement kernels of the application were analyzed/optimized for parallel execution using the mapping G {yields} {und G}, which can be performed semi-automatically. This approach is widely applicable to many types of high-performance computing systems, such as graphics processing units or clusters comprised of nodes that contain one or more such units. Phase I Accomplishments - Phase I research decomposed/profiled computational particle dynamics simulation code for rocket fuel combustion into low and high computational cost regions (respectively, mainly sequential and mainly parallel kernels), with analysis of space and time complexity. Using the research team's expertise in algorithm-to-architecture mappings, the high-cost kernels were transformed, parallelized, and implemented on Nvidia Fermi GPUs. Measured speedups (GPU with respect to single-core CPU) were approximately 20-32X for realistic model parameters, without final optimization. Error analysis showed no loss of computational accuracy. Commercial Applications and Other Benefits - The proposed research will constitute a breakthrough in solution of problems related to efficient parallel computation of particle and fluid dynamics simulations. These problems occur throughout DOE, military and commercial sectors: the potential payoff is high. We plan to license or sell the solution to contractors for military and domestic applications such as disaster simulation (aerodynamic and hydrodynamic), Government agencies (hydrological and environmental simulations), and medical applications (e.g., in tomographic image reconstruction). Keywords - High-performance Computing, Graphic Processing Unit, Fluid/Particle Simulation. Summary for Members of Congress - Department of Energy has many simulation codes that must compute faster, to be effective. The Phase I research parallelized particle/fluid simulations for rocket combustion, for high-performance computing systems.« less
Using quantum chemistry muscle to flex massive systems: How to respond to something perturbing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bertoni, Colleen

Computational chemistry uses the theoretical advances of quantum mechanics and the algorithmic and hardware advances of computer science to give insight into chemical problems. It is currently possible to do highly accurate quantum chemistry calculations, but the most accurate methods are very computationally expensive. Thus it is only feasible to do highly accurate calculations on small molecules, since typically more computationally efficient methods are also less accurate. The overall goal of my dissertation work has been to try to decrease the computational expense of calculations without decreasing the accuracy. In particular, my dissertation work focuses on fragmentation methods, intermolecular interactionsmore » methods, analytic gradients, and taking advantage of new hardware.« less
Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

NASA Astrophysics Data System (ADS)

Qin, Cheng-Zhi; Zhan, Lijun

2012-06-01

As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.
Parallelized Stochastic Cutoff Method for Long-Range Interacting Systems

NASA Astrophysics Data System (ADS)

Endo, Eishin; Toga, Yuta; Sasaki, Munetaka

2015-07-01

We present a method of parallelizing the stochastic cutoff (SCO) method, which is a Monte-Carlo method for long-range interacting systems. After interactions are eliminated by the SCO method, we subdivide a lattice into noninteracting interpenetrating sublattices. This subdivision enables us to parallelize the Monte-Carlo calculation in the SCO method. Such subdivision is found by numerically solving the vertex coloring of a graph created by the SCO method. We use an algorithm proposed by Kuhn and Wattenhofer to solve the vertex coloring by parallel computation. This method was applied to a two-dimensional magnetic dipolar system on an L × L square lattice to examine its parallelization efficiency. The result showed that, in the case of L = 2304, the speed of computation increased about 102 times by parallel computation with 288 processors.
The role of computational chemistry in the science and measurements of the atmosphere

NASA Technical Reports Server (NTRS)

Phillips, D. H.

1978-01-01

The role of computational chemistry in determining the stability, photochemistry, spectroscopic parameters, and parameters for estimating reaction rates of atmospheric constituents is discussed. Examples dealing with the photolysis cross sections of HOCl and (1 Delta g) O2 and with the stability of gaseous NH4Cl and asymmetric ClO3 are presented. It is concluded that computational chemistry can play an important role in the study of atmospheric constituents, particularly reactive and short-lived species which are difficult to investigate experimentally.
Computational chemistry and aeroassisted orbital transfer vehicles

NASA Technical Reports Server (NTRS)

Cooper, D. M.; Jaffe, R. L.; Arnold, J. O.

1985-01-01

An analysis of the radiative heating phenomena encountered during a typical aeroassisted orbital transfer vehicle (AOTV) trajectory was made to determine the potential impact of computational chemistry on AOTV design technology. Both equilibrium and nonequilibrium radiation mechanisms were considered. This analysis showed that computational chemistry can be used to predict (1) radiative intensity factors and spectroscopic data; (2) the excitation rates of both atoms and molecules; (3) high-temperature reaction rate constants for metathesis and charge exchange reactions; (4) particle ionization and neutralization rates and cross sections; and (5) spectral line widths.
The Development of Computational Thinking in a High School Chemistry Course

ERIC Educational Resources Information Center

Matsumoto, Paul S.; Cao, Jiankang

2017-01-01

Computational thinking is a component of the Science and Engineering Practices in the Next Generation Science Standards, which were adopted by some states. We describe the activities in a high school chemistry course that may develop students' computational thinking skills by primarily using Excel, a widely available spreadsheet software. These…

Proceedings: Conference on Computers in Chemical Education and Research, Dekalb, Illinois, 19-23 July 1971.

ERIC Educational Resources Information Center

1971

Computers have effected a comprehensive transformation of chemistry. Computers have greatly enhanced the chemist's ability to do model building, simulations, data refinement and reduction, analysis of data in terms of models, on-line data logging, automated control of experiments, quantum chemistry and statistical and mechanical calculations, and…
Molecular Orbitals of NO, NO[superscript+], and NO[superscript-]: A Computational Quantum Chemistry Experiment

ERIC Educational Resources Information Center

Orenha, Renato P.; Galembeck, Sérgio E.

2014-01-01

This computational experiment presents qualitative molecular orbital (QMO) and computational quantum chemistry exercises of NO, NO[superscript+], and NO[superscript-]. Initially students explore several properties of the target molecules by Lewis diagrams and the QMO theory. Then, they compare qualitative conclusions with EHT and DFT calculations…
Synthesis of a drug-like focused library of trisubstituted pyrrolidines using integrated flow chemistry and batch methods.

PubMed

Baumann, Marcus; Baxendale, Ian R; Kuratli, Christoph; Ley, Steven V; Martin, Rainer E; Schneider, Josef

2011-07-11

A combination of flow and batch chemistries has been successfully applied to the assembly of a series of trisubstituted drug-like pyrrolidines. This study demonstrates the efficient preparation of a focused library of these pharmaceutically important structures using microreactor technologies, as well as classical parallel synthesis techniques, and thus exemplifies the impact of integrating innovative enabling tools within the drug discovery process.
Identifying failure in a tree network of a parallel computer

DOEpatents

Archer, Charles J.; Pinnow, Kurt W.; Wallenfelt, Brian P.

2010-08-24

Methods, parallel computers, and products are provided for identifying failure in a tree network of a parallel computer. The parallel computer includes one or more processing sets including an I/O node and a plurality of compute nodes. For each processing set embodiments include selecting a set of test compute nodes, the test compute nodes being a subset of the compute nodes of the processing set; measuring the performance of the I/O node of the processing set; measuring the performance of the selected set of test compute nodes; calculating a current test value in dependence upon the measured performance of the I/O node of the processing set, the measured performance of the set of test compute nodes, and a predetermined value for I/O node performance; and comparing the current test value with a predetermined tree performance threshold. If the current test value is below the predetermined tree performance threshold, embodiments include selecting another set of test compute nodes. If the current test value is not below the predetermined tree performance threshold, embodiments include selecting from the test compute nodes one or more potential problem nodes and testing individually potential problem nodes and links to potential problem nodes.
Design of on-board parallel computer on nano-satellite

NASA Astrophysics Data System (ADS)

You, Zheng; Tian, Hexiang; Yu, Shijie; Meng, Li

2007-11-01

This paper provides one scheme of the on-board parallel computer system designed for the Nano-satellite. Based on the development request that the Nano-satellite should have a small volume, low weight, low power cost, and intelligence, this scheme gets rid of the traditional one-computer system and dual-computer system with endeavor to improve the dependability, capability and intelligence simultaneously. According to the method of integration design, it employs the parallel computer system with shared memory as the main structure, connects the telemetric system, attitude control system, and the payload system by the intelligent bus, designs the management which can deal with the static tasks and dynamic task-scheduling, protect and recover the on-site status and so forth in light of the parallel algorithms, and establishes the fault diagnosis, restoration and system restructure mechanism. It accomplishes an on-board parallel computer system with high dependability, capability and intelligence, a flexible management on hardware resources, an excellent software system, and a high ability in extension, which satisfies with the conception and the tendency of the integration electronic design sufficiently.
Optical Symbolic Computing

NASA Astrophysics Data System (ADS)

Neff, John A.

1989-12-01

Experiments originating from Gestalt psychology have shown that representing information in a symbolic form provides a more effective means to understanding. Computer scientists have been struggling for the last two decades to determine how best to create, manipulate, and store collections of symbolic structures. In the past, much of this struggling led to software innovations because that was the path of least resistance. For example, the development of heuristics for organizing the searching through knowledge bases was much less expensive than building massively parallel machines that could search in parallel. That is now beginning to change with the emergence of parallel architectures which are showing the potential for handling symbolic structures. This paper will review the relationships between symbolic computing and parallel computing architectures, and will identify opportunities for optics to significantly impact the performance of such computing machines. Although neural networks are an exciting subset of massively parallel computing structures, this paper will not touch on this area since it is receiving a great deal of attention in the literature. That is, the concepts presented herein do not consider the distributed representation of knowledge.
Parallel rendering

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1995-01-01

This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
Enhancing PC Cluster-Based Parallel Branch-and-Bound Algorithms for the Graph Coloring Problem

NASA Astrophysics Data System (ADS)

Taoka, Satoshi; Takafuji, Daisuke; Watanabe, Toshimasa

A branch-and-bound algorithm (BB for short) is the most general technique to deal with various combinatorial optimization problems. Even if it is used, computation time is likely to increase exponentially. So we consider its parallelization to reduce it. It has been reported that the computation time of a parallel BB heavily depends upon node-variable selection strategies. And, in case of a parallel BB, it is also necessary to prevent increase in communication time. So, it is important to pay attention to how many and what kind of nodes are to be transferred (called sending-node selection strategy). In this paper, for the graph coloring problem, we propose some sending-node selection strategies for a parallel BB algorithm by adopting MPI for parallelization and experimentally evaluate how these strategies affect computation time of a parallel BB on a PC cluster network.
Biocellion: accelerating computer simulation of multicellular biological system models.

PubMed

Kang, Seunghwa; Kahan, Simon; McDermott, Jason; Flann, Nicholas; Shmulevich, Ilya

2014-11-01

Biological system behaviors are often the outcome of complex interactions among a large number of cells and their biotic and abiotic environment. Computational biologists attempt to understand, predict and manipulate biological system behavior through mathematical modeling and computer simulation. Discrete agent-based modeling (in combination with high-resolution grids to model the extracellular environment) is a popular approach for building biological system models. However, the computational complexity of this approach forces computational biologists to resort to coarser resolution approaches to simulate large biological systems. High-performance parallel computers have the potential to address the computing challenge, but writing efficient software for parallel computers is difficult and time-consuming. We have developed Biocellion, a high-performance software framework, to solve this computing challenge using parallel computers. To support a wide range of multicellular biological system models, Biocellion asks users to provide their model specifics by filling the function body of pre-defined model routines. Using Biocellion, modelers without parallel computing expertise can efficiently exploit parallel computers with less effort than writing sequential programs from scratch. We simulate cell sorting, microbial patterning and a bacterial system in soil aggregate as case studies. Biocellion runs on x86 compatible systems with the 64 bit Linux operating system and is freely available for academic use. Visit http://biocellion.com for additional information. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Modelling parallel programs and multiprocessor architectures with AXE

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Fineman, Charles E.

1991-01-01

AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation

NASA Astrophysics Data System (ADS)

Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.

2011-03-01

A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.
Performing an allreduce operation on a plurality of compute nodes of a parallel computer

DOEpatents

Faraj, Ahmad [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.
Hybrid massively parallel fast sweeping method for static Hamilton-Jacobi equations

NASA Astrophysics Data System (ADS)

Detrixhe, Miles; Gibou, Frédéric

2016-10-01

The fast sweeping method is a popular algorithm for solving a variety of static Hamilton-Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling, and show state-of-the-art speedup values for the fast sweeping method.
Turbomachinery CFD on parallel computers

NASA Technical Reports Server (NTRS)

Blech, Richard A.; Milner, Edward J.; Quealy, Angela; Townsend, Scott E.

1992-01-01

The role of multistage turbomachinery simulation in the development of propulsion system models is discussed. Particularly, the need for simulations with higher fidelity and faster turnaround time is highlighted. It is shown how such fast simulations can be used in engineering-oriented environments. The use of parallel processing to achieve the required turnaround times is discussed. Current work by several researchers in this area is summarized. Parallel turbomachinery CFD research at the NASA Lewis Research Center is then highlighted. These efforts are focused on implementing the average-passage turbomachinery model on MIMD, distributed memory parallel computers. Performance results are given for inviscid, single blade row and viscous, multistage applications on several parallel computers, including networked workstations.
A Review of High-Performance Computational Strategies for Modeling and Imaging of Electromagnetic Induction Data

NASA Astrophysics Data System (ADS)

Newman, Gregory A.

2014-01-01

Many geoscientific applications exploit electrostatic and electromagnetic fields to interrogate and map subsurface electrical resistivity—an important geophysical attribute for characterizing mineral, energy, and water resources. In complex three-dimensional geologies, where many of these resources remain to be found, resistivity mapping requires large-scale modeling and imaging capabilities, as well as the ability to treat significant data volumes, which can easily overwhelm single-core and modest multicore computing hardware. To treat such problems requires large-scale parallel computational resources, necessary for reducing the time to solution to a time frame acceptable to the exploration process. The recognition that significant parallel computing processes must be brought to bear on these problems gives rise to choices that must be made in parallel computing hardware and software. In this review, some of these choices are presented, along with the resulting trade-offs. We also discuss future trends in high-performance computing and the anticipated impact on electromagnetic (EM) geophysics. Topics discussed in this review article include a survey of parallel computing platforms, graphics processing units to multicore CPUs with a fast interconnect, along with effective parallel solvers and associated solver libraries effective for inductive EM modeling and imaging.
Toward an automated parallel computing environment for geosciences

NASA Astrophysics Data System (ADS)

Zhang, Huai; Liu, Mian; Shi, Yaolin; Yuen, David A.; Yan, Zhenzhen; Liang, Guoping

2007-08-01

Software for geodynamic modeling has not kept up with the fast growing computing hardware and network resources. In the past decade supercomputing power has become available to most researchers in the form of affordable Beowulf clusters and other parallel computer platforms. However, to take full advantage of such computing power requires developing parallel algorithms and associated software, a task that is often too daunting for geoscience modelers whose main expertise is in geosciences. We introduce here an automated parallel computing environment built on open-source algorithms and libraries. Users interact with this computing environment by specifying the partial differential equations, solvers, and model-specific properties using an English-like modeling language in the input files. The system then automatically generates the finite element codes that can be run on distributed or shared memory parallel machines. This system is dynamic and flexible, allowing users to address different problems in geosciences. It is capable of providing web-based services, enabling users to generate source codes online. This unique feature will facilitate high-performance computing to be integrated with distributed data grids in the emerging cyber-infrastructures for geosciences. In this paper we discuss the principles of this automated modeling environment and provide examples to demonstrate its versatility.
Computer architecture evaluation for structural dynamics computations: Project summary

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1989-01-01

The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.
Multi-threading: A new dimension to massively parallel scientific computation

NASA Astrophysics Data System (ADS)

Nielsen, Ida M. B.; Janssen, Curtis L.

2000-06-01

Multi-threading is becoming widely available for Unix-like operating systems, and the application of multi-threading opens new ways for performing parallel computations with greater efficiency. We here briefly discuss the principles of multi-threading and illustrate the application of multi-threading for a massively parallel direct four-index transformation of electron repulsion integrals. Finally, other potential applications of multi-threading in scientific computing are outlined.
Surface Modification Engineered Assembly of Novel Quantum Dot Architectures for Advanced Applications

DTIC Science & Technology

2008-02-09

Campbell, S. Ogata, and F. Shimojo, “ Multimillion atom simulations of nanosystems on parallel computers,” in Proceedings of the International...nanomesas: multimillion -atom molecular dynamics simulations on parallel computers,” J. Appl. Phys. 94, 6762 (2003). 21. P. Vashishta, R. K. Kalia...and A. Nakano, “ Multimillion atom molecular dynamics simulations of nanoparticles on parallel computers,” Journal of Nanoparticle Research 5, 119-135
Computational chemistry

NASA Technical Reports Server (NTRS)

Arnold, J. O.

1987-01-01

With the advent of supercomputers, modern computational chemistry algorithms and codes, a powerful tool was created to help fill NASA's continuing need for information on the properties of matter in hostile or unusual environments. Computational resources provided under the National Aerodynamics Simulator (NAS) program were a cornerstone for recent advancements in this field. Properties of gases, materials, and their interactions can be determined from solutions of the governing equations. In the case of gases, for example, radiative transition probabilites per particle, bond-dissociation energies, and rates of simple chemical reactions can be determined computationally as reliably as from experiment. The data are proving to be quite valuable in providing inputs to real-gas flow simulation codes used to compute aerothermodynamic loads on NASA's aeroassist orbital transfer vehicles and a host of problems related to the National Aerospace Plane Program. Although more approximate, similar solutions can be obtained for ensembles of atoms simulating small particles of materials with and without the presence of gases. Computational chemistry has application in studying catalysis, properties of polymers, all of interest to various NASA missions, including those previously mentioned. In addition to discussing these applications of computational chemistry within NASA, the governing equations and the need for supercomputers for their solution is outlined.

Using parallel computing for the display and simulation of the space debris environment

NASA Astrophysics Data System (ADS)

Möckel, M.; Wiedemann, C.; Flegel, S.; Gelhaus, J.; Vörsmann, P.; Klinkrad, H.; Krag, H.

2011-07-01

Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction to OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Using parallel computing for the display and simulation of the space debris environment

NASA Astrophysics Data System (ADS)

Moeckel, Marek; Wiedemann, Carsten; Flegel, Sven Kevin; Gelhaus, Johannes; Klinkrad, Heiner; Krag, Holger; Voersmann, Peter

Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction of OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Fragment informatics and computational fragment-based drug design: an overview and update.

PubMed

Sheng, Chunquan; Zhang, Wannian

2013-05-01

Fragment-based drug design (FBDD) is a promising approach for the discovery and optimization of lead compounds. Despite its successes, FBDD also faces some internal limitations and challenges. FBDD requires a high quality of target protein and good solubility of fragments. Biophysical techniques for fragment screening necessitate expensive detection equipment and the strategies for evolving fragment hits to leads remain to be improved. Regardless, FBDD is necessary for investigating larger chemical space and can be applied to challenging biological targets. In this scenario, cheminformatics and computational chemistry can be used as alternative approaches that can significantly improve the efficiency and success rate of lead discovery and optimization. Cheminformatics and computational tools assist FBDD in a very flexible manner. Computational FBDD can be used independently or in parallel with experimental FBDD for efficiently generating and optimizing leads. Computational FBDD can also be integrated into each step of experimental FBDD and help to play a synergistic role by maximizing its performance. This review will provide critical analysis of the complementarity between computational and experimental FBDD and highlight recent advances in new algorithms and successful examples of their applications. In particular, fragment-based cheminformatics tools, high-throughput fragment docking, and fragment-based de novo drug design will provide the focus of this review. We will also discuss the advantages and limitations of different methods and the trends in new developments that should inspire future research. © 2012 Wiley Periodicals, Inc.
Quantum information, cognition, and music.

PubMed

Dalla Chiara, Maria L; Giuntini, Roberto; Leporini, Roberto; Negri, Eleonora; Sergioli, Giuseppe

2015-01-01

Parallelism represents an essential aspect of human mind/brain activities. One can recognize some common features between psychological parallelism and the characteristic parallel structures that arise in quantum theory and in quantum computation. The article is devoted to a discussion of the following questions: a comparison between classical probabilistic Turing machines and quantum Turing machines.possible applications of the quantum computational semantics to cognitive problems.parallelism in music.
Quantum information, cognition, and music

PubMed Central

Dalla Chiara, Maria L.; Giuntini, Roberto; Leporini, Roberto; Negri, Eleonora; Sergioli, Giuseppe

2015-01-01

Parallelism represents an essential aspect of human mind/brain activities. One can recognize some common features between psychological parallelism and the characteristic parallel structures that arise in quantum theory and in quantum computation. The article is devoted to a discussion of the following questions: a comparison between classical probabilistic Turing machines and quantum Turing machines.possible applications of the quantum computational semantics to cognitive problems.parallelism in music. PMID:26539139
Restricted access Improved hydrogeophysical characterization and monitoring through parallel modeling and inversion of time-domain resistivity andinduced-polarization data

USGS Publications Warehouse

Johnson, Timothy C.; Versteeg, Roelof J.; Ward, Andy; Day-Lewis, Frederick D.; Revil, André

2010-01-01

Electrical geophysical methods have found wide use in the growing discipline of hydrogeophysics for characterizing the electrical properties of the subsurface and for monitoring subsurface processes in terms of the spatiotemporal changes in subsurface conductivity, chargeability, and source currents they govern. Presently, multichannel and multielectrode data collections systems can collect large data sets in relatively short periods of time. Practitioners, however, often are unable to fully utilize these large data sets and the information they contain because of standard desktop-computer processing limitations. These limitations can be addressed by utilizing the storage and processing capabilities of parallel computing environments. We have developed a parallel distributed-memory forward and inverse modeling algorithm for analyzing resistivity and time-domain induced polar-ization (IP) data. The primary components of the parallel computations include distributed computation of the pole solutions in forward mode, distributed storage and computation of the Jacobian matrix in inverse mode, and parallel execution of the inverse equation solver. We have tested the corresponding parallel code in three efforts: (1) resistivity characterization of the Hanford 300 Area Integrated Field Research Challenge site in Hanford, Washington, U.S.A., (2) resistivity characterization of a volcanic island in the southern Tyrrhenian Sea in Italy, and (3) resistivity and IP monitoring of biostimulation at a Superfund site in Brandywine, Maryland, U.S.A. Inverse analysis of each of these data sets would be limited or impossible in a standard serial computing environment, which underscores the need for parallel high-performance computing to fully utilize the potential of electrical geophysical methods in hydrogeophysical applications.
Conformational Isomerism of trans-[Pt(NH2C6H11)2I2] and the Classical Wernerian Chemistry of [Pt(NH2C6H11)4]X2 (X = Cl, Br, I)1

PubMed Central

Johnstone, Timothy C.; Lippard, Stephen J.

2012-01-01

X-ray crystallographic analysis of the compound trans-[Pt(NH2C6H11)2I2] revealed the presence of two distinct conformers within one crystal lattice. This compound was studied by variable temperature NMR spectroscopy to investigate the dynamic interconversion between these isomers. The results of this investigation were interpreted using physical (CPK) and computational (molecular mechanics and density functional theory) models. The conversion of the salts [Pt(NH2C6H11)4]X2 into trans-[Pt(NH2C6H11)2X2] (X = Cl, Br, I) was also studied and is discussed here with an emphasis on parallels to the work of Alfred Werner. PMID:23554544
Micro-separation toward systems biology.

PubMed

Liu, Bi-Feng; Xu, Bo; Zhang, Guisen; Du, Wei; Luo, Qingming

2006-02-17

Current biology is experiencing transformation in logic or philosophy that forces us to reevaluate the concept of cell, tissue or entire organism as a collection of individual components. Systems biology that aims at understanding biological system at the systems level is an emerging research area, which involves interdisciplinary collaborations of life sciences, computational and mathematical sciences, systems engineering, and analytical technology, etc. For analytical chemistry, developing innovative methods to meet the requirement of systems biology represents new challenges as also opportunities and responsibility. In this review, systems biology-oriented micro-separation technologies are introduced for comprehensive profiling of genome, proteome and metabolome, characterization of biomolecules interaction and single cell analysis such as capillary electrophoresis, ultra-thin layer gel electrophoresis, micro-column liquid chromatography, and their multidimensional combinations, parallel integrations, microfabricated formats, and nano technology involvement. Future challenges and directions are also suggested.
Locating hardware faults in a data communications network of a parallel computer

DOEpatents

Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

2010-01-12

Hardware faults location in a data communications network of a parallel computer. Such a parallel computer includes a plurality of compute nodes and a data communications network that couples the compute nodes for data communications and organizes the compute node as a tree. Locating hardware faults includes identifying a next compute node as a parent node and a root of a parent test tree, identifying for each child compute node of the parent node a child test tree having the child compute node as root, running a same test suite on the parent test tree and each child test tree, and identifying the parent compute node as having a defective link connected from the parent compute node to a child compute node if the test suite fails on the parent test tree and succeeds on all the child test trees.
National resource for computation in chemistry, phase I: evaluation and recommendations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1980-05-01

The National Resource for Computation in Chemistry (NRCC) was inaugurated at the Lawrence Berkeley Laboratory (LBL) in October 1977, with joint funding by the Department of Energy (DOE) and the National Science Foundation (NSF). The chief activities of the NRCC include: assembling a staff of eight postdoctoral computational chemists, establishing an office complex at LBL, purchasing a midi-computer and graphics display system, administering grants of computer time, conducting nine workshops in selected areas of computational chemistry, compiling a library of computer programs with adaptations and improvements, initiating a software distribution system, providing user assistance and consultation on request. This reportmore » presents assessments and recommendations of an Ad Hoc Review Committee appointed by the DOE and NSF in January 1980. The recommendations are that NRCC should: (1) not fund grants for computing time or research but leave that to the relevant agencies, (2) continue the Workshop Program in a mode similar to Phase I, (3) abandon in-house program development and establish instead a competitive external postdoctoral program in chemistry software development administered by the Policy Board and Director, and (4) not attempt a software distribution system (leaving that function to the QCPE). Furthermore, (5) DOE should continue to make its computational facilities available to outside users (at normal cost rates) and should find some way to allow the chemical community to gain occasional access to a CRAY-level computer.« less
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Choudhary, Alok Nidhi

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Parallel implementation of geometrical shock dynamics for two dimensional converging shock waves

NASA Astrophysics Data System (ADS)

Qiu, Shi; Liu, Kuang; Eliasson, Veronica

2016-10-01

Geometrical shock dynamics (GSD) theory is an appealing method to predict the shock motion in the sense that it is more computationally efficient than solving the traditional Euler equations, especially for converging shock waves. However, to solve and optimize large scale configurations, the main bottleneck is the computational cost. Among the existing numerical GSD schemes, there is only one that has been implemented on parallel computers, with the purpose to analyze detonation waves. To extend the computational advantage of the GSD theory to more general applications such as converging shock waves, a numerical implementation using a spatial decomposition method has been coupled with a front tracking approach on parallel computers. In addition, an efficient tridiagonal system solver for massively parallel computers has been applied to resolve the most expensive function in this implementation, resulting in an efficiency of 0.93 while using 32 HPCC cores. Moreover, symmetric boundary conditions have been developed to further reduce the computational cost, achieving a speedup of 19.26 for a 12-sided polygonal converging shock.
A Comparison of Automatic Parallelization Tools/Compilers on the SGI Origin 2000 Using the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry

1998-01-01

Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics

NASA Technical Reports Server (NTRS)

Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier

1992-01-01

Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Parallel Domain Decomposition Formulation and Software for Large-Scale Sparse Symmetrical/Unsymmetrical Aeroacoustic Applications

NASA Technical Reports Server (NTRS)

Nguyen, D. T.; Watson, Willie R. (Technical Monitor)

2005-01-01

The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.
Parallel, Asynchronous Executive (PAX): System concepts, facilities, and architecture

NASA Technical Reports Server (NTRS)

Jones, W. H.

1983-01-01

The Parallel, Asynchronous Executive (PAX) is a software operating system simulation that allows many computers to work on a single problem at the same time. PAX is currently implemented on a UNIVAC 1100/42 computer system. Independent UNIVAC runstreams are used to simulate independent computers. Data are shared among independent UNIVAC runstreams through shared mass-storage files. PAX has achieved the following: (1) applied several computing processes simultaneously to a single, logically unified problem; (2) resolved most parallel processor conflicts by careful work assignment; (3) resolved by means of worker requests to PAX all conflicts not resolved by work assignment; (4) provided fault isolation and recovery mechanisms to meet the problems of an actual parallel, asynchronous processing machine. Additionally, one real-life problem has been constructed for the PAX environment. This is CASPER, a collection of aerodynamic and structural dynamic problem simulation routines. CASPER is not discussed in this report except to provide examples of parallel-processing techniques.
Applications of Parallel Computation in Micro-Mechanics and Finite Element Method

NASA Technical Reports Server (NTRS)

Tan, Hui-Qian

1996-01-01

This project discusses the application of parallel computations related with respect to material analyses. Briefly speaking, we analyze some kind of material by elements computations. We call an element a cell here. A cell is divided into a number of subelements called subcells and all subcells in a cell have the identical structure. The detailed structure will be given later in this paper. It is obvious that the problem is "well-structured". SIMD machine would be a better choice. In this paper we try to look into the potentials of SIMD machine in dealing with finite element computation by developing appropriate algorithms on MasPar, a SIMD parallel machine. In section 2, the architecture of MasPar will be discussed. A brief review of the parallel programming language MPL also is given in that section. In section 3, some general parallel algorithms which might be useful to the project will be proposed. And, combining with the algorithms, some features of MPL will be discussed in more detail. In section 4, the computational structure of cell/subcell model will be given. The idea of designing the parallel algorithm for the model will be demonstrated. Finally in section 5, a summary will be given.
Optics Program Modified for Multithreaded Parallel Computing

NASA Technical Reports Server (NTRS)

Lou, John; Bedding, Dave; Basinger, Scott

2006-01-01

A powerful high-performance computer program for simulating and analyzing adaptive and controlled optical systems has been developed by modifying the serial version of the Modeling and Analysis for Controlled Optical Systems (MACOS) program to impart capabilities for multithreaded parallel processing on computing systems ranging from supercomputers down to Symmetric Multiprocessing (SMP) personal computers. The modifications included the incorporation of OpenMP, a portable and widely supported application interface software, that can be used to explicitly add multithreaded parallelism to an application program under a shared-memory programming model. OpenMP was applied to parallelize ray-tracing calculations, one of the major computing components in MACOS. Multithreading is also used in the diffraction propagation of light in MACOS based on pthreads [POSIX Thread, (where "POSIX" signifies a portable operating system for UNIX)]. In tests of the parallelized version of MACOS, the speedup in ray-tracing calculations was found to be linear, or proportional to the number of processors, while the speedup in diffraction calculations ranged from 50 to 60 percent, depending on the type and number of processors. The parallelized version of MACOS is portable, and, to the user, its interface is basically the same as that of the original serial version of MACOS.
PyPele Rewritten To Use MPI

NASA Technical Reports Server (NTRS)

Hockney, George; Lee, Seungwon

2008-01-01

A computer program known as PyPele, originally written as a Pythonlanguage extension module of a C++ language program, has been rewritten in pure Python language. The original version of PyPele dispatches and coordinates parallel-processing tasks on cluster computers and provides a conceptual framework for spacecraft-mission- design and -analysis software tools to run in an embarrassingly parallel mode. The original version of PyPele uses SSH (Secure Shell a set of standards and an associated network protocol for establishing a secure channel between a local and a remote computer) to coordinate parallel processing. Instead of SSH, the present Python version of PyPele uses Message Passing Interface (MPI) [an unofficial de-facto standard language-independent application programming interface for message- passing on a parallel computer] while keeping the same user interface. The use of MPI instead of SSH and the preservation of the original PyPele user interface make it possible for parallel application programs written previously for the original version of PyPele to run on MPI-based cluster computers. As a result, engineers using the previously written application programs can take advantage of embarrassing parallelism without need to rewrite those programs.
Massively parallel information processing systems for space applications

NASA Technical Reports Server (NTRS)

Schaefer, D. H.

1979-01-01

NASA is developing massively parallel systems for ultra high speed processing of digital image data collected by satellite borne instrumentation. Such systems contain thousands of processing elements. Work is underway on the design and fabrication of the 'Massively Parallel Processor', a ground computer containing 16,384 processing elements arranged in a 128 x 128 array. This computer uses existing technology. Advanced work includes the development of semiconductor chips containing thousands of feedthrough paths. Massively parallel image analog to digital conversion technology is also being developed. The goal is to provide compact computers suitable for real-time onboard processing of images.

n-body simulations using message passing parallel computers.

NASA Astrophysics Data System (ADS)

Grama, A. Y.; Kumar, V.; Sameh, A.

The authors present new parallel formulations of the Barnes-Hut method for n-body simulations on message passing computers. These parallel formulations partition the domain efficiently incurring minimal communication overhead. This is in contrast to existing schemes that are based on sorting a large number of keys or on the use of global data structures. The new formulations are augmented by alternate communication strategies which serve to minimize communication overhead. The impact of these communication strategies is experimentally studied. The authors report on experimental results obtained from an astrophysical simulation on an nCUBE2 parallel computer.
An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

DOE PAGES

Mironov, Vladimir; Moskovsky, Alexander; D’Mello, Michael; ...

2017-10-04

The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overallmore » memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.« less
Numerical approaches to combustion modeling. Progress in Astronautics and Aeronautics. Vol. 135

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oran, E.S.; Boris, J.P.

1991-01-01

Various papers on numerical approaches to combustion modeling are presented. The topics addressed include; ab initio quantum chemistry for combustion; rate coefficient calculations for combustion modeling; numerical modeling of combustion of complex hydrocarbons; combustion kinetics and sensitivity analysis computations; reduction of chemical reaction models; length scales in laminar and turbulent flames; numerical modeling of laminar diffusion flames; laminar flames in premixed gases; spectral simulations of turbulent reacting flows; vortex simulation of reacting shear flow; combustion modeling using PDF methods. Also considered are: supersonic reacting internal flow fields; studies of detonation initiation, propagation, and quenching; numerical modeling of heterogeneous detonations, deflagration-to-detonationmore » transition to reactive granular materials; toward a microscopic theory of detonations in energetic crystals; overview of spray modeling; liquid drop behavior in dense and dilute clusters; spray combustion in idealized configurations: parallel drop streams; comparisons of deterministic and stochastic computations of drop collisions in dense sprays; ignition and flame spread across solid fuels; numerical study of pulse combustor dynamics; mathematical modeling of enclosure fires; nuclear systems.« less
Analysis of turbulent free jet hydrogen-air diffusion flames with finite chemical reaction rates

NASA Technical Reports Server (NTRS)

Sislian, J. P.

1978-01-01

The nonequilibrium flow field resulting from the turbulent mixing and combustion of a supersonic axisymmetric hydrogen jet in a supersonic parallel coflowing air stream is analyzed. Effective turbulent transport properties are determined using the (K-epsilon) model. The finite-rate chemistry model considers eight reactions between six chemical species, H, O, H2O, OH, O2, and H2. The governing set of nonlinear partial differential equations is solved by an implicit finite-difference procedure. Radial distributions are obtained at two downstream locations of variables such as turbulent kinetic energy, turbulent dissipation rate, turbulent scale length, and viscosity. The results show that these variables attain peak values at the axis of symmetry. Computed distributions of velocity, temperature, and mass fraction are also given. A direct analytical approach to account for the effect of species concentration fluctuations on the mean production rate of species (the phenomenon of unmixedness) is also presented. However, the use of the method does not seem justified in view of the excessive computer time required to solve the resulting system of equations.
A design methodology for portable software on parallel computers

NASA Technical Reports Server (NTRS)

Nicol, David M.; Miller, Keith W.; Chrisman, Dan A.

1993-01-01

This final report for research that was supported by grant number NAG-1-995 documents our progress in addressing two difficulties in parallel programming. The first difficulty is developing software that will execute quickly on a parallel computer. The second difficulty is transporting software between dissimilar parallel computers. In general, we expect that more hardware-specific information will be included in software designs for parallel computers than in designs for sequential computers. This inclusion is an instance of portability being sacrificed for high performance. New parallel computers are being introduced frequently. Trying to keep one's software on the current high performance hardware, a software developer almost continually faces yet another expensive software transportation. The problem of the proposed research is to create a design methodology that helps designers to more precisely control both portability and hardware-specific programming details. The proposed research emphasizes programming for scientific applications. We completed our study of the parallelizability of a subsystem of the NASA Earth Radiation Budget Experiment (ERBE) data processing system. This work is summarized in section two. A more detailed description is provided in Appendix A ('Programming Practices to Support Eventual Parallelism'). Mr. Chrisman, a graduate student, wrote and successfully defended a Ph.D. dissertation proposal which describes our research associated with the issues of software portability and high performance. The list of research tasks are specified in the proposal. The proposal 'A Design Methodology for Portable Software on Parallel Computers' is summarized in section three and is provided in its entirety in Appendix B. We are currently studying a proposed subsystem of the NASA Clouds and the Earth's Radiant Energy System (CERES) data processing system. This software is the proof-of-concept for the Ph.D. dissertation. We have implemented and measured the performance of a portion of this subsystem on the Intel iPSC/2 parallel computer. These results are provided in section four. Our future work is summarized in section five, our acknowledgements are stated in section six, and references for published papers associated with NAG-1-995 are provided in section seven.
Convergence issues in domain decomposition parallel computation of hovering rotor

NASA Astrophysics Data System (ADS)

Xiao, Zhongyun; Liu, Gang; Mou, Bin; Jiang, Xiong

2018-05-01

Implicit LU-SGS time integration algorithm has been widely used in parallel computation in spite of its lack of information from adjacent domains. When applied to parallel computation of hovering rotor flows in a rotating frame, it brings about convergence issues. To remedy the problem, three LU factorization-based implicit schemes (consisting of LU-SGS, DP-LUR and HLU-SGS) are investigated comparatively. A test case of pure grid rotation is designed to verify these algorithms, which show that LU-SGS algorithm introduces errors on boundary cells. When partition boundaries are circumferential, errors arise in proportion to grid speed, accumulating along with the rotation, and leading to computational failure in the end. Meanwhile, DP-LUR and HLU-SGS methods show good convergence owing to boundary treatment which are desirable in domain decomposition parallel computations.
How to make your own response boxes: A step-by-step guide for the construction of reliable and inexpensive parallel-port response pads from computer mice.

PubMed

Voss, Andreas; Leonhart, Rainer; Stahl, Christoph

2007-11-01

Psychological research is based in large parts on response latencies, which are often registered by keypresses on a standard computer keyboard. Recording response latencies with a standard keyboard is problematic because keypresses are buffered within the keyboard hardware before they are signaled to the computer, adding error variance to the recorded latencies. This can be circumvented by using external response pads connected to the computer's parallel port. In this article, we describe how to build inexpensive, reliable, and easy-to-use response pads with six keys from two standard computer mice that can be connected to the PC's parallel port. We also address the problem of recording data from the parallel port with different software packages under Microsoft's Windows XP.
Development and Formative Evaluation of Computer Simulated College Chemistry Experiments.

ERIC Educational Resources Information Center

Cavin, Claudia S.; Cavin, E. D.

1978-01-01

This article describes the design, preparation, and initial evaluation of a set of computer-simulated chemistry experiments. The experiments entailed the use of an atomic emission spectroscope and a single-beam visible absorption spectrophometer. (Author/IRT)
Research and Teaching: Computational Methods in General Chemistry--Perceptions of Programming, Prior Experience, and Student Outcomes

ERIC Educational Resources Information Center

Wheeler, Lindsay B.; Chiu, Jennie L.; Grisham, Charles M.

2016-01-01

This article explores how integrating computational tools into a general chemistry laboratory course can influence student perceptions of programming and investigates relationships among student perceptions, prior experience, and student outcomes.
The Computer Bulletin Board.

ERIC Educational Resources Information Center

Batt, Russell H., Ed.

1989-01-01

Describes two chemistry computer programs: (1) "Eureka: A Chemistry Problem Solver" (problem files may be written by the instructor, MS-DOS 2.0, IBM with 384K); and (2) "PC-File+" (database management, IBM with 416K and two floppy drives). (MVL)
Development of an Undergraduate Course in the Use of Digital Computers With Chemistry Instrumentation.

ERIC Educational Resources Information Center

Wilkins, Charles L.

Computer-assisted instruction (CAI) has proven useful in teaching chemistry instrumentation techniques to undergraduate students. The work completed at the time of this interim report has clearly shown that a general purpose laboratory computer system, equipped with suitable devices to allow direct data input from experiments, can be an effective…
Chemical Equilibrium, Unit 2: Le Chatelier's Principle. A Computer-Enriched Module for Introductory Chemistry. Student's Guide and Teacher's Guide.

ERIC Educational Resources Information Center

Jameson, A. Keith

Presented are the teacher's guide and student materials for one of a series of self-instructional, computer-based learning modules for an introductory, undergraduate chemistry course. The student manual for this unit on Le Chatelier's principle includes objectives, prerequisites, pretest, instructions for executing the computer program, and…
Using Free Computational Resources to Illustrate the Drug Design Process in an Undergraduate Medicinal Chemistry Course

ERIC Educational Resources Information Center

Rodrigues, Ricardo P.; Andrade, Saulo F.; Mantoani, Susimaire P.; Eifler-Lima, Vera L.; Silva, Vinicius B.; Kawano, Daniel F.

2015-01-01

Advances in, and dissemination of, computer technologies in the field of drug research now enable the use of molecular modeling tools to teach important concepts of drug design to chemistry and pharmacy students. A series of computer laboratories is described to introduce undergraduate students to commonly adopted "in silico" drug design…
The Use of Modular Computer-Based Lessons in a Modification of the Classical Introductory Course in Organic Chemistry.

ERIC Educational Resources Information Center

Stotter, Philip L.; Culp, George H.

An experimental course in organic chemistry utilized computer-assisted instructional (CAI) techniques. The CAI lessons provided tutorial drill and practice and simulated experiments and reactions. The Conversational Language for Instruction and Computing was used, along with a CDC 6400-6600 system; students scheduled and completed the lessons at…
CHEMEX; Understanding and Solving Problems in Chemistry. A Computer-Assisted Instruction Program for General Chemistry.

ERIC Educational Resources Information Center

Lower, Stephen K.

A brief overview of CHEMEX--a problem-solving, tutorial style computer-assisted instructional course--is provided and sample problems are offered. In CHEMEX, students receive problems in advance and attempt to solve them before moving through the computer program, which assists them in overcoming difficulties and serves as a review mechanism.…
The Impact of Learner's Prior Knowledge on Their Use of Chemistry Computer Simulations: A Case Study

ERIC Educational Resources Information Center

Liu, Han-Chin; Andre, Thomas; Greenbowe, Thomas

2008-01-01

It is complicated to design a computer simulation that adapts to students with different characteristics. This study documented cases that show how college students' prior chemistry knowledge level affected their interaction with peers and their approach to solving problems with the use of computer simulations that were designed to learn…
On some methods for improving time of reachability sets computation for the dynamic system control problem

NASA Astrophysics Data System (ADS)

Zimovets, Artem; Matviychuk, Alexander; Ushakov, Vladimir

2016-12-01

The paper presents two different approaches to reduce the time of computer calculation of reachability sets. First of these two approaches use different data structures for storing the reachability sets in the computer memory for calculation in single-threaded mode. Second approach is based on using parallel algorithms with reference to the data structures from the first approach. Within the framework of this paper parallel algorithm of approximate reachability set calculation on computer with SMP-architecture is proposed. The results of numerical modelling are presented in the form of tables which demonstrate high efficiency of parallel computing technology and also show how computing time depends on the used data structure.
Performance analysis of parallel branch and bound search with the hypercube architecture

NASA Technical Reports Server (NTRS)

Mraz, Richard T.

1987-01-01

With the availability of commercial parallel computers, researchers are examining new classes of problems which might benefit from parallel computing. This paper presents results of an investigation of the class of search intensive problems. The specific problem discussed is the Least-Cost Branch and Bound search method of deadline job scheduling. The object-oriented design methodology was used to map the problem into a parallel solution. While the initial design was good for a prototype, the best performance resulted from fine-tuning the algorithm for a specific computer. The experiments analyze the computation time, the speed up over a VAX 11/785, and the load balance of the problem when using loosely coupled multiprocessor system based on the hypercube architecture.
Dynamic modeling of parallel robots for computed-torque control implementation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Codourey, A.

1998-12-01

In recent years, increased interest in parallel robots has been observed. Their control with modern theory, such as the computed-torque method, has, however, been restrained, essentially due to the difficulty in establishing a simple dynamic model that can be calculated in real time. In this paper, a simple method based on the virtual work principle is proposed for modeling parallel robots. The mass matrix of the robot, needed for decoupling control strategies, does not explicitly appear in the formulation; however, it can be computed separately, based on kinetic energy considerations. The method is applied to the DELTA parallel robot, leadingmore » to a very efficient model that has been implemented in a real-time computed-torque control algorithm.« less
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming

NASA Technical Reports Server (NTRS)

Dorband, John E.; Aburdene, Maurice F.

2002-01-01

Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.

Portability and Cross-Platform Performance of an MPI-Based Parallel Polygon Renderer

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1999-01-01

Visualizing the results of computations performed on large-scale parallel computers is a challenging problem, due to the size of the datasets involved. One approach is to perform the visualization and graphics operations in place, exploiting the available parallelism to obtain the necessary rendering performance. Over the past several years, we have been developing algorithms and software to support visualization applications on NASA's parallel supercomputers. Our results have been incorporated into a parallel polygon rendering system called PGL. PGL was initially developed on tightly-coupled distributed-memory message-passing systems, including Intel's iPSC/860 and Paragon, and IBM's SP2. Over the past year, we have ported it to a variety of additional platforms, including the HP Exemplar, SGI Origin2OOO, Cray T3E, and clusters of Sun workstations. In implementing PGL, we have had two primary goals: cross-platform portability and high performance. Portability is important because (1) our manpower resources are limited, making it difficult to develop and maintain multiple versions of the code, and (2) NASA's complement of parallel computing platforms is diverse and subject to frequent change. Performance is important in delivering adequate rendering rates for complex scenes and ensuring that parallel computing resources are used effectively. Unfortunately, these two goals are often at odds. In this paper we report on our experiences with portability and performance of the PGL polygon renderer across a range of parallel computing platforms.
The 'wired' universe of organic chemistry.

PubMed

Grzybowski, Bartosz A; Bishop, Kyle J M; Kowalczyk, Bartlomiej; Wilmer, Christopher E

2009-04-01

The millions of reactions performed and compounds synthesized by organic chemists over the past two centuries connect to form a network larger than the metabolic networks of higher organisms and rivalling the complexity of the World Wide Web. Despite its apparent randomness, the network of chemistry has a well-defined, modular architecture. The network evolves in time according to trends that have not changed since the inception of the discipline, and thus project into chemistry's future. Analysis of organic chemistry using the tools of network theory enables the identification of most 'central' organic molecules, and for the prediction of which and how many molecules will be made in the future. Statistical analyses based on network connectivity are useful in optimizing parallel syntheses, in estimating chemical reactivity, and more.
A compositional reservoir simulator on distributed memory parallel computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rame, M.; Delshad, M.

1995-12-31

This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
Hybrid massively parallel fast sweeping method for static Hamilton–Jacobi equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Detrixhe, Miles, E-mail: mdetrixhe@engineering.ucsb.edu; University of California Santa Barbara, Santa Barbara, CA, 93106; Gibou, Frédéric, E-mail: fgibou@engineering.ucsb.edu

The fast sweeping method is a popular algorithm for solving a variety of static Hamilton–Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling,more » and show state-of-the-art speedup values for the fast sweeping method.« less
Hybrid parallel computing architecture for multiview phase shifting

NASA Astrophysics Data System (ADS)

Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun

2014-11-01

The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
On efficiency of fire simulation realization: parallelization with greater number of computational meshes

NASA Astrophysics Data System (ADS)

Valasek, Lukas; Glasa, Jan

2017-12-01

Current fire simulation systems are capable to utilize advantages of high-performance computer (HPC) platforms available and to model fires efficiently in parallel. In this paper, efficiency of a corridor fire simulation on a HPC computer cluster is discussed. The parallel MPI version of Fire Dynamics Simulator is used for testing efficiency of selected strategies of allocation of computational resources of the cluster using a greater number of computational cores. Simulation results indicate that if the number of cores used is not equal to a multiple of the total number of cluster node cores there are allocation strategies which provide more efficient calculations.
Programs for Fundamentals of Chemistry.

ERIC Educational Resources Information Center

Gallardo, Julio; Delgado, Steven

This document provides computer programs, written in BASIC PLUS, for presenting fundamental or remedial college chemistry students with chemical problems in a computer assisted instructional program. Programs include instructions, a sample run, and 14 separate practice sessions covering: mathematical operations, using decimals, solving…
[Advancements of computer chemistry in separation of Chinese medicine].

PubMed

Li, Lingjuan; Hong, Hong; Xu, Xuesong; Guo, Liwei

2011-12-01

Separating technique of Chinese medicine is not only a key technique in the field of Chinese medicine' s research and development, but also a significant step in the modernization of Chinese medicinal preparation. Computer chemistry can build model and look for the regulations from Chinese medicine system which is full of complicated data. This paper analyzed the applicability, key technology, basic mode and common algorithm of computer chemistry applied in the separation of Chinese medicine, introduced the mathematic mode and the setting methods of Extraction kinetics, investigated several problems which based on traditional Chinese medicine membrane procession, and forecasted the application prospect.
A Hybrid MPI/OpenMP Approach for Parallel Groundwater Model Calibration on Multicore Computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tang, Guoping; D'Azevedo, Ed F; Zhang, Fan

2010-01-01

Groundwater model calibration is becoming increasingly computationally time intensive. We describe a hybrid MPI/OpenMP approach to exploit two levels of parallelism in software and hardware to reduce calibration time on multicore computers with minimal parallelization effort. At first, HydroGeoChem 5.0 (HGC5) is parallelized using OpenMP for a uranium transport model with over a hundred species involving nearly a hundred reactions, and a field scale coupled flow and transport model. In the first application, a single parallelizable loop is identified to consume over 97% of the total computational time. With a few lines of OpenMP compiler directives inserted into the code,more » the computational time reduces about ten times on a compute node with 16 cores. The performance is further improved by selectively parallelizing a few more loops. For the field scale application, parallelizable loops in 15 of the 174 subroutines in HGC5 are identified to take more than 99% of the execution time. By adding the preconditioned conjugate gradient solver and BICGSTAB, and using a coloring scheme to separate the elements, nodes, and boundary sides, the subroutines for finite element assembly, soil property update, and boundary condition application are parallelized, resulting in a speedup of about 10 on a 16-core compute node. The Levenberg-Marquardt (LM) algorithm is added into HGC5 with the Jacobian calculation and lambda search parallelized using MPI. With this hybrid approach, compute nodes at the number of adjustable parameters (when the forward difference is used for Jacobian approximation), or twice that number (if the center difference is used), are used to reduce the calibration time from days and weeks to a few hours for the two applications. This approach can be extended to global optimization scheme and Monte Carol analysis where thousands of compute nodes can be efficiently utilized.« less
Variable-Complexity Multidisciplinary Optimization on Parallel Computers

NASA Technical Reports Server (NTRS)

Grossman, Bernard; Mason, William H.; Watson, Layne T.; Haftka, Raphael T.

1998-01-01

This report covers work conducted under grant NAG1-1562 for the NASA High Performance Computing and Communications Program (HPCCP) from December 7, 1993, to December 31, 1997. The objective of the research was to develop new multidisciplinary design optimization (MDO) techniques which exploit parallel computing to reduce the computational burden of aircraft MDO. The design of the High-Speed Civil Transport (HSCT) air-craft was selected as a test case to demonstrate the utility of our MDO methods. The three major tasks of this research grant included: development of parallel multipoint approximation methods for the aerodynamic design of the HSCT, use of parallel multipoint approximation methods for structural optimization of the HSCT, mathematical and algorithmic development including support in the integration of parallel computation for items (1) and (2). These tasks have been accomplished with the development of a response surface methodology that incorporates multi-fidelity models. For the aerodynamic design we were able to optimize with up to 20 design variables using hundreds of expensive Euler analyses together with thousands of inexpensive linear theory simulations. We have thereby demonstrated the application of CFD to a large aerodynamic design problem. For the predicting structural weight we were able to combine hundreds of structural optimizations of refined finite element models with thousands of optimizations based on coarse models. Computations have been carried out on the Intel Paragon with up to 128 nodes. The parallel computation allowed us to perform combined aerodynamic-structural optimization using state of the art models of a complex aircraft configurations.
Line-plane broadcasting in a data communications network of a parallel computer

DOEpatents

Archer, Charles J.; Berg, Jeremy E.; Blocksome, Michael A.; Smith, Brian E.

2010-06-08

Methods, apparatus, and products are disclosed for line-plane broadcasting in a data communications network of a parallel computer, the parallel computer comprising a plurality of compute nodes connected together through the network, the network optimized for point to point data communications and characterized by at least a first dimension, a second dimension, and a third dimension, that include: initiating, by a broadcasting compute node, a broadcast operation, including sending a message to all of the compute nodes along an axis of the first dimension for the network; sending, by each compute node along the axis of the first dimension, the message to all of the compute nodes along an axis of the second dimension for the network; and sending, by each compute node along the axis of the second dimension, the message to all of the compute nodes along an axis of the third dimension for the network.
Line-plane broadcasting in a data communications network of a parallel computer

DOEpatents

Archer, Charles J.; Berg, Jeremy E.; Blocksome, Michael A.; Smith, Brian E.

2010-11-23

Methods, apparatus, and products are disclosed for line-plane broadcasting in a data communications network of a parallel computer, the parallel computer comprising a plurality of compute nodes connected together through the network, the network optimized for point to point data communications and characterized by at least a first dimension, a second dimension, and a third dimension, that include: initiating, by a broadcasting compute node, a broadcast operation, including sending a message to all of the compute nodes along an axis of the first dimension for the network; sending, by each compute node along the axis of the first dimension, the message to all of the compute nodes along an axis of the second dimension for the network; and sending, by each compute node along the axis of the second dimension, the message to all of the compute nodes along an axis of the third dimension for the network.
Protein Engineering: Development of a Metal Ion Dependent Switch

DTIC Science & Technology

2017-05-22

Society of Chemistry Royal Society of Chemistry Biochemistry PNAS Escherichia coli Journal of Biotechnology Biochemistry Nature Protocols Journal of...Molecular Biology Biochemistry Royal Society of Chemistry Proteins: Structure, Function, and Bioinformatics Journal of Molecular Biology Biophysical...Biophysical Journal Protein Science Journal of Computational Chemistry Current Opinion in Chemical Biology Royal Society of Chemistry
Computational chemistry for NH 3 synthesis, hydrotreating, and NO x reduction: Three topics of special interest to Haldor Topsøe

DOE Office of Scientific and Technical Information (OSTI.GOV)

Elnabawy, Ahmed O.; Rangarajan, Srinivas; Mavrikakis, Manos

Computational chemistry, especially density functional theory, has experienced a remarkable growth in terms of application over the last few decades. This is attributed to the improvements in theory and computing infrastructure that enable the analysis of systems of unprecedented size and detail at an affordable computational expense. In this perspective, we discuss recent progress and current challenges facing electronic structure theory in the context of heterogeneous catalysis. We specifically focus on the impact of computational chemistry in elucidating and designing catalytic systems in three topics of interest to Haldor Topsøe – ammonia, synthesis, hydrotreating, and NO x reduction. Furthermore, wemore » then discuss the common tools and concepts in computational catalysis that underline these topics and provide a perspective on the challenges and future directions of research in this area of catalysis research.« less
Computational chemistry for NH 3 synthesis, hydrotreating, and NO x reduction: Three topics of special interest to Haldor Topsøe

DOE PAGES

Elnabawy, Ahmed O.; Rangarajan, Srinivas; Mavrikakis, Manos

2015-06-05

Computational chemistry, especially density functional theory, has experienced a remarkable growth in terms of application over the last few decades. This is attributed to the improvements in theory and computing infrastructure that enable the analysis of systems of unprecedented size and detail at an affordable computational expense. In this perspective, we discuss recent progress and current challenges facing electronic structure theory in the context of heterogeneous catalysis. We specifically focus on the impact of computational chemistry in elucidating and designing catalytic systems in three topics of interest to Haldor Topsøe – ammonia, synthesis, hydrotreating, and NO x reduction. Furthermore, wemore » then discuss the common tools and concepts in computational catalysis that underline these topics and provide a perspective on the challenges and future directions of research in this area of catalysis research.« less
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)

NASA Technical Reports Server (NTRS)

Carroll, Chester C.; Owen, Jeffrey E.

1988-01-01

A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.
A parallel implementation of an off-lattice individual-based model of multicellular populations

NASA Astrophysics Data System (ADS)

Harvey, Daniel G.; Fletcher, Alexander G.; Osborne, James M.; Pitt-Francis, Joe

2015-07-01

As computational models of multicellular populations include ever more detailed descriptions of biophysical and biochemical processes, the computational cost of simulating such models limits their ability to generate novel scientific hypotheses and testable predictions. While developments in microchip technology continue to increase the power of individual processors, parallel computing offers an immediate increase in available processing power. To make full use of parallel computing technology, it is necessary to develop specialised algorithms. To this end, we present a parallel algorithm for a class of off-lattice individual-based models of multicellular populations. The algorithm divides the spatial domain between computing processes and comprises communication routines that ensure the model is correctly simulated on multiple processors. The parallel algorithm is shown to accurately reproduce the results of a deterministic simulation performed using a pre-existing serial implementation. We test the scaling of computation time, memory use and load balancing as more processes are used to simulate a cell population of fixed size. We find approximate linear scaling of both speed-up and memory consumption on up to 32 processor cores. Dynamic load balancing is shown to provide speed-up for non-regular spatial distributions of cells in the case of a growing population.
Spatiotemporal Domain Decomposition for Massive Parallel Computation of Space-Time Kernel Density

NASA Astrophysics Data System (ADS)

Hohl, A.; Delmelle, E. M.; Tang, W.

2015-07-01

Accelerated processing capabilities are deemed critical when conducting analysis on spatiotemporal datasets of increasing size, diversity and availability. High-performance parallel computing offers the capacity to solve computationally demanding problems in a limited timeframe, but likewise poses the challenge of preventing processing inefficiency due to workload imbalance between computing resources. Therefore, when designing new algorithms capable of implementing parallel strategies, careful spatiotemporal domain decomposition is necessary to account for heterogeneity in the data. In this study, we perform octtree-based adaptive decomposition of the spatiotemporal domain for parallel computation of space-time kernel density. In order to avoid edge effects near subdomain boundaries, we establish spatiotemporal buffers to include adjacent data-points that are within the spatial and temporal kernel bandwidths. Then, we quantify computational intensity of each subdomain to balance workloads among processors. We illustrate the benefits of our methodology using a space-time epidemiological dataset of Dengue fever, an infectious vector-borne disease that poses a severe threat to communities in tropical climates. Our parallel implementation of kernel density reaches substantial speedup compared to sequential processing, and achieves high levels of workload balance among processors due to great accuracy in quantifying computational intensity. Our approach is portable of other space-time analytical tests.
Accelerating EPI distortion correction by utilizing a modern GPU-based parallel computation.

PubMed

Yang, Yao-Hao; Huang, Teng-Yi; Wang, Fu-Nien; Chuang, Tzu-Chao; Chen, Nan-Kuei

2013-04-01

The combination of phase demodulation and field mapping is a practical method to correct echo planar imaging (EPI) geometric distortion. However, since phase dispersion accumulates in each phase-encoding step, the calculation complexity of phase modulation is Ny-fold higher than conventional image reconstructions. Thus, correcting EPI images via phase demodulation is generally a time-consuming task. Parallel computing by employing general-purpose calculations on graphics processing units (GPU) can accelerate scientific computing if the algorithm is parallelized. This study proposes a method that incorporates the GPU-based technique into phase demodulation calculations to reduce computation time. The proposed parallel algorithm was applied to a PROPELLER-EPI diffusion tensor data set. The GPU-based phase demodulation method reduced the EPI distortion correctly, and accelerated the computation. The total reconstruction time of the 16-slice PROPELLER-EPI diffusion tensor images with matrix size of 128 × 128 was reduced from 1,754 seconds to 101 seconds by utilizing the parallelized 4-GPU program. GPU computing is a promising method to accelerate EPI geometric correction. The resulting reduction in computation time of phase demodulation should accelerate postprocessing for studies performed with EPI, and should effectuate the PROPELLER-EPI technique for clinical practice. Copyright © 2011 by the American Society of Neuroimaging.
Super and parallel computers and their impact on civil engineering

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kamat, M.P.

1986-01-01

This book presents the papers given at a conference on the use of supercomputers in civil engineering. Topics considered at the conference included solving nonlinear equations on a hypercube, a custom architectured parallel processing system, distributed data processing, algorithms, computer architecture, parallel processing, vector processing, computerized simulation, and cost benefit analysis.

Xyce

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thomquist, Heidi K.; Fixel, Deborah A.; Fett, David Brian

The Xyce Parallel Electronic Simulator simulates electronic circuit behavior in DC, AC, HB, MPDE and transient mode using standard analog (DAE) and/or device (PDE) device models including several age and radiation aware devices. It supports a variety of computing platforms (both serial and parallel) computers. Lastly, it uses a variety of modern solution algorithms dynamic parallel load-balancing and iterative solvers.
Computer Based Instructional Techniques in Undergraduate Introductory Organic Chemistry: Rationale, Developmental Techniques, Programming Strategies and Evaluation.

ERIC Educational Resources Information Center

Culp, G. H.; And Others

Over 100 interactive computer programs for use in general and organic chemistry at the University of Texas at Austin have been prepared. The rationale for the programs is based upon the belief that computer-assisted instruction (CAI) can improve education by, among other things, freeing teachers from routine tasks, measuring entry skills,…
MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning

PubMed Central

Yang, Jie; Huang, Yuan; Xu, Lixiong; Li, Siguang; Qi, Man

2015-01-01

Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications. However, ANNs are notably slow in computation especially when the size of data is large. Nowadays, big data has received a momentum from both industry and academia. To fulfill the potentials of ANNs for big data applications, the computation process must be speeded up. For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications. Three data intensive scenarios are considered in the parallelization process in terms of the volume of classification data, the size of the training data, and the number of neurons in the neural network. The performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation. PMID:26681933
Tutorial: Parallel Computing of Simulation Models for Risk Analysis.

PubMed

Reilly, Allison C; Staid, Andrea; Gao, Michael; Guikema, Seth D

2016-10-01

Simulation models are widely used in risk analysis to study the effects of uncertainties on outcomes of interest in complex problems. Often, these models are computationally complex and time consuming to run. This latter point may be at odds with time-sensitive evaluations or may limit the number of parameters that are considered. In this article, we give an introductory tutorial focused on parallelizing simulation code to better leverage modern computing hardware, enabling risk analysts to better utilize simulation-based methods for quantifying uncertainty in practice. This article is aimed primarily at risk analysts who use simulation methods but do not yet utilize parallelization to decrease the computational burden of these models. The discussion is focused on conceptual aspects of embarrassingly parallel computer code and software considerations. Two complementary examples are shown using the languages MATLAB and R. A brief discussion of hardware considerations is located in the Appendix. © 2016 Society for Risk Analysis.
MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning.

PubMed

Liu, Yang; Yang, Jie; Huang, Yuan; Xu, Lixiong; Li, Siguang; Qi, Man

2015-01-01

Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications. However, ANNs are notably slow in computation especially when the size of data is large. Nowadays, big data has received a momentum from both industry and academia. To fulfill the potentials of ANNs for big data applications, the computation process must be speeded up. For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications. Three data intensive scenarios are considered in the parallelization process in terms of the volume of classification data, the size of the training data, and the number of neurons in the neural network. The performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation.
A parallel algorithm for the two-dimensional time fractional diffusion equation with implicit difference method.

PubMed

Gong, Chunye; Bao, Weimin; Tang, Guojian; Jiang, Yuewen; Liu, Jie

2014-01-01

It is very time consuming to solve fractional differential equations. The computational complexity of two-dimensional fractional differential equation (2D-TFDE) with iterative implicit finite difference method is O(M(x)M(y)N(2)). In this paper, we present a parallel algorithm for 2D-TFDE and give an in-depth discussion about this algorithm. A task distribution model and data layout with virtual boundary are designed for this parallel algorithm. The experimental results show that the parallel algorithm compares well with the exact solution. The parallel algorithm on single Intel Xeon X5540 CPU runs 3.16-4.17 times faster than the serial algorithm on single CPU core. The parallel efficiency of 81 processes is up to 88.24% compared with 9 processes on a distributed memory cluster system. We do think that the parallel computing technology will become a very basic method for the computational intensive fractional applications in the near future.
Performance Characteristics of the Multi-Zone NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Jin, Haoqiang; VanderWijngaart, Rob F.

2003-01-01

We describe a new suite of computational benchmarks that models applications featuring multiple levels of parallelism. Such parallelism is often available in realistic flow computations on systems of grids, but had not previously been captured in bench-marks. The new suite, named NPB Multi-Zone, is extended from the NAS Parallel Benchmarks suite, and involves solving the application benchmarks LU, BT and SP on collections of loosely coupled discretization meshes. The solutions on the meshes are updated independently, but after each time step they exchange boundary value information. This strategy provides relatively easily exploitable coarse-grain parallelism between meshes. Three reference implementations are available: one serial, one hybrid using the Message Passing Interface (MPI) and OpenMP, and another hybrid using a shared memory multi-level programming model (SMP+OpenMP). We examine the effectiveness of hybrid parallelization paradigms in these implementations on three different parallel computers. We also use an empirical formula to investigate the performance characteristics of the multi-zone benchmarks.
Methods for design and evaluation of parallel computating systems (The PISCES project)

NASA Technical Reports Server (NTRS)

Pratt, Terrence W.; Wise, Robert; Haught, Mary JO

1989-01-01

The PISCES project started in 1984 under the sponsorship of the NASA Computational Structural Mechanics (CSM) program. A PISCES 1 programming environment and parallel FORTRAN were implemented in 1984 for the DEC VAX (using UNIX processes to simulate parallel processes). This system was used for experimentation with parallel programs for scientific applications and AI (dynamic scene analysis) applications. PISCES 1 was ported to a network of Apollo workstations by N. Fitzgerald.
A new parallel-vector finite element analysis software on distributed-memory computers

NASA Technical Reports Server (NTRS)

Qin, Jiangning; Nguyen, Duc T.

1993-01-01

A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.
Solving very large, sparse linear systems on mesh-connected parallel computers

NASA Technical Reports Server (NTRS)

Opsahl, Torstein; Reif, John

1987-01-01

The implementation of Pan and Reif's Parallel Nested Dissection (PND) algorithm on mesh connected parallel computers is described. This is the first known algorithm that allows very large, sparse linear systems of equations to be solved efficiently in polylog time using a small number of processors. How the processor bound of PND can be matched to the number of processors available on a given parallel computer by slowing down the algorithm by constant factors is described. Also, for the important class of problems where G(A) is a grid graph, a unique memory mapping that reduces the inter-processor communication requirements of PND to those that can be executed on mesh connected parallel machines is detailed. A description of an implementation on the Goodyear Massively Parallel Processor (MPP), located at Goddard is given. Also, a detailed discussion of data mappings and performance issues is given.
High order parallel numerical schemes for solving incompressible flows

NASA Technical Reports Server (NTRS)

Lin, Avi; Milner, Edward J.; Liou, May-Fun; Belch, Richard A.

1992-01-01

The use of parallel computers for numerically solving flow fields has gained much importance in recent years. This paper introduces a new high order numerical scheme for computational fluid dynamics (CFD) specifically designed for parallel computational environments. A distributed MIMD system gives the flexibility of treating different elements of the governing equations with totally different numerical schemes in different regions of the flow field. The parallel decomposition of the governing operator to be solved is the primary parallel split. The primary parallel split was studied using a hypercube like architecture having clusters of shared memory processors at each node. The approach is demonstrated using examples of simple steady state incompressible flows. Future studies should investigate the secondary split because, depending on the numerical scheme that each of the processors applies and the nature of the flow in the specific subdomain, it may be possible for a processor to seek better, or higher order, schemes for its particular subcase.
Parallelization of the FLAPW method

NASA Astrophysics Data System (ADS)

Canning, A.; Mannstadt, W.; Freeman, A. J.

2000-08-01

The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.
PEM-PCA: a parallel expectation-maximization PCA face recognition architecture.

PubMed

Rujirakul, Kanokmon; So-In, Chakchai; Arnonkijpanich, Banchar

2014-01-01

Principal component analysis or PCA has been traditionally used as one of the feature extraction techniques in face recognition systems yielding high accuracy when requiring a small number of features. However, the covariance matrix and eigenvalue decomposition stages cause high computational complexity, especially for a large database. Thus, this research presents an alternative approach utilizing an Expectation-Maximization algorithm to reduce the determinant matrix manipulation resulting in the reduction of the stages' complexity. To improve the computational time, a novel parallel architecture was employed to utilize the benefits of parallelization of matrix computation during feature extraction and classification stages including parallel preprocessing, and their combinations, so-called a Parallel Expectation-Maximization PCA architecture. Comparing to a traditional PCA and its derivatives, the results indicate lower complexity with an insignificant difference in recognition precision leading to high speed face recognition systems, that is, the speed-up over nine and three times over PCA and Parallel PCA.
Partitioning problems in parallel, pipelined and distributed computing

NASA Technical Reports Server (NTRS)

Bokhari, S.

1985-01-01

The problem of optimally assigning the modules of a parallel program over the processors of a multiple computer system is addressed. A Sum-Bottleneck path algorithm is developed that permits the efficient solution of many variants of this problem under some constraints on the structure of the partitions. In particular, the following problems are solved optimally for a single-host, multiple satellite system: partitioning multiple chain structured parallel programs, multiple arbitrarily structured serial programs and single tree structured parallel programs. In addition, the problems of partitioning chain structured parallel programs across chain connected systems and across shared memory (or shared bus) systems are also solved under certain constraints. All solutions for parallel programs are equally applicable to pipelined programs. These results extend prior research in this area by explicitly taking concurrency into account and permit the efficient utilization of multiple computer architectures for a wide range of problems of practical interest.
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

NASA Technical Reports Server (NTRS)

Povitsky, A.

1998-01-01

In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
Parallel Algorithms and Patterns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robey, Robert W.

2016-06-16

This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
Parallel Computing for Probabilistic Response Analysis of High Temperature Composites

NASA Technical Reports Server (NTRS)

Sues, R. H.; Lua, Y. J.; Smith, M. D.

1994-01-01

The objective of this Phase I research was to establish the required software and hardware strategies to achieve large scale parallelism in solving PCM problems. To meet this objective, several investigations were conducted. First, we identified the multiple levels of parallelism in PCM and the computational strategies to exploit these parallelisms. Next, several software and hardware efficiency investigations were conducted. These involved the use of three different parallel programming paradigms and solution of two example problems on both a shared-memory multiprocessor and a distributed-memory network of workstations.
Beyond input-output computings: error-driven emergence with parallel non-distributed slime mold computer.

PubMed

Aono, Masashi; Gunji, Yukio-Pegio

2003-10-01

The emergence derived from errors is the key importance for both novel computing and novel usage of the computer. In this paper, we propose an implementable experimental plan for the biological computing so as to elicit the emergent property of complex systems. An individual plasmodium of the true slime mold Physarum polycephalum acts in the slime mold computer. Modifying the Elementary Cellular Automaton as it entails the global synchronization problem upon the parallel computing provides the NP-complete problem solved by the slime mold computer. The possibility to solve the problem by giving neither all possible results nor explicit prescription of solution-seeking is discussed. In slime mold computing, the distributivity in the local computing logic can change dynamically, and its parallel non-distributed computing cannot be reduced into the spatial addition of multiple serial computings. The computing system based on exhaustive absence of the super-system may produce, something more than filling the vacancy.
Reconstruction for time-domain in vivo EPR 3D multigradient oximetric imaging--a parallel processing perspective.

PubMed

Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C

2009-01-01

Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.
Access and visualization using clusters and other parallel computers

NASA Technical Reports Server (NTRS)

Katz, Daniel S.; Bergou, Attila; Berriman, Bruce; Block, Gary; Collier, Jim; Curkendall, Dave; Good, John; Husman, Laura; Jacob, Joe; Laity, Anastasia;

2003-01-01

JPL's Parallel Applications Technologies Group has been exploring the issues of data access and visualization of very large data sets over the past 10 or so years. this work has used a number of types of parallel computers, and today includes the use of commodity clusters. This talk will highlight some of the applications and tools we have developed, including how they use parallel computing resources, and specifically how we are using modern clusters. Our applications focus on NASA's needs; thus our data sets are usually related to Earth and Space Science, including data delivered from instruments in space, and data produced by telescopes on the ground.

Transect studies on pine forests along parallel 52° north, 12-32° east and along a pollution gradient in Poland: general assumptions

Treesearch

Alicja Breymeyer

1998-01-01

The responses of pine forest to changing climate and environmental chemistry were studied along two transects following the pollution and continentality gradients in Poland. One axis begins on the western border of Poland, crosses the country along the 52nd parallel, and ends on the eastern border of Poland in the area of Bialowieza National Park, Biosphere Reserve....
CMOS VLSI Layout and Verification of a SIMD Computer

NASA Technical Reports Server (NTRS)

Zheng, Jianqing

1996-01-01

A CMOS VLSI layout and verification of a 3 x 3 processor parallel computer has been completed. The layout was done using the MAGIC tool and the verification using HSPICE. Suggestions for expanding the computer into a million processor network are presented. Many problems that might be encountered when implementing a massively parallel computer are discussed.
CFD Analysis and Design Optimization Using Parallel Computers

NASA Technical Reports Server (NTRS)

Martinelli, Luigi; Alonso, Juan Jose; Jameson, Antony; Reuther, James

1997-01-01

A versatile and efficient multi-block method is presented for the simulation of both steady and unsteady flow, as well as aerodynamic design optimization of complete aircraft configurations. The compressible Euler and Reynolds Averaged Navier-Stokes (RANS) equations are discretized using a high resolution scheme on body-fitted structured meshes. An efficient multigrid implicit scheme is implemented for time-accurate flow calculations. Optimum aerodynamic shape design is achieved at very low cost using an adjoint formulation. The method is implemented on parallel computing systems using the MPI message passing interface standard to ensure portability. The results demonstrate that, by combining highly efficient algorithms with parallel computing, it is possible to perform detailed steady and unsteady analysis as well as automatic design for complex configurations using the present generation of parallel computers.
Optimistic barrier synchronization

NASA Technical Reports Server (NTRS)

Nicol, David M.

1992-01-01

Barrier synchronization is fundamental operation in parallel computation. In many contexts, at the point a processor enters a barrier it knows that it has already processed all the work required of it prior to synchronization. The alternative case, when a processor cannot enter a barrier with the assurance that it has already performed all the necessary pre-synchronization computation, is treated. The problem arises when the number of pre-sychronization messages to be received by a processor is unkown, for example, in a parallel discrete simulation or any other computation that is largely driven by an unpredictable exchange of messages. We describe an optimistic O(log sup 2 P) barrier algorithm for such problems, study its performance on a large-scale parallel system, and consider extensions to general associative reductions as well as associative parallel prefix computations.
Parallel grid generation algorithm for distributed memory computers

NASA Technical Reports Server (NTRS)

Moitra, Stuti; Moitra, Anutosh

1994-01-01

A parallel grid-generation algorithm and its implementation on the Intel iPSC/860 computer are described. The grid-generation scheme is based on an algebraic formulation of homotopic relations. Methods for utilizing the inherent parallelism of the grid-generation scheme are described, and implementation of multiple levELs of parallelism on multiple instruction multiple data machines are indicated. The algorithm is capable of providing near orthogonality and spacing control at solid boundaries while requiring minimal interprocessor communications. Results obtained on the Intel hypercube for a blended wing-body configuration are used to demonstrate the effectiveness of the algorithm. Fortran implementations bAsed on the native programming model of the iPSC/860 computer and the Express system of software tools are reported. Computational gains in execution time speed-up ratios are given.
Efficient computation of hashes

NASA Astrophysics Data System (ADS)

Lopes, Raul H. C.; Franqueira, Virginia N. L.; Hobson, Peter R.

2014-06-01

The sequential computation of hashes at the core of many distributed storage systems and found, for example, in grid services can hinder efficiency in service quality and even pose security challenges that can only be addressed by the use of parallel hash tree modes. The main contributions of this paper are, first, the identification of several efficiency and security challenges posed by the use of sequential hash computation based on the Merkle-Damgard engine. In addition, alternatives for the parallel computation of hash trees are discussed, and a prototype for a new parallel implementation of the Keccak function, the SHA-3 winner, is introduced.
Methods for operating parallel computing systems employing sequenced communications

DOEpatents

Benner, R.E.; Gustafson, J.L.; Montry, G.R.

1999-08-10

A parallel computing system and method are disclosed having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system. 15 figs.
Methods for operating parallel computing systems employing sequenced communications

DOEpatents

Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

1999-01-01

A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
Managing internode data communications for an uninitialized process in a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Charles J; Blocksome, Michael A; Miller, Douglas R

2014-05-20

A parallel computer includes nodes, each having main memory and a messaging unit (MU). Each MU includes computer memory, which in turn includes, MU message buffers. Each MU message buffer is associated with an uninitialized process on the compute node. In the parallel computer, managing internode data communications for an uninitialized process includes: receiving, by an MU of a compute node, one or more data communications messages in an MU message buffer associated with an uninitialized process on the compute node; determining, by an application agent, that the MU message buffer associated with the uninitialized process is full prior tomore » initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in main computer memory; and moving, by the application agent, data communications messages from the MU message buffer associated with the uninitialized process to the temporary message buffer in main computer memory.« less
Managing internode data communications for an uninitialized process in a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Parker, Jeffrey J; Ratterman, Joseph D; Smith, Brian E

2014-05-20

A parallel computer includes nodes, each having main memory and a messaging unit (MU). Each MU includes computer memory, which in turn includes, MU message buffers. Each MU message buffer is associated with an uninitialized process on the compute node. In the parallel computer, managing internode data communications for an uninitialized process includes: receiving, by an MU of a compute node, one or more data communications messages in an MU message buffer associated with an uninitialized process on the compute node; determining, by an application agent, that the MU message buffer associated with the uninitialized process is full prior to initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in main computer memory; and moving, by the application agent, data communications messages from the MU message buffer associated with the uninitialized process to the temporary message buffer in main computer memory.
[Computational chemistry in structure-based drug design].

PubMed

Cao, Ran; Li, Wei; Sun, Han-Zi; Zhou, Yu; Huang, Niu

2013-07-01

Today, the understanding of the sequence and structure of biologically relevant targets is growing rapidly and researchers from many disciplines, physics and computational science in particular, are making significant contributions to modern biology and drug discovery. However, it remains challenging to rationally design small molecular ligands with desired biological characteristics based on the structural information of the drug targets, which demands more accurate calculation of ligand binding free-energy. With the rapid advances in computer power and extensive efforts in algorithm development, physics-based computational chemistry approaches have played more important roles in structure-based drug design. Here we reviewed the newly developed computational chemistry methods in structure-based drug design as well as the elegant applications, including binding-site druggability assessment, large scale virtual screening of chemical database, and lead compound optimization. Importantly, here we address the current bottlenecks and propose practical solutions.
Implementing and analyzing the multi-threaded LP-inference

NASA Astrophysics Data System (ADS)

Bolotova, S. Yu; Trofimenko, E. V.; Leschinskaya, M. V.

2018-03-01

The logical production equations provide new possibilities for the backward inference optimization in intelligent production-type systems. The strategy of a relevant backward inference is aimed at minimization of a number of queries to external information source (either to a database or an interactive user). The idea of the method is based on the computing of initial preimages set and searching for the true preimage. The execution of each stage can be organized independently and in parallel and the actual work at a given stage can also be distributed between parallel computers. This paper is devoted to the parallel algorithms of the relevant inference based on the advanced scheme of the parallel computations “pipeline” which allows to increase the degree of parallelism. The author also provides some details of the LP-structures implementation.
A Parallel Processing Algorithm for Remote Sensing Classification

NASA Technical Reports Server (NTRS)

Gualtieri, J. Anthony

2005-01-01

A current thread in parallel computation is the use of cluster computers created by networking a few to thousands of commodity general-purpose workstation-level commuters using the Linux operating system. For example on the Medusa cluster at NASA/GSFC, this provides for super computing performance, 130 G(sub flops) (Linpack Benchmark) at moderate cost, $370K. However, to be useful for scientific computing in the area of Earth science, issues of ease of programming, access to existing scientific libraries, and portability of existing code need to be considered. In this paper, I address these issues in the context of tools for rendering earth science remote sensing data into useful products. In particular, I focus on a problem that can be decomposed into a set of independent tasks, which on a serial computer would be performed sequentially, but with a cluster computer can be performed in parallel, giving an obvious speedup. To make the ideas concrete, I consider the problem of classifying hyperspectral imagery where some ground truth is available to train the classifier. In particular I will use the Support Vector Machine (SVM) approach as applied to hyperspectral imagery. The approach will be to introduce notions about parallel computation and then to restrict the development to the SVM problem. Pseudocode (an outline of the computation) will be described and then details specific to the implementation will be given. Then timing results will be reported to show what speedups are possible using parallel computation. The paper will close with a discussion of the results.
PISCES: An environment for parallel scientific computation

NASA Technical Reports Server (NTRS)

Pratt, T. W.

1985-01-01

The parallel implementation of scientific computing environment (PISCES) is a project to provide high-level programming environments for parallel MIMD computers. Pisces 1, the first of these environments, is a FORTRAN 77 based environment which runs under the UNIX operating system. The Pisces 1 user programs in Pisces FORTRAN, an extension of FORTRAN 77 for parallel processing. The major emphasis in the Pisces 1 design is in providing a carefully specified virtual machine that defines the run-time environment within which Pisces FORTRAN programs are executed. Each implementation then provides the same virtual machine, regardless of differences in the underlying architecture. The design is intended to be portable to a variety of architectures. Currently Pisces 1 is implemented on a network of Apollo workstations and on a DEC VAX uniprocessor via simulation of the task level parallelism. An implementation for the Flexible Computing Corp. FLEX/32 is under construction. An introduction to the Pisces 1 virtual computer and the FORTRAN 77 extensions is presented. An example of an algorithm for the iterative solution of a system of equations is given. The most notable features of the design are the provision for several granularities of parallelism in programs and the provision of a window mechanism for distributed access to large arrays of data.
Parallel Preconditioning for CFD Problems on the CM-5

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)

1994-01-01

Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.
LABORATORY AND COMPUTATIONAL INVESTIGATIONS OF THE ATMOSPHERIC CHEMISTRY OF KEY OXIDATION PRODUCTS CONTROLLING TROPOSPHERIC OZONE FORMATION

EPA Science Inventory

Major uncertainties remain in our ability to identify the key reactions and primary oxidation products of volatile hydrocarbons that contribute to ozone formation in the troposphere. To reduce these uncertainties, computational chemistry, mechanistic and process analysis techniqu...
Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)

2000-01-01

HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).
Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.; Byun, Chansup; Kwak, Dochan (Technical Monitor)

2001-01-01

A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel super computers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.
Reactive Transport Modeling of Induced Calcite Precipitation Reaction Fronts in Porous Media Using A Parallel, Fully Coupled, Fully Implicit Approach

NASA Astrophysics Data System (ADS)

Guo, L.; Huang, H.; Gaston, D.; Redden, G. D.; Fox, D. T.; Fujita, Y.

2010-12-01

Inducing mineral precipitation in the subsurface is one potential strategy for immobilizing trace metal and radionuclide contaminants. Generating mineral precipitates in situ can be achieved by manipulating chemical conditions, typically through injection or in situ generation of reactants. How these reactants transport, mix and react within the medium controls the spatial distribution and composition of the resulting mineral phases. Multiple processes, including fluid flow, dispersive/diffusive transport of reactants, biogeochemical reactions and changes in porosity-permeability, are tightly coupled over a number of scales. Numerical modeling can be used to investigate the nonlinear coupling effects of these processes which are quite challenging to explore experimentally. Many subsurface reactive transport simulators employ a de-coupled or operator-splitting approach where transport equations and batch chemistry reactions are solved sequentially. However, such an approach has limited applicability for biogeochemical systems with fast kinetics and strong coupling between chemical reactions and medium properties. A massively parallel, fully coupled, fully implicit Reactive Transport simulator (referred to as “RAT”) based on a parallel multi-physics object-oriented simulation framework (MOOSE) has been developed at the Idaho National Laboratory. Within this simulator, systems of transport and reaction equations can be solved simultaneously in a fully coupled, fully implicit manner using the Jacobian Free Newton-Krylov (JFNK) method with additional advanced computing capabilities such as (1) physics-based preconditioning for solution convergence acceleration, (2) massively parallel computing and scalability, and (3) adaptive mesh refinements for 2D and 3D structured and unstructured mesh. The simulator was first tested against analytical solutions, then applied to simulating induced calcium carbonate mineral precipitation in 1D columns and 2D flow cells as analogs to homogeneous and heterogeneous porous media, respectively. In 1D columns, calcium carbonate mineral precipitation was driven by urea hydrolysis catalyzed by urease enzyme, and in 2D flow cells, calcium carbonate mineral forming reactants were injected sequentially, forming migrating reaction fronts that are typically highly nonuniform. The RAT simulation results for the spatial and temporal distributions of precipitates, reaction rates and major species in the system, and also for changes in porosity and permeability, were compared to both laboratory experimental data and computational results obtained using other reactive transport simulators. The comparisons demonstrate the ability of RAT to simulate complex nonlinear systems and the advantages of fully coupled approaches, over de-coupled methods, for accurate simulation of complex, dynamic processes such as engineered mineral precipitation in subsurface environments.
Parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Amin-Javaheri, Masoud; Orin, David E.

1989-01-01

The development of an O(log2N) parallel algorithm for the manipulator inertia matrix is presented. It is based on the most efficient serial algorithm which uses the composite rigid body method. Recursive doubling is used to reformulate the linear recurrence equations which are required to compute the diagonal elements of the matrix. It results in O(log2N) levels of computation. Computation of the off-diagonal elements involves N linear recurrences of varying-size and a new method, which avoids redundant computation of position and orientation transforms for the manipulator, is developed. The O(log2N) algorithm is presented in both equation and graphic forms which clearly show the parallelism inherent in the algorithm.

Supercomputing in Aerospace

NASA Technical Reports Server (NTRS)

Kutler, Paul; Yee, Helen

1987-01-01

Topics addressed include: numerical aerodynamic simulation; computational mechanics; supercomputers; aerospace propulsion systems; computational modeling in ballistics; turbulence modeling; computational chemistry; computational fluid dynamics; and computational astrophysics.
Manyscale Computing for Sensor Processing in Support of Space Situational Awareness

NASA Astrophysics Data System (ADS)

Schmalz, M.; Chapman, W.; Hayden, E.; Sahni, S.; Ranka, S.

2014-09-01

Increasing image and signal data burden associated with sensor data processing in support of space situational awareness implies continuing computational throughput growth beyond the petascale regime. In addition to growing applications data burden and diversity, the breadth, diversity and scalability of high performance computing architectures and their various organizations challenge the development of a single, unifying, practicable model of parallel computation. Therefore, models for scalable parallel processing have exploited architectural and structural idiosyncrasies, yielding potential misapplications when legacy programs are ported among such architectures. In response to this challenge, we have developed a concise, efficient computational paradigm and software called Manyscale Computing to facilitate efficient mapping of annotated application codes to heterogeneous parallel architectures. Our theory, algorithms, software, and experimental results support partitioning and scheduling of application codes for envisioned parallel architectures, in terms of work atoms that are mapped (for example) to threads or thread blocks on computational hardware. Because of the rigor, completeness, conciseness, and layered design of our manyscale approach, application-to-architecture mapping is feasible and scalable for architectures at petascales, exascales, and above. Further, our methodology is simple, relying primarily on a small set of primitive mapping operations and support routines that are readily implemented on modern parallel processors such as graphics processing units (GPUs) and hybrid multi-processors (HMPs). In this paper, we overview the opportunities and challenges of manyscale computing for image and signal processing in support of space situational awareness applications. We discuss applications in terms of a layered hardware architecture (laboratory > supercomputer > rack > processor > component hierarchy). Demonstration applications include performance analysis and results in terms of execution time as well as storage, power, and energy consumption for bus-connected and/or networked architectures. The feasibility of the manyscale paradigm is demonstrated by addressing four principal challenges: (1) architectural/structural diversity, parallelism, and locality, (2) masking of I/O and memory latencies, (3) scalability of design as well as implementation, and (4) efficient representation/expression of parallel applications. Examples will demonstrate how manyscale computing helps solve these challenges efficiently on real-world computing systems.
Chemistry Notes.

ERIC Educational Resources Information Center

School Science Review, 1983

1983-01-01

Presents background information, laboratory procedures, classroom materials/activities, and chemistry experiments. Topics include sublimation, electronegativity, electrolysis, experimental aspects of strontianite, halide test, evaluation of present and future computer programs in chemistry, formula building, care of glass/saturated calomel…
DOE Office of Scientific and Technical Information (OSTI.GOV)

Demeure, I.M.

The research presented here is concerned with representation techniques and tools to support the design, prototyping, simulation, and evaluation of message-based parallel, distributed computations. The author describes ParaDiGM-Parallel, Distributed computation Graph Model-a visual representation technique for parallel, message-based distributed computations. ParaDiGM provides several views of a computation depending on the aspect of concern. It is made of two complementary submodels, the DCPG-Distributed Computing Precedence Graph-model, and the PAM-Process Architecture Model-model. DCPGs are precedence graphs used to express the functionality of a computation in terms of tasks, message-passing, and data. PAM graphs are used to represent the partitioning of a computationmore » into schedulable units or processes, and the pattern of communication among those units. There is a natural mapping between the two models. He illustrates the utility of ParaDiGM as a representation technique by applying it to various computations (e.g., an adaptive global optimization algorithm, the client-server model). ParaDiGM representations are concise. They can be used in documenting the design and the implementation of parallel, distributed computations, in describing such computations to colleagues, and in comparing and contrasting various implementations of the same computation. He then describes VISA-VISual Assistant, a software tool to support the design, prototyping, and simulation of message-based parallel, distributed computations. VISA is based on the ParaDiGM model. In particular, it supports the editing of ParaDiGM graphs to describe the computations of interest, and the animation of these graphs to provide visual feedback during simulations. The graphs are supplemented with various attributes, simulation parameters, and interpretations which are procedures that can be executed by VISA.« less
Parallelization and Visual Analysis of Multidimensional Fields: Application to Ozone Production, Destruction, and Transport in Three Dimensions

NASA Technical Reports Server (NTRS)

Schwan, Karsten; Alyea, Fred; Ribarsky, M. William; Trauner, Mary; Eisenhauer, Greg; Jean, Yves; Gu, Weiming; Wang, Ray; Waldrop, Jeffrey; Schroeder, Beth;

1996-01-01

The three-dimensional, spectral transport model used in the current project was first successfully integrated over climatological time scales by Dr. Guang Ping Lou for the simulation of atmospheric N2O using the United Kingdom Meteorological Office (UKMO) 4-dimensional, assimilated wind and temperature data set. A non-parallel, FORTRAN version of this integration using a fairly simple N2O chemistry package containing only photo-chemical reactions was used to verify our initial parallel model results. The integrations reproduced the gross features of the observed stratospheric climatological N2O distributions but also simulated the structure of the stratospheric Antarctic vortex and its evolution. Subsequently, Dr. Thomas Kindler, who produced much of the parallel version of our model, enlarged the N2O model chemistry package to include N2O reactions involving O(D-1) and also introduced assimilated wind data from NASA as well as UKMO. Initially, transport calculations without chemistry were run using Carbon-14 as a non-reactive tracer gas with the result that large differences in the transport properties of the two assimilated wind data sets were apparent from the resultant Carbon-14 distributions. Subsequent calculations for N2O, including its chemistry, with the two input winds data sets with verification from UARS satellite observations have refined the transport differences between the two such that the model's steering capabilities could be used to infer the correct climatological vertical velocity fields required to support the N2O observations. During this process, it was also discovered that both the NASA and the UKMO data contained spurious values in some of the higher frequency wave components, leading to incorrect local transport calculations and ultimately affecting the large scale properties of the model's N2O distributions, particularly at tropical latitudes. Subsequent model runs with wind data that had been filtered to remove some of the high frequency components produced much more realistic N2O distributions. During the past few months, the UKMO wind data base for a complete two-year period was processed into spectral form for model use. This new version of the input transport data base now includes complete temperature fields as well as the necessary wind data. This was done to facilitate advanced chemical calculations in the parallel model which often depend upon temperature. Additional UKMO data is being added as it becomes available.

Scan line graphics generation on the massively parallel processor

NASA Technical Reports Server (NTRS)

Dorband, John E.

1988-01-01

Described here is how researchers implemented a scan line graphics generation algorithm on the Massively Parallel Processor (MPP). Pixels are computed in parallel and their results are applied to the Z buffer in large groups. To perform pixel value calculations, facilitate load balancing across the processors and apply the results to the Z buffer efficiently in parallel requires special virtual routing (sort computation) techniques developed by the author especially for use on single-instruction multiple-data (SIMD) architectures.
The paradigm compiler: Mapping a functional language for the connection machine

NASA Technical Reports Server (NTRS)

Dennis, Jack B.

1989-01-01

The Paradigm Compiler implements a new approach to compiling programs written in high level languages for execution on highly parallel computers. The general approach is to identify the principal data structures constructed by the program and to map these structures onto the processing elements of the target machine. The mapping is chosen to maximize performance as determined through compile time global analysis of the source program. The source language is Sisal, a functional language designed for scientific computations, and the target language is Paris, the published low level interface to the Connection Machine. The data structures considered are multidimensional arrays whose dimensions are known at compile time. Computations that build such arrays usually offer opportunities for highly parallel execution; they are data parallel. The Connection Machine is an attractive target for these computations, and the parallel for construct of the Sisal language is a convenient high level notation for data parallel algorithms. The principles and organization of the Paradigm Compiler are discussed.
Parallel computing in genomic research: advances and applications

PubMed Central

Ocaña, Kary; de Oliveira, Daniel

2015-01-01

Today’s genomic experiments have to process the so-called “biological big data” that is now reaching the size of Terabytes and Petabytes. To process this huge amount of data, scientists may require weeks or months if they use their own workstations. Parallelism techniques and high-performance computing (HPC) environments can be applied for reducing the total processing time and to ease the management, treatment, and analyses of this data. However, running bioinformatics experiments in HPC environments such as clouds, grids, clusters, and graphics processing unit requires the expertise from scientists to integrate computational, biological, and mathematical techniques and technologies. Several solutions have already been proposed to allow scientists for processing their genomic experiments using HPC capabilities and parallelism techniques. This article brings a systematic review of literature that surveys the most recently published research involving genomics and parallel computing. Our objective is to gather the main characteristics, benefits, and challenges that can be considered by scientists when running their genomic experiments to benefit from parallelism techniques and HPC capabilities. PMID:26604801
Parallel computing in genomic research: advances and applications.

PubMed

Ocaña, Kary; de Oliveira, Daniel

2015-01-01

Today's genomic experiments have to process the so-called "biological big data" that is now reaching the size of Terabytes and Petabytes. To process this huge amount of data, scientists may require weeks or months if they use their own workstations. Parallelism techniques and high-performance computing (HPC) environments can be applied for reducing the total processing time and to ease the management, treatment, and analyses of this data. However, running bioinformatics experiments in HPC environments such as clouds, grids, clusters, and graphics processing unit requires the expertise from scientists to integrate computational, biological, and mathematical techniques and technologies. Several solutions have already been proposed to allow scientists for processing their genomic experiments using HPC capabilities and parallelism techniques. This article brings a systematic review of literature that surveys the most recently published research involving genomics and parallel computing. Our objective is to gather the main characteristics, benefits, and challenges that can be considered by scientists when running their genomic experiments to benefit from parallelism techniques and HPC capabilities.
Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fijany, A.; Milman, M.; Redding, D.

1994-12-31

In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm,more » designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.« less
Automated Generation of Message-Passing Programs: An Evaluation Using CAPTools

NASA Technical Reports Server (NTRS)

Hribar, Michelle R.; Jin, Haoqiang; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

1998-01-01

Scientists at NASA Ames Research Center have been developing computational aeroscience applications on highly parallel architectures over the past ten years. During that same time period, a steady transition of hardware and system software also occurred, forcing us to expend great efforts into migrating and re-coding our applications. As applications and machine architectures become increasingly complex, the cost and time required for this process will become prohibitive. In this paper, we present the first set of results in our evaluation of interactive parallelization tools. In particular, we evaluate CAPTool's ability to parallelize computational aeroscience applications. CAPTools was tested on serial versions of the NAS Parallel Benchmarks and ARC3D, a computational fluid dynamics application, on two platforms: the SGI Origin 2000 and the Cray T3E. This evaluation includes performance, amount of user interaction required, limitations and portability. Based on these results, a discussion on the feasibility of computer aided parallelization of aerospace applications is presented along with suggestions for future work.
Connectionist Models and Parallelism in High Level Vision.

DTIC Science & Technology

1985-01-01

GRANT NUMBER(s) Jerome A. Feldman N00014-82-K-0193 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENt. PROJECT, TASK Computer Science...Connectionist Models 2.1 Background and Overviev % Computer science is just beginning to look seriously at parallel computation : it may turn out that...the chair. The program includes intermediate level networks that compute more complex joints and ones that compute parallelograms in the image. These
Effect of Material Ion Exchanges on the Mechanical Stiffness Properties and Shear Deformation of Hydrated Cement Material Chemistry Structure C-S-H Jennit - A Computational Modeling Study

DTIC Science & Technology

2014-01-01

Study Material properties and performance are governed by material molecular chemistry structures and molecular level interactions. Methods to...understand relationships between the material properties and performance and their correlation to the molecular level chemistry and morphology, and thus...find ways of manipulating and adjusting matters at the atomistic level in order to improve material performance are required. A computational material
DOE Office of Scientific and Technical Information (OSTI.GOV)

None, None

The Second SIAM Conference on Computational Science and Engineering was held in San Diego from February 10-12, 2003. Total conference attendance was 553. This is a 23% increase in attendance over the first conference. The focus of this conference was to draw attention to the tremendous range of major computational efforts on large problems in science and engineering, to promote the interdisciplinary culture required to meet these large-scale challenges, and to encourage the training of the next generation of computational scientists. Computational Science & Engineering (CS&E) is now widely accepted, along with theory and experiment, as a crucial third modemore » of scientific investigation and engineering design. Aerospace, automotive, biological, chemical, semiconductor, and other industrial sectors now rely on simulation for technical decision support. For federal agencies also, CS&E has become an essential support for decisions on resources, transportation, and defense. CS&E is, by nature, interdisciplinary. It grows out of physical applications and it depends on computer architecture, but at its heart are powerful numerical algorithms and sophisticated computer science techniques. From an applied mathematics perspective, much of CS&E has involved analysis, but the future surely includes optimization and design, especially in the presence of uncertainty. Another mathematical frontier is the assimilation of very large data sets through such techniques as adaptive multi-resolution, automated feature search, and low-dimensional parameterization. The themes of the 2003 conference included, but were not limited to: Advanced Discretization Methods; Computational Biology and Bioinformatics; Computational Chemistry and Chemical Engineering; Computational Earth and Atmospheric Sciences; Computational Electromagnetics; Computational Fluid Dynamics; Computational Medicine and Bioengineering; Computational Physics and Astrophysics; Computational Solid Mechanics and Materials; CS&E Education; Meshing and Adaptivity; Multiscale and Multiphysics Problems; Numerical Algorithms for CS&E; Discrete and Combinatorial Algorithms for CS&E; Inverse Problems; Optimal Design, Optimal Control, and Inverse Problems; Parallel and Distributed Computing; Problem-Solving Environments; Software and Wddleware Systems; Uncertainty Estimation and Sensitivity Analysis; and Visualization and Computer Graphics.« less
GPU accelerated dynamic functional connectivity analysis for functional MRI data.

PubMed

Akgün, Devrim; Sakoğlu, Ünal; Esquivel, Johnny; Adinoff, Bryon; Mete, Mutlu

2015-07-01

Recent advances in multi-core processors and graphics card based computational technologies have paved the way for an improved and dynamic utilization of parallel computing techniques. Numerous applications have been implemented for the acceleration of computationally-intensive problems in various computational science fields including bioinformatics, in which big data problems are prevalent. In neuroimaging, dynamic functional connectivity (DFC) analysis is a computationally demanding method used to investigate dynamic functional interactions among different brain regions or networks identified with functional magnetic resonance imaging (fMRI) data. In this study, we implemented and analyzed a parallel DFC algorithm based on thread-based and block-based approaches. The thread-based approach was designed to parallelize DFC computations and was implemented in both Open Multi-Processing (OpenMP) and Compute Unified Device Architecture (CUDA) programming platforms. Another approach developed in this study to better utilize CUDA architecture is the block-based approach, where parallelization involves smaller parts of fMRI time-courses obtained by sliding-windows. Experimental results showed that the proposed parallel design solutions enabled by the GPUs significantly reduce the computation time for DFC analysis. Multicore implementation using OpenMP on 8-core processor provides up to 7.7× speed-up. GPU implementation using CUDA yielded substantial accelerations ranging from 18.5× to 157× speed-up once thread-based and block-based approaches were combined in the analysis. Proposed parallel programming solutions showed that multi-core processor and CUDA-supported GPU implementations accelerated the DFC analyses significantly. Developed algorithms make the DFC analyses more practical for multi-subject studies with more dynamic analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.
Administering truncated receive functions in a parallel messaging interface

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2014-12-09

Administering truncated receive functions in a parallel messaging interface (`PMI`) of a parallel computer comprising a plurality of compute nodes coupled for data communications through the PMI and through a data communications network, including: sending, through the PMI on a source compute node, a quantity of data from the source compute node to a destination compute node; specifying, by an application on the destination compute node, a portion of the quantity of data to be received by the application on the destination compute node and a portion of the quantity of data to be discarded; receiving, by the PMI on the destination compute node, all of the quantity of data; providing, by the PMI on the destination compute node to the application on the destination compute node, only the portion of the quantity of data to be received by the application; and discarding, by the PMI on the destination compute node, the portion of the quantity of data to be discarded.
Efficient Predictions of Excited State for Nanomaterials Using Aces 3 and 4

DTIC Science & Technology

2017-12-20

by first-principle methods in the software package ACES by using large parallel computers, growing tothe exascale. 15. SUBJECT TERMS Computer...modeling, excited states, optical properties, structure, stability, activation barriers first principle methods , parallel computing 16. SECURITY...2 Progress with new density functional methods
Efficient multi-objective calibration of a computationally intensive hydrologic model with parallel computing software in Python

USDA-ARS?s Scientific Manuscript database

With enhanced data availability, distributed watershed models for large areas with high spatial and temporal resolution are increasingly used to understand water budgets and examine effects of human activities and climate change/variability on water resources. Developing parallel computing software...
A Survey of Parallel Computing

DTIC Science & Technology

1988-07-01

Evaluating Two Massively Parallel Machines. Communications of the ACM .9, , , 176 BIBLIOGRAPHY 29, 8 (August), pp. 752-758. Gajski , D.D., Padua, D.A., Kuck...Computer Architecture, edited by Gajski , D. D., Milutinovic, V. M. Siegel, H. J. and Furht, B. P. IEEE Computer Society Press, Washington, D.C., pp. 387-407
Topical perspective on massive threading and parallelism.

PubMed

Farber, Robert M

2011-09-01

Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future. Published by Elsevier Inc.

Software Issues at the User Interface

DTIC Science & Technology

1991-05-01

successful integration of parallel computers into mainstream scientific computing. Clearly a compiler is the most important software tool available to a...Computer Science University of Colorado Boulder, CO 80309 ABSTRACT We review software issues that are critical to the successful integration of parallel...The development of an optimizing compiler of this quality, addressing communicaton instructions as well as computational instructions is a major
U.S. EPA’s Computational Toxicology Program: Innovation Powered by Chemistry (Dalton State College presentation)

EPA Science Inventory

Invited presentation at Dalton College, Dalton, GA to the Alliance for Innovation & Sustainability, April 20, 2017. U.S. EPA’s Computational Toxicology Program: Innovation Powered by Chemistry It is estimated that tens of thousands of commercial and industrial chemicals are ...
Applied Computational Chemistry for the Blind and Visually Impaired

ERIC Educational Resources Information Center

Wedler, Henry B.; Cohen, Sarah R.; Davis, Rebecca L.; Harrison, Jason G.; Siebert, Matthew R.; Willenbring, Dan; Hamann, Christian S.; Shaw, Jared T.; Tantillo, Dean J.

2012-01-01

We describe accommodations that we have made to our applied computational-theoretical chemistry laboratory to provide access for blind and visually impaired students interested in independent investigation of structure-function relationships. Our approach utilizes tactile drawings, molecular model kits, existing software, Bash and Perl scripts…
From data to analysis: linking NWChem and Avogadro with the syntax and semantics of Chemical Markup Language.

PubMed

de Jong, Wibe A; Walker, Andrew M; Hanwell, Marcus D

2013-05-24

Multidisciplinary integrated research requires the ability to couple the diverse sets of data obtained from a range of complex experiments and computer simulations. Integrating data requires semantically rich information. In this paper an end-to-end use of semantically rich data in computational chemistry is demonstrated utilizing the Chemical Markup Language (CML) framework. Semantically rich data is generated by the NWChem computational chemistry software with the FoX library and utilized by the Avogadro molecular editor for analysis and visualization. The NWChem computational chemistry software has been modified and coupled to the FoX library to write CML compliant XML data files. The FoX library was expanded to represent the lexical input files and molecular orbitals used by the computational chemistry software. Draft dictionary entries and a format for molecular orbitals within CML CompChem were developed. The Avogadro application was extended to read in CML data, and display molecular geometry and electronic structure in the GUI allowing for an end-to-end solution where Avogadro can create input structures, generate input files, NWChem can run the calculation and Avogadro can then read in and analyse the CML output produced. The developments outlined in this paper will be made available in future releases of NWChem, FoX, and Avogadro. The production of CML compliant XML files for computational chemistry software such as NWChem can be accomplished relatively easily using the FoX library. The CML data can be read in by a newly developed reader in Avogadro and analysed or visualized in various ways. A community-based effort is needed to further develop the CML CompChem convention and dictionary. This will enable the long-term goal of allowing a researcher to run simple "Google-style" searches of chemistry and physics and have the results of computational calculations returned in a comprehensible form alongside articles from the published literature.
From data to analysis: linking NWChem and Avogadro with the syntax and semantics of Chemical Markup Language

PubMed Central

2013-01-01

Background Multidisciplinary integrated research requires the ability to couple the diverse sets of data obtained from a range of complex experiments and computer simulations. Integrating data requires semantically rich information. In this paper an end-to-end use of semantically rich data in computational chemistry is demonstrated utilizing the Chemical Markup Language (CML) framework. Semantically rich data is generated by the NWChem computational chemistry software with the FoX library and utilized by the Avogadro molecular editor for analysis and visualization. Results The NWChem computational chemistry software has been modified and coupled to the FoX library to write CML compliant XML data files. The FoX library was expanded to represent the lexical input files and molecular orbitals used by the computational chemistry software. Draft dictionary entries and a format for molecular orbitals within CML CompChem were developed. The Avogadro application was extended to read in CML data, and display molecular geometry and electronic structure in the GUI allowing for an end-to-end solution where Avogadro can create input structures, generate input files, NWChem can run the calculation and Avogadro can then read in and analyse the CML output produced. The developments outlined in this paper will be made available in future releases of NWChem, FoX, and Avogadro. Conclusions The production of CML compliant XML files for computational chemistry software such as NWChem can be accomplished relatively easily using the FoX library. The CML data can be read in by a newly developed reader in Avogadro and analysed or visualized in various ways. A community-based effort is needed to further develop the CML CompChem convention and dictionary. This will enable the long-term goal of allowing a researcher to run simple “Google-style” searches of chemistry and physics and have the results of computational calculations returned in a comprehensible form alongside articles from the published literature. PMID:23705910
A rapid parallelization of cone-beam projection and back-projection operator based on texture fetching interpolation

NASA Astrophysics Data System (ADS)

Xie, Lizhe; Hu, Yining; Chen, Yang; Shi, Luyao

2015-03-01

Projection and back-projection are the most computational consuming parts in Computed Tomography (CT) reconstruction. Parallelization strategies using GPU computing techniques have been introduced. We in this paper present a new parallelization scheme for both projection and back-projection. The proposed method is based on CUDA technology carried out by NVIDIA Corporation. Instead of build complex model, we aimed on optimizing the existing algorithm and make it suitable for CUDA implementation so as to gain fast computation speed. Besides making use of texture fetching operation which helps gain faster interpolation speed, we fixed sampling numbers in the computation of projection, to ensure the synchronization of blocks and threads, thus prevents the latency caused by inconsistent computation complexity. Experiment results have proven the computational efficiency and imaging quality of the proposed method.
Chemistry Division: Annual progress report for period ending March 31, 1987

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1987-08-01

This report is divided into the following sections: coal chemistry; aqueous chemistry at high temperatures and pressures; geochemistry of crustal processes to high temperatures and pressures; chemistry of advanced inorganic materials; structure and dynamics of advanced polymeric materials; chemistry of transuranium elements and compounds; separations chemistry; reactions and catalysis in molten salts; surface science related to heterogeneous catalysis; electron spectroscopy; chemistry related to nuclear waste disposal; computational modeling of security document printing; and special topics. (DLC)
Development of the Connected Chemistry as Formative Assessment Pedagogy for High School Chemistry Teaching

ERIC Educational Resources Information Center

Park, Mihwa; Liu, Xiufeng; Waight, Noemi

2017-01-01

This paper describes the development of Connected Chemistry as Formative Assessment (CCFA) pedagogy, which integrates three promising teaching and learning approaches, computer models, formative assessments, and learning progressions, to promote student understanding in chemistry. CCFA supports student learning in making connections among the…
Exploiting parallel computing with limited program changes using a network of microcomputers

NASA Technical Reports Server (NTRS)

Rogers, J. L., Jr.; Sobieszczanski-Sobieski, J.

1985-01-01

Network computing and multiprocessor computers are two discernible trends in parallel processing. The computational behavior of an iterative distributed process in which some subtasks are completed later than others because of an imbalance in computational requirements is of significant interest. The effects of asynchronus processing was studied. A small existing program was converted to perform finite element analysis by distributing substructure analysis over a network of four Apple IIe microcomputers connected to a shared disk, simulating a parallel computer. The substructure analysis uses an iterative, fully stressed, structural resizing procedure. A framework of beams divided into three substructures is used as the finite element model. The effects of asynchronous processing on the convergence of the design variables are determined by not resizing particular substructures on various iterations.
Massively parallel quantum computer simulator

NASA Astrophysics Data System (ADS)

De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.

2007-01-01

We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.
Multithreaded Model for Dynamic Load Balancing Parallel Adaptive PDE Computations

NASA Technical Reports Server (NTRS)

Chrisochoides, Nikos

1995-01-01

We present a multithreaded model for the dynamic load-balancing of numerical, adaptive computations required for the solution of Partial Differential Equations (PDE's) on multiprocessors. Multithreading is used as a means of exploring concurrency in the processor level in order to tolerate synchronization costs inherent to traditional (non-threaded) parallel adaptive PDE solvers. Our preliminary analysis for parallel, adaptive PDE solvers indicates that multithreading can be used an a mechanism to mask overheads required for the dynamic balancing of processor workloads with computations required for the actual numerical solution of the PDE's. Also, multithreading can simplify the implementation of dynamic load-balancing algorithms, a task that is very difficult for traditional data parallel adaptive PDE computations. Unfortunately, multithreading does not always simplify program complexity, often makes code re-usability not an easy task, and increases software complexity.
Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations

NASA Astrophysics Data System (ADS)

Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.

2016-07-01

Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.
A sweep algorithm for massively parallel simulation of circuit-switched networks

NASA Technical Reports Server (NTRS)

Gaujal, Bruno; Greenberg, Albert G.; Nicol, David M.

1992-01-01

A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude.
Programming Probabilistic Structural Analysis for Parallel Processing Computer

NASA Technical Reports Server (NTRS)

Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

1991-01-01

The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.
Simulation of ozone production in a complex circulation region using nested grids

NASA Astrophysics Data System (ADS)

Taghavi, M.; Cautenet, S.; Foret, G.

2003-07-01

During ESCOMPTE precampaign (15 June to 10 July 2000), three days of intensive pollution (IOP0) have been observed and simulated. The comprehensive RAMS model, version 4.3, coupled online with a chemical module including 29 species, has been used to follow the chemistry of the zone polluted over southern France. This online method can be used because the code is paralleled and the SGI 3800 computer is very powerful. Two runs have been performed: run1 with one grid and run2 with two nested grids. The redistribution of simulated chemical species (ozone, carbon monoxide, sulphur dioxide and nitrogen oxides) was compared to aircraft measurements and surface stations. The 2-grid run has given substantially better results than the one-grid run only because the former takes the outer pollutants into account. This online method helps to explain dynamics and to retrieve the chemical species redistribution with a good agreement.
Simulation of ozone production in a complex circulation region using nested grids

NASA Astrophysics Data System (ADS)

Taghavi, M.; Cautenet, S.; Foret, G.

2004-06-01

During the ESCOMPTE precampaign (summer 2000, over Southern France), a 3-day period of intensive observation (IOP0), associated with ozone peaks, has been simulated. The comprehensive RAMS model, version 4.3, coupled on-line with a chemical module including 29 species, is used to follow the chemistry of the polluted zone. This efficient but time consuming method can be used because the code is installed on a parallel computer, the SGI 3800. Two runs are performed: run 1 with a single grid and run 2 with two nested grids. The simulated fields of ozone, carbon monoxide, nitrogen oxides and sulfur dioxide are compared with aircraft and surface station measurements. The 2-grid run looks substantially better than the run with one grid because the former takes the outer pollutants into account. This on-line method helps to satisfactorily retrieve the chemical species redistribution and to explain the impact of dynamics on this redistribution.
Quantifying chemical uncertainties in simulations of the ISM

NASA Astrophysics Data System (ADS)

Glover, Simon

2018-06-01

The ever-increasing power of large parallel computers now makes it possible to include increasingly sophisticated chemical models in three-dimensional simulations of the interstellar medium (ISM). This allows us to study the role that chemistry plays in the thermal balance of a realistically-structured, turbulent ISM, as well as enabling us to generated detailed synthetic observations of important atomic or molecular tracers. However, one major constraint on the accuracy of these models is the accuracy with which the input chemical rate coefficients are known. Uncertainties in these chemical rate coefficients inevitably introduce uncertainties into the model predictions. In this talk, I will review some of the methods we can use to quantify these uncertainties and to identify the key reactions where improved chemical data is most urgently required. I will also discuss a few examples, ranging from the local ISM to the high-redshift universe.
ParFit: A Python-Based Object-Oriented Program for Fitting Molecular Mechanics Parameters to ab Initio Data.

PubMed

Zahariev, Federico; De Silva, Nuwan; Gordon, Mark S; Windus, Theresa L; Dick-Perez, Marilu

2017-03-27

A newly created object-oriented program for automating the process of fitting molecular-mechanics parameters to ab initio data, termed ParFit, is presented. ParFit uses a hybrid of deterministic and stochastic genetic algorithms. ParFit can simultaneously handle several molecular-mechanics parameters in multiple molecules and can also apply symmetric and antisymmetric constraints on the optimized parameters. The simultaneous handling of several molecules enhances the transferability of the fitted parameters. ParFit is written in Python, uses a rich set of standard and nonstandard Python libraries, and can be run in parallel on multicore computer systems. As an example, a series of phosphine oxides, important for metal extraction chemistry, are parametrized using ParFit. ParFit is in an open source program available for free on GitHub ( https://github.com/fzahari/ParFit ).
Terascale High-Fidelity Simulations of Turbulent Combustion with Detailed Chemistry: Spray Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rutland, Christopher J.

2009-04-26

The Terascale High-Fidelity Simulations of Turbulent Combustion (TSTC) project is a multi-university collaborative effort to develop a high-fidelity turbulent reacting flow simulation capability utilizing terascale, massively parallel computer technology. The main paradigm of the approach is direct numerical simulation (DNS) featuring the highest temporal and spatial accuracy, allowing quantitative observations of the fine-scale physics found in turbulent reacting flows as well as providing a useful tool for development of sub-models needed in device-level simulations. Under this component of the TSTC program the simulation code named S3D, developed and shared with coworkers at Sandia National Laboratories, has been enhanced with newmore » numerical algorithms and physical models to provide predictive capabilities for turbulent liquid fuel spray dynamics. Major accomplishments include improved fundamental understanding of mixing and auto-ignition in multi-phase turbulent reactant mixtures and turbulent fuel injection spray jets.« less
Numerical characteristics of quantum computer simulation

NASA Astrophysics Data System (ADS)

Chernyavskiy, A.; Khamitov, K.; Teplov, A.; Voevodin, V.; Voevodin, Vl.

2016-12-01

The simulation of quantum circuits is significantly important for the implementation of quantum information technologies. The main difficulty of such modeling is the exponential growth of dimensionality, thus the usage of modern high-performance parallel computations is relevant. As it is well known, arbitrary quantum computation in circuit model can be done by only single- and two-qubit gates, and we analyze the computational structure and properties of the simulation of such gates. We investigate the fact that the unique properties of quantum nature lead to the computational properties of the considered algorithms: the quantum parallelism make the simulation of quantum gates highly parallel, and on the other hand, quantum entanglement leads to the problem of computational locality during simulation. We use the methodology of the AlgoWiki project (algowiki-project.org) to analyze the algorithm. This methodology consists of theoretical (sequential and parallel complexity, macro structure, and visual informational graph) and experimental (locality and memory access, scalability and more specific dynamic characteristics) parts. Experimental part was made by using the petascale Lomonosov supercomputer (Moscow State University, Russia). We show that the simulation of quantum gates is a good base for the research and testing of the development methods for data intense parallel software, and considered methodology of the analysis can be successfully used for the improvement of the algorithms in quantum information science.

Parallel programming of industrial applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Heroux, M; Koniges, A; Simon, H

1998-07-21

In the introductory material, we overview the typical MPP environment for real application computing and the special tools available such as parallel debuggers and performance analyzers. Next, we draw from a series of real applications codes and discuss the specific challenges and problems that are encountered in parallelizing these individual applications. The application areas drawn from include biomedical sciences, materials processing and design, plasma and fluid dynamics, and others. We show how it was possible to get a particular application to run efficiently and what steps were necessary. Finally we end with a summary of the lessons learned from thesemore » applications and predictions for the future of industrial parallel computing. This tutorial is based on material from a forthcoming book entitled: "Industrial Strength Parallel Computing" to be published by Morgan Kaufmann Publishers (ISBN l-55860-54).« less
schwimmbad: A uniform interface to parallel processing pools in Python

NASA Astrophysics Data System (ADS)

Price-Whelan, Adrian M.; Foreman-Mackey, Daniel

2017-09-01

Many scientific and computing problems require doing some calculation on all elements of some data set. If the calculations can be executed in parallel (i.e. without any communication between calculations), these problems are said to be perfectly parallel. On computers with multiple processing cores, these tasks can be distributed and executed in parallel to greatly improve performance. A common paradigm for handling these distributed computing problems is to use a processing "pool": the "tasks" (the data) are passed in bulk to the pool, and the pool handles distributing the tasks to a number of worker processes when available. schwimmbad provides a uniform interface to parallel processing pools and enables switching easily between local development (e.g., serial processing or with multiprocessing) and deployment on a cluster or supercomputer (via, e.g., MPI or JobLib).
F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

NASA Technical Reports Server (NTRS)

DiNucci, David C.; Saini, Subhash (Technical Monitor)

1998-01-01

Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).
Processing data communications events by awakening threads in parallel active messaging interface of a parallel computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.

Processing data communications events in a parallel active messaging interface (`PAMI`) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for themore » context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.« less
A massively parallel computational approach to coupled thermoelastic/porous gas flow problems

NASA Technical Reports Server (NTRS)

Shia, David; Mcmanus, Hugh L.

1995-01-01

A new computational scheme for coupled thermoelastic/porous gas flow problems is presented. Heat transfer, gas flow, and dynamic thermoelastic governing equations are expressed in fully explicit form, and solved on a massively parallel computer. The transpiration cooling problem is used as an example problem. The numerical solutions have been verified by comparison to available analytical solutions. Transient temperature, pressure, and stress distributions have been obtained. Small spatial oscillations in pressure and stress have been observed, which would be impractical to predict with previously available schemes. Comparisons between serial and massively parallel versions of the scheme have also been made. The results indicate that for small scale problems the serial and parallel versions use practically the same amount of CPU time. However, as the problem size increases the parallel version becomes more efficient than the serial version.
NAS Requirements Checklist for Job Queuing/Scheduling Software

NASA Technical Reports Server (NTRS)

Jones, James Patton

1996-01-01

The increasing reliability of parallel systems and clusters of computers has resulted in these systems becoming more attractive for true production workloads. Today, the primary obstacle to production use of clusters of computers is the lack of a functional and robust Job Management System for parallel applications. This document provides a checklist of NAS requirements for job queuing and scheduling in order to make most efficient use of parallel systems and clusters for parallel applications. Future requirements are also identified to assist software vendors with design planning.
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

1993-01-01

A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
Writing and Computing across the USM Chemistry Curriculum

NASA Astrophysics Data System (ADS)

Gordon, Nancy R.; Newton, Thomas A.; Rhodes, Gale; Ricci, John S.; Stebbins, Richard G.; Tracy, Henry J.

2001-01-01

The faculty of the University of Southern Maine believes the ability to communicate effectively is one of the most important skills required of successful chemists. To help students achieve that goal, the faculty has developed a Writing and Computer Program consisting of writing and computer assignments of gradually increasing sophistication for all our laboratory courses. The assignments build in complexity until, at the junior level, students are writing full journal-quality laboratory reports. Computer assignments also increase in difficulty as students attack more complicated subjects. We have found the program easy to initiate and our part-time faculty concurs as well. The Writing and Computing across the Curriculum Program also serves to unite the entire chemistry curriculum. We believe the program is helping to reverse what the USM chemistry faculty and other educators have found to be a steady deterioration in the writing skills of many of today's students.
Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

BAER,THOMAS A.; SACKINGER,PHILIP A.; SUBIA,SAMUEL R.

1999-10-14

Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-staticmore » solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance.« less
Transitioning NWChem to the Next Generation of Manycore Machines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bylaska, Eric J.; Apra, Edoardo; Kowalski, Karol

The NorthWest Chemistry (NWChem) modeling software is a popular molecular chemistry simulation software that was designed from the start to work on massively parallel processing supercomputers[6, 28, 49]. It contains an umbrella of modules that today includes Self Consistent Field (SCF), second order Mller-Plesset perturbation theory (MP2), Coupled Cluster, multi-conguration selfconsistent eld (MCSCF), selected conguration interaction (CI), tensor contraction engine (TCE) many body methods, density functional theory (DFT), time-dependent density functional theory (TDDFT), real time time-dependent density functional theory, pseudopotential plane-wave density functional theory (PSPW), band structure (BAND), ab initio molecular dynamics, Car-Parrinello molecular dynamics, classical molecular dynamics (MD), QM/MM,more » AIMD/MM, GIAO NMR, COSMO, COSMO-SMD, and RISM solvation models, free energy simulations, reaction path optimization, parallel in time, among other capabilities[ 22]. Moreover new capabilities continue to be added with each new release.« less
Biennial Conference on Chemical Education: Abstracts of Papers (9th, Bozeman, Montana, July 27-August 2, 1986).

ERIC Educational Resources Information Center

1986

This document includes summaries of conference presentations dealing with a wide variety of topics, including chemistry units for the elementary classroom, science experimentation in the secondary school, computer simulations, computer interfaces, videodisc technology, correspondence teaching of general chemistry, interdisciplinary energy courses,…
Using Computer Simulations in Chemistry Problem Solving

ERIC Educational Resources Information Center

Avramiotis, Spyridon; Tsaparlis, Georgios

2013-01-01

This study is concerned with the effects of computer simulations of two novel chemistry problems on the problem solving ability of students. A control-experimental group, equalized by pair groups (n[subscript Exp] = n[subscript Ctrl] = 78), research design was used. The students had no previous experience of chemical practical work. Student…
RESEARCH STRATEGIES FOR THE APPLICATION OF THE TECHNIQUES OF COMPUTATIONAL BIOLOGICAL CHEMISTRY TO ENVIRONMENTAL PROBLEMS

EPA Science Inventory

On October 25 and 26, 1984, the U.S. EPA sponsored a workshop to consider the potential applications of the techniques of computational biological chemistry to problems in environmental health. Eleven extramural scientists from the various related disciplines and a similar number...
PREDICTING CHEMICAL REACTIVITY OF HUMIC SUBSTANCES FOR MINERALS AND XENOBIOTICS: USE OF COMPUTATIONAL CHEMISTRY, SCANNING PROBE MICROSCOPY AND VIRTUAL REALITY

EPA Science Inventory

In this chapter we review the literature on scanning probe microscopy (SPM), virtual reality (VR), and computational chemistry and our earlier work dealing with modeling lignin, lignin-carbohydrate complexes (LCC), humic substances (HSs) and non-bonded organo-mineral interactions...
Computational Chemistry Studies on the Carbene Hydroxymethylene

ERIC Educational Resources Information Center

Marzzacco, Charles J.; Baum, J. Clayton

2011-01-01

A density functional theory computational chemistry exercise on the structure and vibrational spectrum of the carbene hydroxymethylene is presented. The potential energy curve for the decomposition reaction of the carbene to formaldehyde and the geometry of the transition state are explored. The results are in good agreement with recent…
Dissociation of the Ethyl Radical: An Exercise in Computational Chemistry

ERIC Educational Resources Information Center

Nassabeh, Nahal; Tran, Mark; Fleming, Patrick E.

2014-01-01

A set of exercises for use in a typical physical chemistry laboratory course are described, modeling the unimolecular dissociation of the ethyl radical to form ethylene and atomic hydrogen. Students analyze the computational results both qualitatively and quantitatively. Qualitative structural changes are compared to approximate predicted values…
Partitioning and packing mathematical simulation models for calculation on parallel computers

NASA Technical Reports Server (NTRS)

Arpasi, D. J.; Milner, E. J.

1986-01-01

The development of multiprocessor simulations from a serial set of ordinary differential equations describing a physical system is described. Degrees of parallelism (i.e., coupling between the equations) and their impact on parallel processing are discussed. The problem of identifying computational parallelism within sets of closely coupled equations that require the exchange of current values of variables is described. A technique is presented for identifying this parallelism and for partitioning the equations for parallel solution on a multiprocessor. An algorithm which packs the equations into a minimum number of processors is also described. The results of the packing algorithm when applied to a turbojet engine model are presented in terms of processor utilization.
Parallel/distributed direct method for solving linear systems

NASA Technical Reports Server (NTRS)

Lin, Avi

1990-01-01

A new family of parallel schemes for directly solving linear systems is presented and analyzed. It is shown that these schemes exhibit a near optimal performance and enjoy several important features: (1) For large enough linear systems, the design of the appropriate paralleled algorithm is insensitive to the number of processors as its performance grows monotonically with them; (2) It is especially good for large matrices, with dimensions large relative to the number of processors in the system; (3) It can be used in both distributed parallel computing environments and tightly coupled parallel computing systems; and (4) This set of algorithms can be mapped onto any parallel architecture without any major programming difficulties or algorithmical changes.
Vectorization for Molecular Dynamics on Intel Xeon Phi Corpocessors

NASA Astrophysics Data System (ADS)

Yi, Hongsuk

2014-03-01

Many modern processors are capable of exploiting data-level parallelism through the use of single instruction multiple data (SIMD) execution. The new Intel Xeon Phi coprocessor supports 512 bit vector registers for the high performance computing. In this paper, we have developed a hierarchical parallelization scheme for accelerated molecular dynamics simulations with the Terfoff potentials for covalent bond solid crystals on Intel Xeon Phi coprocessor systems. The scheme exploits multi-level parallelism computing. We combine thread-level parallelism using a tightly coupled thread-level and task-level parallelism with 512-bit vector register. The simulation results show that the parallel performance of SIMD implementations on Xeon Phi is apparently superior to their x86 CPU architecture.
Parallel and pipeline computation of fast unitary transforms

NASA Technical Reports Server (NTRS)

Fino, B. J.; Algazi, V. R.

1975-01-01

The letter discusses the parallel and pipeline organization of fast-unitary-transform algorithms such as the fast Fourier transform, and points out the efficiency of a combined parallel-pipeline processor of a transform such as the Haar transform, in which (2 to the n-th power) -1 hardware 'butterflies' generate a transform of order 2 to the n-th power every computation cycle.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.