parallel program written: Topics by Science.gov

Sample records for parallel program written

Application Portable Parallel Library

NASA Technical Reports Server (NTRS)

Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

1995-01-01

Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Adapting high-level language programs for parallel processing using data flow

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1988-01-01

EASY-FLOW, a very high-level data flow language, is introduced for the purpose of adapting programs written in a conventional high-level language to a parallel environment. The level of parallelism provided is of the large-grained variety in which parallel activities take place between subprograms or processes. A program written in EASY-FLOW is a set of subprogram calls as units, structured by iteration, branching, and distribution constructs. A data flow graph may be deduced from an EASY-FLOW program.
PyPele Rewritten To Use MPI

NASA Technical Reports Server (NTRS)

Hockney, George; Lee, Seungwon

2008-01-01

A computer program known as PyPele, originally written as a Pythonlanguage extension module of a C++ language program, has been rewritten in pure Python language. The original version of PyPele dispatches and coordinates parallel-processing tasks on cluster computers and provides a conceptual framework for spacecraft-mission- design and -analysis software tools to run in an embarrassingly parallel mode. The original version of PyPele uses SSH (Secure Shell a set of standards and an associated network protocol for establishing a secure channel between a local and a remote computer) to coordinate parallel processing. Instead of SSH, the present Python version of PyPele uses Message Passing Interface (MPI) [an unofficial de-facto standard language-independent application programming interface for message- passing on a parallel computer] while keeping the same user interface. The use of MPI instead of SSH and the preservation of the original PyPele user interface make it possible for parallel application programs written previously for the original version of PyPele to run on MPI-based cluster computers. As a result, engineers using the previously written application programs can take advantage of embarrassing parallelism without need to rewrite those programs.
Efficient partitioning and assignment on programs for multiprocessor execution

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1993-01-01

The general problem studied is that of segmenting or partitioning programs for distribution across a multiprocessor system. Efficient partitioning and the assignment of program elements are of great importance since the time consumed in this overhead activity may easily dominate the computation, effectively eliminating any gains made by the use of the parallelism. In this study, the partitioning of sequentially structured programs (written in FORTRAN) is evaluated. Heuristics, developed for similar applications are examined. Finally, a model for queueing networks with finite queues is developed which may be used to analyze multiprocessor system architectures with a shared memory approach to the problem of partitioning. The properties of sequentially written programs form obstacles to large scale (at the procedure or subroutine level) parallelization. Data dependencies of even the minutest nature, reflecting the sequential development of the program, severely limit parallelism. The design of heuristic algorithms is tied to the experience gained in the parallel splitting. Parallelism obtained through the physical separation of data has seen some success, especially at the data element level. Data parallelism on a grander scale requires models that accurately reflect the effects of blocking caused by finite queues. A model for the approximation of the performance of finite queueing networks is developed. This model makes use of the decomposition approach combined with the efficiency of product form solutions.
Parallel computer vision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Uhr, L.

1987-01-01

This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
Program For Parallel Discrete-Event Simulation

NASA Technical Reports Server (NTRS)

Beckman, Brian C.; Blume, Leo R.; Geiselman, John S.; Presley, Matthew T.; Wedel, John J., Jr.; Bellenot, Steven F.; Diloreto, Michael; Hontalas, Philip J.; Reiher, Peter L.; Weiland, Frederick P.

1991-01-01

User does not have to add any special logic to aid in synchronization. Time Warp Operating System (TWOS) computer program is special-purpose operating system designed to support parallel discrete-event simulation. Complete implementation of Time Warp mechanism. Supports only simulations and other computations designed for virtual time. Time Warp Simulator (TWSIM) subdirectory contains sequential simulation engine interface-compatible with TWOS. TWOS and TWSIM written in, and support simulations in, C programming language.
cljam: a library for handling DNA sequence alignment/map (SAM) with parallel processing.

PubMed

Takeuchi, Toshiki; Yamada, Atsuo; Aoki, Takashi; Nishimura, Kunihiro

2016-01-01

Next-generation sequencing can determine DNA bases and the results of sequence alignments are generally stored in files in the Sequence Alignment/Map (SAM) format and the compressed binary version (BAM) of it. SAMtools is a typical tool for dealing with files in the SAM/BAM format. SAMtools has various functions, including detection of variants, visualization of alignments, indexing, extraction of parts of the data and loci, and conversion of file formats. It is written in C and can execute fast. However, SAMtools requires an additional implementation to be used in parallel with, for example, OpenMP (Open Multi-Processing) libraries. For the accumulation of next-generation sequencing data, a simple parallelization program, which can support cloud and PC cluster environments, is required. We have developed cljam using the Clojure programming language, which simplifies parallel programming, to handle SAM/BAM data. Cljam can run in a Java runtime environment (e.g., Windows, Linux, Mac OS X) with Clojure. Cljam can process and analyze SAM/BAM files in parallel and at high speed. The execution time with cljam is almost the same as with SAMtools. The cljam code is written in Clojure and has fewer lines than other similar tools.
Dockres: a computer program that analyzes the output of virtual screening of small molecules

PubMed Central

2010-01-01

Background This paper describes a computer program named Dockres that is designed to analyze and summarize results of virtual screening of small molecules. The program is supplemented with utilities that support the screening process. Foremost among these utilities are scripts that run the virtual screening of a chemical library on a large number of processors in parallel. Methods Dockres and some of its supporting utilities are written Fortran-77; other utilities are written as C-shell scripts. They support the parallel execution of the screening. The current implementation of the program handles virtual screening with Autodock-3 and Autodock-4, but can be extended to work with the output of other programs. Results Analysis of virtual screening by Dockres led to both active and selective lead compounds. Conclusions Analysis of virtual screening was facilitated and enhanced by Dockres in both the authors' laboratories as well as laboratories elsewhere. PMID:20205801
Concurrency-based approaches to parallel programming

NASA Technical Reports Server (NTRS)

Kale, L.V.; Chrisochoides, N.; Kohl, J.; Yelick, K.

1995-01-01

The inevitable transition to parallel programming can be facilitated by appropriate tools, including languages and libraries. After describing the needs of applications developers, this paper presents three specific approaches aimed at development of efficient and reusable parallel software for irregular and dynamic-structured problems. A salient feature of all three approaches in their exploitation of concurrency within a processor. Benefits of individual approaches such as these can be leveraged by an interoperability environment which permits modules written using different approaches to co-exist in single applications.
IOPA: I/O-aware parallelism adaption for parallel programs

PubMed Central

Liu, Tao; Liu, Yi; Qian, Chen; Qian, Depei

2017-01-01

With the development of multi-/many-core processors, applications need to be written as parallel programs to improve execution efficiency. For data-intensive applications that use multiple threads to read/write files simultaneously, an I/O sub-system can easily become a bottleneck when too many of these types of threads exist; on the contrary, too few threads will cause insufficient resource utilization and hurt performance. Therefore, programmers must pay much attention to parallelism control to find the appropriate number of I/O threads for an application. This paper proposes a parallelism control mechanism named IOPA that can adjust the parallelism of applications to adapt to the I/O capability of a system and balance computing resources and I/O bandwidth. The programming interface of IOPA is also provided to programmers to simplify parallel programming. IOPA is evaluated using multiple applications with both solid state and hard disk drives. The results show that the parallel applications using IOPA can achieve higher efficiency than those with a fixed number of threads. PMID:28278236
IOPA: I/O-aware parallelism adaption for parallel programs.

PubMed

Liu, Tao; Liu, Yi; Qian, Chen; Qian, Depei

2017-01-01

With the development of multi-/many-core processors, applications need to be written as parallel programs to improve execution efficiency. For data-intensive applications that use multiple threads to read/write files simultaneously, an I/O sub-system can easily become a bottleneck when too many of these types of threads exist; on the contrary, too few threads will cause insufficient resource utilization and hurt performance. Therefore, programmers must pay much attention to parallelism control to find the appropriate number of I/O threads for an application. This paper proposes a parallelism control mechanism named IOPA that can adjust the parallelism of applications to adapt to the I/O capability of a system and balance computing resources and I/O bandwidth. The programming interface of IOPA is also provided to programmers to simplify parallel programming. IOPA is evaluated using multiple applications with both solid state and hard disk drives. The results show that the parallel applications using IOPA can achieve higher efficiency than those with a fixed number of threads.
Transient Finite Element Computations on a Variable Transputer System

NASA Technical Reports Server (NTRS)

Smolinski, Patrick J.; Lapczyk, Ireneusz

1993-01-01

A parallel program to analyze transient finite element problems was written and implemented on a system of transputer processors. The program uses the explicit time integration algorithm which eliminates the need for equation solving, making it more suitable for parallel computations. An interprocessor communication scheme was developed for arbitrary two dimensional grid processor configurations. Several 3-D problems were analyzed on a system with a small number of processors.
Program Correctness, Verification and Testing for Exascale (Corvette)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sen, Koushik; Iancu, Costin; Demmel, James W

The goal of this project is to provide tools to assess the correctness of parallel programs written using hybrid parallelism. There is a dire lack of both theoretical and engineering know-how in the area of finding bugs in hybrid or large scale parallel programs, which our research aims to change. In the project we have demonstrated novel approaches in several areas: 1. Low overhead automated and precise detection of concurrency bugs at scale. 2. Using low overhead bug detection tools to guide speculative program transformations for performance. 3. Techniques to reduce the concurrency required to reproduce a bug using partialmore » program restart/replay. 4. Techniques to provide reproducible execution of floating point programs. 5. Techniques for tuning the floating point precision used in codes.« less
Concurrent extensions to the FORTRAN language for parallel programming of computational fluid dynamics algorithms

NASA Technical Reports Server (NTRS)

Weeks, Cindy Lou

1986-01-01

Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
The paradigm compiler: Mapping a functional language for the connection machine

NASA Technical Reports Server (NTRS)

Dennis, Jack B.

1989-01-01

The Paradigm Compiler implements a new approach to compiling programs written in high level languages for execution on highly parallel computers. The general approach is to identify the principal data structures constructed by the program and to map these structures onto the processing elements of the target machine. The mapping is chosen to maximize performance as determined through compile time global analysis of the source program. The source language is Sisal, a functional language designed for scientific computations, and the target language is Paris, the published low level interface to the Connection Machine. The data structures considered are multidimensional arrays whose dimensions are known at compile time. Computations that build such arrays usually offer opportunities for highly parallel execution; they are data parallel. The Connection Machine is an attractive target for these computations, and the parallel for construct of the Sisal language is a convenient high level notation for data parallel algorithms. The principles and organization of the Paradigm Compiler are discussed.
Nemesis I: Parallel Enhancements to ExodusII

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hennigan, Gary L.; John, Matthew S.; Shadid, John N.

2006-03-28

NEMESIS I is an enhancement to the EXODUS II finite element database model used to store and retrieve data for unstructured parallel finite element analyses. NEMESIS I adds data structures which facilitate the partitioning of a scalar (standard serial) EXODUS II file onto parallel disk systems found on many parallel computers. Since the NEMESIS I application programming interface (APl)can be used to append information to an existing EXODUS II files can be used on files which contain NEMESIS I information. The NEMESIS I information is written and read via C or C++ callable functions which compromise the NEMESIS I API.
Parallel Ada benchmarks for the SVMS

NASA Technical Reports Server (NTRS)

Collard, Philippe E.

1990-01-01

The use of parallel processing paradigm to design and develop faster and more reliable computers appear to clearly mark the future of information processing. NASA started the development of such an architecture: the Spaceborne VHSIC Multi-processor System (SVMS). Ada will be one of the languages used to program the SVMS. One of the unique characteristics of Ada is that it supports parallel processing at the language level through the tasking constructs. It is important for the SVMS project team to assess how efficiently the SVMS architecture will be implemented, as well as how efficiently Ada environment will be ported to the SVMS. AUTOCLASS II, a Bayesian classifier written in Common Lisp, was selected as one of the benchmarks for SVMS configurations. The purpose of the R and D effort was to provide the SVMS project team with the version of AUTOCLASS II, written in Ada, that would make use of Ada tasking constructs as much as possible so as to constitute a suitable benchmark. Additionally, a set of programs was developed that would measure Ada tasking efficiency on parallel architectures as well as determine the critical parameters influencing tasking efficiency. All this was designed to provide the SVMS project team with a set of suitable tools in the development of the SVMS architecture.
Massively parallel sparse matrix function calculations with NTPoly

NASA Astrophysics Data System (ADS)

Dawson, William; Nakajima, Takahito

2018-04-01

We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
Time Warp Operating System, Version 2.5.1

NASA Technical Reports Server (NTRS)

Bellenot, Steven F.; Gieselman, John S.; Hawley, Lawrence R.; Peterson, Judy; Presley, Matthew T.; Reiher, Peter L.; Springer, Paul L.; Tupman, John R.; Wedel, John J., Jr.; Wieland, Frederick P.;

1993-01-01

Time Warp Operating System, TWOS, is special purpose computer program designed to support parallel simulation of discrete events. Complete implementation of Time Warp software mechanism, which implements distributed protocol for virtual synchronization based on rollback of processes and annihilation of messages. Supports simulations and other computations in which both virtual time and dynamic load balancing used. Program utilizes underlying resources of operating system. Written in C programming language.

Array distribution in data-parallel programs

NASA Technical Reports Server (NTRS)

Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert; Sheffler, Thomas J.

1994-01-01

We consider distribution at compile time of the array data in a distributed-memory implementation of a data-parallel program written in a language like Fortran 90. We allow dynamic redistribution of data and define a heuristic algorithmic framework that chooses distribution parameters to minimize an estimate of program completion time. We represent the program as an alignment-distribution graph. We propose a divide-and-conquer algorithm for distribution that initially assigns a common distribution to each node of the graph and successively refines this assignment, taking computation, realignment, and redistribution costs into account. We explain how to estimate the effect of distribution on computation cost and how to choose a candidate set of distributions. We present the results of an implementation of our algorithms on several test problems.

COMP Superscalar, an interoperable programming framework

NASA Astrophysics Data System (ADS)

Badia, Rosa M.; Conejero, Javier; Diaz, Carlos; Ejarque, Jorge; Lezzi, Daniele; Lordan, Francesc; Ramon-Cortes, Cristian; Sirvent, Raul

2015-12-01

COMPSs is a programming framework that aims to facilitate the parallelization of existing applications written in Java, C/C++ and Python scripts. For that purpose, it offers a simple programming model based on sequential development in which the user is mainly responsible for (i) identifying the functions to be executed as asynchronous parallel tasks and (ii) annotating them with annotations or standard Python decorators. A runtime system is in charge of exploiting the inherent concurrency of the code, automatically detecting and enforcing the data dependencies between tasks and spawning these tasks to the available resources, which can be nodes in a cluster, clouds or grids. In cloud environments, COMPSs provides scalability and elasticity features allowing the dynamic provision of resources.
PCSIM: A Parallel Simulation Environment for Neural Circuits Fully Integrated with Python

PubMed Central

Pecevski, Dejan; Natschläger, Thomas; Schuch, Klaus

2008-01-01

The Parallel Circuit SIMulator (PCSIM) is a software package for simulation of neural circuits. It is primarily designed for distributed simulation of large scale networks of spiking point neurons. Although its computational core is written in C++, PCSIM's primary interface is implemented in the Python programming language, which is a powerful programming environment and allows the user to easily integrate the neural circuit simulator with data analysis and visualization tools to manage the full neural modeling life cycle. The main focus of this paper is to describe PCSIM's full integration into Python and the benefits thereof. In particular we will investigate how the automatically generated bidirectional interface and PCSIM's object-oriented modular framework enable the user to adopt a hybrid modeling approach: using and extending PCSIM's functionality either employing pure Python or C++ and thus combining the advantages of both worlds. Furthermore, we describe several supplementary PCSIM packages written in pure Python and tailored towards setting up and analyzing neural simulations. PMID:19543450
A portable MPI-based parallel vector template library

NASA Technical Reports Server (NTRS)

Sheffler, Thomas J.

1995-01-01

This paper discusses the design and implementation of a polymorphic collection library for distributed address-space parallel computers. The library provides a data-parallel programming model for C++ by providing three main components: a single generic collection class, generic algorithms over collections, and generic algebraic combining functions. Collection elements are the fourth component of a program written using the library and may be either of the built-in types of C or of user-defined types. Many ideas are borrowed from the Standard Template Library (STL) of C++, although a restricted programming model is proposed because of the distributed address-space memory model assumed. Whereas the STL provides standard collections and implementations of algorithms for uniprocessors, this paper advocates standardizing interfaces that may be customized for different parallel computers. Just as the STL attempts to increase programmer productivity through code reuse, a similar standard for parallel computers could provide programmers with a standard set of algorithms portable across many different architectures. The efficacy of this approach is verified by examining performance data collected from an initial implementation of the library running on an IBM SP-2 and an Intel Paragon.
A Portable MPI-Based Parallel Vector Template Library

NASA Technical Reports Server (NTRS)

Sheffler, Thomas J.

1995-01-01

This paper discusses the design and implementation of a polymorphic collection library for distributed address-space parallel computers. The library provides a data-parallel programming model for C + + by providing three main components: a single generic collection class, generic algorithms over collections, and generic algebraic combining functions. Collection elements are the fourth component of a program written using the library and may be either of the built-in types of c or of user-defined types. Many ideas are borrowed from the Standard Template Library (STL) of C++, although a restricted programming model is proposed because of the distributed address-space memory model assumed. Whereas the STL provides standard collections and implementations of algorithms for uniprocessors, this paper advocates standardizing interfaces that may be customized for different parallel computers. Just as the STL attempts to increase programmer productivity through code reuse, a similar standard for parallel computers could provide programmers with a standard set of algorithms portable across many different architectures. The efficacy of this approach is verified by examining performance data collected from an initial implementation of the library running on an IBM SP-2 and an Intel Paragon.
Parallel computation and the basis system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, G.R.

1993-05-01

A software package has been written that can facilitate efforts to develop powerful, flexible, and easy-to use programs that can run in single-processor, massively parallel, and distributed computing environments. Particular attention has been given to the difficulties posed by a program consisting of many science packages that represent subsystems of a complicated, coupled system. Methods have been found to maintain independence of the packages by hiding data structures without increasing the communications costs in a parallel computing environment. Concepts developed in this work are demonstrated by a prototype program that uses library routines from two existing software systems, Basis andmore » Parallel Virtual Machine (PVM). Most of the details of these libraries have been encapsulated in routines and macros that could be rewritten for alternative libraries that possess certain minimum capabilities. The prototype software uses a flexible master-and-slaves paradigm for parallel computation and supports domain decomposition with message passing for partitioning work among slaves. Facilities are provided for accessing variables that are distributed among the memories of slaves assigned to subdomains. The software is named PROTOPAR.« less
Algorithms and programming tools for image processing on the MPP, part 2

NASA Technical Reports Server (NTRS)

Reeves, Anthony P.

1986-01-01

A number of algorithms were developed for image warping and pyramid image filtering. Techniques were investigated for the parallel processing of a large number of independent irregular shaped regions on the MPP. In addition some utilities for dealing with very long vectors and for sorting were developed. Documentation pages for the algorithms which are available for distribution are given. The performance of the MPP for a number of basic data manipulations was determined. From these results it is possible to predict the efficiency of the MPP for a number of algorithms and applications. The Parallel Pascal development system, which is a portable programming environment for the MPP, was improved and better documentation including a tutorial was written. This environment allows programs for the MPP to be developed on any conventional computer system; it consists of a set of system programs and a library of general purpose Parallel Pascal functions. The algorithms were tested on the MPP and a presentation on the development system was made to the MPP users group. The UNIX version of the Parallel Pascal System was distributed to a number of new sites.
Developing Information Power Grid Based Algorithms and Software

NASA Technical Reports Server (NTRS)

Dongarra, Jack

1998-01-01

This exploratory study initiated our effort to understand performance modeling on parallel systems. The basic goal of performance modeling is to understand and predict the performance of a computer program or set of programs on a computer system. Performance modeling has numerous applications, including evaluation of algorithms, optimization of code implementations, parallel library development, comparison of system architectures, parallel system design, and procurement of new systems. Our work lays the basis for the construction of parallel libraries that allow for the reconstruction of application codes on several distinct architectures so as to assure performance portability. Following our strategy, once the requirements of applications are well understood, one can then construct a library in a layered fashion. The top level of this library will consist of architecture-independent geometric, numerical, and symbolic algorithms that are needed by the sample of applications. These routines should be written in a language that is portable across the targeted architectures.
Quasi-one-dimensional compressible flow across face seals and narrow slots. 2: Computer program

NASA Technical Reports Server (NTRS)

Zuk, J.; Smith, P. J.

1972-01-01

A computer program is presented for compressible fluid flow with friction across face seals and through narrow slots. The computer program carries out a quasi-one-dimensional flow analysis which is valid for laminar and turbulent flows under both subsonic and choked flow conditions for parallel surfaces. The program is written in FORTRAN IV. The input and output variables are in either the International System of Units (SI) or the U.S. customary system.
Automated Performance Prediction of Message-Passing Parallel Programs

NASA Technical Reports Server (NTRS)

Block, Robert J.; Sarukkai, Sekhar; Mehra, Pankaj; Woodrow, Thomas S. (Technical Monitor)

1995-01-01

The increasing use of massively parallel supercomputers to solve large-scale scientific problems has generated a need for tools that can predict scalability trends of applications written for these machines. Much work has been done to create simple models that represent important characteristics of parallel programs, such as latency, network contention, and communication volume. But many of these methods still require substantial manual effort to represent an application in the model's format. The NIK toolkit described in this paper is the result of an on-going effort to automate the formation of analytic expressions of program execution time, with a minimum of programmer assistance. In this paper we demonstrate the feasibility of our approach, by extending previous work to detect and model communication patterns automatically, with and without overlapped computations. The predictions derived from these models agree, within reasonable limits, with execution times of programs measured on the Intel iPSC/860 and Paragon. Further, we demonstrate the use of MK in selecting optimal computational grain size and studying various scalability metrics.
Parallel computation and the Basis system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, G.R.

1992-12-16

A software package has been written that can facilitate efforts to develop powerful, flexible, and easy-to-use programs that can run in single-processor, massively parallel, and distributed computing environments. Particular attention has been given to the difficulties posed by a program consisting of many science packages that represent subsystems of a complicated, coupled system. Methods have been found to maintain independence of the packages by hiding data structures without increasing the communication costs in a parallel computing environment. Concepts developed in this work are demonstrated by a prototype program that uses library routines from two existing software systems, Basis and Parallelmore » Virtual Machine (PVM). Most of the details of these libraries have been encapsulated in routines and macros that could be rewritten for alternative libraries that possess certain minimum capabilities. The prototype software uses a flexible master-and-slaves paradigm for parallel computation and supports domain decomposition with message passing for partitioning work among slaves. Facilities are provided for accessing variables that are distributed among the memories of slaves assigned to subdomains. The software is named PROTOPAR.« less
Portable programming on parallel/networked computers using the Application Portable Parallel Library (APPL)

NASA Technical Reports Server (NTRS)

Quealy, Angela; Cole, Gary L.; Blech, Richard A.

1993-01-01

The Application Portable Parallel Library (APPL) is a subroutine-based library of communication primitives that is callable from applications written in FORTRAN or C. APPL provides a consistent programmer interface to a variety of distributed and shared-memory multiprocessor MIMD machines. The objective of APPL is to minimize the effort required to move parallel applications from one machine to another, or to a network of homogeneous machines. APPL encompasses many of the message-passing primitives that are currently available on commercial multiprocessor systems. This paper describes APPL (version 2.3.1) and its usage, reports the status of the APPL project, and indicates possible directions for the future. Several applications using APPL are discussed, as well as performance and overhead results.
IMa2p - Parallel MCMC and inference of ancient demography under the Isolation with Migration (IM) model

PubMed Central

Sethuraman, Arun; Hey, Jody

2015-01-01

IMa2 and related programs are used to study the divergence of closely related species and of populations within species. These methods are based on the sampling of genealogies using MCMC, and they can proceed quite slowly for larger data sets. We describe a parallel implementation, called IMa2p, that provides a nearly linear increase in genealogy sampling rate with the number of processors in use. IMa2p is written in OpenMPI and C++, and scales well for demographic analyses of a large number of loci and populations, which are difficult to study using the serial version of the program. PMID:26059786
Rubus: A compiler for seamless and extensible parallelism.

PubMed

Adnan, Muhammad; Aslam, Faisal; Nawaz, Zubair; Sarwar, Syed Mansoor

2017-01-01

Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer's expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been achieved by Rubus on the same GPU. Moreover, Rubus achieves this performance without drastically increasing the memory footprint of a program.
Rubus: A compiler for seamless and extensible parallelism

PubMed Central

Adnan, Muhammad; Aslam, Faisal; Sarwar, Syed Mansoor

2017-01-01

Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer’s expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been achieved by Rubus on the same GPU. Moreover, Rubus achieves this performance without drastically increasing the memory footprint of a program. PMID:29211758
SIMOGEN - An Object-Oriented Language for Simulation

DTIC Science & Technology

1989-03-01

program generator must also be written in the same prcgramming languaje . In this case, the C language was chosen, for the following main reasons...3), March 88. 4. PRESTO: A System for Object-Oriented Parallel Programing B N Bershad, E D Lazowska & H M Levy Software Practice and Experience, Vol...U.S. Depare nt of Defence ANSI/ML-STD 1815A. 7. Object-oriented Development Grady Booch Transactions on Software Engineering , February 86. 8. A
ng: What next-generation languages can teach us about HENP frameworks in the manycore era

NASA Astrophysics Data System (ADS)

Binet, Sébastien

2011-12-01

Current High Energy and Nuclear Physics (HENP) frameworks were written before multicore systems became widely deployed. A 'single-thread' execution model naturally emerged from that environment, however, this no longer fits into the processing model on the dawn of the manycore era. Although previous work focused on minimizing the changes to be applied to the LHC frameworks (because of the data taking phase) while still trying to reap the benefits of the parallel-enhanced CPU architectures, this paper explores what new languages could bring to the design of the next-generation frameworks. Parallel programming is still in an intensive phase of R&D and no silver bullet exists despite the 30+ years of literature on the subject. Yet, several parallel programming styles have emerged: actors, message passing, communicating sequential processes, task-based programming, data flow programming, ... to name a few. We present the work of the prototyping of a next-generation framework in new and expressive languages (python and Go) to investigate how code clarity and robustness are affected and what are the downsides of using languages younger than FORTRAN/C/C++.
Computer program for the load and trajectory analysis of two DOF bodies connected by an elastic tether: Users manual

NASA Technical Reports Server (NTRS)

Doyle, G. R., Jr.; Burbick, J. W.

1973-01-01

The derivation of the differential equations of motion of a 3 Degrees of Freedom body joined to a 3 Degrees of Freedom body by an elastic tether. The tether is represented by a spring and dashpot in parallel. A computer program which integrates the equations of motion is also described. Although the derivation of the equations of motions are for a general system, the computer program is written for defining loads in large boosters recovered by parachutes.
Toroidal transformer design program with application to inverter circuitry

NASA Technical Reports Server (NTRS)

Dayton, J. A., Jr.

1972-01-01

Estimates of temperature, weight, efficiency, regulation, and final dimensions are included in the output of the computer program for the design of transformers for use in the basic parallel inverter. The program, written in FORTRAN 4, selects a tape wound toroidal magnetic core and, taking temperature, materials, core geometry, skin depth, and ohmic losses into account, chooses the appropriate wire sizes and number of turns for the center tapped primary and single secondary coils. Using the program, 2- and 4-kilovolt-ampere transformers are designed for frequencies from 200 to 3200 Hz and the efficiency of a basic transistor inverter is estimated.
The Automated Instrumentation and Monitoring System (AIMS) reference manual

NASA Technical Reports Server (NTRS)

Yan, Jerry; Hontalas, Philip; Listgarten, Sherry

1993-01-01

Whether a researcher is designing the 'next parallel programming paradigm,' another 'scalable multiprocessor' or investigating resource allocation algorithms for multiprocessors, a facility that enables parallel program execution to be captured and displayed is invaluable. Careful analysis of execution traces can help computer designers and software architects to uncover system behavior and to take advantage of specific application characteristics and hardware features. A software tool kit that facilitates performance evaluation of parallel applications on multiprocessors is described. The Automated Instrumentation and Monitoring System (AIMS) has four major software components: a source code instrumentor which automatically inserts active event recorders into the program's source code before compilation; a run time performance-monitoring library, which collects performance data; a trace file animation and analysis tool kit which reconstructs program execution from the trace file; and a trace post-processor which compensate for data collection overhead. Besides being used as prototype for developing new techniques for instrumenting, monitoring, and visualizing parallel program execution, AIMS is also being incorporated into the run-time environments of various hardware test beds to evaluate their impact on user productivity. Currently, AIMS instrumentors accept FORTRAN and C parallel programs written for Intel's NX operating system on the iPSC family of multi computers. A run-time performance-monitoring library for the iPSC/860 is included in this release. We plan to release monitors for other platforms (such as PVM and TMC's CM-5) in the near future. Performance data collected can be graphically displayed on workstations (e.g. Sun Sparc and SGI) supporting X-Windows (in particular, Xl IR5, Motif 1.1.3).
Multiscale Simulations of Magnetic Island Coalescence

NASA Technical Reports Server (NTRS)

Dorelli, John C.

2010-01-01

We describe a new interactive parallel Adaptive Mesh Refinement (AMR) framework written in the Python programming language. This new framework, PyAMR, hides the details of parallel AMR data structures and algorithms (e.g., domain decomposition, grid partition, and inter-process communication), allowing the user to focus on the development of algorithms for advancing the solution of a systems of partial differential equations on a single uniform mesh. We demonstrate the use of PyAMR by simulating the pairwise coalescence of magnetic islands using the resistive Hall MHD equations. Techniques for coupling different physics models on different levels of the AMR grid hierarchy are discussed.

PETSc Users Manual Revision 3.3

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balay, S.; Brown, J.; Buschelman, K.

This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication. PETSc includes an expanding suite of parallel linear, nonlinear equation solvers and time integrators that may be used in application codes written in Fortran, C, C++, Python, and MATLAB (sequential). PETSc provides many of the mechanisms neededmore » within parallel application codes, such as parallel matrix and vector assembly routines. The library is organized hierarchically, enabling users to employ the level of abstraction that is most appropriate for a particular problem. By using techniques of object-oriented programming, PETSc provides enormous flexibility for users. PETSc is a sophisticated set of software tools; as such, for some users it initially has a much steeper learning curve than a simple subroutine library. In particular, for individuals without some computer science background, experience programming in C, C++ or Fortran and experience using a debugger such as gdb or dbx, it may require a significant amount of time to take full advantage of the features that enable efficient software use. However, the power of the PETSc design and the algorithms it incorporates may make the efficient implementation of many application codes simpler than “rolling them” yourself; For many tasks a package such as MATLAB is often the best tool; PETSc is not intended for the classes of problems for which effective MATLAB code can be written. PETSc also has a MATLAB interface, so portions of your code can be written in MATLAB to “try out” the PETSc solvers. The resulting code will not be scalable however because currently MATLAB is inherently not scalable; and PETSc should not be used to attempt to provide a “parallel linear solver” in an otherwise sequential code. Certainly all parts of a previously sequential code need not be parallelized but the matrix generation portion must be parallelized to expect any kind of reasonable performance. Do not expect to generate your matrix sequentially and then “use PETSc” to solve the linear system in parallel. Since PETSc is under continued development, small changes in usage and calling sequences of routines will occur. PETSc is supported; see the web site http://www.mcs.anl.gov/petsc for information on contacting support. A http://www.mcs.anl.gov/petsc/publications may be found a list of publications and web sites that feature work involving PETSc. We welcome any reports of corrections for this document.« less
PETSc Users Manual Revision 3.4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balay, S.; Brown, J.; Buschelman, K.

This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication. PETSc includes an expanding suite of parallel linear, nonlinear equation solvers and time integrators that may be used in application codes written in Fortran, C, C++, Python, and MATLAB (sequential). PETSc provides many of the mechanisms neededmore » within parallel application codes, such as parallel matrix and vector assembly routines. The library is organized hierarchically, enabling users to employ the level of abstraction that is most appropriate for a particular problem. By using techniques of object-oriented programming, PETSc provides enormous flexibility for users. PETSc is a sophisticated set of software tools; as such, for some users it initially has a much steeper learning curve than a simple subroutine library. In particular, for individuals without some computer science background, experience programming in C, C++ or Fortran and experience using a debugger such as gdb or dbx, it may require a significant amount of time to take full advantage of the features that enable efficient software use. However, the power of the PETSc design and the algorithms it incorporates may make the efficient implementation of many application codes simpler than “rolling them” yourself; For many tasks a package such as MATLAB is often the best tool; PETSc is not intended for the classes of problems for which effective MATLAB code can be written. PETSc also has a MATLAB interface, so portions of your code can be written in MATLAB to “try out” the PETSc solvers. The resulting code will not be scalable however because currently MATLAB is inherently not scalable; and PETSc should not be used to attempt to provide a “parallel linear solver” in an otherwise sequential code. Certainly all parts of a previously sequential code need not be parallelized but the matrix generation portion must be parallelized to expect any kind of reasonable performance. Do not expect to generate your matrix sequentially and then “use PETSc” to solve the linear system in parallel. Since PETSc is under continued development, small changes in usage and calling sequences of routines will occur. PETSc is supported; see the web site http://www.mcs.anl.gov/petsc for information on contacting support. A http://www.mcs.anl.gov/petsc/publications may be found a list of publications and web sites that feature work involving PETSc. We welcome any reports of corrections for this document.« less
PETSc Users Manual Revision 3.5

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balay, S.; Abhyankar, S.; Adams, M.

This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication. PETSc includes an expanding suite of parallel linear, nonlinear equation solvers and time integrators that may be used in application codes written in Fortran, C, C++, Python, and MATLAB (sequential). PETSc provides many of the mechanisms neededmore » within parallel application codes, such as parallel matrix and vector assembly routines. The library is organized hierarchically, enabling users to employ the level of abstraction that is most appropriate for a particular problem. By using techniques of object-oriented programming, PETSc provides enormous flexibility for users. PETSc is a sophisticated set of software tools; as such, for some users it initially has a much steeper learning curve than a simple subroutine library. In particular, for individuals without some computer science background, experience programming in C, C++ or Fortran and experience using a debugger such as gdb or dbx, it may require a significant amount of time to take full advantage of the features that enable efficient software use. However, the power of the PETSc design and the algorithms it incorporates may make the efficient implementation of many application codes simpler than “rolling them” yourself. ;For many tasks a package such as MATLAB is often the best tool; PETSc is not intended for the classes of problems for which effective MATLAB code can be written. PETSc also has a MATLAB interface, so portions of your code can be written in MATLAB to “try out” the PETSc solvers. The resulting code will not be scalable however because currently MATLAB is inherently not scalable; and PETSc should not be used to attempt to provide a “parallel linear solver” in an otherwise sequential code. Certainly all parts of a previously sequential code need not be parallelized but the matrix generation portion must be parallelized to expect any kind of reasonable performance. Do not expect to generate your matrix sequentially and then “use PETSc” to solve the linear system in parallel. Since PETSc is under continued development, small changes in usage and calling sequences of routines will occur. PETSc is supported; see the web site http://www.mcs.anl.gov/petsc for information on contacting support. A http://www.mcs.anl.gov/petsc/publications may be found a list of publications and web sites that feature work involving PETSc. We welcome any reports of corrections for this document.« less
The Adam language: Ada extended with support for multiway activities

NASA Technical Reports Server (NTRS)

Charlesworth, Arthur

1993-01-01

The Adam language is an extension of Ada that supports multiway activities, which are cooperative activities involving two or more processes. This support is provided by three new constructs: diva procedures, meet statements, and multiway accept statements. Diva procedures are recursive generic procedures having a particular restrictive syntax that facilitates translation for parallel computers. Meet statements and multiway accept statements provide two ways to express a multiway rendezvous, which is an n-way rendezvous generalizing Ada's 2-way rendezvous. While meet statements tend to have simpler rules than multiway accept statements, the latter approach is a more straightforward extension of Ada. The only nonnull statements permitted within meet statements and multiway accept statements are calls on instantiated diva procedures. A call on an instantiated diva procedure is also permitted outside a multiway rendezvous; thus sequential Adam programs using diva procedures can be written. Adam programs are translated into Ada programs appropriate for use on parallel computers.
SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws

NASA Technical Reports Server (NTRS)

Cooke, Daniel; Rushton, Nelson

2013-01-01

With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less costly than development of comparable parallel code. Moreover, SequenceL not only automatically parallelizes the code, but since it is based on CSP-NT, it is provably race free, thus eliminating the largest quality challenge the parallelized software developer faces.
A practical guide to replica-exchange Wang—Landau simulations

NASA Astrophysics Data System (ADS)

Vogel, Thomas; Li, Ying Wai; Landau, David P.

2018-04-01

This paper is based on a series of tutorial lectures about the replica-exchange Wang-Landau (REWL) method given at the IX Brazilian Meeting on Simulational Physics (BMSP 2017). It provides a practical guide for the implementation of the method. A complete example code for a model system is available online. In this paper, we discuss the main parallel features of this code after a brief introduction to the REWL algorithm. The tutorial section is mainly directed at users who have written a single-walker Wang–Landau program already but might have just taken their first steps in parallel programming using the Message Passing Interface (MPI). In the last section, we answer “frequently asked questions” from users about the implementation of REWL for different scientific problems.
Pteros 2.0: Evolution of the fast parallel molecular analysis library for C++ and python.

PubMed

Yesylevskyy, Semen O

2015-07-15

Pteros is the high-performance open-source library for molecular modeling and analysis of molecular dynamics trajectories. Starting from version 2.0 Pteros is available for C++ and Python programming languages with very similar interfaces. This makes it suitable for writing complex reusable programs in C++ and simple interactive scripts in Python alike. New version improves the facilities for asynchronous trajectory reading and parallel execution of analysis tasks by introducing analysis plugins which could be written in either C++ or Python in completely uniform way. The high level of abstraction provided by analysis plugins greatly simplifies prototyping and implementation of complex analysis algorithms. Pteros is available for free under Artistic License from http://sourceforge.net/projects/pteros/. © 2015 Wiley Periodicals, Inc.
Speech and Language and Language Translation (SALT)

DTIC Science & Technology

2012-12-01

Resources are classified as: Parallel Text Dictionaries Monolingual Text Other Dictionaries are further classified as: Text: can download entire...not clear how many are translated http://www.redsea-online.com/modules.php?name= dictionary Monolingual Text Monolingual Text; An Crubadan web...attached to a following word. A program could be written to detach the character د from unknown words, when the remaining word matches a dictionary
The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

DOE PAGES

O'keefe, Matthew; Parr, Terence; Edgar, B. Kevin; ...

1995-01-01

Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. Wemore » have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.« less
Binary tree eigen solver in finite element analysis

NASA Technical Reports Server (NTRS)

Akl, F. A.; Janetzke, D. C.; Kiraly, L. J.

1993-01-01

This paper presents a transputer-based binary tree eigensolver for the solution of the generalized eigenproblem in linear elastic finite element analysis. The algorithm is based on the method of recursive doubling, which parallel implementation of a number of associative operations on an arbitrary set having N elements is of the order of o(log2N), compared to (N-1) steps if implemented sequentially. The hardware used in the implementation of the binary tree consists of 32 transputers. The algorithm is written in OCCAM which is a high-level language developed with the transputers to address parallel programming constructs and to provide the communications between processors. The algorithm can be replicated to match the size of the binary tree transputer network. Parallel and sequential finite element analysis programs have been developed to solve for the set of the least-order eigenpairs using the modified subspace method. The speed-up obtained for a typical analysis problem indicates close agreement with the theoretical prediction given by the method of recursive doubling.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets.

PubMed

Shrimankar, D D; Sathe, S R

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today's supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets

PubMed Central

Shrimankar, D. D.; Sathe, S. R.

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today’s supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures. PMID:27932868
Implementation and performance of FDPS: a framework for developing parallel particle simulation codes

NASA Astrophysics Data System (ADS)

Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro

2016-08-01

We present the basic idea, implementation, measured performance, and performance model of FDPS (Framework for Developing Particle Simulators). FDPS is an application-development framework which helps researchers to develop simulation programs using particle methods for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, exchange of particles which are not in the domain of each computing node, and gathering of the particle information in other nodes which are necessary for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as the Barnes-Hut tree algorithm or the Fast Multipole Method should be used in the case of long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are required. FDPS provides all of these functions which are necessary for efficient parallel execution of particle-based simulations as "templates," which are independent of the actual data structure of particles and the functional form of the particle-particle interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speed-up was obtained for up to the full system of the K computer. The minimum calculation time per timestep is in the range of 30 ms (N = 107) to 300 ms (N = 109). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.
Parallel community climate model: Description and user`s guide

DOE Office of Scientific and Technical Information (OSTI.GOV)

Drake, J.B.; Flanery, R.E.; Semeraro, B.D.

This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain intomore » geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.« less
Parallel Adaptive Mesh Refinement Library

NASA Technical Reports Server (NTRS)

Mac-Neice, Peter; Olson, Kevin

2005-01-01

Parallel Adaptive Mesh Refinement Library (PARAMESH) is a package of Fortran 90 subroutines designed to provide a computer programmer with an easy route to extension of (1) a previously written serial code that uses a logically Cartesian structured mesh into (2) a parallel code with adaptive mesh refinement (AMR). Alternatively, in its simplest use, and with minimal effort, PARAMESH can operate as a domain-decomposition tool for users who want to parallelize their serial codes but who do not wish to utilize adaptivity. The package builds a hierarchy of sub-grids to cover the computational domain of a given application program, with spatial resolution varying to satisfy the demands of the application. The sub-grid blocks form the nodes of a tree data structure (a quad-tree in two or an oct-tree in three dimensions). Each grid block has a logically Cartesian mesh. The package supports one-, two- and three-dimensional models.
Python based high-level synthesis compiler

NASA Astrophysics Data System (ADS)

Cieszewski, Radosław; Pozniak, Krzysztof; Romaniuk, Ryszard

2014-11-01

This paper presents a python based High-Level synthesis (HLS) compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and map it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Creating parallel programs implemented in FPGAs is not trivial. This article describes design, implementation and first results of created Python based compiler.
Algorithmic synthesis using Python compiler

NASA Astrophysics Data System (ADS)

Cieszewski, Radoslaw; Romaniuk, Ryszard; Pozniak, Krzysztof; Linczuk, Maciej

2015-09-01

This paper presents a python to VHDL compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and translate it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the programmed circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. This can be achieved by using many computational resources at the same time. Creating parallel programs implemented in FPGAs in pure HDL is difficult and time consuming. Using higher level of abstraction and High-Level Synthesis compiler implementation time can be reduced. The compiler has been implemented using the Python language. This article describes design, implementation and results of created tools.
jInv: A Modular and Scalable Framework for Electromagnetic Inverse Problems

NASA Astrophysics Data System (ADS)

Belliveau, P. T.; Haber, E.

2016-12-01

Inversion is a key tool in the interpretation of geophysical electromagnetic (EM) data. Three-dimensional (3D) EM inversion is very computationally expensive and practical software for inverting large 3D EM surveys must be able to take advantage of high performance computing (HPC) resources. It has traditionally been difficult to achieve those goals in a high level dynamic programming environment that allows rapid development and testing of new algorithms, which is important in a research setting. With those goals in mind, we have developed jInv, a framework for PDE constrained parameter estimation problems. jInv provides optimization and regularization routines, a framework for user defined forward problems, and interfaces to several direct and iterative solvers for sparse linear systems. The forward modeling framework provides finite volume discretizations of differential operators on rectangular tensor product meshes and tetrahedral unstructured meshes that can be used to easily construct forward modeling and sensitivity routines for forward problems described by partial differential equations. jInv is written in the emerging programming language Julia. Julia is a dynamic language targeted at the computational science community with a focus on high performance and native support for parallel programming. We have developed frequency and time-domain EM forward modeling and sensitivity routines for jInv. We will illustrate its capabilities and performance with two synthetic time-domain EM inversion examples. First, in airborne surveys, which use many sources, we achieve distributed memory parallelism by decoupling the forward and inverse meshes and performing forward modeling for each source on small, locally refined meshes. Secondly, we invert grounded source time-domain data from a gradient array style induced polarization survey using a novel time-stepping technique that allows us to compute data from different time-steps in parallel. These examples both show that it is possible to invert large scale 3D time-domain EM datasets within a modular, extensible framework written in a high-level, easy to use programming language.
[Not Available].

PubMed

Pecevski, Dejan; Natschläger, Thomas; Schuch, Klaus

2009-01-01

The Parallel Circuit SIMulator (PCSIM) is a software package for simulation of neural circuits. It is primarily designed for distributed simulation of large scale networks of spiking point neurons. Although its computational core is written in C++, PCSIM's primary interface is implemented in the Python programming language, which is a powerful programming environment and allows the user to easily integrate the neural circuit simulator with data analysis and visualization tools to manage the full neural modeling life cycle. The main focus of this paper is to describe PCSIM's full integration into Python and the benefits thereof. In particular we will investigate how the automatically generated bidirectional interface and PCSIM's object-oriented modular framework enable the user to adopt a hybrid modeling approach: using and extending PCSIM's functionality either employing pure Python or C++ and thus combining the advantages of both worlds. Furthermore, we describe several supplementary PCSIM packages written in pure Python and tailored towards setting up and analyzing neural simulations.
Parallel log structured file system collective buffering to achieve a compact representation of scientific and/or dimensional data

DOEpatents

Grider, Gary A.; Poole, Stephen W.

2015-09-01

Collective buffering and data pattern solutions are provided for storage, retrieval, and/or analysis of data in a collective parallel processing environment. For example, a method can be provided for data storage in a collective parallel processing environment. The method comprises receiving data to be written for a plurality of collective processes within a collective parallel processing environment, extracting a data pattern for the data to be written for the plurality of collective processes, generating a representation describing the data pattern, and saving the data and the representation.

Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

NASA Astrophysics Data System (ADS)

Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

2014-06-01

This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.
Preparation of Entangled Polymer Melts of Various Architecture for Coarse-Grained Models

DTIC Science & Technology

2011-09-01

Simulator ( LAMMPS ). This report presents a theory overview and a manual how to use the method. 15. SUBJECT TERMS Ammunition, coarse-grained model...polymer builder, LAMMPS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT UU 18. NUMBER OF PAGES 26 19a. NAME OF RESPONSIBLE PERSON...scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ). Gel is an in house written C program of coarse- grained polymer builder, and LAMMPS is
Users manual for the Chameleon parallel programming tools

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gropp, W.; Smith, B.

1993-06-01

Message passing is a common method for writing programs for distributed-memory parallel computers. Unfortunately, the lack of a standard for message passing has hampered the construction of portable and efficient parallel programs. In an attempt to remedy this problem, a number of groups have developed their own message-passing systems, each with its own strengths and weaknesses. Chameleon is a second-generation system of this type. Rather than replacing these existing systems, Chameleon is meant to supplement them by providing a uniform way to access many of these systems. Chameleon`s goals are to (a) be very lightweight (low over-head), (b) be highlymore » portable, and (c) help standardize program startup and the use of emerging message-passing operations such as collective operations on subsets of processors. Chameleon also provides a way to port programs written using PICL or Intel NX message passing to other systems, including collections of workstations. Chameleon is tracking the Message-Passing Interface (MPI) draft standard and will provide both an MPI implementation and an MPI transport layer. Chameleon provides support for heterogeneous computing by using p4 and PVM. Chameleon`s support for homogeneous computing includes the portable libraries p4, PICL, and PVM and vendor-specific implementation for Intel NX, IBM EUI (SP-1), and Thinking Machines CMMD (CM-5). Support for Ncube and PVM 3.x is also under development.« less
PVM Wrapper

NASA Technical Reports Server (NTRS)

Katz, Daniel

2004-01-01

PVM Wrapper is a software library that makes it possible for code that utilizes the Parallel Virtual Machine (PVM) software library to run using the message-passing interface (MPI) software library, without needing to rewrite the entire code. PVM and MPI are the two most common software libraries used for applications that involve passing of messages among parallel computers. Since about 1996, MPI has been the de facto standard. Codes written when PVM was popular often feature patterns of {"initsend," "pack," "send"} and {"receive," "unpack"} calls. In many cases, these calls are not contiguous and one set of calls may even exist over multiple subroutines. These characteristics make it difficult to obtain equivalent functionality via a single MPI "send" call. Because PVM Wrapper is written to run with MPI- 1.2, some PVM functions are not permitted and must be replaced - a task that requires some programming expertise. The "pvm_spawn" and "pvm_parent" function calls are not replaced, but a programmer can use "mpirun" and knowledge of the ranks of parent and child tasks with supplied macroinstructions to enable execution of codes that use "pvm_spawn" and "pvm_parent."
PDB Editor: a user-friendly Java-based Protein Data Bank file editor with a GUI.

PubMed

Lee, Jonas; Kim, Sung Hou

2009-04-01

The Protein Data Bank file format is the format most widely used by protein crystallographers and biologists to disseminate and manipulate protein structures. Despite this, there are few user-friendly software packages available to efficiently edit and extract raw information from PDB files. This limitation often leads to many protein crystallographers wasting significant time manually editing PDB files. PDB Editor, written in Java Swing GUI, allows the user to selectively search, select, extract and edit information in parallel. Furthermore, the program is a stand-alone application written in Java which frees users from the hassles associated with platform/operating system-dependent installation and usage. PDB Editor can be downloaded from http://sourceforge.net/projects/pdbeditorjl/.
Katome: de novo DNA assembler implemented in rust

NASA Astrophysics Data System (ADS)

Neumann, Łukasz; Nowak, Robert M.; Kuśmirek, Wiktor

2017-08-01

Katome is a new de novo sequence assembler written in the Rust programming language, designed with respect to future parallelization of the algorithms, run time and memory usage optimization. The application uses new algorithms for the correct assembly of repetitive sequences. Performance and quality tests were performed on various data, comparing the new application to `dnaasm', `ABySS' and `Velvet' genome assemblers. Quality tests indicate that the new assembler creates more contigs than well-established solutions, but the contigs have better quality with regard to mismatches per 100kbp and indels per 100kbp. Additionally, benchmarks indicate that the Rust-based implementation outperforms `dnaasm', `ABySS' and `Velvet' assemblers, written in C++, in terms of assembly time. Lower memory usage in comparison to `dnaasm' is observed.
Parallelization of an Object-Oriented Unstructured Aeroacoustics Solver

NASA Technical Reports Server (NTRS)

Baggag, Abdelkader; Atkins, Harold; Oezturan, Can; Keyes, David

1999-01-01

A computational aeroacoustics code based on the discontinuous Galerkin method is ported to several parallel platforms using MPI. The discontinuous Galerkin method is a compact high-order method that retains its accuracy and robustness on non-smooth unstructured meshes. In its semi-discrete form, the discontinuous Galerkin method can be combined with explicit time marching methods making it well suited to time accurate computations. The compact nature of the discontinuous Galerkin method also makes it well suited for distributed memory parallel platforms. The original serial code was written using an object-oriented approach and was previously optimized for cache-based machines. The port to parallel platforms was achieved simply by treating partition boundaries as a type of boundary condition. Code modifications were minimal because boundary conditions were abstractions in the original program. Scalability results are presented for the SCI Origin, IBM SP2, and clusters of SGI and Sun workstations. Slightly superlinear speedup is achieved on a fixed-size problem on the Origin, due to cache effects.
Reliable, Memory Speed Storage for Cluster Computing Frameworks

DTIC Science & Technology

2014-06-16

specification API that can capture computations in many of today’s popular data -parallel computing models, e.g., MapReduce and SQL. We also ported the Hadoop ...today’s big data workloads: • Immutable data : Data is immutable once written, since dominant underlying storage systems, such as HDFS [3], only support...network transfers, so reads can be data -local. • Program size vs. data size: In big data processing, the same operation is repeatedly applied on massive
Legacy model integration for enhancing hydrologic interdisciplinary research

NASA Astrophysics Data System (ADS)

Dozier, A.; Arabi, M.; David, O.

2013-12-01

Many challenges are introduced to interdisciplinary research in and around the hydrologic science community due to advances in computing technology and modeling capabilities in different programming languages, across different platforms and frameworks by researchers in a variety of fields with a variety of experience in computer programming. Many new hydrologic models as well as optimization, parameter estimation, and uncertainty characterization techniques are developed in scripting languages such as Matlab, R, Python, or in newer languages such as Java and the .Net languages, whereas many legacy models have been written in FORTRAN and C, which complicates inter-model communication for two-way feedbacks. However, most hydrologic researchers and industry personnel have little knowledge of the computing technologies that are available to address the model integration process. Therefore, the goal of this study is to address these new challenges by utilizing a novel approach based on a publish-subscribe-type system to enhance modeling capabilities of legacy socio-economic, hydrologic, and ecologic software. Enhancements include massive parallelization of executions and access to legacy model variables at any point during the simulation process by another program without having to compile all the models together into an inseparable 'super-model'. Thus, this study provides two-way feedback mechanisms between multiple different process models that can be written in various programming languages and can run on different machines and operating systems. Additionally, a level of abstraction is given to the model integration process that allows researchers and other technical personnel to perform more detailed and interactive modeling, visualization, optimization, calibration, and uncertainty analysis without requiring deep understanding of inter-process communication. To be compatible, a program must be written in a programming language with bindings to a common implementation of the message passing interface (MPI), which includes FORTRAN, C, Java, the .NET languages, Python, R, Matlab, and many others. The system is tested on a longstanding legacy hydrologic model, the Soil and Water Assessment Tool (SWAT), to observe and enhance speed-up capabilities for various optimization, parameter estimation, and model uncertainty characterization techniques, which is particularly important for computationally intensive hydrologic simulations. Initial results indicate that the legacy extension system significantly decreases developer time, computation time, and the cost of purchasing commercial parallel processing licenses, while enhancing interdisciplinary research by providing detailed two-way feedback mechanisms between various process models with minimal changes to legacy code.
PPM Receiver Implemented in Software

NASA Technical Reports Server (NTRS)

Gray, Andrew; Kang, Edward; Lay, Norman; Vilnrotter, Victor; Srinivasan, Meera; Lee, Clement

2010-01-01

A computer program has been written as a tool for developing optical pulse-position- modulation (PPM) receivers in which photodetector outputs are fed to analog-to-digital converters (ADCs) and all subsequent signal processing is performed digitally. The program can be used, for example, to simulate an all-digital version of the PPM receiver described in Parallel Processing of Broad-Band PPM Signals (NPO-40711), which appears elsewhere in this issue of NASA Tech Briefs. The program can also be translated into a design for digital PPM receiver hardware. The most notable innovation embodied in the software and the underlying PPM-reception concept is a digital processing subsystem that performs synchronization of PPM time slots, even though the digital processing is, itself, asynchronous in the sense that no attempt is made to synchronize it with the incoming optical signal a priori and there is no feedback to analog signal processing subsystems or ADCs. Functions performed by the software receiver include time-slot synchronization, symbol synchronization, coding preprocessing, and diagnostic functions. The program is written in the MATLAB and Simulink software system. The software receiver is highly parameterized and, hence, programmable: for example, slot- and symbol-synchronization filters have programmable bandwidths.
The impact of written information and counseling (WOMAN-PRO II Program) on symptom outcomes in women with vulvar neoplasia: A multicenter randomized controlled phase II study.

PubMed

Raphaelis, Silvia; Mayer, Hanna; Ott, Stefan; Mueller, Michael D; Steiner, Enikö; Joura, Elmar; Senn, Beate

2017-07-01

To determine whether written information and/or counseling based on the WOMAN-PRO II Program decreases symptom prevalence in women with vulvar neoplasia by a clinically relevant degree, and to explore the differences between the 2 interventions in symptom prevalence, symptom distress prevalence, and symptom experience. A multicenter randomized controlled parallel-group phase II trial with 2 interventions provided to patients after the initial diagnosis was performed in Austria and Switzerland. Women randomized to written information received a predefined set of leaflets concerning wound care and available healthcare services. Women allocated to counseling were additionally provided with 5 consultations by an Advanced Practice Nurse (APN) between the initial diagnosis and 6months post-surgery that focused on symptom management, utilization of healthcare services, and health-related decision-making. Symptom outcomes were simultaneously measured 5 times to the counseling time points. A total of 49 women with vulvar neoplasia participated in the study. Symptom prevalence decreased in women with counseling by a clinically relevant degree, but not in women with written information. Sporadically, significant differences between the 2 interventions could be observed in individual items, but not in the total scales or subscales of the symptom outcomes. The results indicate that counseling may reduce symptom prevalence in women with vulvar neoplasia by a clinically relevant extent. The observed group differences between the 2 interventions slightly favor counseling over written information. The results justify testing the benefit of counseling thoroughly in a comparative phase III trial. Copyright © 2017 Elsevier Inc. All rights reserved.
Learning in Parallel: Using Parallel Corpora to Enhance Written Language Acquisition at the Beginning Level

ERIC Educational Resources Information Center

Bluemel, Brody

2014-01-01

This article illustrates the pedagogical value of incorporating parallel corpora in foreign language education. It explores the development of a Chinese/English parallel corpus designed specifically for pedagogical application. The corpus tool was created to aid language learners in reading comprehension and writing development by making foreign…
A Comparison of Verbal and Written Language in Alzheimer's Disease

ERIC Educational Resources Information Center

Groves-Wright, Kathy; Neils-Strunjas, Jean; Burnett, Rebecca; O'Neill, Mary Jane

2004-01-01

Few studies have examined characteristics of both verbal and written language of individuals with Alzheimer's disease (AD). This study used parallel measures (picture description, word fluency, spelling to dictation, and confrontational naming) to compare verbal and written language of individuals with mild AD, moderate AD, and normal controls (14…
RPython high-level synthesis

NASA Astrophysics Data System (ADS)

Cieszewski, Radoslaw; Linczuk, Maciej

2016-09-01

The development of FPGA technology and the increasing complexity of applications in recent decades have forced compilers to move to higher abstraction levels. Compilers interprets an algorithmic description of a desired behavior written in High-Level Languages (HLLs) and translate it to Hardware Description Languages (HDLs). This paper presents a RPython based High-Level synthesis (HLS) compiler. The compiler get the configuration parameters and map RPython program to VHDL. Then, VHDL code can be used to program FPGA chips. In comparison of other technologies usage, FPGAs have the potential to achieve far greater performance than software as a result of omitting the fetch-decode-execute operations of General Purpose Processors (GPUs), and introduce more parallel computation. This can be exploited by utilizing many resources at the same time. Creating parallel algorithms computed with FPGAs in pure HDL is difficult and time consuming. Implementation time can be greatly reduced with High-Level Synthesis compiler. This article describes design methodologies and tools, implementation and first results of created VHDL backend for RPython compiler.
Automatic selection of dynamic data partitioning schemes for distributed memory multicomputers

NASA Technical Reports Server (NTRS)

Palermo, Daniel J.; Banerjee, Prithviraj

1995-01-01

For distributed memory multicomputers such as the Intel Paragon, the IBM SP-2, the NCUBE/2, and the Thinking Machines CM-5, the quality of the data partitioning for a given application is crucial to obtaining high performance. This task has traditionally been the user's responsibility, but in recent years much effort has been directed to automating the selection of data partitioning schemes. Several researchers have proposed systems that are able to produce data distributions that remain in effect for the entire execution of an application. For complex programs, however, such static data distributions may be insufficient to obtain acceptable performance. The selection of distributions that dynamically change over the course of a program's execution adds another dimension to the data partitioning problem. In this paper, we present a technique that can be used to automatically determine which partitionings are most beneficial over specific sections of a program while taking into account the added overhead of performing redistribution. This system is being built as part of the PARADIGM (PARAllelizing compiler for DIstributed memory General-purpose Multicomputers) project at the University of Illinois. The complete system will provide a fully automated means to parallelize programs written in a serial programming model obtaining high performance on a wide range of distributed-memory multicomputers.
A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

PubMed Central

2011-01-01

Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive documentation and annotated usage examples. PMID:21352538
A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines.

PubMed

Cieślik, Marcin; Mura, Cameron

2011-02-25

Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive documentation and annotated usage examples.
Children's Use of Evaluative Devices in Spoken and Written Narratives

ERIC Educational Resources Information Center

Drijbooms, Elise; Groen, Margriet A.; Verhoeven, Ludo

2017-01-01

This study investigated the development of evaluation in narratives from middle to late childhood, within the context of differentiating between spoken and written modalities. Two parallel forms of a picture story were used to elicit spoken and written narratives from fourth- and sixth-graders. It was expected that, in addition to an increase of…
Force user's manual: A portable, parallel FORTRAN

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Arenstorf, Norbert S.; Ramanan, Aruna V.

1990-01-01

The use of Force, a parallel, portable FORTRAN on shared memory parallel computers is described. Force simplifies writing code for parallel computers and, once the parallel code is written, it is easily ported to computers on which Force is installed. Although Force is nearly the same for all computers, specific details are included for the Cray-2, Cray-YMP, Convex 220, Flex/32, Encore, Sequent, Alliant computers on which it is installed.
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.

NASA Technical Reports Server (NTRS)

Reddy, C. J.

2000-01-01

PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.

Parallel Algorithms for Least Squares and Related Computations.

DTIC Science & Technology

1991-03-22

for dense computations in linear algebra . The work has recently been published in a general reference book on parallel algorithms by SIAM. AFO SR...written his Ph.D. dissertation with the principal investigator. (See publication 6.) • Parallel Algorithms for Dense Linear Algebra Computations. Our...and describe and to put into perspective a selection of the more important parallel algorithms for numerical linear algebra . We give a major new
Communication Studies of DMP and SMP Machines

NASA Technical Reports Server (NTRS)

Sohn, Andrew; Biswas, Rupak; Chancellor, Marisa K. (Technical Monitor)

1997-01-01

Understanding the interplay between machines and problems is key to obtaining high performance on parallel machines. This paper investigates the interplay between programming paradigms and communication capabilities of parallel machines. In particular, we explicate the communication capabilities of the IBM SP-2 distributed-memory multiprocessor and the SGI PowerCHALLENGEarray symmetric multiprocessor. Two benchmark problems of bitonic sorting and Fast Fourier Transform are selected for experiments. Communication-efficient algorithms are developed to exploit the overlapping capabilities of the machines. Programs are written in Message-Passing Interface for portability and identical codes are used for both machines. Various data sizes and message sizes are used to test the machines' communication capabilities. Experimental results indicate that the communication performance of the multiprocessors are consistent with the size of messages. The SP-2 is sensitive to message size but yields a much higher communication overlapping because of the communication co-processor. The PowerCHALLENGEarray is not highly sensitive to message size and yields a low communication overlapping. Bitonic sorting yields lower performance compared to FFT due to a smaller computation-to-communication ratio.
NLSEmagic: Nonlinear Schrödinger equation multi-dimensional Matlab-based GPU-accelerated integrators using compact high-order schemes

NASA Astrophysics Data System (ADS)

Caplan, R. M.

2013-04-01

We present a simple to use, yet powerful code package called NLSEmagic to numerically integrate the nonlinear Schrödinger equation in one, two, and three dimensions. NLSEmagic is a high-order finite-difference code package which utilizes graphic processing unit (GPU) parallel architectures. The codes running on the GPU are many times faster than their serial counterparts, and are much cheaper to run than on standard parallel clusters. The codes are developed with usability and portability in mind, and therefore are written to interface with MATLAB utilizing custom GPU-enabled C codes with the MEX-compiler interface. The packages are freely distributed, including user manuals and set-up files. Catalogue identifier: AEOJ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOJ_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 124453 No. of bytes in distributed program, including test data, etc.: 4728604 Distribution format: tar.gz Programming language: C, CUDA, MATLAB. Computer: PC, MAC. Operating system: Windows, MacOS, Linux. Has the code been vectorized or parallelized?: Yes. Number of processors used: Single CPU, number of GPU processors dependent on chosen GPU card (max is currently 3072 cores on GeForce GTX 690). Supplementary material: Setup guide, Installation guide. RAM: Highly dependent on dimensionality and grid size. For typical medium-large problem size in three dimensions, 4GB is sufficient. Keywords: Nonlinear Schröodinger Equation, GPU, high-order finite difference, Bose-Einstien condensates. Classification: 4.3, 7.7. Nature of problem: Integrate solutions of the time-dependent one-, two-, and three-dimensional cubic nonlinear Schrödinger equation. Solution method: The integrators utilize a fully-explicit fourth-order Runge-Kutta scheme in time and both second- and fourth-order differencing in space. The integrators are written to run on NVIDIA GPUs and are interfaced with MATLAB including built-in visualization and analysis tools. Restrictions: The main restriction for the GPU integrators is the amount of RAM on the GPU as the code is currently only designed for running on a single GPU. Unusual features: Ability to visualize real-time simulations through the interaction of MATLAB and the compiled GPU integrators. Additional comments: Setup guide and Installation guide provided. Program has a dedicated web site at www.nlsemagic.com. Running time: A three-dimensional run with a grid dimension of 87×87×203 for 3360 time steps (100 non-dimensional time units) takes about one and a half minutes on a GeForce GTX 580 GPU card.
CUBE: Information-optimized parallel cosmological N-body simulation code

NASA Astrophysics Data System (ADS)

Yu, Hao-Ran; Pen, Ue-Li; Wang, Xin

2018-05-01

CUBE, written in Coarray Fortran, is a particle-mesh based parallel cosmological N-body simulation code. The memory usage of CUBE can approach as low as 6 bytes per particle. Particle pairwise (PP) force, cosmological neutrinos, spherical overdensity (SO) halofinder are included.
Parametric analysis of hollow conductor parallel and coaxial transmission lines for high frequency space power distribution

NASA Technical Reports Server (NTRS)

Jeffries, K. S.; Renz, D. D.

1984-01-01

A parametric analysis was performed of transmission cables for transmitting electrical power at high voltage (up to 1000 V) and high frequency (10 to 30 kHz) for high power (100 kW or more) space missions. Large diameter (5 to 30 mm) hollow conductors were considered in closely spaced coaxial configurations and in parallel lines. Formulas were derived to calculate inductance and resistance for these conductors. Curves of cable conductance, mass, inductance, capacitance, resistance, power loss, and temperature were plotted for various conductor diameters, conductor thickness, and alternating current frequencies. An example 5 mm diameter coaxial cable with 0.5 mm conductor thickness was calculated to transmit 100 kW at 1000 Vac, 50 m with a power loss of 1900 W, an inductance of 1.45 micron and a capacitance of 0.07 micron-F. The computer programs written for this analysis are listed in the appendix.
A Comparison of Automatic Parallelization Tools/Compilers on the SGI Origin 2000 Using the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry

1998-01-01

Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.
ParFit: A Python-Based Object-Oriented Program for Fitting Molecular Mechanics Parameters to ab Initio Data

DOE PAGES

Zahariev, Federico; De Silva, Nuwan; Gordon, Mark S.; ...

2017-02-23

Here, a newly created object-oriented program for automating the process of fitting molecular-mechanics parameters to ab initio data, termed ParFit, is presented. ParFit uses a hybrid of deterministic and stochastic genetic algorithms. ParFit can simultaneously handle several molecular-mechanics parameters in multiple molecules and can also apply symmetric and antisymmetric constraints on the optimized parameters. The simultaneous handling of several molecules enhances the transferability of the fitted parameters. ParFit is written in Python, uses a rich set of standard and nonstandard Python libraries, and can be run in parallel on multicore computer systems. As an example, a series of phosphine oxides,more » important for metal extraction chemistry, are parametrized using ParFit.« less
ParFit: A Python-Based Object-Oriented Program for Fitting Molecular Mechanics Parameters to ab Initio Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zahariev, Federico; De Silva, Nuwan; Gordon, Mark S.

Here, a newly created object-oriented program for automating the process of fitting molecular-mechanics parameters to ab initio data, termed ParFit, is presented. ParFit uses a hybrid of deterministic and stochastic genetic algorithms. ParFit can simultaneously handle several molecular-mechanics parameters in multiple molecules and can also apply symmetric and antisymmetric constraints on the optimized parameters. The simultaneous handling of several molecules enhances the transferability of the fitted parameters. ParFit is written in Python, uses a rich set of standard and nonstandard Python libraries, and can be run in parallel on multicore computer systems. As an example, a series of phosphine oxides,more » important for metal extraction chemistry, are parametrized using ParFit.« less
Are written information or counseling (WOMAN-PRO II program) able to improve patient satisfaction and the delivery of health care of women with vulvar neoplasms? Secondary outcomes of a multicenter randomized controlled trial

PubMed

Gehrig, Larissa; Kobleder, Andrea; Werner, Birgit; Denhaerynck, Kris; Senn, Beate

2017-01-01

Background: Patients with vulvar neoplasms report a lack of information, missing support in self-management and a gap in delivery of health care. Aim: The aim of the study was to investigate if written information or counseling based on the WOMAN-PRO II program are able to improve patient satisfaction and the delivery of health care from the health professional's perspective of women with vulvar neoplasms. Method: Patient satisfaction and the delivery of health care have been investigated as two secondary outcomes in a multicenter randomized controlled parallel-group phase II study (Clinical Trial ID: NCT01986725). In total, 49 women, from four hospitals (CH, AUT), completed the questionnaire PACIC-S11 after written information (n = 13) and counseling (n = 36). The delivery of health care was evaluated by ten Advanced Practice Nurses (APNs) by using the G-ACIC before and after implementing counseling based on the WOMAN-PRO II program. Results: There were no significant differences between the two groups identified (p = 0.25). Only few aspects were rated highly by all women, such as the overall satisfaction (M = 80.3 %) and satisfaction with organization of care (M = 83.0 %). The evaluation of delivery of health care by APNs in women who received counseling improved significantly (p = 0.031). Conclusions: There are indications, that the practice of both interventions might have improved patient satisfaction and counseling the delivery of health care. The aspects that have been rated low in the PACIC-S11 and G-ACIC indicate possibilities to optimize the delivery of health care.
Experiences from Participants in Large-Scale Group Practice of the Maharishi Transcendental Meditation and TM-Sidhi Programs and Parallel Principles of Quantum Theory, Astrophysics, Quantum Cosmology, and String Theory: Interdisciplinary Qualitative Correspondences

NASA Astrophysics Data System (ADS)

Svenson, Eric Johan

Participants on the Invincible America Assembly in Fairfield, Iowa, and neighboring Maharishi Vedic City, Iowa, practicing Maharishi Transcendental Meditation(TM) (TM) and the TM-Sidhi(TM) programs in large groups, submitted written experiences that they had had during, and in some cases shortly after, their daily practice of the TM and TM-Sidhi programs. Participants were instructed to include in their written experiences only what they observed and to leave out interpretation and analysis. These experiences were then read by the author and compared with principles and phenomena of modern physics, particularly with quantum theory, astrophysics, quantum cosmology, and string theory as well as defining characteristics of higher states of consciousness as described by Maharishi Vedic Science. In all cases, particular principles or phenomena of physics and qualities of higher states of consciousness appeared qualitatively quite similar to the content of the given experience. These experiences are presented in an Appendix, in which the corresponding principles and phenomena of physics are also presented. These physics "commentaries" on the experiences were written largely in layman's terms, without equations, and, in nearly every case, with clear reference to the corresponding sections of the experiences to which a given principle appears to relate. An abundance of similarities were apparent between the subjective experiences during meditation and principles of modern physics. A theoretic framework for understanding these rich similarities may begin with Maharishi's theory of higher states of consciousness provided herein. We conclude that the consistency and richness of detail found in these abundant similarities warrants the further pursuit and development of such a framework.
ParFit: A Python-Based Object-Oriented Program for Fitting Molecular Mechanics Parameters to ab Initio Data.

PubMed

Zahariev, Federico; De Silva, Nuwan; Gordon, Mark S; Windus, Theresa L; Dick-Perez, Marilu

2017-03-27

A newly created object-oriented program for automating the process of fitting molecular-mechanics parameters to ab initio data, termed ParFit, is presented. ParFit uses a hybrid of deterministic and stochastic genetic algorithms. ParFit can simultaneously handle several molecular-mechanics parameters in multiple molecules and can also apply symmetric and antisymmetric constraints on the optimized parameters. The simultaneous handling of several molecules enhances the transferability of the fitted parameters. ParFit is written in Python, uses a rich set of standard and nonstandard Python libraries, and can be run in parallel on multicore computer systems. As an example, a series of phosphine oxides, important for metal extraction chemistry, are parametrized using ParFit. ParFit is in an open source program available for free on GitHub ( https://github.com/fzahari/ParFit ).
A survey of packages for large linear systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Kesheng; Milne, Brent

2000-02-11

This paper evaluates portable software packages for the iterative solution of very large sparse linear systems on parallel architectures. While we cannot hope to tell individual users which package will best suit their needs, we do hope that our systematic evaluation provides essential unbiased information about the packages and the evaluation process may serve as an example on how to evaluate these packages. The information contained here include feature comparisons, usability evaluations and performance characterizations. This review is primarily focused on self-contained packages that can be easily integrated into an existing program and are capable of computing solutions to verymore » large sparse linear systems of equations. More specifically, it concentrates on portable parallel linear system solution packages that provide iterative solution schemes and related preconditioning schemes because iterative methods are more frequently used than competing schemes such as direct methods. The eight packages evaluated are: Aztec, BlockSolve,ISIS++, LINSOL, P-SPARSLIB, PARASOL, PETSc, and PINEAPL. Among the eight portable parallel iterative linear system solvers reviewed, we recommend PETSc and Aztec for most application programmers because they have well designed user interface, extensive documentation and very responsive user support. Both PETSc and Aztec are written in the C language and are callable from Fortran. For those users interested in using Fortran 90, PARASOL is a good alternative. ISIS++is a good alternative for those who prefer the C++ language. Both PARASOL and ISIS++ are relatively new and are continuously evolving. Thus their user interface may change. In general, those packages written in Fortran 77 are more cumbersome to use because the user may need to directly deal with a number of arrays of varying sizes. Languages like C++ and Fortran 90 offer more convenient data encapsulation mechanisms which make it easier to implement a clean and intuitive user interface. In addition to reviewing these portable parallel iterative solver packages, we also provide a more cursory assessment of a range of related packages, from specialized parallel preconditioners to direct methods for sparse linear systems.« less
Reading through Films

ERIC Educational Resources Information Center

Raman, Madhavi Gayathri; Vijaya

2016-01-01

This paper captures the design of a comprehensive curriculum incorporating the four skills based exclusively on the use of parallel audio-visual and written texts. We discuss the use of authentic materials to teach English to Indian undergraduates aged 18 to 20 years. Specifically, we talk about the use of parallel reading (screen-play) and…
PAREMD: A parallel program for the evaluation of momentum space properties of atoms and molecules

NASA Astrophysics Data System (ADS)

Meena, Deep Raj; Gadre, Shridhar R.; Balanarayan, P.

2018-03-01

The present work describes a code for evaluating the electron momentum density (EMD), its moments and the associated Shannon information entropy for a multi-electron molecular system. The code works specifically for electronic wave functions obtained from traditional electronic structure packages such as GAMESS and GAUSSIAN. For the momentum space orbitals, the general expression for Gaussian basis sets in position space is analytically Fourier transformed to momentum space Gaussian basis functions. The molecular orbital coefficients of the wave function are taken as an input from the output file of the electronic structure calculation. The analytic expressions of EMD are evaluated over a fine grid and the accuracy of the code is verified by a normalization check and a numerical kinetic energy evaluation which is compared with the analytic kinetic energy given by the electronic structure package. Apart from electron momentum density, electron density in position space has also been integrated into this package. The program is written in C++ and is executed through a Shell script. It is also tuned for multicore machines with shared memory through OpenMP. The program has been tested for a variety of molecules and correlated methods such as CISD, Møller-Plesset second order (MP2) theory and density functional methods. For correlated methods, the PAREMD program uses natural spin orbitals as an input. The program has been benchmarked for a variety of Gaussian basis sets for different molecules showing a linear speedup on a parallel architecture.
Survival distributions impact the power of randomized placebo-phase design and parallel groups randomized clinical trials.

PubMed

Abrahamyan, Lusine; Li, Chuan Silvia; Beyene, Joseph; Willan, Andrew R; Feldman, Brian M

2011-03-01

The study evaluated the power of the randomized placebo-phase design (RPPD)-a new design of randomized clinical trials (RCTs), compared with the traditional parallel groups design, assuming various response time distributions. In the RPPD, at some point, all subjects receive the experimental therapy, and the exposure to placebo is for only a short fixed period of time. For the study, an object-oriented simulation program was written in R. The power of the simulated trials was evaluated using six scenarios, where the treatment response times followed the exponential, Weibull, or lognormal distributions. The median response time was assumed to be 355 days for the placebo and 42 days for the experimental drug. Based on the simulation results, the sample size requirements to achieve the same level of power were different under different response time to treatment distributions. The scenario where the response times followed the exponential distribution had the highest sample size requirement. In most scenarios, the parallel groups RCT had higher power compared with the RPPD. The sample size requirement varies depending on the underlying hazard distribution. The RPPD requires more subjects to achieve a similar power to the parallel groups design. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Can SEE-2 children understand ASL-using adults?

PubMed

Luetke-Stahlman, B

1990-01-01

Signing Exact English or SEE-2, is one of several invented sign systems currently being used with hearing-impaired children in the United States. The system parallels the morphology of written English and differs from American Sign Language both in terms of the configuration of many of the lexical signs and in grammatical word order. The current study investigated whether students accustomed to an invented sign system could comprehend ASL signed by deaf adults. One group of subjects was exposed to SEE-2 in their day school classrooms. The other group attended residential programs and was exposed to Signed English, PSE and ASL. Both groups observed three videotaped short stories and answered questions following each. Both groups answered approximately 25 percent of the written comprehension questions correctly; their mean scores did not differ significantly. Results of the study suggest that students exposed to SEE-2 and lacking experiences with deaf adults were able to comprehend ASL as well as their peers who attended residential schools.
Design of fuel cell powered data centers for sufficient reliability and availability

NASA Astrophysics Data System (ADS)

Ritchie, Alexa J.; Brouwer, Jacob

2018-04-01

It is challenging to design a sufficiently reliable fuel cell electrical system for use in data centers, which require 99.9999% uptime. Such a system could lower emissions and increase data center efficiency, but the reliability and availability of such a system must be analyzed and understood. Currently, extensive backup equipment is used to ensure electricity availability. The proposed design alternative uses multiple fuel cell systems each supporting a small number of servers to eliminate backup power equipment provided the fuel cell design has sufficient reliability and availability. Potential system designs are explored for the entire data center and for individual fuel cells. Reliability block diagram analysis of the fuel cell systems was accomplished to understand the reliability of the systems without repair or redundant technologies. From this analysis, it was apparent that redundant components would be necessary. A program was written in MATLAB to show that the desired system reliability could be achieved by a combination of parallel components, regardless of the number of additional components needed. Having shown that the desired reliability was achievable through some combination of components, a dynamic programming analysis was undertaken to assess the ideal allocation of parallel components.
Progress in Computational Simulation of Earthquakes

NASA Technical Reports Server (NTRS)

Donnellan, Andrea; Parker, Jay; Lyzenga, Gregory; Judd, Michele; Li, P. Peggy; Norton, Charles; Tisdale, Edwin; Granat, Robert

2006-01-01

GeoFEST(P) is a computer program written for use in the QuakeSim project, which is devoted to development and improvement of means of computational simulation of earthquakes. GeoFEST(P) models interacting earthquake fault systems from the fault-nucleation to the tectonic scale. The development of GeoFEST( P) has involved coupling of two programs: GeoFEST and the Pyramid Adaptive Mesh Refinement Library. GeoFEST is a message-passing-interface-parallel code that utilizes a finite-element technique to simulate evolution of stress, fault slip, and plastic/elastic deformation in realistic materials like those of faulted regions of the crust of the Earth. The products of such simulations are synthetic observable time-dependent surface deformations on time scales from days to decades. Pyramid Adaptive Mesh Refinement Library is a software library that facilitates the generation of computational meshes for solving physical problems. In an application of GeoFEST(P), a computational grid can be dynamically adapted as stress grows on a fault. Simulations on workstations using a few tens of thousands of stress and displacement finite elements can now be expanded to multiple millions of elements with greater than 98-percent scaled efficiency on over many hundreds of parallel processors (see figure).
Practical multipeptide synthesis: dedicated software for the definition of multiple, overlapping peptides covering polypeptide sequences.

PubMed

Heegaard, P M; Holm, A; Hagerup, M

1993-01-01

A personal computer program for the conversion of linear amino acid sequences to multiple, small, overlapping peptide sequences has been developed. Peptide lengths and "jumps" (the distance between two consecutive overlapping peptides) are defined by the user. To facilitate the use of the program for parallel solid-phase chemical peptide syntheses for the synchronous production of multiple peptides, amino acids at each acylation step are laid out by the program in a convenient standard multi-well setup. Also, the total number of equivalents, as well as the derived amount in milligrams (depend-ending on user-defined equivalent weights and molar surplus), of each amino acid are given. The program facilitates the implementation of multipeptide synthesis, e.g., for the elucidation of polypeptide structure-function relationships, and greatly reduces the risk of introducing mistakes at the planning step. It is written in Pascal and runs on any DOS-based personal computer. No special graphic display is needed.
pcircle - A Suite of Scalable Parallel File System Tools

DOE Office of Scientific and Technical Information (OSTI.GOV)

WANG, FEIYI

2015-10-01

Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as async progress report, checkpoint and restart, as well as integrity checking.

The mathematical statement for the solving of the problem of N-version software system design

NASA Astrophysics Data System (ADS)

Kovalev, I. V.; Kovalev, D. I.; Zelenkov, P. V.; Voroshilova, A. A.

2015-10-01

The N-version programming, as a methodology of the fault-tolerant software systems design, allows successful solving of the mentioned tasks. The use of N-version programming approach turns out to be effective, since the system is constructed out of several parallel executed versions of some software module. Those versions are written to meet the same specification but by different programmers. The problem of developing an optimal structure of N-version software system presents a kind of very complex optimization problem. This causes the use of deterministic optimization methods inappropriate for solving the stated problem. In this view, exploiting heuristic strategies looks more rational. In the field of pseudo-Boolean optimization theory, the so called method of varied probabilities (MVP) has been developed to solve problems with a large dimensionality.
Scalable Performance Environments for Parallel Systems

NASA Technical Reports Server (NTRS)

Reed, Daniel A.; Olson, Robert D.; Aydt, Ruth A.; Madhyastha, Tara M.; Birkett, Thomas; Jensen, David W.; Nazief, Bobby A. A.; Totty, Brian K.

1991-01-01

As parallel systems expand in size and complexity, the absence of performance tools for these parallel systems exacerbates the already difficult problems of application program and system software performance tuning. Moreover, given the pace of technological change, we can no longer afford to develop ad hoc, one-of-a-kind performance instrumentation software; we need scalable, portable performance analysis tools. We describe an environment prototype based on the lessons learned from two previous generations of performance data analysis software. Our environment prototype contains a set of performance data transformation modules that can be interconnected in user-specified ways. It is the responsibility of the environment infrastructure to hide details of module interconnection and data sharing. The environment is written in C++ with the graphical displays based on X windows and the Motif toolkit. It allows users to interconnect and configure modules graphically to form an acyclic, directed data analysis graph. Performance trace data are represented in a self-documenting stream format that includes internal definitions of data types, sizes, and names. The environment prototype supports the use of head-mounted displays and sonic data presentation in addition to the traditional use of visual techniques.
Execution of a parallel edge-based Navier-Stokes solver on commodity graphics processor units

NASA Astrophysics Data System (ADS)

Corral, Roque; Gisbert, Fernando; Pueblas, Jesus

2017-02-01

The implementation of an edge-based three-dimensional Reynolds Average Navier-Stokes solver for unstructured grids able to run on multiple graphics processing units (GPUs) is presented. Loops over edges, which are the most time-consuming part of the solver, have been written to exploit the massively parallel capabilities of GPUs. Non-blocking communications between parallel processes and between the GPU and the central processor unit (CPU) have been used to enhance code scalability. The code is written using a mixture of C++ and OpenCL, to allow the execution of the source code on GPUs. The Message Passage Interface (MPI) library is used to allow the parallel execution of the solver on multiple GPUs. A comparative study of the solver parallel performance is carried out using a cluster of CPUs and another of GPUs. It is shown that a single GPU is up to 64 times faster than a single CPU core. The parallel scalability of the solver is mainly degraded due to the loss of computing efficiency of the GPU when the size of the case decreases. However, for large enough grid sizes, the scalability is strongly improved. A cluster featuring commodity GPUs and a high bandwidth network is ten times less costly and consumes 33% less energy than a CPU-based cluster with an equivalent computational power.
Pythran: enabling static optimization of scientific Python programs

NASA Astrophysics Data System (ADS)

Guelton, Serge; Brunet, Pierrick; Amini, Mehdi; Merlini, Adrien; Corbillon, Xavier; Raynaud, Alan

2015-01-01

Pythran is an open source static compiler that turns modules written in a subset of Python language into native ones. Assuming that scientific modules do not rely much on the dynamic features of the language, it trades them for powerful, possibly inter-procedural, optimizations. These optimizations include detection of pure functions, temporary allocation removal, constant folding, Numpy ufunc fusion and parallelization, explicit thread-level parallelism through OpenMP annotations, false variable polymorphism pruning, and automatic vector instruction generation such as AVX or SSE. In addition to these compilation steps, Pythran provides a C++ runtime library that leverages the C++ STL to provide generic containers, and the Numeric Template Toolbox for Numpy support. It takes advantage of modern C++11 features such as variadic templates, type inference, move semantics and perfect forwarding, as well as classical idioms such as expression templates. Unlike the Cython approach, Pythran input code remains compatible with the Python interpreter. Output code is generally as efficient as the annotated Cython equivalent, if not more, but without the backward compatibility loss.
Halvade-RNA: Parallel variant calling from transcriptomic data using MapReduce.

PubMed

Decap, Dries; Reumers, Joke; Herzeel, Charlotte; Costanza, Pascal; Fostier, Jan

2017-01-01

Given the current cost-effectiveness of next-generation sequencing, the amount of DNA-seq and RNA-seq data generated is ever increasing. One of the primary objectives of NGS experiments is calling genetic variants. While highly accurate, most variant calling pipelines are not optimized to run efficiently on large data sets. However, as variant calling in genomic data has become common practice, several methods have been proposed to reduce runtime for DNA-seq analysis through the use of parallel computing. Determining the effectively expressed variants from transcriptomics (RNA-seq) data has only recently become possible, and as such does not yet benefit from efficiently parallelized workflows. We introduce Halvade-RNA, a parallel, multi-node RNA-seq variant calling pipeline based on the GATK Best Practices recommendations. Halvade-RNA makes use of the MapReduce programming model to create and manage parallel data streams on which multiple instances of existing tools such as STAR and GATK operate concurrently. Whereas the single-threaded processing of a typical RNA-seq sample requires ∼28h, Halvade-RNA reduces this runtime to ∼2h using a small cluster with two 20-core machines. Even on a single, multi-core workstation, Halvade-RNA can significantly reduce runtime compared to using multi-threading, thus providing for a more cost-effective processing of RNA-seq data. Halvade-RNA is written in Java and uses the Hadoop MapReduce 2.0 API. It supports a wide range of distributions of Hadoop, including Cloudera and Amazon EMR.
The Complexity of Parallel Algorithms,

DTIC Science & Technology

1985-11-01

programns have been written for se(luiential coiipn ters. Many p~eop~le want coimp ~ilers dihal. will c(nimpile t he, code for parallel machines, to avoid...between two vertices. We also rely on parallel algorithms for maintaining data structures and manipulating graphs. We do not go into the details of these...Jpatlis and maintain connected coimp ~onents. The routine is: - 35 .- ExtendPath(r, Q, V) begin P +-0; s 4- while there is a path in V - P from s to a vertex
Spiral: Automated Computing for Linear Transforms

NASA Astrophysics Data System (ADS)

Püschel, Markus

2010-09-01

Writing fast software has become extraordinarily difficult. For optimal performance, programs and their underlying algorithms have to be adapted to take full advantage of the platform's parallelism, memory hierarchy, and available instruction set. To make things worse, the best implementations are often platform-dependent and platforms are constantly evolving, which quickly renders libraries obsolete. We present Spiral, a domain-specific program generation system for important functionality used in signal processing and communication including linear transforms, filters, and other functions. Spiral completely replaces the human programmer. For a desired function, Spiral generates alternative algorithms, optimizes them, compiles them into programs, and intelligently searches for the best match to the computing platform. The main idea behind Spiral is a mathematical, declarative, domain-specific framework to represent algorithms and the use of rewriting systems to generate and optimize algorithms at a high level of abstraction. Experimental results show that the code generated by Spiral competes with, and sometimes outperforms, the best available human-written code.
Software for Use with Optoelectronic Measuring Tool

NASA Technical Reports Server (NTRS)

Ballard, Kim C.

2004-01-01

A computer program has been written to facilitate and accelerate the process of measurement by use of the apparatus described in "Optoelectronic Tool Adds Scale Marks to Photographic Images" (KSC-12201). The tool contains four laser diodes that generate parallel beams of light spaced apart at a known distance. The beams of light are used to project bright spots that serve as scale marks that become incorporated into photographic images (including film and electronic images). The sizes of objects depicted in the images can readily be measured by reference to the scale marks. The computer program is applicable to a scene that contains the laser spots and that has been imaged in a square pixel format that can be imported into a graphical user interface (GUI) generated by the program. It is assumed that the laser spots and the distance(s) to be measured all lie in the same plane and that the plane is perpendicular to the line of sight of the camera used to record the image
MCdevelop - a universal framework for Stochastic Simulations

NASA Astrophysics Data System (ADS)

Slawinska, M.; Jadach, S.

2011-03-01

We present MCdevelop, a universal computer framework for developing and exploiting the wide class of Stochastic Simulations (SS) software. This powerful universal SS software development tool has been derived from a series of scientific projects for precision calculations in high energy physics (HEP), which feature a wide range of functionality in the SS software needed for advanced precision Quantum Field Theory calculations for the past LEP experiments and for the ongoing LHC experiments at CERN, Geneva. MCdevelop is a "spin-off" product of HEP to be exploited in other areas, while it will still serve to develop new SS software for HEP experiments. Typically SS involve independent generation of large sets of random "events", often requiring considerable CPU power. Since SS jobs usually do not share memory it makes them easy to parallelize. The efficient development, testing and running in parallel SS software requires a convenient framework to develop software source code, deploy and monitor batch jobs, merge and analyse results from multiple parallel jobs, even before the production runs are terminated. Throughout the years of development of stochastic simulations for HEP, a sophisticated framework featuring all the above mentioned functionality has been implemented. MCdevelop represents its latest version, written mostly in C++ (GNU compiler gcc). It uses Autotools to build binaries (optionally managed within the KDevelop 3.5.3 Integrated Development Environment (IDE)). It uses the open-source ROOT package for histogramming, graphics and the mechanism of persistency for the C++ objects. MCdevelop helps to run multiple parallel jobs on any computer cluster with NQS-type batch system. Program summaryProgram title:MCdevelop Catalogue identifier: AEHW_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHW_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 48 136 No. of bytes in distributed program, including test data, etc.: 355 698 Distribution format: tar.gz Programming language: ANSI C++ Computer: Any computer system or cluster with C++ compiler and UNIX-like operating system. Operating system: Most UNIX systems, Linux. The application programs were thoroughly tested under Ubuntu 7.04, 8.04 and CERN Scientific Linux 5. Has the code been vectorised or parallelised?: Tools (scripts) for optional parallelisation on a PC farm are included. RAM: 500 bytes Classification: 11.3 External routines: ROOT package version 5.0 or higher ( http://root.cern.ch/drupal/). Nature of problem: Developing any type of stochastic simulation program for high energy physics and other areas. Solution method: Object Oriented programming in C++ with added persistency mechanism, batch scripts for running on PC farms and Autotools.
10 CFR 26.403 - Written policy and procedures.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 10 Energy 1 2013-01-01 2013-01-01 false Written policy and procedures. 26.403 Section 26.403 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS FFD Program for Construction § 26.403 Written policy and procedures. (a) Licensees and other entities who implement an FFD program under this...
10 CFR 26.403 - Written policy and procedures.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 10 Energy 1 2012-01-01 2012-01-01 false Written policy and procedures. 26.403 Section 26.403 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS FFD Program for Construction § 26.403 Written policy and procedures. (a) Licensees and other entities who implement an FFD program under this...
10 CFR 26.403 - Written policy and procedures.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 10 Energy 1 2014-01-01 2014-01-01 false Written policy and procedures. 26.403 Section 26.403 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS FFD Program for Construction § 26.403 Written policy and procedures. (a) Licensees and other entities who implement an FFD program under this...
10 CFR 26.403 - Written policy and procedures.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 10 Energy 1 2011-01-01 2011-01-01 false Written policy and procedures. 26.403 Section 26.403 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS FFD Program for Construction § 26.403 Written policy and procedures. (a) Licensees and other entities who implement an FFD program under this...
10 CFR 26.403 - Written policy and procedures.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 10 Energy 1 2010-01-01 2010-01-01 false Written policy and procedures. 26.403 Section 26.403 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS FFD Program for Construction § 26.403 Written policy and procedures. (a) Licensees and other entities who implement an FFD program under this...
User's guide to noise data acquisition and analysis programs for HP9845: Nicolet analyzers

NASA Technical Reports Server (NTRS)

Mcgary, M. C.

1982-01-01

A software interface package was written for use with a desktop computer and two models of single channel Fast Fourier analyzers. This software features a portable measurement and analysis system with several options. Two types of interface hardware can alternately be used in conjunction with the software. Either an IEEE-488 Bus interface or a 16-bit parallel system may be used. Two types of storage medium, either tape cartridge or floppy disc can be used with the software. Five types of data may be stored, plotted, and/or printed. The data types include time histories, narrow band power spectra, and narrow band, one-third octave band, or octave band sound pressure level. The data acquisition programming includes a front panel remote control option for the FFT analyzers. Data analysis options include choice of line type and pen color for plotting.
MetaQuant: a tool for the automatic quantification of GC/MS-based metabolome data.

PubMed

Bunk, Boyke; Kucklick, Martin; Jonas, Rochus; Münch, Richard; Schobert, Max; Jahn, Dieter; Hiller, Karsten

2006-12-01

MetaQuant is a Java-based program for the automatic and accurate quantification of GC/MS-based metabolome data. In contrast to other programs MetaQuant is able to quantify hundreds of substances simultaneously with minimal manual intervention. The integration of a self-acting calibration function allows the parallel and fast calibration for several metabolites simultaneously. Finally, MetaQuant is able to import GC/MS data in the common NetCDF format and to export the results of the quantification into Systems Biology Markup Language (SBML), Comma Separated Values (CSV) or Microsoft Excel (XLS) format. MetaQuant is written in Java and is available under an open source license. Precompiled packages for the installation on Windows or Linux operating systems are freely available for download. The source code as well as the installation packages are available at http://bioinformatics.org/metaquant
A new free and open source tool for space plasma modeling.

NASA Astrophysics Data System (ADS)

Honkonen, I. J.

2014-12-01

I will present a new distributed memory parallel, free and open source computational model for studying space plasma. The model is written in C++ with emphasis on good software development practices and code readability without sacrificing serial or parallel performance. As such the model could be especially useful for education, for learning both (magneto)hydrodynamics (MHD) and computational model development. By using latest features of the C++ standard (2011) it has been possible to develop a very modular program which improves not only the readability of code but also the testability of the model and decreases the effort required to make changes to various parts of the program. Major parts of the model, functionality not directly related to (M)HD, have been outsourced to other freely available libraries which has reduced the development time of the model significantly. I will present an overview of the code architecture as well as details of different parts of the model and will show examples of using the model including preparing input files and plotting results. A multitude of 1-, 2- and 3-dimensional test cases are included in the software distribution and the results of, for example, Kelvin-Helmholtz, bow shock, blast wave and reconnection tests, will be presented.
Design and optimization of a portable LQCD Monte Carlo code using OpenACC

NASA Astrophysics Data System (ADS)

Bonati, Claudio; Coscetti, Simone; D'Elia, Massimo; Mesiti, Michele; Negro, Francesco; Calore, Enrico; Schifano, Sebastiano Fabio; Silvi, Giorgio; Tripiccione, Raffaele

The present panorama of HPC architectures is extremely heterogeneous, ranging from traditional multi-core CPU processors, supporting a wide class of applications but delivering moderate computing performance, to many-core Graphics Processor Units (GPUs), exploiting aggressive data-parallelism and delivering higher performances for streaming computing applications. In this scenario, code portability (and performance portability) become necessary for easy maintainability of applications; this is very relevant in scientific computing where code changes are very frequent, making it tedious and prone to error to keep different code versions aligned. In this work, we present the design and optimization of a state-of-the-art production-level LQCD Monte Carlo application, using the directive-based OpenACC programming model. OpenACC abstracts parallel programming to a descriptive level, relieving programmers from specifying how codes should be mapped onto the target architecture. We describe the implementation of a code fully written in OpenAcc, and show that we are able to target several different architectures, including state-of-the-art traditional CPUs and GPUs, with the same code. We also measure performance, evaluating the computing efficiency of our OpenACC code on several architectures, comparing with GPU-specific implementations and showing that a good level of performance-portability can be reached.
A molecular dynamics implementation of the 3D Mercedes-Benz water model

NASA Astrophysics Data System (ADS)

Hynninen, T.; Dias, C. L.; Mkrtchyan, A.; Heinonen, V.; Karttunen, M.; Foster, A. S.; Ala-Nissila, T.

2012-02-01

The three-dimensional Mercedes-Benz model was recently introduced to account for the structural and thermodynamic properties of water. It treats water molecules as point-like particles with four dangling bonds in tetrahedral coordination, representing H-bonds of water. Its conceptual simplicity renders the model attractive in studies where complex behaviors emerge from H-bond interactions in water, e.g., the hydrophobic effect. A molecular dynamics (MD) implementation of the model is non-trivial and we outline here the mathematical framework of its force-field. Useful routines written in modern Fortran are also provided. This open source code is free and can easily be modified to account for different physical context. The provided code allows both serial and MPI-parallelized execution. Program summaryProgram title: CASHEW (Coarse Approach Simulator for Hydrogen-bonding Effects in Water) Catalogue identifier: AEKM_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEKM_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 20 501 No. of bytes in distributed program, including test data, etc.: 551 044 Distribution format: tar.gz Programming language: Fortran 90 Computer: Program has been tested on desktop workstations and a Cray XT4/XT5 supercomputer. Operating system: Linux, Unix, OS X Has the code been vectorized or parallelized?: The code has been parallelized using MPI. RAM: Depends on size of system, about 5 MB for 1500 molecules. Classification: 7.7 External routines: A random number generator, Mersenne Twister ( http://www.math.sci.hiroshima-u.ac.jp/m-mat/MT/VERSIONS/FORTRAN/mt95.f90), is used. A copy of the code is included in the distribution. Nature of problem: Molecular dynamics simulation of a new geometric water model. Solution method: New force-field for water molecules, velocity-Verlet integration, representation of molecules as rigid particles with rotations described using quaternion algebra. Restrictions: Memory and cpu time limit the size of simulations. Additional comments: Software web site: https://gitorious.org/cashew/. Running time: Depends on the size of system. The sample tests provided only take a few seconds.
Crema

DTIC Science & Technology

2015-08-01

Crema FizzBuzz Program .................................................. 8 Figure 4: Hello World program written in C...11 Figure 5: Hello World program written in Crema...KLEE Coverage for " Hello , World" Program ................................................................ 14 Table 2: Qmail State-space Explosion

Automatic Adaptation of Tunable Distributed Applications

DTIC Science & Technology

2001-01-01

size, weight, and battery life, with a single CPU, less memory, smaller hard disk, and lower bandwidth network connectivity. The power of PDAs is...wireless, and bluetooth [32] facilities; thus achieving different rates of data transmission. 1 With the trend of “write once, run everywhere...applications, a single component can execute on multiple processors (or machines) in parallel. These parallel applications, written in a specialized language
Heterogeneous scalable framework for multiphase flows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morris, Karla Vanessa

2013-09-01

Two categories of challenges confront the developer of computational spray models: those related to the computation and those related to the physics. Regarding the computation, the trend towards heterogeneous, multi- and many-core platforms will require considerable re-engineering of codes written for the current supercomputing platforms. Regarding the physics, accurate methods for transferring mass, momentum and energy from the dispersed phase onto the carrier fluid grid have so far eluded modelers. Significant challenges also lie at the intersection between these two categories. To be competitive, any physics model must be expressible in a parallel algorithm that performs well on evolving computermore » platforms. This work created an application based on a software architecture where the physics and software concerns are separated in a way that adds flexibility to both. The develop spray-tracking package includes an application programming interface (API) that abstracts away the platform-dependent parallelization concerns, enabling the scientific programmer to write serial code that the API resolves into parallel processes and threads of execution. The project also developed the infrastructure required to provide similar APIs to other application. The API allow object-oriented Fortran applications direct interaction with Trilinos to support memory management of distributed objects in central processing units (CPU) and graphic processing units (GPU) nodes for applications using C++.« less
JUPITER: Joint Universal Parameter IdenTification and Evaluation of Reliability - An Application Programming Interface (API) for Model Analysis

USGS Publications Warehouse

Banta, Edward R.; Poeter, Eileen P.; Doherty, John E.; Hill, Mary C.

2006-01-01

he Joint Universal Parameter IdenTification and Evaluation of Reliability Application Programming Interface (JUPITER API) improves the computer programming resources available to those developing applications (computer programs) for model analysis.The JUPITER API consists of eleven Fortran-90 modules that provide for encapsulation of data and operations on that data. Each module contains one or more entities: data, data types, subroutines, functions, and generic interfaces. The modules do not constitute computer programs themselves; instead, they are used to construct computer programs. Such computer programs are called applications of the API. The API provides common modeling operations for use by a variety of computer applications.The models being analyzed are referred to here as process models, and may, for example, represent the physics, chemistry, and(or) biology of a field or laboratory system. Process models commonly are constructed using published models such as MODFLOW (Harbaugh et al., 2000; Harbaugh, 2005), MT3DMS (Zheng and Wang, 1996), HSPF (Bicknell et al., 1997), PRMS (Leavesley and Stannard, 1995), and many others. The process model may be accessed by a JUPITER API application as an external program, or it may be implemented as a subroutine within a JUPITER API application . In either case, execution of the model takes place in a framework designed by the application programmer. This framework can be designed to take advantage of any parallel processing capabilities possessed by the process model, as well as the parallel-processing capabilities of the JUPITER API.Model analyses for which the JUPITER API could be useful include, for example: Compare model results to observed values to determine how well the model reproduces system processes and characteristics.Use sensitivity analysis to determine the information provided by observations to parameters and predictions of interest.Determine the additional data needed to improve selected model predictions.Use calibration methods to modify parameter values and other aspects of the model.Compare predictions to regulatory limits.Quantify the uncertainty of predictions based on the results of one or many simulations using inferential or Monte Carlo methods.Determine how to manage the system to achieve stated objectives.The capabilities provided by the JUPITER API include, for example, communication with process models, parallel computations, compressed storage of matrices, and flexible input capabilities. The input capabilities use input blocks suitable for lists or arrays of data. The input blocks needed for one application can be included within one data file or distributed among many files. Data exchange between different JUPITER API applications or between applications and other programs is supported by data-exchange files.The JUPITER API has already been used to construct a number of applications. Three simple example applications are presented in this report. More complicated applications include the universal inverse code UCODE_2005 (Poeter et al., 2005), the multi-model analysis MMA (Eileen P. Poeter, Mary C. Hill, E.R. Banta, S.W. Mehl, and Steen Christensen, written commun., 2006), and a code named OPR_PPR (Matthew J. Tonkin, Claire R. Tiedeman, Mary C. Hill, and D. Matthew Ely, written communication, 2006).This report describes a set of underlying organizational concepts and complete specifics about the JUPITER API. While understanding the organizational concept presented is useful to understanding the modules, other organizational concepts can be used in applications constructed using the JUPITER API.
Controlling Laboratory Processes From A Personal Computer

NASA Technical Reports Server (NTRS)

Will, H.; Mackin, M. A.

1991-01-01

Computer program provides natural-language process control from IBM PC or compatible computer. Sets up process-control system that either runs without operator or run by workers who have limited programming skills. Includes three smaller programs. Two of them, written in FORTRAN 77, record data and control research processes. Third program, written in Pascal, generates FORTRAN subroutines used by other two programs to identify user commands with device-driving routines written by user. Also includes set of input data allowing user to define user commands to be executed by computer. Requires personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. Also requires FORTRAN 77 compiler and device drivers written by user.
Computation of Reacting Flows in Combustion Processes

NASA Technical Reports Server (NTRS)

Keith, Theo G., Jr.; Chen, Kuo-Huey

1997-01-01

The main objective of this research was to develop an efficient three-dimensional computer code for chemically reacting flows. The main computer code developed is ALLSPD-3D. The ALLSPD-3D computer program is developed for the calculation of three-dimensional, chemically reacting flows with sprays. The ALL-SPD code employs a coupled, strongly implicit solution procedure for turbulent spray combustion flows. A stochastic droplet model and an efficient method for treatment of the spray source terms in the gas-phase equations are used to calculate the evaporating liquid sprays. The chemistry treatment in the code is general enough that an arbitrary number of reaction and species can be defined by the users. Also, it is written in generalized curvilinear coordinates with both multi-block and flexible internal blockage capabilities to handle complex geometries. In addition, for general industrial combustion applications, the code provides both dilution and transpiration cooling capabilities. The ALLSPD algorithm, which employs the preconditioning and eigenvalue rescaling techniques, is capable of providing efficient solution for flows with a wide range of Mach numbers. Although written for three-dimensional flows in general, the code can be used for two-dimensional and axisymmetric flow computations as well. The code is written in such a way that it can be run in various computer platforms (supercomputers, workstations and parallel processors) and the GUI (Graphical User Interface) should provide a user-friendly tool in setting up and running the code.
Operation Corporate: Parallels of the Joint Operational Access Concept

DTIC Science & Technology

2013-04-04

To) 04-04-2013 Master of Military Studies Research Paper September 2012 - April 2013 4. TITLE AND SUBTITLE Sa. CONTRACT NUMBER Operation Corporate...amphibious operations. Discussion: Many books, periodicals, and papers have been written about the Falklands War. Most of these cover a broad...periodicals, and papers have been written about the Falklands War. Most of these cover a broad overview of the air, naval, and ground campaign, or are very
A Tutorial on Parallel and Concurrent Programming in Haskell

NASA Astrophysics Data System (ADS)

Peyton Jones, Simon; Singh, Satnam

This practical tutorial introduces the features available in Haskell for writing parallel and concurrent programs. We first describe how to write semi-explicit parallel programs by using annotations to express opportunities for parallelism and to help control the granularity of parallelism for effective execution on modern operating systems and processors. We then describe the mechanisms provided by Haskell for writing explicitly parallel programs with a focus on the use of software transactional memory to help share information between threads. Finally, we show how nested data parallelism can be used to write deterministically parallel programs which allows programmers to use rich data types in data parallel programs which are automatically transformed into flat data parallel versions for efficient execution on multi-core processors.
MASPROP- MASS PROPERTIES OF A RIGID STRUCTURE

NASA Technical Reports Server (NTRS)

Hull, R. A.

1994-01-01

The computer program MASPROP was developed to rapidly calculate the mass properties of complex rigid structural systems. This program's basic premise is that complex systems can be adequately described by a combination of basic elementary structural shapes. Thirteen widely used basic structural shapes are available in this program. They are as follows: Discrete Mass, Cylinder, Truncated Cone, Torus, Beam (arbitrary cross section), Circular Rod (arbitrary cross section), Spherical Segment, Sphere, Hemisphere, Parallelepiped, Swept Trapezoidal Panel, Symmetric Trapezoidal Panels, and a Curved Rectangular Panel. MASPROP provides a designer with a simple technique that requires minimal input to calculate the mass properties of a complex rigid structure and should be useful in any situation where one needs to calculate the center of gravity and moments of inertia of a complex structure. Rigid body analysis is used to calculate mass properties. Mass properties are calculated about component axes that have been rotated to be parallel to the system coordinate axes. Then the system center of gravity is calculated and the mass properties are transferred to axes through the system center of gravity by using the parallel axis theorem. System weight, moments of inertia about the system origin, and the products of inertia about the system center of mass are calculated and printed. From the information about the system center of mass the principal axes of the system and the moments of inertia about them are calculated and printed. The only input required is simple geometric data describing the size and location of each element and the respective material density or weight of each element. This program is written in FORTRAN for execution on a CDC 6000 series computer with a central memory requirement of approximately 62K (octal) of 60 bit words. The development of this program was completed in 1978.
Evaluation of potential severe accidents during low power and shutdown operations at Surry, Unit 1: Analysis of core damage frequency from internal events during mid-loop operations, Appendices E (Sections E.1--E.8). Volume 2, Part 3A

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chu, T.L.; Musicki, Z.; Kohut, P.

1994-06-01

During 1989, the Nuclear Regulatory Commission (NRC) initiated an extensive program to carefully examine the potential risks during low power and shutdown operations. The program includes two parallel projects being performed by Brookhaven National Laboratory (BNL) and Sandia National Laboratories (SNL). Two plants, Surry (pressurized water reactor) and Grand Gulf (boiling water reactor), were selected as the plants to be studied. The objectives of the program are to assess the risks of severe accidents initiated during plant operational states other than full power operation and to compare the estimated core damage frequencies, important accident sequences and other qualitative and quantitativemore » results with those accidents initiated during full power operation as assessed in NUREG-1150. The objective of this report is to document the approach utilized in the Surry plant and discuss the results obtained. A parallel report for the Grand Gulf plant is prepared by SNL. This study shows that the core-damage frequency during mid-loop operation at the Surry plant is comparable to that of power operation. The authors recognize that there is very large uncertainty in the human error probabilities in this study. This study identified that only a few procedures are available for mitigating accidents that may occur during shutdown. Procedures written specifically for shutdown accidents would be useful.« less
An IBM 370 assembly language program verifier

NASA Technical Reports Server (NTRS)

Maurer, W. D.

1977-01-01

The paper describes a program written in SNOBOL which verifies the correctness of programs written in assembly language for the IBM 360 and 370 series of computers. The motivation for using assembly language as a source language for a program verifier was the realization that many errors in programs are caused by misunderstanding or ignorance of the characteristics of specific computers. The proof of correctness of a program written in assembly language must take these characteristics into account. The program has been compiled and is currently running at the Center for Academic and Administrative Computing of The George Washington University.
chemf: A purely functional chemistry toolkit.

PubMed

Höck, Stefan; Riedl, Rainer

2012-12-20

Although programming in a type-safe and referentially transparent style offers several advantages over working with mutable data structures and side effects, this style of programming has not seen much use in chemistry-related software. Since functional programming languages were designed with referential transparency in mind, these languages offer a lot of support when writing immutable data structures and side-effects free code. We therefore started implementing our own toolkit based on the above programming paradigms in a modern, versatile programming language. We present our initial results with functional programming in chemistry by first describing an immutable data structure for molecular graphs together with a couple of simple algorithms to calculate basic molecular properties before writing a complete SMILES parser in accordance with the OpenSMILES specification. Along the way we show how to deal with input validation, error handling, bulk operations, and parallelization in a purely functional way. At the end we also analyze and improve our algorithms and data structures in terms of performance and compare it to existing toolkits both object-oriented and purely functional. All code was written in Scala, a modern multi-paradigm programming language with a strong support for functional programming and a highly sophisticated type system. We have successfully made the first important steps towards a purely functional chemistry toolkit. The data structures and algorithms presented in this article perform well while at the same time they can be safely used in parallelized applications, such as computer aided drug design experiments, without further adjustments. This stands in contrast to existing object-oriented toolkits where thread safety of data structures and algorithms is a deliberate design decision that can be hard to implement. Finally, the level of type-safety achieved by Scala highly increased the reliability of our code as well as the productivity of the programmers involved in this project.
chemf: A purely functional chemistry toolkit

PubMed Central

2012-01-01

Background Although programming in a type-safe and referentially transparent style offers several advantages over working with mutable data structures and side effects, this style of programming has not seen much use in chemistry-related software. Since functional programming languages were designed with referential transparency in mind, these languages offer a lot of support when writing immutable data structures and side-effects free code. We therefore started implementing our own toolkit based on the above programming paradigms in a modern, versatile programming language. Results We present our initial results with functional programming in chemistry by first describing an immutable data structure for molecular graphs together with a couple of simple algorithms to calculate basic molecular properties before writing a complete SMILES parser in accordance with the OpenSMILES specification. Along the way we show how to deal with input validation, error handling, bulk operations, and parallelization in a purely functional way. At the end we also analyze and improve our algorithms and data structures in terms of performance and compare it to existing toolkits both object-oriented and purely functional. All code was written in Scala, a modern multi-paradigm programming language with a strong support for functional programming and a highly sophisticated type system. Conclusions We have successfully made the first important steps towards a purely functional chemistry toolkit. The data structures and algorithms presented in this article perform well while at the same time they can be safely used in parallelized applications, such as computer aided drug design experiments, without further adjustments. This stands in contrast to existing object-oriented toolkits where thread safety of data structures and algorithms is a deliberate design decision that can be hard to implement. Finally, the level of type-safety achieved by Scala highly increased the reliability of our code as well as the productivity of the programmers involved in this project. PMID:23253942
King of the 40th parallel - Discovery in the American West

USGS Publications Warehouse

Moore, James G.

2006-01-01

This book recounts the life and achievements of Clarence King, widely recognized as one of America’s most gifted intellectuals of the nineteenth century, and a legendary figure in the American West. King’s genius, singular accomplishments, and near-death adventures unfold in a narrative centered on his personal relationship with his lifelong friend and colleague, James Gardner. The two, upon completing their studies at Yale, traveled by wagon train across the continent and worked with the California Geological Survey. King went on to establish the Geological Exploration of the 40th Parallel, a government mapping program that stretched across the western mountain chains from California to Wyoming. This was the precursor to the U.S. Geological Survey (USGS). Founded in 1879, with Clarence King as its architect and first director, the USGS became the most important and influential science agency in the nation.The adventurous aspects of conducting geological fieldwork in the West, much of them documented by letters written by King and Gardner, punctuate a book copiously illustrated with historic maps and photographs showing localities and people important to the story.
A Concept for Run-Time Support of the Chapel Language

NASA Technical Reports Server (NTRS)

James, Mark

2006-01-01

A document presents a concept for run-time implementation of other concepts embodied in the Chapel programming language. (Now undergoing development, Chapel is intended to become a standard language for parallel computing that would surpass older such languages in both computational performance in the efficiency with which pre-existing code can be reused and new code written.) The aforementioned other concepts are those of distributions, domains, allocations, and access, as defined in a separate document called "A Semantic Framework for Domains and Distributions in Chapel" and linked to a language specification defined in another separate document called "Chapel Specification 0.3." The concept presented in the instant report is recognition that a data domain that was invented for Chapel offers a novel approach to distributing and processing data in a massively parallel environment. The concept is offered as a starting point for development of working descriptions of functions and data structures that would be necessary to implement interfaces to a compiler for transforming the aforementioned other concepts from their representations in Chapel source code to their run-time implementations.
Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.
TS-SRP/PACK - COMPUTER PROGRAMS TO CHARACTERIZE ALLOYS AND PREDICT CYCLIC LIFE USING THE TOTAL STRAIN VERSION OF STRAINRANGE PARTITIONING

NASA Technical Reports Server (NTRS)

Saltsman, J. F.

1994-01-01

TS-SRP/PACK is a set of computer programs for characterizing and predicting fatigue and creep-fatigue resistance of metallic materials in the high-temperature, long-life regime for isothermal and nonisothermal fatigue. The programs use the total strain version of the Strainrange Partitioning (TS-SRP). The user should be thoroughly familiar with the TS-SRP method before attempting to use any of these programs. The document for this program includes a theory manual as well as a detailed user's manual with a tutorial to guide the user in the proper use of TS-SRP. An extensive database has also been developed in a parallel effort. This database is an excellent source of high-temperature, creep-fatigue test data and can be used with other life-prediction methods as well. Five programs are included in TS-SRP/PACK along with the alloy database. The TABLE program is used to print the datasets, which are in NAMELIST format, in a reader friendly format. INDATA is used to create new datasets or add to existing ones. The FAIL program is used to characterize the failure behavior of an alloy as given by the constants in the strainrange-life relations used by the total strain version of SRP (TS-SRP) and the inelastic strainrange-based version of SRP. The program FLOW is used to characterize the flow behavior (the constitutive response) of an alloy as given by the constants in the flow equations used by TS-SRP. Finally, LIFE is used to predict the life of a specified cycle, using the constants characterizing failure and flow behavior determined by FAIL and FLOW. LIFE is written in interpretive BASIC to avoid compiling and linking every time the equation constants are changed. Four out of five programs in this package are written in FORTRAN 77 for IBM PC series and compatible computers running MS-DOS and are designed to read data using the NAMELIST format statement. The fifth is written in BASIC version 3.0 for IBM PC series and compatible computers running MS-DOS version 3.10. The executables require at least 239K of memory and DOS 3.1 or higher. To compile the source, a Lahey FORTRAN compiler is required. Source code modifications will be necessary if the compiler to be used does not support NAMELIST input. Probably the easiest revision to make is to use a list-directed READ statement. The standard distribution medium for this program is a set of two 5.25 inch 360K MS-DOS format diskettes. The contents of the diskettes are compressed using the PKWARE archiving tools. The utility to unarchive the files, PKUNZIP.EXE, is included. TS-SRP/PACK was developed in 1992.
Introduction to the Atari Computer. A Program Written in the Pilot Programming Language.

ERIC Educational Resources Information Center

Schlenker, Richard M.

Designed to be an introduction to the Atari microcomputers for beginners, the interactive computer program listed in this document is written in the Pilot programing language. Instructions are given for entering and storing the program in the computer memory for use by students. (MES)
Progress report on Nuclear Density project with Lawrence Livermore National Lab Year 2010

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnson, C W; Krastev, P; Ormand, W E

2011-03-11

The main goal for year 2010 was to improve parallelization of the configuration interaction code BIGSTICK, co-written by W. Erich Ormand (LLNL) and Calvin W. Johnson (SDSU), with the parallelization carried out primarily by Plamen Krastev, a postdoc at SDSU and funded in part by this grant. The central computational algorithm is the Lanczos algorithm, which consists of a matrix-vector multiplication (matvec), followed by a Gram-Schmidt reorthogonalization.
Reduze - Feynman integral reduction in C++

NASA Astrophysics Data System (ADS)

Studerus, C.

2010-07-01

Reduze is a computer program for reducing Feynman integrals to master integrals employing a Laporta algorithm. The program is written in C++ and uses classes provided by the GiNaC library to perform the simplifications of the algebraic prefactors in the system of equations. Reduze offers the possibility to run reductions in parallel. Program summaryProgram title:Reduze Catalogue identifier: AEGE_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGE_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions:: yes No. of lines in distributed program, including test data, etc.: 55 433 No. of bytes in distributed program, including test data, etc.: 554 866 Distribution format: tar.gz Programming language: C++ Computer: All Operating system: Unix/Linux Number of processors used: The number of processors is problem dependent. More than one possible but not arbitrary many. RAM: Depends on the complexity of the system. Classification: 4.4, 5 External routines: CLN ( http://www.ginac.de/CLN/), GiNaC ( http://www.ginac.de/) Nature of problem: Solving large systems of linear equations with Feynman integrals as unknowns and rational polynomials as prefactors. Solution method: Using a Gauss/Laporta algorithm to solve the system of equations. Restrictions: Limitations depend on the complexity of the system (number of equations, number of kinematic invariants). Running time: Depends on the complexity of the system.
EMAN2: an extensible image processing suite for electron microscopy.

PubMed

Tang, Guang; Peng, Liwei; Baldwin, Philip R; Mann, Deepinder S; Jiang, Wen; Rees, Ian; Ludtke, Steven J

2007-01-01

EMAN is a scientific image processing package with a particular focus on single particle reconstruction from transmission electron microscopy (TEM) images. It was first released in 1999, and new versions have been released typically 2-3 times each year since that time. EMAN2 has been under development for the last two years, with a completely refactored image processing library, and a wide range of features to make it much more flexible and extensible than EMAN1. The user-level programs are better documented, more straightforward to use, and written in the Python scripting language, so advanced users can modify the programs' behavior without any recompilation. A completely rewritten 3D transformation class simplifies translation between Euler angle standards and symmetry conventions. The core C++ library has over 500 functions for image processing and associated tasks, and it is modular with introspection capabilities, so programmers can add new algorithms with minimal effort and programs can incorporate new capabilities automatically. Finally, a flexible new parallelism system has been designed to address the shortcomings in the rigid system in EMAN1.

Parallel Eclipse Project Checkout

NASA Technical Reports Server (NTRS)

Crockett, Thomas M.; Joswig, Joseph C.; Shams, Khawaja S.; Powell, Mark W.; Bachmann, Andrew G.

2011-01-01

Parallel Eclipse Project Checkout (PEPC) is a program written to leverage parallelism and to automate the checkout process of plug-ins created in Eclipse RCP (Rich Client Platform). Eclipse plug-ins can be aggregated in a feature project. This innovation digests a feature description (xml file) and automatically checks out all of the plug-ins listed in the feature. This resolves the issue of manually checking out each plug-in required to work on the project. To minimize the amount of time necessary to checkout the plug-ins, this program makes the plug-in checkouts parallel. After parsing the feature, a request to checkout for each plug-in in the feature has been inserted. These requests are handled by a thread pool with a configurable number of threads. By checking out the plug-ins in parallel, the checkout process is streamlined before getting started on the project. For instance, projects that took 30 minutes to checkout now take less than 5 minutes. The effect is especially clear on a Mac, which has a network monitor displaying the bandwidth use. When running the client from a developer s home, the checkout process now saturates the bandwidth in order to get all the plug-ins checked out as fast as possible. For comparison, a checkout process that ranged from 8-200 Kbps from a developer s home is now able to saturate a pipe of 1.3 Mbps, resulting in significantly faster checkouts. Eclipse IDE (integrated development environment) tries to build a project as soon as it is downloaded. As part of another optimization, this innovation programmatically tells Eclipse to stop building while checkouts are happening, which dramatically reduces lock contention and enables plug-ins to continue downloading until all of them finish. Furthermore, the software re-enables automatic building, and forces Eclipse to do a clean build once it finishes checking out all of the plug-ins. This software is fully generic and does not contain any NASA-specific code. It can be applied to any Eclipse-based repository with a similar structure. It also can apply build parameters and preferences automatically at the end of the checkout.
Generic command interpreter for robot controllers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Werner, J.

1991-04-09

Generic command interpreter programs have been written for robot controllers at Sandia National Laboratories (SNL). Each interpreter program resides on a robot controller and interfaces the controller with a supervisory program on another (host) computer. We call these interpreter programs monitors because they wait, monitoring a communication line, for commands from the supervisory program. These monitors are designed to interface with the object-oriented software structure of the supervisory programs. The functions of the monitor programs are written in each robot controller's native language but reflect the object-oriented functions of the supervisory programs. These functions and other specifics of the monitormore » programs written for three different robots at SNL will be discussed. 4 refs., 4 figs.« less
JSD: Parallel Job Accounting on the IBM SP2

NASA Technical Reports Server (NTRS)

Saphir, William; Jones, James Patton; Walter, Howard (Technical Monitor)

1995-01-01

The IBM SP2 is one of the most promising parallel computers for scientific supercomputing - it is fast and usually reliable. One of its biggest problems is a lack of robust and comprehensive system software. Among other things, this software allows a collection of Unix processes to be treated as a single parallel application. It does not, however, provide accounting for parallel jobs other than what is provided by AIX for the individual process components. Without parallel job accounting, it is not possible to monitor system use, measure the effectiveness of system administration strategies, or identify system bottlenecks. To address this problem, we have written jsd, a daemon that collects accounting data for parallel jobs. jsd records information in a format that is easily machine- and human-readable, allowing us to extract the most important accounting information with very little effort. jsd also notifies system administrators in certain cases of system failure.
Application programs written by using customizing tools of a computer-aided design system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, X.; Huang, R.; Juricic, D.

1995-12-31

Customizing tools of Computer-Aided Design Systems have been developed to such a degree as to become equivalent to powerful higher-level programming languages that are especially suitable for graphics applications. Two examples of application programs written by using AutoCAD`s customizing tools are given in some detail to illustrate their power. One tool uses AutoLISP list-processing language to develop an application program that produces four views of a given solid model. The other uses AutoCAD Developmental System, based on program modules written in C, to produce an application program that renders a freehand sketch from a given CAD drawing.
GANDALF - Graphical Astrophysics code for N-body Dynamics And Lagrangian Fluids

NASA Astrophysics Data System (ADS)

Hubber, D. A.; Rosotti, G. P.; Booth, R. A.

2018-01-01

GANDALF is a new hydrodynamics and N-body dynamics code designed for investigating planet formation, star formation and star cluster problems. GANDALF is written in C++, parallelized with both OPENMP and MPI and contains a PYTHON library for analysis and visualization. The code has been written with a fully object-oriented approach to easily allow user-defined implementations of physics modules or other algorithms. The code currently contains implementations of smoothed particle hydrodynamics, meshless finite-volume and collisional N-body schemes, but can easily be adapted to include additional particle schemes. We present in this paper the details of its implementation, results from the test suite, serial and parallel performance results and discuss the planned future development. The code is freely available as an open source project on the code-hosting website github at https://github.com/gandalfcode/gandalf and is available under the GPLv2 license.
Linear ground-water flow, flood-wave response program for programmable calculators

USGS Publications Warehouse

Kernodle, John Michael

1978-01-01

Two programs are documented which solve a discretized analytical equation derived to determine head changes at a point in a one-dimensional ground-water flow system. The programs, written for programmable calculators, are in widely divergent but commonly encountered languages and serve to illustrate the adaptability of the linear model to use in situations where access to true computers is not possible or economical. The analytical method assumes a semi-infinite aquifer which is uniform in thickness and hydrologic characteristics, bounded on one side by an impermeable barrier and on the other parallel side by a fully penetrating stream in complete hydraulic connection with the aquifer. Ground-water heads may be calculated for points along a line which is perpendicular to the impermeable barrie and the fully penetrating stream. Head changes at the observation point are dependent on (1) the distance between that point and the impermeable barrier, (2) the distance between the line of stress (the stream) and the impermeable barrier, (3) aquifer diffusivity, (4) time, and (5) head changes along the line of stress. The primary application of the programs is to determine aquifer diffusivity by the flood-wave response technique. (Woodard-USGS)
Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers

NASA Technical Reports Server (NTRS)

Su, Ernesto; Lain, Antonio; Ramaswamy, Shankar; Palermo, Daniel J.; Hodges, Eugene W., IV; Banerjee, Prithviraj

1995-01-01

The PARADIGM compiler project provides an automated means to parallelize programs, written in a serial programming model, for efficient execution on distributed-memory multicomputers. .A previous implementation of the compiler based on the PTD representation allowed symbolic array sizes, affine loop bounds and array subscripts, and variable number of processors, provided that arrays were single or multi-dimensionally block distributed. The techniques presented here extend the compiler to also accept multidimensional cyclic and block-cyclic distributions within a uniform symbolic framework. These extensions demand more sophisticated symbolic manipulation capabilities. A novel aspect of our approach is to meet this demand by interfacing PARADIGM with a powerful off-the-shelf symbolic package, Mathematica. This paper describes some of the Mathematica routines that performs various transformations, shows how they are invoked and used by the compiler to overcome the new challenges, and presents experimental results for code involving cyclic and block-cyclic arrays as evidence of the feasibility of the approach.
Clarity: An Open Source Manager for Laboratory Automation

PubMed Central

Delaney, Nigel F.; Echenique, José Rojas; Marx, Christopher J.

2013-01-01

Software to manage automated laboratories interfaces with hardware instruments, gives users a way to specify experimental protocols, and schedules activities to avoid hardware conflicts. In addition to these basics, modern laboratories need software that can run multiple different protocols in parallel and that can be easily extended to interface with a constantly growing diversity of techniques and instruments. We present Clarity: a laboratory automation manager that is hardware agnostic, portable, extensible and open source. Clarity provides critical features including remote monitoring, robust error reporting by phone or email, and full state recovery in the event of a system crash. We discuss the basic organization of Clarity; demonstrate an example of its implementation for the automated analysis of bacterial growth; and describe how the program can be extended to manage new hardware. Clarity is mature; well documented; actively developed; written in C# for the Common Language Infrastructure; and is free and open source software. These advantages set Clarity apart from currently available laboratory automation programs. PMID:23032169
Understanding and representing natural language meaning

NASA Astrophysics Data System (ADS)

Waltz, D. L.; Maran, L. R.; Dorfman, M. H.; Dinitz, R.; Farwell, D.

1982-12-01

During this contract period the authors have: (1) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (2) written a parsing program which selects appropriate word and sentence meanings by a parallel process know as activation and inhibition; (3) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (4) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (5) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (6) constructed a general model for the representation of tense and aspect of verbs; (7) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.
Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

PubMed

Nadkarni, P M; Miller, P L

1991-01-01

A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.
Evaluation of potential severe accidents during low power and shutdown operations at Surry, Unit 1: Analysis of core damage frequency from internal events during mid-loop operations, Appendices A--D. Volume 2, Part 2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chu, T.L.; Musicki, Z.; Kohut, P.

1994-06-01

During 1989, the Nuclear Regulatory Commission (NRC) initiated an extensive program to carefully examine the Potential risks during low Power and shutdown operations. The program includes two parallel projects being performed by Brookhaven National Laboratory (BNL) and Sandia National Laboratories (SNL). Two plants, Surry (pressurized water reactor) and Grand Gulf (boiling water reactor), were selected as the Plants to be studied. The objectives of the program are to assess the risks of severe accidents initiated during plant operational states other than full power operation and to compare the estimated core damage frequencies, important accident sequences and other qualitative and quantitativemore » results with those accidents initiated during full power operation as assessed in NUREG-1150. The objective of this report is to document the approach utilized in the Surry plant and discuss the results obtained. A parallel report for the Grand Gulf plant is prepared by SNL. This study shows that the core-damage frequency during mid-loop operation at the Surry plant is comparable to that of power operation. We recognize that there is very large uncertainty in the human error probabilities in this study. This study identified that only a few procedures are available for mitigating accidents that may occur during shutdown. Procedures written specifically for shutdown accidents would be useful. This document, Volume 2, Pt. 2 provides appendices A through D of this report.« less
Application of CHAD hydrodynamics to shock-wave problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trease, H.E.; O`Rourke, P.J.; Sahota, M.S.

1997-12-31

CHAD is the latest in a sequence of continually evolving computer codes written to effectively utilize massively parallel computer architectures and the latest grid generators for unstructured meshes. Its applications range from automotive design issues such as in-cylinder and manifold flows of internal combustion engines, vehicle aerodynamics, underhood cooling and passenger compartment heating, ventilation, and air conditioning to shock hydrodynamics and materials modeling. CHAD solves the full unsteady Navier-Stoke equations with the k-epsilon turbulence model in three space dimensions. The code has four major features that distinguish it from the earlier KIVA code, also developed at Los Alamos. First, itmore » is based on a node-centered, finite-volume method in which, like finite element methods, all fluid variables are located at computational nodes. The computational mesh efficiently and accurately handles all element shapes ranging from tetrahedra to hexahedra. Second, it is written in standard Fortran 90 and relies on automatic domain decomposition and a universal communication library written in standard C and MPI for unstructured grids to effectively exploit distributed-memory parallel architectures. Thus the code is fully portable to a variety of computing platforms such as uniprocessor workstations, symmetric multiprocessors, clusters of workstations, and massively parallel platforms. Third, CHAD utilizes a variable explicit/implicit upwind method for convection that improves computational efficiency in flows that have large velocity Courant number variations due to velocity of mesh size variations. Fourth, CHAD is designed to also simulate shock hydrodynamics involving multimaterial anisotropic behavior under high shear. The authors will discuss CHAD capabilities and show several sample calculations showing the strengths and weaknesses of CHAD.« less
Parallel programming with Easy Java Simulations

NASA Astrophysics Data System (ADS)

Esquembre, F.; Christian, W.; Belloni, M.

2018-01-01

Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.
5 CFR 370.105 - Written agreements.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 5 Administrative Personnel 1 2011-01-01 2011-01-01 false Written agreements. 370.105 Section 370... TECHNOLOGY EXCHANGE PROGRAM § 370.105 Written agreements. Before the detail begins, the agency and private sector organization must enter into a written agreement with the individual(s) detailed. The written...
5 CFR 370.105 - Written agreements.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 5 Administrative Personnel 1 2013-01-01 2013-01-01 false Written agreements. 370.105 Section 370... TECHNOLOGY EXCHANGE PROGRAM § 370.105 Written agreements. Before the detail begins, the agency and private sector organization must enter into a written agreement with the individual(s) detailed. The written...
Improving Written Expression of Seventh Grade Mildly Intellectually Disabled Students Utilizing a Basal Reading Program, Journal Writing and Computer Applications.

ERIC Educational Resources Information Center

Wimberly, Sabrenai R.

A practicum was designed to increase mildly intellectually disabled students' written communication skills by demonstrating functional written expression skills in daily assignments and in social communication. A sequenced reading and language program with the integration of journal writing and computer applications was utilized. Seventh- and…
Genetic Parallel Programming: design and implementation.

PubMed

Cheang, Sin Man; Leung, Kwong Sak; Lee, Kin Hong

2006-01-01

This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential program if required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.
Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

PubMed Central

Nadkarni, P. M.; Miller, P. L.

1991-01-01

A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations. PMID:1807632
Diversity in Libraries: Academic Residency Programs. Contributions in Librarianship and Information Science.

ERIC Educational Resources Information Center

Cogell, Raquel V., Ed.; Gruwell, Cindy A., Ed.

This book contains 15 essays written by 19 librarians who participated in minority residency programs in academic libraries and 5 essays written by 6 professionals who served as residency program administrators. The following essays are included: (1) "The University of California, Santa Barbara Fellowship--A Program in Transition" (Detrice…
Bilingual parallel programming

DOE Office of Scientific and Technical Information (OSTI.GOV)

Foster, I.; Overbeek, R.

1990-01-01

Numerous experiments have demonstrated that computationally intensive algorithms support adequate parallelism to exploit the potential of large parallel machines. Yet successful parallel implementations of serious applications are rare. The limiting factor is clearly programming technology. None of the approaches to parallel programming that have been proposed to date -- whether parallelizing compilers, language extensions, or new concurrent languages -- seem to adequately address the central problems of portability, expressiveness, efficiency, and compatibility with existing software. In this paper, we advocate an alternative approach to parallel programming based on what we call bilingual programming. We present evidence that this approach providesmore » and effective solution to parallel programming problems. The key idea in bilingual programming is to construct the upper levels of applications in a high-level language while coding selected low-level components in low-level languages. This approach permits the advantages of a high-level notation (expressiveness, elegance, conciseness) to be obtained without the cost in performance normally associated with high-level approaches. In addition, it provides a natural framework for reusing existing code.« less

Survey: Computer Usage in Design Courses.

ERIC Educational Resources Information Center

Henley, Ernest J.

1983-01-01

Presents results of a survey of chemical engineering departments regarding computer usage in senior design courses. Results are categorized according to: computer usage (use of process simulators, student-written programs, faculty-written or "canned" programs; costs (hard and soft money); and available software. Programs offered are…
Culminating Point and the 38th Parallel

DTIC Science & Technology

1994-01-01

T• 3M•~ OPKALL"L 6. AUTHOR(S) TAMVS L. BRyA10 LF COL ) LkSA 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION AIR WAR ...Prescribed by ANSI Std. Z39-18 298-102 AIR WAR COLLEGE AIR UNIVERSITY THE CULMINATING POINT AND THE 38TH PARALLEL by James L. Bryan Lieutenant Colonel, USA...securing the only attainable objective the following Spring. Why do this analysis on the Korean War when so much has already been written about it
Research in Computational Aeroscience Applications Implemented on Advanced Parallel Computing Systems

NASA Technical Reports Server (NTRS)

Wigton, Larry

1996-01-01

Improving the numerical linear algebra routines for use in new Navier-Stokes codes, specifically Tim Barth's unstructured grid code, with spin-offs to TRANAIR is reported. A fast distance calculation routine for Navier-Stokes codes using the new one-equation turbulence models is written. The primary focus of this work was devoted to improving matrix-iterative methods. New algorithms have been developed which activate the full potential of classical Cray-class computers as well as distributed-memory parallel computers.
41 CFR 60-1.40 - Affirmative action programs.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 41 Public Contracts and Property Management 1 2011-07-01 2009-07-01 true Affirmative action... written affirmative action program for each of its establishments, if it has 50 or more employees and: (i... each nonconstruction subcontractor to develop and maintain a written affirmative action program for...
41 CFR 60-1.40 - Affirmative action programs.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 41 Public Contracts and Property Management 1 2010-07-01 2010-07-01 true Affirmative action... written affirmative action program for each of its establishments, if it has 50 or more employees and: (i... each nonconstruction subcontractor to develop and maintain a written affirmative action program for...
A parallel strategy for implementing real-time expert systems using CLIPS

NASA Technical Reports Server (NTRS)

Ilyes, Laszlo A.; Villaseca, F. Eugenio; Delaat, John

1994-01-01

As evidenced by current literature, there appears to be a continued interest in the study of real-time expert systems. It is generally recognized that speed of execution is only one consideration when designing an effective real-time expert system. Some other features one must consider are the expert system's ability to perform temporal reasoning, handle interrupts, prioritize data, contend with data uncertainty, and perform context focusing as dictated by the incoming data to the expert system. This paper presents a strategy for implementing a real time expert system on the iPSC/860 hypercube parallel computer using CLIPS. The strategy takes into consideration not only the execution time of the software, but also those features which define a true real-time expert system. The methodology is then demonstrated using a practical implementation of an expert system which performs diagnostics on the Space Shuttle Main Engine (SSME). This particular implementation uses an eight node hypercube to process ten sensor measurements in order to simultaneously diagnose five different failure modes within the SSME. The main program is written in ANSI C and embeds CLIPS to better facilitate and debug the rule based expert system.
Recent advances and future prospects for Monte Carlo

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B

2010-01-01

The history of Monte Carlo methods is closely linked to that of computers: The first known Monte Carlo program was written in 1947 for the ENIAC; a pre-release of the first Fortran compiler was used for Monte Carlo In 1957; Monte Carlo codes were adapted to vector computers in the 1980s, clusters and parallel computers in the 1990s, and teraflop systems in the 2000s. Recent advances include hierarchical parallelism, combining threaded calculations on multicore processors with message-passing among different nodes. With the advances In computmg, Monte Carlo codes have evolved with new capabilities and new ways of use. Production codesmore » such as MCNP, MVP, MONK, TRIPOLI and SCALE are now 20-30 years old (or more) and are very rich in advanced featUres. The former 'method of last resort' has now become the first choice for many applications. Calculations are now routinely performed on office computers, not just on supercomputers. Current research and development efforts are investigating the use of Monte Carlo methods on FPGAs. GPUs, and many-core processors. Other far-reaching research is exploring ways to adapt Monte Carlo methods to future exaflop systems that may have 1M or more concurrent computational processes.« less
A parallel and modular deformable cell Car-Parrinello code

NASA Astrophysics Data System (ADS)

Cavazzoni, Carlo; Chiarotti, Guido L.

1999-12-01

We have developed a modular parallel code implementing the Car-Parrinello [Phys. Rev. Lett. 55 (1985) 2471] algorithm including the variable cell dynamics [Europhys. Lett. 36 (1994) 345; J. Phys. Chem. Solids 56 (1995) 510]. Our code is written in Fortran 90, and makes use of some new programming concepts like encapsulation, data abstraction and data hiding. The code has a multi-layer hierarchical structure with tree like dependences among modules. The modules include not only the variables but also the methods acting on them, in an object oriented fashion. The modular structure allows easier code maintenance, develop and debugging procedures, and is suitable for a developer team. The layer structure permits high portability. The code displays an almost linear speed-up in a wide range of number of processors independently of the architecture. Super-linear speed up is obtained with a "smart" Fast Fourier Transform (FFT) that uses the available memory on the single node (increasing for a fixed problem with the number of processing elements) as temporary buffer to store wave function transforms. This code has been used to simulate water and ammonia at giant planet conditions for systems as large as 64 molecules for ˜50 ps.
Toward Petascale Biologically Plausible Neural Networks

NASA Astrophysics Data System (ADS)

Long, Lyle

This talk will describe an approach to achieving petascale neural networks. Artificial intelligence has been oversold for many decades. Computers in the beginning could only do about 16,000 operations per second. Computer processing power, however, has been doubling every two years thanks to Moore's law, and growing even faster due to massively parallel architectures. Finally, 60 years after the first AI conference we have computers on the order of the performance of the human brain (1016 operations per second). The main issues now are algorithms, software, and learning. We have excellent models of neurons, such as the Hodgkin-Huxley model, but we do not know how the human neurons are wired together. With careful attention to efficient parallel computing, event-driven programming, table lookups, and memory minimization massive scale simulations can be performed. The code that will be described was written in C + + and uses the Message Passing Interface (MPI). It uses the full Hodgkin-Huxley neuron model, not a simplified model. It also allows arbitrary network structures (deep, recurrent, convolutional, all-to-all, etc.). The code is scalable, and has, so far, been tested on up to 2,048 processor cores using 107 neurons and 109 synapses.
Optimization of Particle-in-Cell Codes on RISC Processors

NASA Technical Reports Server (NTRS)

Decyk, Viktor K.; Karmesin, Steve Roy; Boer, Aeint de; Liewer, Paulette C.

1996-01-01

General strategies are developed to optimize particle-cell-codes written in Fortran for RISC processors which are commonly used on massively parallel computers. These strategies include data reorganization to improve cache utilization and code reorganization to improve efficiency of arithmetic pipelines.
Global Magnetohydrodynamic Simulation Using High Performance FORTRAN on Parallel Computers

NASA Astrophysics Data System (ADS)

Ogino, T.

High Performance Fortran (HPF) is one of modern and common techniques to achieve high performance parallel computation. We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5 VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.
Programming Details for MDPLOT: A Program for Plotting Multi-Dimensional Data

Treesearch

W.L. Nance; B.H. Polmer; G.C. Keith

1975-01-01

The program is written in ASA FORTRAN IV and consists of the main program (MAIN) with 14 subroutines. Subroutines SETUP, PLOT, GRID, SCALE, and 01s are microfilm-dependent and therefore must be replaced with the equivalent routines written for the high resolution plotting device available at the user's installation. The calls to these subroutines are flagged...
Parallel mutual information estimation for inferring gene regulatory networks on GPUs

PubMed Central

2011-01-01

Background Mutual information is a measure of similarity between two variables. It has been widely used in various application domains including computational biology, machine learning, statistics, image processing, and financial computing. Previously used simple histogram based mutual information estimators lack the precision in quality compared to kernel based methods. The recently introduced B-spline function based mutual information estimation method is competitive to the kernel based methods in terms of quality but at a lower computational complexity. Results We present a new approach to accelerate the B-spline function based mutual information estimation algorithm with commodity graphics hardware. To derive an efficient mapping onto this type of architecture, we have used the Compute Unified Device Architecture (CUDA) programming model to design and implement a new parallel algorithm. Our implementation, called CUDA-MI, can achieve speedups of up to 82 using double precision on a single GPU compared to a multi-threaded implementation on a quad-core CPU for large microarray datasets. We have used the results obtained by CUDA-MI to infer gene regulatory networks (GRNs) from microarray data. The comparisons to existing methods including ARACNE and TINGe show that CUDA-MI produces GRNs of higher quality in less time. Conclusions CUDA-MI is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant speedup over sequential multi-threaded implementation by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs. PMID:21672264
Brainlab: A Python Toolkit to Aid in the Design, Simulation, and Analysis of Spiking Neural Networks with the NeoCortical Simulator.

PubMed

Drewes, Rich; Zou, Quan; Goodman, Philip H

2009-01-01

Neuroscience modeling experiments often involve multiple complex neural network and cell model variants, complex input stimuli and input protocols, followed by complex data analysis. Coordinating all this complexity becomes a central difficulty for the experimenter. The Python programming language, along with its extensive library packages, has emerged as a leading "glue" tool for managing all sorts of complex programmatic tasks. This paper describes a toolkit called Brainlab, written in Python, that leverages Python's strengths for the task of managing the general complexity of neuroscience modeling experiments. Brainlab was also designed to overcome the major difficulties of working with the NCS (NeoCortical Simulator) environment in particular. Brainlab is an integrated model-building, experimentation, and data analysis environment for the powerful parallel spiking neural network simulator system NCS.
Brainlab: A Python Toolkit to Aid in the Design, Simulation, and Analysis of Spiking Neural Networks with the NeoCortical Simulator

PubMed Central

Drewes, Rich; Zou, Quan; Goodman, Philip H.

2008-01-01

Neuroscience modeling experiments often involve multiple complex neural network and cell model variants, complex input stimuli and input protocols, followed by complex data analysis. Coordinating all this complexity becomes a central difficulty for the experimenter. The Python programming language, along with its extensive library packages, has emerged as a leading “glue” tool for managing all sorts of complex programmatic tasks. This paper describes a toolkit called Brainlab, written in Python, that leverages Python's strengths for the task of managing the general complexity of neuroscience modeling experiments. Brainlab was also designed to overcome the major difficulties of working with the NCS (NeoCortical Simulator) environment in particular. Brainlab is an integrated model-building, experimentation, and data analysis environment for the powerful parallel spiking neural network simulator system NCS. PMID:19506707
Partitioning problems in parallel, pipelined and distributed computing

NASA Technical Reports Server (NTRS)

Bokhari, S.

1985-01-01

The problem of optimally assigning the modules of a parallel program over the processors of a multiple computer system is addressed. A Sum-Bottleneck path algorithm is developed that permits the efficient solution of many variants of this problem under some constraints on the structure of the partitions. In particular, the following problems are solved optimally for a single-host, multiple satellite system: partitioning multiple chain structured parallel programs, multiple arbitrarily structured serial programs and single tree structured parallel programs. In addition, the problems of partitioning chain structured parallel programs across chain connected systems and across shared memory (or shared bus) systems are also solved under certain constraints. All solutions for parallel programs are equally applicable to pipelined programs. These results extend prior research in this area by explicitly taking concurrency into account and permit the efficient utilization of multiple computer architectures for a wide range of problems of practical interest.
Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes

NASA Technical Reports Server (NTRS)

Yan, Jerry; Jin, Haoqiang; Frumkin, Michael; Yan, Jerry (Technical Monitor)

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate OpenMP-based parallel programs with nominal user assistance. We outline techniques used in the implementation of the tool and discuss the application of this tool on the NAS Parallel Benchmarks and several computational fluid dynamics codes. This work demonstrates the great potential of using the tool to quickly port parallel programs and also achieve good performance that exceeds some of the commercial tools.
Transactional Analysis in Management.

ERIC Educational Resources Information Center

Hewson, Julie; Turner, Colin

Although Transactional Analysis (TA) has heavily influenced psychotherapy, little has been written to parallel that influence in areas of organization theory, organization behavior, or management studies. This book is intended primarily for people working in management roles. In part one, personal experiences are drawn upon to describe a fictional…
78 FR 64535 - Meeting of the CJIS Advisory Policy Board

Federal Register 2010, 2011, 2012, 2013, 2014

2013-10-29

... policy issues and appropriate technical and operational issues related to the programs administered by... Federal Officer (DFO). Any member of the public may file a written statement with the Board. Written... previously submitted written statements. Written comments should be provided to Mr. R. Scott Trent, DFO, at...
Event Reconstruction for Many-core Architectures using Java

DOE Office of Scientific and Technical Information (OSTI.GOV)

Graf, Norman A.; /SLAC

Although Moore's Law remains technically valid, the performance enhancements in computing which traditionally resulted from increased CPU speeds ended years ago. Chip manufacturers have chosen to increase the number of core CPUs per chip instead of increasing clock speed. Unfortunately, these extra CPUs do not automatically result in improvements in simulation or reconstruction times. To take advantage of this extra computing power requires changing how software is written. Event reconstruction is globally serial, in the sense that raw data has to be unpacked first, channels have to be clustered to produce hits before those hits are identified as belonging tomore » a track or shower, tracks have to be found and fit before they are vertexed, etc. However, many of the individual procedures along the reconstruction chain are intrinsically independent and are perfect candidates for optimization using multi-core architecture. Threading is perhaps the simplest approach to parallelizing a program and Java includes a powerful threading facility built into the language. We have developed a fast and flexible reconstruction package (org.lcsim) written in Java that has been used for numerous physics and detector optimization studies. In this paper we present the results of our studies on optimizing the performance of this toolkit using multiple threads on many-core architectures.« less

BOOK REVIEW: Numerical Recipes in C++: The Art of Scientific Computing (2nd edn)1 Numerical Recipes Example Book (C++) (2nd edn)2 Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version3Numerical Recipes in C++: The Art of Scientific Computing (2nd edn) Numerical Recipes Example Book (C++) (2nd edn) Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version

NASA Astrophysics Data System (ADS)

Press, William H.; Teukolsky, Saul A.; Vettering, William T.; Flannery, Brian P.

2003-05-01

The two Numerical Recipes books are marvellous. The principal book, The Art of Scientific Computing, contains program listings for almost every conceivable requirement, and it also contains a well written discussion of the algorithms and the numerical methods involved. The Example Book provides a complete driving program, with helpful notes, for nearly all the routines in the principal book. The first edition of Numerical Recipes: The Art of Scientific Computing was published in 1986 in two versions, one with programs in Fortran, the other with programs in Pascal. There were subsequent versions with programs in BASIC and in C. The second, enlarged edition was published in 1992, again in two versions, one with programs in Fortran (NR(F)), the other with programs in C (NR(C)). In 1996 the authors produced Numerical Recipes in Fortran 90: The Art of Parallel Scientific Computing as a supplement, called Volume 2, with the original (Fortran) version referred to as Volume 1. Numerical Recipes in C++ (NR(C++)) is another version of the 1992 edition. The numerical recipes are also available on a CD ROM: if you want to use any of the recipes, I would strongly advise you to buy the CD ROM. The CD ROM contains the programs in all the languages. When the first edition was published I bought it, and have also bought copies of the other editions as they have appeared. Anyone involved in scientific computing ought to have a copy of at least one version of Numerical Recipes, and there also ought to be copies in every library. If you already have NR(F), should you buy the NR(C++) and, if not, which version should you buy? In the preface to Volume 2 of NR(F), the authors say 'C and C++ programmers have not been far from our minds as we have written this volume, and we think that you will find that time spent in absorbing its principal lessons will be amply repaid in the future as C and C++ eventually develop standard parallel extensions'. In the preface and introduction to NR(C++), the authors point out some of the problems in the use of C++ in scientific computing. I have not found any mention of parallel computing in NR(C++). Fortran has quite a lot going for it. As someone who has used it in most of its versions from Fortran II, I have seen it develop and leave behind other languages promoted by various enthusiasts: who now uses Algol or Pascal? I think it unlikely that C++ will disappear: it was devised as a systems language, and can also be used for other purposes such as scientific computing. It is possible that Fortran will disappear, but Fortran has the strengths that it can develop, that there are extensive Fortran subroutine libraries, and that it has been developed for parallel computing. To argue with programmers as to which is the best language to use is sterile. If you wish to use C++, then buy NR(C++), but you should also look at volume 2 of NR(F). If you are a Fortran programmer, then make sure you have NR(F), volumes 1 and 2. But whichever language you use, make sure you have one version or the other, and the CD ROM. The Example Book provides listings of complete programs to run nearly all the routines in NR, frequently based on cases where an anlytical solution is available. It is helpful when developing a new program incorporating an unfamiliar routine to see that routine actually working, and this is what the programs in the Example Book achieve. I started teaching computational physics before Numerical Recipes was published. If I were starting again, I would make heavy use of both The Art of Scientific Computing and of the Example Book. Every computational physics teaching laboratory should have both volumes: the programs in the Example Book are included on the CD ROM, but the extra commentary in the book itself is of considerable value. P Borcherds
An interactive parallel programming environment applied in atmospheric science

NASA Technical Reports Server (NTRS)

vonLaszewski, G.

1996-01-01

This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.
New implementation of OGC Web Processing Service in Python programming language. PyWPS-4 and issues we are facing with processing of large raster data using OGC WPS

NASA Astrophysics Data System (ADS)

Čepický, Jáchym; Moreira de Sousa, Luís

2016-06-01

The OGC® Web Processing Service (WPS) Interface Standard provides rules for standardizing inputs and outputs (requests and responses) for geospatial processing services, such as polygon overlay. The standard also defines how a client can request the execution of a process, and how the output from the process is handled. It defines an interface that facilitates publishing of geospatial processes and client discovery of processes and and binding to those processes into workflows. Data required by a WPS can be delivered across a network or they can be available at a server. PyWPS was one of the first implementations of OGC WPS on the server side. It is written in the Python programming language and it tries to connect to all existing tools for geospatial data analysis, available on the Python platform. During the last two years, the PyWPS development team has written a new version (called PyWPS-4) completely from scratch. The analysis of large raster datasets poses several technical issues in implementing the WPS standard. The data format has to be defined and validated on the server side and binary data have to be encoded using some numeric representation. Pulling raster data from remote servers introduces security risks, in addition, running several processes in parallel has to be possible, so that system resources are used efficiently while preserving security. Here we discuss these topics and illustrate some of the solutions adopted within the PyWPS implementation.
Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)

2001-01-01

This viewgraph presentation provides information on the technical aspects of debugging computer code that has been automatically converted for use in a parallel computing system. Shared memory parallelization and distributed memory parallelization entail separate and distinct challenges for a debugging program. A prototype system has been developed which integrates various tools for the debugging of automatically parallelized programs including the CAPTools Database which provides variable definition information across subroutines as well as array distribution information.
Semantic Support and Parallel Parsing in Chinese

ERIC Educational Resources Information Center

Hsieh, Yufen; Boland, Julie E.

2015-01-01

Two eye-tracking experiments were conducted using written Chinese sentences that contained a multi-word ambiguous region. The goal was to determine whether readers maintained multiple interpretations throughout the ambiguous region or selected a single interpretation at the point of ambiguity. Within the ambiguous region, we manipulated the…
Space Tethers Design Criteria

NASA Technical Reports Server (NTRS)

Tomlin, Donald D.; Faile, Gwyn C.; Hayashida, Kazuo B.; Frost, Cynthia L.; Wagner, Carole Y.; Mitchell, Michael L.; Vaughn, Jason A.; Galuska, Michael J.

1998-01-01

The small expendable deployable system and tether satellite system programs did not have a uniform written criteria for tethers. The JSC safety panel asked what criteria was used to design the tethers. Since none existed, a criteria was written based on past experience for future tether programs.
7 CFR 225.13 - Appeal procedures.

Code of Federal Regulations, 2010 CFR

2010-01-01

... CHILD NUTRITION PROGRAMS SUMMER FOOD SERVICE PROGRAM State Agency Provisions § 225.13 Appeal procedures...) The appellant be allowed the opportunity to review any information upon which the action was based; (4... by filing written documentation with the review official. To be considered, written documentation...
Architecture Adaptive Computing Environment

NASA Technical Reports Server (NTRS)

Dorband, John E.

2006-01-01

Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
The Automatic Parallelisation of Scientific Application Codes Using a Computer Aided Parallelisation Toolkit

NASA Technical Reports Server (NTRS)

Ierotheou, C.; Johnson, S.; Leggett, P.; Cross, M.; Evans, E.; Jin, Hao-Qiang; Frumkin, M.; Yan, J.; Biegel, Bryan (Technical Monitor)

2001-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. Historically, the lack of a programming standard for using directives and the rather limited performance due to scalability have affected the take-up of this programming model approach. Significant progress has been made in hardware and software technologies, as a result the performance of parallel programs with compiler directives has also made improvements. The introduction of an industrial standard for shared-memory programming with directives, OpenMP, has also addressed the issue of portability. In this study, we have extended the computer aided parallelization toolkit (developed at the University of Greenwich), to automatically generate OpenMP based parallel programs with nominal user assistance. We outline the way in which loop types are categorized and how efficient OpenMP directives can be defined and placed using the in-depth interprocedural analysis that is carried out by the toolkit. We also discuss the application of the toolkit on the NAS Parallel Benchmarks and a number of real-world application codes. This work not only demonstrates the great potential of using the toolkit to quickly parallelize serial programs but also the good performance achievable on up to 300 processors for hybrid message passing and directive-based parallelizations.
Practical Formal Verification of MPI and Thread Programs

NASA Astrophysics Data System (ADS)

Gopalakrishnan, Ganesh; Kirby, Robert M.

Large-scale simulation codes in science and engineering are written using the Message Passing Interface (MPI). Shared memory threads are widely used directly, or to implement higher level programming abstractions. Traditional debugging methods for MPI or thread programs are incapable of providing useful formal guarantees about coverage. They get bogged down in the sheer number of interleavings (schedules), often missing shallow bugs. In this tutorial we will introduce two practical formal verification tools: ISP (for MPI C programs) and Inspect (for Pthread C programs). Unlike other formal verification tools, ISP and Inspect run directly on user source codes (much like a debugger). They pursue only the relevant set of process interleavings, using our own customized Dynamic Partial Order Reduction algorithms. For a given test harness, DPOR allows these tools to guarantee the absence of deadlocks, instrumented MPI object leaks and communication races (using ISP), and shared memory races (using Inspect). ISP and Inspect have been used to verify large pieces of code: in excess of 10,000 lines of MPI/C for ISP in under 5 seconds, and about 5,000 lines of Pthread/C code in a few hours (and much faster with the use of a cluster or by exploiting special cases such as symmetry) for Inspect. We will also demonstrate the Microsoft Visual Studio and Eclipse Parallel Tools Platform integrations of ISP (these will be available on the LiveCD).
Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores

NASA Astrophysics Data System (ADS)

Kegel, Philipp; Schellmann, Maraike; Gorlatch, Sergei

We compare two parallel programming approaches for multi-core systems: the well-known OpenMP and the recently introduced Threading Building Blocks (TBB) library by Intel®. The comparison is made using the parallelization of a real-world numerical algorithm for medical imaging. We develop several parallel implementations, and compare them w.r.t. programming effort, programming style and abstraction, and runtime performance. We show that TBB requires a considerable program re-design, whereas with OpenMP simple compiler directives are sufficient. While TBB appears to be less appropriate for parallelizing existing implementations, it fosters a good programming style and higher abstraction level for newly developed parallel programs. Our experimental measurements on a dual quad-core system demonstrate that OpenMP slightly outperforms TBB in our implementation.
Kinesiology Workbook and Laboratory Manual.

ERIC Educational Resources Information Center

Harris, Ruth W.

This manual is written for students in anatomy, kinesiology, or introductory biomechanics courses. The book is divided into two sections, a kinesiology workbook and a laboratory manual. The two sections parallel each other in content and format. Each is divided into three corresponding sections: (1) Anatomical bases for movement description; (2)…
The Effect of Critical Thinking Instruction on Verbal Descriptions of Music

ERIC Educational Resources Information Center

Johnson, Daniel C.

2011-01-01

The purpose of this study was to determine the effect of critical thinking instruction on music listening skills of fifth-grade students as measured by written responses to music listening. The researcher compared instruction that included opportunities for critical thinking (Critical Thinking Instruction, CTI) with parallel instruction without…
Validated Test Method 1316: Liquid-Solid Partitioning as a Function of Liquid-to-Solid Ratio in Solid Materials Using a Parallel Batch Procedure

EPA Pesticide Factsheets

Describes procedures written based on the assumption that they will be performed by analysts who are formally trained in at least the basic principles of chemical analysis and in the use of the subject technology.
Programming for Pioneer 12

NASA Technical Reports Server (NTRS)

Shem, B. C.

1985-01-01

Background on Pioneer probes 6 to 11 is given as well as an overview of the Pioneer Venus mission. A computer program was written in C language for analyzing radio signals from the Pioneer Venus orbiter. A second program was written to facilitate high gain antenna commands to move the antenna itself, to set the simulated spin period, and to set the attitude control system angle.
An object-oriented approach to nested data parallelism

NASA Technical Reports Server (NTRS)

Sheffler, Thomas J.; Chatterjee, Siddhartha

1994-01-01

This paper describes an implementation technique for integrating nested data parallelism into an object-oriented language. Data-parallel programming employs sets of data called 'collections' and expresses parallelism as operations performed over the elements of a collection. When the elements of a collection are also collections, then there is the possibility for 'nested data parallelism.' Few current programming languages support nested data parallelism however. In an object-oriented framework, a collection is a single object. Its type defines the parallel operations that may be applied to it. Our goal is to design and build an object-oriented data-parallel programming environment supporting nested data parallelism. Our initial approach is built upon three fundamental additions to C++. We add new parallel base types by implementing them as classes, and add a new parallel collection type called a 'vector' that is implemented as a template. Only one new language feature is introduced: the 'foreach' construct, which is the basis for exploiting elementwise parallelism over collections. The strength of the method lies in the compilation strategy, which translates nested data-parallel C++ into ordinary C++. Extracting the potential parallelism in nested 'foreach' constructs is called 'flattening' nested parallelism. We show how to flatten 'foreach' constructs using a simple program transformation. Our prototype system produces vector code which has been successfully run on workstations, a CM-2, and a CM-5.
The BLAZE language: A parallel language for scientific programming

NASA Technical Reports Server (NTRS)

Mehrotra, P.; Vanrosendale, J.

1985-01-01

A Pascal-like scientific programming language, Blaze, is described. Blaze contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus Blaze should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with onceptually sequential control flow. A central goal in the design of Blaze is portability across a broad range of parallel architectures. The multiple levels of parallelism present in Blaze code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of Blaze are described and shows how this language would be used in typical scientific programming.
Moose: An Open-Source Framework to Enable Rapid Development of Collaborative, Multi-Scale, Multi-Physics Simulation Tools

NASA Astrophysics Data System (ADS)

Slaughter, A. E.; Permann, C.; Peterson, J. W.; Gaston, D.; Andrs, D.; Miller, J.

2014-12-01

The Idaho National Laboratory (INL)-developed Multiphysics Object Oriented Simulation Environment (MOOSE; www.mooseframework.org), is an open-source, parallel computational framework for enabling the solution of complex, fully implicit multiphysics systems. MOOSE provides a set of computational tools that scientists and engineers can use to create sophisticated multiphysics simulations. Applications built using MOOSE have computed solutions for chemical reaction and transport equations, computational fluid dynamics, solid mechanics, heat conduction, mesoscale materials modeling, geomechanics, and others. To facilitate the coupling of diverse and highly-coupled physical systems, MOOSE employs the Jacobian-free Newton-Krylov (JFNK) method when solving the coupled nonlinear systems of equations arising in multiphysics applications. The MOOSE framework is written in C++, and leverages other high-quality, open-source scientific software packages such as LibMesh, Hypre, and PETSc. MOOSE uses a "hybrid parallel" model which combines both shared memory (thread-based) and distributed memory (MPI-based) parallelism to ensure efficient resource utilization on a wide range of computational hardware. MOOSE-based applications are inherently modular, which allows for simulation expansion (via coupling of additional physics modules) and the creation of multi-scale simulations. Any application developed with MOOSE supports running (in parallel) any other MOOSE-based application. Each application can be developed independently, yet easily communicate with other applications (e.g., conductivity in a slope-scale model could be a constant input, or a complete phase-field micro-structure simulation) without additional code being written. This method of development has proven effective at INL and expedites the development of sophisticated, sustainable, and collaborative simulation tools.
IMPMOT user's manual. [written in FORTRAN 4

NASA Technical Reports Server (NTRS)

Stewart, D. J.; Bishop, M. J.

1974-01-01

This user's manual describes the input and output variables as well as the job control language necessary to utilize the IMP-H apogee motor firing program, IMPMOT. The IMPMOT program can be executed as either a stand-alone program or as a member of the flight dynamics system. This program is used to determine the time and attitude at which to fire the IMP-H apogee boost motor. The IMPMOT program is written in FORTRAN 4 for use on the IBM 360 series computer.
Accelerating Monte Carlo simulations with an NVIDIA ® graphics processor

NASA Astrophysics Data System (ADS)

Martinsen, Paul; Blaschke, Johannes; Künnemeyer, Rainer; Jordan, Robert

2009-10-01

Modern graphics cards, commonly used in desktop computers, have evolved beyond a simple interface between processor and display to incorporate sophisticated calculation engines that can be applied to general purpose computing. The Monte Carlo algorithm for modelling photon transport in turbid media has been implemented on an NVIDIA ® 8800 GT graphics card using the CUDA toolkit. The Monte Carlo method relies on following the trajectory of millions of photons through the sample, often taking hours or days to complete. The graphics-processor implementation, processing roughly 110 million scattering events per second, was found to run more than 70 times faster than a similar, single-threaded implementation on a 2.67 GHz desktop computer. Program summaryProgram title: Phoogle-C/Phoogle-G Catalogue identifier: AEEB_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEB_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 51 264 No. of bytes in distributed program, including test data, etc.: 2 238 805 Distribution format: tar.gz Programming language: C++ Computer: Designed for Intel PCs. Phoogle-G requires a NVIDIA graphics card with support for CUDA 1.1 Operating system: Windows XP Has the code been vectorised or parallelized?: Phoogle-G is written for SIMD architectures RAM: 1 GB Classification: 21.1 External routines: Charles Karney Random number library. Microsoft Foundation Class library. NVIDA CUDA library [1]. Nature of problem: The Monte Carlo technique is an effective algorithm for exploring the propagation of light in turbid media. However, accurate results require tracing the path of many photons within the media. The independence of photons naturally lends the Monte Carlo technique to implementation on parallel architectures. Generally, parallel computing can be expensive, but recent advances in consumer grade graphics cards have opened the possibility of high-performance desktop parallel-computing. Solution method: In this pair of programmes we have implemented the Monte Carlo algorithm described by Prahl et al. [2] for photon transport in infinite scattering media to compare the performance of two readily accessible architectures: a standard desktop PC and a consumer grade graphics card from NVIDIA. Restrictions: The graphics card implementation uses single precision floating point numbers for all calculations. Only photon transport from an isotropic point-source is supported. The graphics-card version has no user interface. The simulation parameters must be set in the source code. The desktop version has a simple user interface; however some properties can only be accessed through an ActiveX client (such as Matlab). Additional comments: The random number library used has a LGPL ( http://www.gnu.org/copyleft/lesser.html) licence. Running time: Runtime can range from minutes to months depending on the number of photons simulated and the optical properties of the medium. References:http://www.nvidia.com/object/cuda_home.html. S. Prahl, M. Keijzer, Sl. Jacques, A. Welch, SPIE Institute Series 5 (1989) 102.

Parallel compression of data chunks of a shared data object using a log-structured file system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bent, John M.; Faibish, Sorin; Grider, Gary

2016-10-25

Techniques are provided for parallel compression of data chunks being written to a shared object. A client executing on a compute node or a burst buffer node in a parallel computing system stores a data chunk generated by the parallel computing system to a shared data object on a storage node by compressing the data chunk; and providing the data compressed data chunk to the storage node that stores the shared object. The client and storage node may employ Log-Structured File techniques. The compressed data chunk can be de-compressed by the client when the data chunk is read. A storagemore » node stores a data chunk as part of a shared object by receiving a compressed version of the data chunk from a compute node; and storing the compressed version of the data chunk to the shared data object on the storage node.« less
Global MHD simulation of magnetosphere using HPF

NASA Astrophysics Data System (ADS)

Ogino, T.

We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5% using 56 PEs of Fujitsu VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.
Compulsory Education: Schools, Pupils, Teachers, Programs and Methods. Conference Papers for the 8th Session of the International Standing Conference for the History of Education (Parma, Italy, September 3-6, 1986). Volume II.

ERIC Educational Resources Information Center

Genovesi, Giovanni, Ed.

This second of four volumes on the history of compulsory education among the nations of Europe and the western hemisphere covers schools, pupils, teachers, programs, and methods. Of the volume's 16 selections, 13 are written in English and 3 are written in Italian. Most selections contain summaries; summaries of the Italian articles are written in…
Particle Laden Turbulence in a Radiation Environment Using a Portable High Preformace Solver Based on the Legion Runtime System

NASA Astrophysics Data System (ADS)

Torres, Hilario; Iaccarino, Gianluca

2017-11-01

Soleil-X is a multi-physics solver being developed at Stanford University as a part of the Predictive Science Academic Alliance Program II. Our goal is to conduct high fidelity simulations of particle laden turbulent flows in a radiation environment for solar energy receiver applications as well as to demonstrate our readiness to effectively utilize next generation Exascale machines. The novel aspect of Soleil-X is that it is built upon the Legion runtime system to enable easy portability to different parallel distributed heterogeneous architectures while also being written entirely in high-level/high-productivity languages (Ebb and Regent). An overview of the Soleil-X software architecture will be given. Results from coupled fluid flow, Lagrangian point particle tracking, and thermal radiation simulations will be presented. Performance diagnostic tools and metrics corresponding the the same cases will also be discussed. US Department of Energy, National Nuclear Security Administration.
ClusCo: clustering and comparison of protein models.

PubMed

Jamroz, Michal; Kolinski, Andrzej

2013-02-22

The development, optimization and validation of protein modeling methods require efficient tools for structural comparison. Frequently, a large number of models need to be compared with the target native structure. The main reason for the development of Clusco software was to create a high-throughput tool for all-versus-all comparison, because calculating similarity matrix is the one of the bottlenecks in the protein modeling pipeline. Clusco is fast and easy-to-use software for high-throughput comparison of protein models with different similarity measures (cRMSD, dRMSD, GDT_TS, TM-Score, MaxSub, Contact Map Overlap) and clustering of the comparison results with standard methods: K-means Clustering or Hierarchical Agglomerative Clustering. The application was highly optimized and written in C/C++, including the code for parallel execution on CPU and GPU, which resulted in a significant speedup over similar clustering and scoring computation programs.
Collectively loading programs in a multiple program multiple data environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

Techniques are disclosed for loading programs efficiently in a parallel computing system. In one embodiment, nodes of the parallel computing system receive a load description file which indicates, for each program of a multiple program multiple data (MPMD) job, nodes which are to load the program. The nodes determine, using collective operations, a total number of programs to load and a number of programs to load in parallel. The nodes further generate a class route for each program to be loaded in parallel, where the class route generated for a particular program includes only those nodes on which the programmore » needs to be loaded. For each class route, a node is selected using a collective operation to be a load leader which accesses a file system to load the program associated with a class route and broadcasts the program via the class route to other nodes which require the program.« less
The BLAZE language - A parallel language for scientific programming

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Van Rosendale, John

1987-01-01

A Pascal-like scientific programming language, BLAZE, is described. BLAZE contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus BLAZE should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with conceptually sequential control flow. A central goal in the design of BLAZE is portability across a broad range of parallel architectures. The multiple levels of parallelism present in BLAZE code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of BLAZE are described and it is shown how this language would be used in typical scientific programming.
MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program

NASA Astrophysics Data System (ADS)

Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.

2018-02-01

We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.
Econ's optimal decision model of wheat production and distribution-documentation

NASA Technical Reports Server (NTRS)

1977-01-01

The report documents the computer programs written to implement the ECON optical decision model. The programs were written in APL, an extremely compact and powerful language particularly well suited to this model, which makes extensive use of matrix manipulations. The algorithms used are presented and listings of and descriptive information on the APL programs used are given. Possible changes in input data are also given.
CASPER: A GENERALIZED PROGRAM FOR PLOTTING AND SCALING DATA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lietzke, M.P.; Smith, R.E.

A Fortran subroutine was written to scale floating-point data and generate a magnetic tape to plot it on the Calcomp 570 digital plotter. The routine permits a great deal of flexibility, and may be used with any type of FORTRAN or FAP calling program. A simple calling program was also written to permit the user to read in data from cards and plot it without any additional programming. Both the Fortran and binary decks are available. (auth)
Distriblets: Java-Based Distributed Computing on the Web.

ERIC Educational Resources Information Center

Finkel, David; Wills, Craig E.; Brennan, Brian; Brennan, Chris

1999-01-01

Describes a system for using the World Wide Web to distribute computational tasks to multiple hosts on the Web that is written in Java programming language. Describes the programs written to carry out the load distribution, the structure of a "distriblet" class, and experiences in using this system. (Author/LRW)
76 FR 27274 - Defense Federal Acquisition Regulation Supplement; Technical Amendment

Federal Register 2010, 2011, 2012, 2013, 2014

2011-05-11

... procurement programs, it must provide written notice of the determination to the GSA Suspension and Debarment Official. List of Subjects in 48 CFR Part 209 Government procurement. Ynette R. Shelkin, Editor, Defense... suspended from procurement programs, it must provide written notice of the determination to the General...
Micro-PROUST.

ERIC Educational Resources Information Center

Johnson, W. Lewis; Soloway, Elliot

This detailed description of a microcomputer version of PROUST (Program Understander for Students), a knowledge-based system that finds nonsyntactic bugs in Pascal programs written by novice programmers, presents the inner workings of Micro-PROUST, which was written in Golden LISP for the IBM-PC (512K). The contents include: (1) a reprint of an…
10 CFR 26.27 - Written policy and procedures.

Code of Federal Regulations, 2011 CFR

2011-01-01

... NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Program Elements § 26.27 Written policy and... respond to an emergency, the procedure must— (A) Require a determination of fitness by breath alcohol... require him or her to be subject to this subpart, if the results of the determination of fitness indicate...
10 CFR 26.27 - Written policy and procedures.

Code of Federal Regulations, 2010 CFR

2010-01-01

... NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Program Elements § 26.27 Written policy and... respond to an emergency, the procedure must— (A) Require a determination of fitness by breath alcohol... require him or her to be subject to this subpart, if the results of the determination of fitness indicate...
76 FR 5821 - Earned Import Allowance Program: Evaluation of the Effectiveness of the Program for Certain...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-02-02

... Report AGENCY: United States International Trade Commission. ACTION: Notice of public hearing and opportunity to provide testimony and written comments in connection with the Commission's second annual report... date for the public hearing and deadlines for filing briefs and other written submissions, in...
Extraction and analysis of the image in the sight field of comparison goniometer to measure IR mirrors assembly

NASA Astrophysics Data System (ADS)

Wang, Zhi-shan; Zhao, Yue-jin; Li, Zhuo; Dong, Liquan; Chu, Xuhong; Li, Ping

2010-11-01

The comparison goniometer is widely used to measure and inspect small angle, angle difference, and parallelism of two surfaces. However, the common manner to read a comparison goniometer is to inspect the ocular of the goniometer by one eye of the operator. To read an old goniometer that just equips with one adjustable ocular is a difficult work. In the fabrication of an IR reflecting mirrors assembly, a common comparison goniometer is used to measure the angle errors between two neighbor assembled mirrors. In this paper, a quick reading technique image-based for the comparison goniometer used to inspect the parallelism of mirrors in a mirrors assembly is proposed. One digital camera, one comparison goniometer and one set of computer are used to construct a reading system, the image of the sight field in the comparison goniometer will be extracted and recognized to get the angle positions of the reflection surfaces to be measured. In order to obtain the interval distance between the scale lines, a particular technique, left peak first method, based on the local peak values of intensity in the true color image is proposed. A program written in VC++6.0 has been developed to perform the color digital image processing.
I/O-Efficient Scientific Computation Using TPIE

NASA Technical Reports Server (NTRS)

Vengroff, Darren Erik; Vitter, Jeffrey Scott

1996-01-01

In recent years, input/output (I/O)-efficient algorithms for a wide variety of problems have appeared in the literature. However, systems specifically designed to assist programmers in implementing such algorithms have remained scarce. TPIE is a system designed to support I/O-efficient paradigms for problems from a variety of domains, including computational geometry, graph algorithms, and scientific computation. The TPIE interface frees programmers from having to deal not only with explicit read and write calls, but also the complex memory management that must be performed for I/O-efficient computation. In this paper we discuss applications of TPIE to problems in scientific computation. We discuss algorithmic issues underlying the design and implementation of the relevant components of TPIE and present performance results of programs written to solve a series of benchmark problems using our current TPIE prototype. Some of the benchmarks we present are based on the NAS parallel benchmarks while others are of our own creation. We demonstrate that the central processing unit (CPU) overhead required to manage I/O is small and that even with just a single disk, the I/O overhead of I/O-efficient computation ranges from negligible to the same order of magnitude as CPU time. We conjecture that if we use a number of disks in parallel this overhead can be all but eliminated.
SDS: A Framework for Scientific Data Services

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dong, Bin; Byna, Surendra; Wu, Kesheng

2013-10-31

Large-scale scientific applications typically write their data to parallel file systems with organizations designed to achieve fast write speeds. Analysis tasks frequently read the data in a pattern that is different from the write pattern, and therefore experience poor I/O performance. In this paper, we introduce a prototype framework for bridging the performance gap between write and read stages of data access from parallel file systems. We call this framework Scientific Data Services, or SDS for short. This initial implementation of SDS focuses on reorganizing previously written files into data layouts that benefit read patterns, and transparently directs read callsmore » to the reorganized data. SDS follows a client-server architecture. The SDS Server manages partial or full replicas of reorganized datasets and serves SDS Clients' requests for data. The current version of the SDS client library supports HDF5 programming interface for reading data. The client library intercepts HDF5 calls and transparently redirects them to the reorganized data. The SDS client library also provides a querying interface for reading part of the data based on user-specified selective criteria. We describe the design and implementation of the SDS client-server architecture, and evaluate the response time of the SDS Server and the performance benefits of SDS.« less
5 CFR 370.105 - Written agreements.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 5 Administrative Personnel 1 2010-01-01 2010-01-01 false Written agreements. 370.105 Section 370.105 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT CIVIL SERVICE REGULATIONS INFORMATION TECHNOLOGY EXCHANGE PROGRAM § 370.105 Written agreements. Before the detail begins, the agency and private...

Towards massively parallelized all-optical magnetic recording

NASA Astrophysics Data System (ADS)

Davies, C. S.; Janušonis, J.; Kimel, A. V.; Kirilyuk, A.; Tsukamoto, A.; Rasing, Th.; Tobey, R. I.

2018-06-01

We demonstrate an approach to parallel all-optical writing of magnetic domains using spatial and temporal interference of two ultrashort light pulses. We explore how the fluence and grating periodicity of the optical transient grating influence the size and uniformity of the written bits. Using a total incident optical energy of 3.5 μJ, we demonstrate the capability of simultaneously writing 102 spatially separated bits, each featuring a relevant lateral width of ˜1 μm. We discuss viable routes to extend this technique to write individually addressable, sub-diffraction-limited magnetic domains in a wide range of materials.
SERODS optical data storage with parallel signal transfer

DOEpatents

Vo-Dinh, Tuan

2003-09-02

Surface-enhanced Raman optical data storage (SERODS) systems having increased reading and writing speeds, that is, increased data transfer rates, are disclosed. In the various SERODS read and write systems, the surface-enhanced Raman scattering (SERS) data is written and read using a two-dimensional process called parallel signal transfer (PST). The various embodiments utilize laser light beam excitation of the SERODS medium, optical filtering, beam imaging, and two-dimensional light detection. Two- and three-dimensional SERODS media are utilized. The SERODS write systems employ either a different laser or a different level of laser power.
SERODS optical data storage with parallel signal transfer

DOEpatents

Vo-Dinh, Tuan

2003-06-24

Surface-enhanced Raman optical data storage (SERODS) systems having increased reading and writing speeds, that is, increased data transfer rates, are disclosed. In the various SERODS read and write systems, the surface-enhanced Raman scattering (SERS) data is written and read using a two-dimensional process called parallel signal transfer (PST). The various embodiments utilize laser light beam excitation of the SERODS medium, optical filtering, beam imaging, and two-dimensional light detection. Two- and three-dimensional SERODS media are utilized. The SERODS write systems employ either a different laser or a different level of laser power.
Multiprocessor architecture: Synthesis and evaluation

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1990-01-01

Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Vanrosendale, John

1989-01-01

Distributed memory architectures offer high levels of performance and flexibility, but have proven awkard to program. Current languages for nonshared memory architectures provide a relatively low level programming environment, and are poorly suited to modular programming, and to the construction of libraries. A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. Tensor product array computations are focused on along with a simple but important class of numerical algorithms. The problem of programming 1-D kernal routines is focused on first, such as parallel tridiagonal solvers, and then how such parallel kernels can be combined to form parallel tensor product algorithms is examined.
3D-PDR: Three-dimensional photodissociation region code

NASA Astrophysics Data System (ADS)

Bisbas, T. G.; Bell, T. A.; Viti, S.; Yates, J.; Barlow, M. J.

2018-03-01

3D-PDR is a three-dimensional photodissociation region code written in Fortran. It uses the Sundials package (written in C) to solve the set of ordinary differential equations and it is the successor of the one-dimensional PDR code UCL_PDR (ascl:1303.004). Using the HEALpix ray-tracing scheme (ascl:1107.018), 3D-PDR solves a three-dimensional escape probability routine and evaluates the attenuation of the far-ultraviolet radiation in the PDR and the propagation of FIR/submm emission lines out of the PDR. The code is parallelized (OpenMP) and can be applied to 1D and 3D problems.
Methods for design and evaluation of parallel computating systems (The PISCES project)

NASA Technical Reports Server (NTRS)

Pratt, Terrence W.; Wise, Robert; Haught, Mary JO

1989-01-01

The PISCES project started in 1984 under the sponsorship of the NASA Computational Structural Mechanics (CSM) program. A PISCES 1 programming environment and parallel FORTRAN were implemented in 1984 for the DEC VAX (using UNIX processes to simulate parallel processes). This system was used for experimentation with parallel programs for scientific applications and AI (dynamic scene analysis) applications. PISCES 1 was ported to a network of Apollo workstations by N. Fitzgerald.
Xyce parallel electronic simulator users guide, version 6.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users' guide, Version 6.0.1.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users guide, version 6.0.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
A compositional reservoir simulator on distributed memory parallel computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rame, M.; Delshad, M.

1995-12-31

This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
Alaska Native Land Claims. Workbook to Accompany Textbook.

ERIC Educational Resources Information Center

Hays, Lydia L.

Written as a companion to the secondary textbook, "Alaska Native Land Claims", this student workbook is organized via 9 units and 39 chapters which parallel the text's organizational format. Each unit presents unit goals and has anywhere from three to five subsections or chapters. Each titled chapter (e.g., Alaska's First Settlers)…
U.S. Army Aeromedical Research Laboratory Annual Progress Report FY 1986

DTIC Science & Technology

1986-10-01

19 Contracts ................................................. 19 Small Business Innovation...universities and businesses which parallels the research requirements of the laboratories under the USAMRDC command. Because of the scientific manpower...Software is being written to allow double entry verification of data. 2) Small business innovation research Each year, in compliance with the Small
TEACHING COMPOSITION. WHAT RESEARCH SAYS TO THE TEACHER, NUMBER 18.

ERIC Educational Resources Information Center

BURROWS, ALVINA T.

ALTHOUGH CHILDREN'S NEEDS FOR WRITTEN EXPRESSION PROBABLY PARALLEL THOSE OF ADULTS, THE REASON BEHIND CHILDREN'S CHOICE OF WRITING OVER SPEAKING IN GIVEN INSTANCES IS OPEN TO CONJECTURE. MOREOVER, THE COMMON ASSUMPTION BY TEACHERS THAT CHILDREN CAN AND SHOULD WRITE ABOUT PERSONAL INTERESTS OUGHT TO BE TEMPERED BY THE IDEA THAT MANY INTERESTS ARE…
The Modern Idea of the University.

ERIC Educational Resources Information Center

Thompson, Jo Ann Gerdeman

Recurrent themes in selected literature on American higher education written during 1962-1972 are analyzed and related to themes on the same subject addressed by selected Victorian essayists in 19th century England. Parallels in educational thought are used to illuminate some aspects of the nature of the debate over the role of higher education in…
The Effects of Examples on the Comprehension of Textbooks.

ERIC Educational Resources Information Center

Twohig, Paul T.

A study was conducted to examine learning from textbook sentences that provide examples. It was predicted that an example written succinctly and with a word order that emphasized its parallel semantic relations with the exemplified principle, such as "animals have parasites; dogs have fleas," would have a positive effect on idea comprehension. It…
Computer program for quasi-one-dimensional compressible flow with area change and friction - Application to gas film seals

NASA Technical Reports Server (NTRS)

Zuk, J.; Smith, P. J.

1974-01-01

A computer program is presented for compressible fluid flow with friction and area change. The program carries out a quasi-one-dimensional flow analysis which is valid for laminar and turbulent flows under both subsonic and choked flow conditions. The program was written to be applied to gas film seals. The area-change analysis should prove useful for choked flow conditions with small mean thickness, as well as for face seals where radial area change is significant. The program is written in FORTRAN 4.
Discrim: a computer program using an interactive approach to dissect a mixture of normal or lognormal distributions

USGS Publications Warehouse

Bridges, N.J.; McCammon, R.B.

1980-01-01

DISCRIM is an interactive computer graphics program that dissects mixtures of normal or lognormal distributions. The program was written in an effort to obtain a more satisfactory solution to the dissection problem than that offered by a graphical or numerical approach alone. It combines graphic and analytic techniques using a Tektronix1 terminal in a time-share computing environment. The main program and subroutines were written in the FORTRAN language. ?? 1980.
Computer-aided programming for message-passing system; Problems and a solution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, M.Y.; Gajski, D.D.

1989-12-01

As the number of processors and the complexity of problems to be solved increase, programming multiprocessing systems becomes more difficult and error-prone. Program development tools are necessary since programmers are not able to develop complex parallel programs efficiently. Parallel models of computation, parallelization problems, and tools for computer-aided programming (CAP) are discussed. As an example, a CAP tool that performs scheduling and inserts communication primitives automatically is described. It also generates the performance estimates and other program quality measures to help programmers in improving their algorithms and programs.
OpenSWPC: an open-source integrated parallel simulation code for modeling seismic wave propagation in 3D heterogeneous viscoelastic media

NASA Astrophysics Data System (ADS)

Maeda, Takuto; Takemura, Shunsuke; Furumura, Takashi

2017-07-01

We have developed an open-source software package, Open-source Seismic Wave Propagation Code (OpenSWPC), for parallel numerical simulations of seismic wave propagation in 3D and 2D (P-SV and SH) viscoelastic media based on the finite difference method in local-to-regional scales. This code is equipped with a frequency-independent attenuation model based on the generalized Zener body and an efficient perfectly matched layer for absorbing boundary condition. A hybrid-style programming using OpenMP and the Message Passing Interface (MPI) is adopted for efficient parallel computation. OpenSWPC has wide applicability for seismological studies and great portability to allowing excellent performance from PC clusters to supercomputers. Without modifying the code, users can conduct seismic wave propagation simulations using their own velocity structure models and the necessary source representations by specifying them in an input parameter file. The code has various modes for different types of velocity structure model input and different source representations such as single force, moment tensor and plane-wave incidence, which can easily be selected via the input parameters. Widely used binary data formats, the Network Common Data Form (NetCDF) and the Seismic Analysis Code (SAC) are adopted for the input of the heterogeneous structure model and the outputs of the simulation results, so users can easily handle the input/output datasets. All codes are written in Fortran 2003 and are available with detailed documents in a public repository.[Figure not available: see fulltext.

Parallel implementation of an adaptive and parameter-free N-body integrator

NASA Astrophysics Data System (ADS)

Pruett, C. David; Ingham, William H.; Herman, Ralph D.

2011-05-01

Previously, Pruett et al. (2003) [3] described an N-body integrator of arbitrarily high order M with an asymptotic operation count of O(MN). The algorithm's structure lends itself readily to data parallelization, which we document and demonstrate here in the integration of point-mass systems subject to Newtonian gravitation. High order is shown to benefit parallel efficiency. The resulting N-body integrator is robust, parameter-free, highly accurate, and adaptive in both time-step and order. Moreover, it exhibits linear speedup on distributed parallel processors, provided that each processor is assigned at least a handful of bodies. Program summaryProgram title: PNB.f90 Catalogue identifier: AEIK_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEIK_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC license, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 3052 No. of bytes in distributed program, including test data, etc.: 68 600 Distribution format: tar.gz Programming language: Fortran 90 and OpenMPI Computer: All shared or distributed memory parallel processors Operating system: Unix/Linux Has the code been vectorized or parallelized?: The code has been parallelized but has not been explicitly vectorized. RAM: Dependent upon N Classification: 4.3, 4.12, 6.5 Nature of problem: High accuracy numerical evaluation of trajectories of N point masses each subject to Newtonian gravitation. Solution method: Parallel and adaptive extrapolation in time via power series of arbitrary degree. Running time: 5.1 s for the demo program supplied with the package.
Parallel solution of sparse one-dimensional dynamic programming problems

NASA Technical Reports Server (NTRS)

Nicol, David M.

1989-01-01

Parallel computation offers the potential for quickly solving large computational problems. However, it is often a non-trivial task to effectively use parallel computers. Solution methods must sometimes be reformulated to exploit parallelism; the reformulations are often more complex than their slower serial counterparts. We illustrate these points by studying the parallelization of sparse one-dimensional dynamic programming problems, those which do not obviously admit substantial parallelization. We propose a new method for parallelizing such problems, develop analytic models which help us to identify problems which parallelize well, and compare the performance of our algorithm with existing algorithms on a multiprocessor.
ROTRAN 1 - SOLUTION OF EQUATIONS FOR ROTARY TRANSFORMERS

NASA Technical Reports Server (NTRS)

Salomon, P. M.

1994-01-01

ROTRAN1 is a computer program to calculate the impedance and current gain of a simple transformer. Inputs to the program are primary resistance, primary inductance, secondary (load) resistance, secondary inductance, and mutual inductance. ROTRAN1 was written in BASICA for execution on the IBM PC personal computer. It was written in 1986.
The Psychometric Properties of the Agricultural Hazardous Occupations Order Certification Training Program Written Examinations

ERIC Educational Resources Information Center

French, Brian F.; Breidenbach, Daniel H.; Field, William E.; Tormoehlen, Roger

2007-01-01

The written certification exam that accompanies the Gearing Up for Safety-Agricultural Production Safety Training for Youth curriculum was designed to partially meet the testing requirements of the Agricultural Hazardous Occupations Order (AgHOs) Certification Training Program. This curriculum and accompanying assessment tools are available for…
The Genre of Instructor Feedback in Doctoral Programs: A Corpus Linguistic Analysis

ERIC Educational Resources Information Center

Walters, Kelley Jo; Henry, Patricia; Vinella, Michael; Wells, Steve; Shaw, Melanie; Miller, James

2015-01-01

Providing transparent written feedback to doctoral students is essential to the learning process and preparation for the capstone. The purpose of this study was to conduct a qualitative exploration of faculty feedback on benchmark written assignments across multiple, online doctoral programs. The Corpus for this analysis included 236 doctoral…
76 FR 66309 - Pilot Program for Parallel Review of Medical Products; Correction

Federal Register 2010, 2011, 2012, 2013, 2014

2011-10-26

... DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Medicare and Medicaid Services [CMS-3180-N2] Food and Drug Administration [Docket No. FDA-2010-N-0308] Pilot Program for Parallel Review of Medical... technologies to participate in a program of parallel FDA-CMS review. The document was published with an...
ms2: A molecular simulation tool for thermodynamic properties

NASA Astrophysics Data System (ADS)

Deublein, Stephan; Eckl, Bernhard; Stoll, Jürgen; Lishchuk, Sergey V.; Guevara-Carrion, Gabriela; Glass, Colin W.; Merker, Thorsten; Bernreuther, Martin; Hasse, Hans; Vrabec, Jadran

2011-11-01

This work presents the molecular simulation program ms2 that is designed for the calculation of thermodynamic properties of bulk fluids in equilibrium consisting of small electro-neutral molecules. ms2 features the two main molecular simulation techniques, molecular dynamics (MD) and Monte-Carlo. It supports the calculation of vapor-liquid equilibria of pure fluids and multi-component mixtures described by rigid molecular models on the basis of the grand equilibrium method. Furthermore, it is capable of sampling various classical ensembles and yields numerous thermodynamic properties. To evaluate the chemical potential, Widom's test molecule method and gradual insertion are implemented. Transport properties are determined by equilibrium MD simulations following the Green-Kubo formalism. ms2 is designed to meet the requirements of academia and industry, particularly achieving short response times and straightforward handling. It is written in Fortran90 and optimized for a fast execution on a broad range of computer architectures, spanning from single processor PCs over PC-clusters and vector computers to high-end parallel machines. The standard Message Passing Interface (MPI) is used for parallelization and ms2 is therefore easily portable to different computing platforms. Feature tools facilitate the interaction with the code and the interpretation of input and output files. The accuracy and reliability of ms2 has been shown for a large variety of fluids in preceding work. Program summaryProgram title:ms2 Catalogue identifier: AEJF_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJF_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Special Licence supplied by the authors No. of lines in distributed program, including test data, etc.: 82 794 No. of bytes in distributed program, including test data, etc.: 793 705 Distribution format: tar.gz Programming language: Fortran90 Computer: The simulation tool ms2 is usable on a wide variety of platforms, from single processor machines over PC-clusters and vector computers to vector-parallel architectures. (Tested with Fortran compilers: gfortran, Intel, PathScale, Portland Group and Sun Studio.) Operating system: Unix/Linux, Windows Has the code been vectorized or parallelized?: Yes. Message Passing Interface (MPI) protocol Scalability. Excellent scalability up to 16 processors for molecular dynamics and >512 processors for Monte-Carlo simulations. RAM:ms2 runs on single processors with 512 MB RAM. The memory demand rises with increasing number of processors used per node and increasing number of molecules. Classification: 7.7, 7.9, 12 External routines: Message Passing Interface (MPI) Nature of problem: Calculation of application oriented thermodynamic properties for rigid electro-neutral molecules: vapor-liquid equilibria, thermal and caloric data as well as transport properties of pure fluids and multi-component mixtures. Solution method: Molecular dynamics, Monte-Carlo, various classical ensembles, grand equilibrium method, Green-Kubo formalism. Restrictions: No. The system size is user-defined. Typical problems addressed by ms2 can be solved by simulating systems containing typically 2000 molecules or less. Unusual features: Feature tools are available for creating input files, analyzing simulation results and visualizing molecular trajectories. Additional comments: Sample makefiles for multiple operation platforms are provided. Documentation is provided with the installation package and is available at http://www.ms-2.de. Running time: The running time of ms2 depends on the problem set, the system size and the number of processes used in the simulation. Running four processes on a "Nehalem" processor, simulations calculating VLE data take between two and twelve hours, calculating transport properties between six and 24 hours.
Fortran interface layer of the framework for developing particle simulator FDPS

NASA Astrophysics Data System (ADS)

Namekata, Daisuke; Iwasawa, Masaki; Nitadori, Keigo; Tanikawa, Ataru; Muranushi, Takayuki; Wang, Long; Hosono, Natsuki; Nomura, Kentaro; Makino, Junichiro

2018-06-01

Numerical simulations based on particle methods have been widely used in various fields including astrophysics. To date, various versions of simulation software have been developed by individual researchers or research groups in each field, through a huge amount of time and effort, even though the numerical algorithms used are very similar. To improve the situation, we have developed a framework, called FDPS (Framework for Developing Particle Simulators), which enables researchers to develop massively parallel particle simulation codes for arbitrary particle methods easily. Until version 3.0, FDPS provided an API (application programming interface) for the C++ programming language only. This limitation comes from the fact that FDPS is developed using the template feature in C++, which is essential to support arbitrary data types of particle. However, there are many researchers who use Fortran to develop their codes. Thus, the previous versions of FDPS require such people to invest much time to learn C++. This is inefficient. To cope with this problem, we developed a Fortran interface layer in FDPS, which provides API for Fortran. In order to support arbitrary data types of particle in Fortran, we design the Fortran interface layer as follows. Based on a given derived data type in Fortran representing particle, a PYTHON script provided by us automatically generates a library that manipulates the C++ core part of FDPS. This library is seen as a Fortran module providing an API of FDPS from the Fortran side and uses C programs internally to interoperate Fortran with C++. In this way, we have overcome several technical issues when emulating a `template' in Fortran. Using the Fortran interface, users can develop all parts of their codes in Fortran. We show that the overhead of the Fortran interface part is sufficiently small and a code written in Fortran shows a performance practically identical to the one written in C++.
Massively parallel implementation of 3D-RISM calculation with volumetric 3D-FFT.

PubMed

Maruyama, Yutaka; Yoshida, Norio; Tadano, Hiroto; Takahashi, Daisuke; Sato, Mitsuhisa; Hirata, Fumio

2014-07-05

A new three-dimensional reference interaction site model (3D-RISM) program for massively parallel machines combined with the volumetric 3D fast Fourier transform (3D-FFT) was developed, and tested on the RIKEN K supercomputer. The ordinary parallel 3D-RISM program has a limitation on the number of parallelizations because of the limitations of the slab-type 3D-FFT. The volumetric 3D-FFT relieves this limitation drastically. We tested the 3D-RISM calculation on the large and fine calculation cell (2048(3) grid points) on 16,384 nodes, each having eight CPU cores. The new 3D-RISM program achieved excellent scalability to the parallelization, running on the RIKEN K supercomputer. As a benchmark application, we employed the program, combined with molecular dynamics simulation, to analyze the oligomerization process of chymotrypsin Inhibitor 2 mutant. The results demonstrate that the massive parallel 3D-RISM program is effective to analyze the hydration properties of the large biomolecular systems. Copyright © 2014 Wiley Periodicals, Inc.
F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

NASA Technical Reports Server (NTRS)

DiNucci, David C.; Saini, Subhash (Technical Monitor)

1998-01-01

Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).
Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

NASA Technical Reports Server (NTRS)

Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

1994-01-01

Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.
Using CLIPS in the domain of knowledge-based massively parallel programming

NASA Technical Reports Server (NTRS)

Dvorak, Jiri J.

1994-01-01

The Program Development Environment (PDE) is a tool for massively parallel programming of distributed-memory architectures. Adopting a knowledge-based approach, the PDE eliminates the complexity introduced by parallel hardware with distributed memory and offers complete transparency in respect of parallelism exploitation. The knowledge-based part of the PDE is realized in CLIPS. Its principal task is to find an efficient parallel realization of the application specified by the user in a comfortable, abstract, domain-oriented formalism. A large collection of fine-grain parallel algorithmic skeletons, represented as COOL objects in a tree hierarchy, contains the algorithmic knowledge. A hybrid knowledge base with rule modules and procedural parts, encoding expertise about application domain, parallel programming, software engineering, and parallel hardware, enables a high degree of automation in the software development process. In this paper, important aspects of the implementation of the PDE using CLIPS and COOL are shown, including the embedding of CLIPS with C++-based parts of the PDE. The appropriateness of the chosen approach and of the CLIPS language for knowledge-based software engineering are discussed.
Spur, helical, and spiral bevel transmission life modeling

NASA Technical Reports Server (NTRS)

Savage, Michael; Rubadeux, Kelly L.; Coe, Harold H.; Coy, John J.

1994-01-01

A computer program, TLIFE, which estimates the life, dynamic capacity, and reliability of aircraft transmissions, is presented. The program enables comparisons of transmission service life at the design stage for optimization. A variety of transmissions may be analyzed including: spur, helical, and spiral bevel reductions as well as series combinations of these reductions. The basic spur and helical reductions include: single mesh, compound, and parallel path plus revert star and planetary gear trains. A variety of straddle and overhung bearing configurations on the gear shafts are possible as is the use of a ring gear for the output. The spiral bevel reductions include single and dual input drives with arbitrary shaft angles. The program is written in FORTRAN 77 and has been executed both in the personal computer DOS environment and on UNIX workstations. The analysis may be performed in either the SI metric or the English inch system of units. The reliability and life analysis is based on the two-parameter Weibull distribution lives of the component gears and bearings. The program output file describes the overall transmission and each constituent transmission, its components, and their locations, capacities, and loads. Primary output is the dynamic capacity and 90-percent reliability and mean lives of the unit transmissions and the overall system which can be used to estimate service overhaul frequency requirements. Two examples are presented to illustrate the information available for single element and series transmissions.
FastChem: A computer program for efficient complex chemical equilibrium calculations in the neutral/ionized gas phase with applications to stellar and planetary atmospheres

NASA Astrophysics Data System (ADS)

Stock, Joachim W.; Kitzmann, Daniel; Patzer, A. Beate C.; Sedlmayr, Erwin

2018-06-01

For the calculation of complex neutral/ionized gas phase chemical equilibria, we present a semi-analytical versatile and efficient computer program, called FastChem. The applied method is based on the solution of a system of coupled nonlinear (and linear) algebraic equations, namely the law of mass action and the element conservation equations including charge balance, in many variables. Specifically, the system of equations is decomposed into a set of coupled nonlinear equations in one variable each, which are solved analytically whenever feasible to reduce computation time. Notably, the electron density is determined by using the method of Nelder and Mead at low temperatures. The program is written in object-oriented C++ which makes it easy to couple the code with other programs, although a stand-alone version is provided. FastChem can be used in parallel or sequentially and is available under the GNU General Public License version 3 at https://github.com/exoclime/FastChem together with several sample applications. The code has been successfully validated against previous studies and its convergence behavior has been tested even for extreme physical parameter ranges down to 100 K and up to 1000 bar. FastChem converges stable and robust in even most demanding chemical situations, which posed sometimes extreme challenges for previous algorithms.
Evolving binary classifiers through parallel computation of multiple fitness cases.

PubMed

Cagnoni, Stefano; Bergenti, Federico; Mordonini, Monica; Adorni, Giovanni

2005-06-01

This paper describes two versions of a novel approach to developing binary classifiers, based on two evolutionary computation paradigms: cellular programming and genetic programming. Such an approach achieves high computation efficiency both during evolution and at runtime. Evolution speed is optimized by allowing multiple solutions to be computed in parallel. Runtime performance is optimized explicitly using parallel computation in the case of cellular programming or implicitly taking advantage of the intrinsic parallelism of bitwise operators on standard sequential architectures in the case of genetic programming. The approach was tested on a digit recognition problem and compared with a reference classifier.
Implementations of BLAST for parallel computers.

PubMed

Jülich, A

1995-02-01

The BLAST sequence comparison programs have been ported to a variety of parallel computers-the shared memory machine Cray Y-MP 8/864 and the distributed memory architectures Intel iPSC/860 and nCUBE. Additionally, the programs were ported to run on workstation clusters. We explain the parallelization techniques and consider the pros and cons of these methods. The BLAST programs are very well suited for parallelization for a moderate number of processors. We illustrate our results using the program blastp as an example. As input data for blastp, a 799 residue protein query sequence and the protein database PIR were used.
Programming parallel architectures: The BLAZE family of languages

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush

1988-01-01

Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.
Exploiting Vector and Multicore Parallelsim for Recursive, Data- and Task-Parallel Programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ren, Bin; Krishnamoorthy, Sriram; Agrawal, Kunal

Modern hardware contains parallel execution resources that are well-suited for data-parallelism-vector units-and task parallelism-multicores. However, most work on parallel scheduling focuses on one type of hardware or the other. In this work, we present a scheduling framework that allows for a unified treatment of task- and data-parallelism. Our key insight is an abstraction, task blocks, that uniformly handles data-parallel iterations and task-parallel tasks, allowing them to be scheduled on vector units or executed independently as multicores. Our framework allows us to define schedulers that can dynamically select between executing task- blocks on vector units or multicores. We show that thesemore » schedulers are asymptotically optimal, and deliver the maximum amount of parallelism available in computation trees. To evaluate our schedulers, we develop program transformations that can convert mixed data- and task-parallel pro- grams into task block-based programs. Using a prototype instantiation of our scheduling framework, we show that, on an 8-core system, we can simultaneously exploit vector and multicore parallelism to achieve 14×-108× speedup over sequential baselines.« less
High-performance computing — an overview

NASA Astrophysics Data System (ADS)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
42 CFR 456.80 - Individual written plan of care.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 42 Public Health 4 2010-10-01 2010-10-01 false Individual written plan of care. 456.80 Section 456... (CONTINUED) MEDICAL ASSISTANCE PROGRAMS UTILIZATION CONTROL Utilization Control: Hospitals Plan of Care § 456.80 Individual written plan of care. (a) Before admission to a hospital or before authorization for...

76 FR 37048 - Louisiana; Final Authorization of State Hazardous Waste Management Program Revisions

Federal Register 2010, 2011, 2012, 2013, 2014

2011-06-24

... the preamble to the immediate final rule. Unless we get written comments which oppose this... your written comments by July 25, 2011. ADDRESSES: Send written comments to Alima Patterson, Region 6... hand delivery/courier; please follow the detailed instructions in the ADDRESSES section of the...
32 CFR 241.5 - Written agreements.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 32 National Defense 2 2011-07-01 2011-07-01 false Written agreements. 241.5 Section 241.5 National... PILOT PROGRAM FOR TEMPORARY EXCHANGE OF INFORMATION TECHNOLOGY PERSONNEL § 241.5 Written agreements. (a... to be assigned to ITEP must sign a three-party agreement. Prior to the agreement being signed the...
32 CFR 241.5 - Written agreements.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 32 National Defense 2 2013-07-01 2013-07-01 false Written agreements. 241.5 Section 241.5 National... PILOT PROGRAM FOR TEMPORARY EXCHANGE OF INFORMATION TECHNOLOGY PERSONNEL § 241.5 Written agreements. (a... agreement. Prior to the agreement being signed the relevant legal office for the DoD Component shall review...
25 CFR 286.18 - Written notice.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 25 Indians 1 2010-04-01 2010-04-01 false Written notice. 286.18 Section 286.18 Indians BUREAU OF INDIAN AFFAIRS, DEPARTMENT OF THE INTERIOR ECONOMIC ENTERPRISES INDIAN BUSINESS DEVELOPMENT PROGRAM § 286.18 Written notice. The applicant for a grant which is disapproved will be notified by letter, stating...
Transforming Biology Assessment with Machine Learning: Automated Scoring of Written Evolutionary Explanations

ERIC Educational Resources Information Center

Nehm, Ross H.; Ha, Minsu; Mayfield, Elijah

2012-01-01

This study explored the use of machine learning to automatically evaluate the accuracy of students' written explanations of evolutionary change. Performance of the Summarization Integrated Development Environment (SIDE) program was compared to human expert scoring using a corpus of 2,260 evolutionary explanations written by 565 undergraduate…
The Design and Evaluation of "CAPTools"--A Computer Aided Parallelization Toolkit

NASA Technical Reports Server (NTRS)

Yan, Jerry; Frumkin, Michael; Hribar, Michelle; Jin, Haoqiang; Waheed, Abdul; Johnson, Steve; Cross, Jark; Evans, Emyr; Ierotheou, Constantinos; Leggett, Pete;

1998-01-01

Writing applications for high performance computers is a challenging task. Although writing code by hand still offers the best performance, it is extremely costly and often not very portable. The Computer Aided Parallelization Tools (CAPTools) are a toolkit designed to help automate the mapping of sequential FORTRAN scientific applications onto multiprocessors. CAPTools consists of the following major components: an inter-procedural dependence analysis module that incorporates user knowledge; a 'self-propagating' data partitioning module driven via user guidance; an execution control mask generation and optimization module for the user to fine tune parallel processing of individual partitions; a program transformation/restructuring facility for source code clean up and optimization; a set of browsers through which the user interacts with CAPTools at each stage of the parallelization process; and a code generator supporting multiple programming paradigms on various multiprocessors. Besides describing the rationale behind the architecture of CAPTools, the parallelization process is illustrated via case studies involving structured and unstructured meshes. The programming process and the performance of the generated parallel programs are compared against other programming alternatives based on the NAS Parallel Benchmarks, ARC3D and other scientific applications. Based on these results, a discussion on the feasibility of constructing architectural independent parallel applications is presented.

Real-time implementations of image segmentation algorithms on shared memory multicore architecture: a survey (Conference Presentation)

NASA Astrophysics Data System (ADS)

Akil, Mohamed

2017-05-01

The real-time processing is getting more and more important in many image processing applications. Image segmentation is one of the most fundamental tasks image analysis. As a consequence, many different approaches for image segmentation have been proposed. The watershed transform is a well-known image segmentation tool. The watershed transform is a very data intensive task. To achieve acceleration and obtain real-time processing of watershed algorithms, parallel architectures and programming models for multicore computing have been developed. This paper focuses on the survey of the approaches for parallel implementation of sequential watershed algorithms on multicore general purpose CPUs: homogeneous multicore processor with shared memory. To achieve an efficient parallel implementation, it's necessary to explore different strategies (parallelization/distribution/distributed scheduling) combined with different acceleration and optimization techniques to enhance parallelism. In this paper, we give a comparison of various parallelization of sequential watershed algorithms on shared memory multicore architecture. We analyze the performance measurements of each parallel implementation and the impact of the different sources of overhead on the performance of the parallel implementations. In this comparison study, we also discuss the advantages and disadvantages of the parallel programming models. Thus, we compare the OpenMP (an application programming interface for multi-Processing) with Ptheads (POSIX Threads) to illustrate the impact of each parallel programming model on the performance of the parallel implementations.
Scalable and portable visualization of large atomistic datasets

NASA Astrophysics Data System (ADS)

Sharma, Ashish; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya

2004-10-01

A scalable and portable code named Atomsviewer has been developed to interactively visualize a large atomistic dataset consisting of up to a billion atoms. The code uses a hierarchical view frustum-culling algorithm based on the octree data structure to efficiently remove atoms outside of the user's field-of-view. Probabilistic and depth-based occlusion-culling algorithms then select atoms, which have a high probability of being visible. Finally a multiresolution algorithm is used to render the selected subset of visible atoms at varying levels of detail. Atomsviewer is written in C++ and OpenGL, and it has been tested on a number of architectures including Windows, Macintosh, and SGI. Atomsviewer has been used to visualize tens of millions of atoms on a standard desktop computer and, in its parallel version, up to a billion atoms. Program summaryTitle of program: Atomsviewer Catalogue identifier: ADUM Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADUM Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland Computer for which the program is designed and others on which it has been tested: 2.4 GHz Pentium 4/Xeon processor, professional graphics card; Apple G4 (867 MHz)/G5, professional graphics card Operating systems under which the program has been tested: Windows 2000/XP, Mac OS 10.2/10.3, SGI IRIX 6.5 Programming languages used: C++, C and OpenGL Memory required to execute with typical data: 1 gigabyte of RAM High speed storage required: 60 gigabytes No. of lines in the distributed program including test data, etc.: 550 241 No. of bytes in the distributed program including test data, etc.: 6 258 245 Number of bits in a word: Arbitrary Number of processors used: 1 Has the code been vectorized or parallelized: No Distribution format: tar gzip file Nature of physical problem: Scientific visualization of atomic systems Method of solution: Rendering of atoms using computer graphic techniques, culling algorithms for data minimization, and levels-of-detail for minimal rendering Restrictions on the complexity of the problem: None Typical running time: The program is interactive in its execution Unusual features of the program: None References: The conceptual foundation and subsequent implementation of the algorithms are found in [A. Sharma, A. Nakano, R.K. Kalia, P. Vashishta, S. Kodiyalam, P. Miller, W. Zhao, X.L. Liu, T.J. Campbell, A. Haas, Presence—Teleoperators and Virtual Environments 12 (1) (2003)].
Physics Structure Analysis of Parallel Waves Concept of Physics Teacher Candidate

NASA Astrophysics Data System (ADS)

Sarwi, S.; Supardi, K. I.; Linuwih, S.

2017-04-01

The aim of this research was to find a parallel structure concept of wave physics and the factors that influence on the formation of parallel conceptions of physics teacher candidates. The method used qualitative research which types of cross-sectional design. These subjects were five of the third semester of basic physics and six of the fifth semester of wave course students. Data collection techniques used think aloud and written tests. Quantitative data were analysed with descriptive technique-percentage. The data analysis technique for belief and be aware of answers uses an explanatory analysis. Results of the research include: 1) the structure of the concept can be displayed through the illustration of a map containing the theoretical core, supplements the theory and phenomena that occur daily; 2) the trend of parallel conception of wave physics have been identified on the stationary waves, resonance of the sound and the propagation of transverse electromagnetic waves; 3) the influence on the parallel conception that reading textbooks less comprehensive and knowledge is partial understanding as forming the structure of the theory.
Laser Metalworking Technology Transfer.

DTIC Science & Technology

1986-01-01

TI 59 programmable calculator /printer...the .4 one-dimensional heat flow model and should not be used for low processing speed. The program is written for use on a Texas Instrument TI 59 programmable calculator with...speed range, and a three-dimensional model for the low speed ranges. The program is written for use on a Texas Instrument TI 59 . * programmable calculator
Sonoma County Plants (How the Indians Used Them).

ERIC Educational Resources Information Center

Clayton, Jane; Lloyd, Dick

Written for children, this guide to the plants found in Sonoma County, California includes sketches of 20 plants and descriptions of the way in which American Indians traditionally used them. Many of the plants presented here parallel those found on a wildlife walk at Spring Lake in Santa Rosa, California where outdoor education expeditions can be…
Reading Without the Left Ventral Occipito-Temporal Cortex

ERIC Educational Resources Information Center

Seghier, Mohamed L.; Neufeld, Nicholas H.; Zeidman, Peter; Leff, Alex P.; Mechelli, Andrea; Nagendran, Arjuna; Riddoch, Jane M.; Humphreys, Glyn W.; Price, Cathy J.

2012-01-01

The left ventral occipito-temporal cortex (LvOT) is thought to be essential for the rapid parallel letter processing that is required for skilled reading. Here we investigate whether rapid written word identification in skilled readers can be supported by neural pathways that do not involve LvOT. Hypotheses were derived from a stroke patient who…
The Internet for Teachers and School Library Media Specialists: Today's Applications, Tomorrow's Prospects.

ERIC Educational Resources Information Center

Valauskas, Edward J.; Ertel, Monica

This book is a collection of "success stories" written by teachers, media specialists, and school administrators who have developed their own facilities to bring the Internet to their students. The book's four main parts parallel the stages educators progress through when incorporating the Internet into the instructional process. Part 1,…
The Self-Perceptions of Young Men as Singers in Singaporean Pre-University Schools

ERIC Educational Resources Information Center

Freer, Patrick K.; Tan, Leonard

2014-01-01

The persistence of young men in choral singing activity has been widely studied in North America, with emerging parallel research in Europe (Freer, 2013; Harrison & Welch, 2012). There has been little such research in Asia. This study, of 12 young men enrolled in Singapore's pre-university schools, collected both written narratives and drawn…
Data reduction software for LORAN-C flight test evaluation

NASA Technical Reports Server (NTRS)

Fischer, J. P.

1979-01-01

A set of programs designed to be run on an IBM 370/158 computer to read the recorded time differences from the tape produced by the LORAN data collection system, convert them to latitude/longitude and produce various plotting input files are described. The programs were written so they may be tailored easily to meet the demands of a particular data reduction job. The tape reader program is written in 370 assembler language and the remaining programs are written in standard IBM FORTRAN-IV language. The tape reader program is dependent upon the recording format used by the data collection system and on the I/O macros used at the computing facility. The other programs are generally device-independent, although the plotting routines are dependent upon the plotting method used. The data reduction programs convert the recorded data to a more readily usable form; convert the time difference (TD) numbers to latitude/longitude (lat/long), to format a printed listing of the TDs, lat/long, reference times, and other information derived from the data, and produce data files which may be used for subsequent plotting.
LABORATORY PROCESS CONTROLLER USING NATURAL LANGUAGE COMMANDS FROM A PERSONAL COMPUTER

NASA Technical Reports Server (NTRS)

Will, H.

1994-01-01

The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.
Multiprocessor speed-up, Amdahl's Law, and the Activity Set Model of parallel program behavior

NASA Technical Reports Server (NTRS)

Gelenbe, Erol

1988-01-01

An important issue in the effective use of parallel processing is the estimation of the speed-up one may expect as a function of the number of processors used. Amdahl's Law has traditionally provided a guideline to this issue, although it appears excessively pessimistic in the light of recent experimental results. In this note, Amdahl's Law is amended by giving a greater importance to the capacity of a program to make effective use of parallel processing, but also recognizing the fact that imbalance of the workload of each processor is bound to occur. An activity set model of parallel program behavior is then introduced along with the corresponding parallelism index of a program, leading to upper and lower bounds to the speed-up.
Desktop computer graphics for RMS/payload handling flight design

NASA Technical Reports Server (NTRS)

Homan, D. J.

1984-01-01

A computer program, the Multi-Adaptive Drawings, Renderings and Similitudes (MADRAS) program, is discussed. The modeling program, written for a desktop computer system (the Hewlett-Packard 9845/C), is written in BASIC and uses modular construction of objects while generating both wire-frame and hidden-line drawings from any viewpoint. The dimensions and placement of objects are user definable. Once the hidden-line calculations are made for a particular viewpoint, the viewpoint may be rotated in pan, tilt, and roll without further hidden-line calculations. The use and results of this program are discussed.
Experiences with hypercube operating system instrumentation

NASA Technical Reports Server (NTRS)

Reed, Daniel A.; Rudolph, David C.

1989-01-01

The difficulties in conceptualizing the interactions among a large number of processors make it difficult both to identify the sources of inefficiencies and to determine how a parallel program could be made more efficient. This paper describes an instrumentation system that can trace the execution of distributed memory parallel programs by recording the occurrence of parallel program events. The resulting event traces can be used to compile summary statistics that provide a global view of program performance. In addition, visualization tools permit the graphic display of event traces. Visual presentation of performance data is particularly useful, indeed, necessary for large-scale parallel computers; the enormous volume of performance data mandates visual display.
Communications oriented programming of parallel iterative solutions of sparse linear systems

NASA Technical Reports Server (NTRS)

Patrick, M. L.; Pratt, T. W.

1986-01-01

Parallel algorithms are developed for a class of scientific computational problems by partitioning the problems into smaller problems which may be solved concurrently. The effectiveness of the resulting parallel solutions is determined by the amount and frequency of communication and synchronization and the extent to which communication can be overlapped with computation. Three different parallel algorithms for solving the same class of problems are presented, and their effectiveness is analyzed from this point of view. The algorithms are programmed using a new programming environment. Run-time statistics and experience obtained from the execution of these programs assist in measuring the effectiveness of these algorithms.

Formatting scripts with computers and Extended BASIC.

PubMed

Menning, C B

1984-02-01

A computer program, written in the language of Extended BASIC, is presented which enables scripts, for educational media, to be quickly written in a nearly unformatted style. From the resulting script file, stored on magnetic tape or disk, the computer program formats the script into either a storyboard , a presentation, or a narrator 's script. Script headings and page and paragraph numbers are automatic features in the word processing. Suggestions are given for making personal modifications to the computer program.
Analysis of spacecraft data

NASA Technical Reports Server (NTRS)

1984-01-01

A software program for the production and analysis of data from the Dynamics Explorer-A (DE-A) satellite was maintained and modified and new software initiated. A capability was developed to process DE-A plasma-wave instrument mission analysis files on the Tektronic 4027 color CRT, for which two programs were written. The algorithm for the calibration lookup table for the plasma-wave instrument data was modified and verified, and a production program to generate color FR-80 spectrograms was written.
Parallel programming of saccades during natural scene viewing: evidence from eye movement positions.

PubMed

Wu, Esther X W; Gilani, Syed Omer; van Boxtel, Jeroen J A; Amihai, Ido; Chua, Fook Kee; Yen, Shih-Cheng

2013-10-24

Previous studies have shown that saccade plans during natural scene viewing can be programmed in parallel. This evidence comes mainly from temporal indicators, i.e., fixation durations and latencies. In the current study, we asked whether eye movement positions recorded during scene viewing also reflect parallel programming of saccades. As participants viewed scenes in preparation for a memory task, their inspection of the scene was suddenly disrupted by a transition to another scene. We examined whether saccades after the transition were invariably directed immediately toward the center or were contingent on saccade onset times relative to the transition. The results, which showed a dissociation in eye movement behavior between two groups of saccades after the scene transition, supported the parallel programming account. Saccades with relatively long onset times (>100 ms) after the transition were directed immediately toward the center of the scene, probably to restart scene exploration. Saccades with short onset times (<100 ms) moved to the center only one saccade later. Our data on eye movement positions provide novel evidence of parallel programming of saccades during scene viewing. Additionally, results from the analyses of intersaccadic intervals were also consistent with the parallel programming hypothesis.
77 FR 47797 - Arkansas: Final Authorization of State Hazardous Waste Management Program Revisions

Federal Register 2010, 2011, 2012, 2013, 2014

2012-08-10

... the preamble to the immediate final rule. Unless we get written comments which oppose this... comment. If you want to comment on this action, you must do so at this time. DATES: Send your written comments by September 10, 2012. ADDRESSES: Send written comments to Alima Patterson, Region 6, Regional...
A Comparison of Written, Vocal, and Video Feedback When Training Teachers

ERIC Educational Resources Information Center

Luck, Kally M.; Lerman, Dorothea C.; Wu, Wai-Ling; Dupuis, Danielle L.; Hussein, Louisa A.

2018-01-01

We compared the effectiveness of and preference for different feedback strategies when training six special education teachers during a 5-day summer training program. In Experiment 1, teachers received written or vocal feedback while learning to implement two different types of preference assessments. In Experiment 2, we compared either written or…
Discourse Features of Written Mexican Spanish: Current Research in Contrastive Rhetoric and Its Implications.

ERIC Educational Resources Information Center

Montano-Harmon, Maria Rosario

1991-01-01

Analyzes discourse features of compositions written in Spanish by secondary school students in Mexico, draws comparisons with those written in English by Anglo-American students in the United States, and discusses the implications of the results for teaching and evaluating composition skills in Spanish language programs. (29 references) (GLR)
A survey of parallel programming tools

NASA Technical Reports Server (NTRS)

Cheng, Doreen Y.

1991-01-01

This survey examines 39 parallel programming tools. Focus is placed on those tool capabilites needed for parallel scientific programming rather than for general computer science. The tools are classified with current and future needs of Numerical Aerodynamic Simulator (NAS) in mind: existing and anticipated NAS supercomputers and workstations; operating systems; programming languages; and applications. They are divided into four categories: suggested acquisitions, tools already brought in; tools worth tracking; and tools eliminated from further consideration at this time.
Xyce Parallel Electronic Simulator Users' Guide Version 6.8

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase$-$ a message passing parallel implementation $-$ which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
An object-oriented, coprocessor-accelerated model for ice sheet simulations

NASA Astrophysics Data System (ADS)

Seddik, H.; Greve, R.

2013-12-01

Recently, numerous models capable of modeling the thermo-dynamics of ice sheets have been developed within the ice sheet modeling community. Their capabilities have been characterized by a wide range of features with different numerical methods (finite difference or finite element), different implementations of the ice flow mechanics (shallow-ice, higher-order, full Stokes) and different treatments for the basal and coastal areas (basal hydrology, basal sliding, ice shelves). Shallow-ice models (SICOPOLIS, IcIES, PISM, etc) have been widely used for modeling whole ice sheets (Greenland and Antarctica) due to the relatively low computational cost of the shallow-ice approximation but higher order (ISSM, AIF) and full Stokes (Elmer/Ice) models have been recently used to model the Greenland ice sheet. The advance in processor speed and the decrease in cost for accessing large amount of memory and storage have undoubtedly been the driving force in the commoditization of models with higher capabilities, and the popularity of Elmer/Ice (http://elmerice.elmerfem.com) with an active user base is a notable representation of this trend. Elmer/Ice is a full Stokes model built on top of the multi-physics package Elmer (http://www.csc.fi/english/pages/elmer) which provides the full machinery for the complex finite element procedure and is fully parallel (mesh partitioning with OpenMPI communication). Elmer is mainly written in Fortran 90 and targets essentially traditional processors as the code base was not initially written to run on modern coprocessors (yet adding support for the recently introduced x86 based coprocessors is possible). Furthermore, a truly modular and object-oriented implementation is required for quick adaptation to fast evolving capabilities in hardware (Fortran 2003 provides an object-oriented programming model while not being clean and requiring a tricky refactoring of Elmer code). In this work, the object-oriented, coprocessor-accelerated finite element code Sainou is introduced. Sainou is an Elmer fork which is reimplemented in Objective C and used for experimenting with ice sheet models running on coprocessors, essentially GPU devices. GPUs are highly parallel processors that provide opportunities for fine-grained parallelization of the full Stokes problem using the standard OpenCL language (http://www.khronos.org/opencl/) to access the device. Sainou is built upon a collection of Objective C base classes that service a modular kernel (itself a base class) which provides the core methods to solve the finite element problem. An early implementation of Sainou will be presented with emphasis on the object architecture and the strategies of parallelizations. The computation of a simple heat conduction problem is used to test the implementation which also provides experimental support for running the global matrix assembly on GPU.
Backtracking and Re-execution in the Automatic Debugging of Parallelized Programs

NASA Technical Reports Server (NTRS)

Matthews, Gregory; Hood, Robert; Johnson, Stephen; Leggett, Peter; Biegel, Bryan (Technical Monitor)

2002-01-01

In this work we describe a new approach using relative debugging to find differences in computation between a serial program and a parallel version of th it program. We use a combination of re-execution and backtracking in order to find the first difference in computation that may ultimately lead to an incorrect value that the user has indicated. In our prototype implementation we use static analysis information from a parallelization tool in order to perform the backtracking as well as the mapping required between serial and parallel computations.
Single-Photon Computed Tomography With Large Position-Sensitive Phototubes*

NASA Astrophysics Data System (ADS)

Feldmann, John; Ranck, Amoreena; Saunders, Robert S.; Welsh, Robert E.; Bradley, Eric L.; Saha, Margaret S.; Kross, Brian; Majewski, Stan; Popov, Vladimir; Weisenberger, Andrew G.; Wojcik, Randolph

2000-10-01

Position-sensitive photomultiplier tubes (PSPMTs) coupled to pixelated CsI(Tl) scintillators have been used with parallel-hole collimators to view the metabolism in small animals of radiopharmaceuticals tagged with ^125I. We report here our preliminary results analyzed using a tomography program^1 written in IDL programming language. The PSPMTs are mounted on a rotating gantry so as to view the subject animal from any azimuth. Preliminary results to test the tomography algorithm have been obtained by placing a variety of plastic mouse-brain phantoms (loaded with Na^125I) in front of one of the detectors and rotating the phantom in steps through 360 degrees. Results of this simulation taken with a variety of collimator hole sizes will be compared and discussed. Extentions of this technique to the use of very small PSPMTs (Hamamatsu M-64) which are capable of a very close approach to those parts of the animal of greatest interest will be described. *Supported in part by The Department of Energy, The National Science Foundation, The American Diabetes Association, The Howard Hughes Foundation and The Jeffress Trust. 1. Tomography algorithm kindly provided by Dr. S. Meikle of The Royal Prince Albert Hospital, Sydney, Australia
Preventing Summer Learning Loss: Results of a Summer Literacy Program for Students from Low-SES Homes

ERIC Educational Resources Information Center

Bowers, Lisa M.; Schwarz, Ilsa

2018-01-01

Among the academic challenges faced by students from low-socioeconomic status (SES) homes is the loss of academic skills during the summer months. A total of 22 elementary students from low-SES homes participated in a summer program designed to improve oral and written narrative skills. We gathered oral and written narrative samples at the…
Expert System for Automated Design Synthesis

NASA Technical Reports Server (NTRS)

Rogers, James L., Jr.; Barthelemy, Jean-Francois M.

1987-01-01

Expert-system computer program EXADS developed to aid users of Automated Design Synthesis (ADS) general-purpose optimization program. EXADS aids engineer in determining best combination based on knowledge of specific problem and expert knowledge stored in knowledge base. Available in two interactive machine versions. IBM PC version (LAR-13687) written in IQ-LISP. DEC VAX version (LAR-13688) written in Franz-LISP.
Qualitative Analysis of Teachers' Written Self-Reflections after Implementation of a Social-Emotional Learning Program in Latvia

ERIC Educational Resources Information Center

Martinsone, Baiba; Damberga, Ilze

2017-01-01

The aim of the present study was to analyze teachers' written self-reflections after implementation of a social-emotional learning program, which was recently developed specifically for the sociocultural context of Latvia. The goal of the analysis was to examine how teachers reflect upon their own strengths and weaknesses in implementing the…
Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry

1998-01-01

This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.
An OpenACC-Based Unified Programming Model for Multi-accelerator Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Jungwon; Lee, Seyong; Vetter, Jeffrey S

2015-01-01

This paper proposes a novel SPMD programming model of OpenACC. Our model integrates the different granularities of parallelism from vector-level parallelism to node-level parallelism into a single, unified model based on OpenACC. It allows programmers to write programs for multiple accelerators using a uniform programming model whether they are in shared or distributed memory systems. We implement a prototype of our model and evaluate its performance with a GPU-based supercomputer using three benchmark applications.
76 FR 67200 - Proposed National Toxicology Program (NTP) Review Process for the Report on Carcinogens: Request...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-10-31

... announcement of listening session. SUMMARY: The NTP invites written public comment on the proposed Report on... proposed process. DATES: The deadline for submission of written comments is November 30, 2011, and the...: Written comments should be sent to Dr. Ruth Lunn, Director, Office of the Report on Carcinogens, DNTP...
Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Hao-Qiang; anMey, Dieter; Hatay, Ferhat F.

2003-01-01

Clusters of SMP (Symmetric Multi-Processors) nodes provide support for a wide range of parallel programming paradigms. The shared address space within each node is suitable for OpenMP parallelization. Message passing can be employed within and across the nodes of a cluster. Multiple levels of parallelism can be achieved by combining message passing and OpenMP parallelization. Which programming paradigm is the best will depend on the nature of the given problem, the hardware components of the cluster, the network, and the available software. In this study we compare the performance of different implementations of the same CFD benchmark application, using the same numerical algorithm but employing different programming paradigms.
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems.

PubMed

Stone, John E; Gohara, David; Shi, Guochun

2010-05-01

We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures.
DB90: A Fortran Callable Relational Database Routine for Scientific and Engineering Computer Programs

NASA Technical Reports Server (NTRS)

Wrenn, Gregory A.

2005-01-01

This report describes a database routine called DB90 which is intended for use with scientific and engineering computer programs. The software is written in the Fortran 90/95 programming language standard with file input and output routines written in the C programming language. These routines should be completely portable to any computing platform and operating system that has Fortran 90/95 and C compilers. DB90 allows a program to supply relation names and up to 5 integer key values to uniquely identify each record of each relation. This permits the user to select records or retrieve data in any desired order.

42 CFR 456.180 - Individual written plan of care.

Code of Federal Regulations, 2010 CFR

2010-10-01

... SERVICES (CONTINUED) MEDICAL ASSISTANCE PROGRAMS UTILIZATION CONTROL Utilization Control: Mental Hospitals Plan of Care § 456.180 Individual written plan of care. (a) Before admission to a mental hospital or...
Transportable Applications Environment Plus, Version 5.1

NASA Technical Reports Server (NTRS)

1994-01-01

Transportable Applications Environment Plus (TAE+) computer program providing integrated, portable programming environment for developing and running application programs based on interactive windows, text, and graphical objects. Enables both programmers and nonprogrammers to construct own custom application interfaces easily and to move interfaces and application programs to different computers. Used to define corporate user interface, with noticeable improvements in application developer's and end user's learning curves. Main components are; WorkBench, What You See Is What You Get (WYSIWYG) software tool for design and layout of user interface; and WPT (Window Programming Tools) Package, set of callable subroutines controlling user interface of application program. WorkBench and WPT's written in C++, and remaining code written in C.
Apple Macintosh programs for nucleic and protein sequence analyses.

PubMed Central

Bellon, B

1988-01-01

This paper describes a package of programs for handling and analyzing nucleic acid and protein sequences using the Apple Macintosh microcomputer. There are three important features of these programs: first, because of the now classical Macintosh interface the programs can be easily used by persons with little or no computer experience. Second, it is possible to save all the data, written in an editable scrolling text window or drawn in a graphic window, as files that can be directly used either as word processing documents or as picture documents. Third, sequences can be easily exchanged with any other computer. The package is composed of thirteen programs, written in Pascal programming language. PMID:2832832
Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Haoqiang; anMey, Dieter; Hatay, Ferhat F.

2003-01-01

With the advent of parallel hardware and software technologies users are faced with the challenge to choose a programming paradigm best suited for the underlying computer architecture. With the current trend in parallel computer architectures towards clusters of shared memory symmetric multi-processors (SMP), parallel programming techniques have evolved to support parallelism beyond a single level. Which programming paradigm is the best will depend on the nature of the given problem, the hardware architecture, and the available software. In this study we will compare different programming paradigms for the parallelization of a selected benchmark application on a cluster of SMP nodes. We compare the timings of different implementations of the same CFD benchmark application employing the same numerical algorithm on a cluster of Sun Fire SMP nodes. The rest of the paper is structured as follows: In section 2 we briefly discuss the programming models under consideration. We describe our compute platform in section 3. The different implementations of our benchmark code are described in section 4 and the performance results are presented in section 5. We conclude our study in section 6.
Matematica Para La Escuela Secundaria: Geometria (Parte 1). Traduccion Preliminar de la Edicion Inglesa Revisada. (Mathematics for High School: Geometry, Part 1. Preliminary Translation of the Revised English Edition).

ERIC Educational Resources Information Center

Allen, Frank B.; And Others

This is part one of a two-part SMSG mathematics text for high school students. Topics include plane geometry, real numbers, triangles and angles, congruence, construction, parallel lines, perpendicular lines, and parallelograms. The text is written in Spanish. (RH)
Rapid assessment of assignments using plagiarism detection software.

PubMed

Bischoff, Whitney R; Abrego, Patricia C

2011-01-01

Faculty members most often use plagiarism detection software to detect portions of students' written work that have been copied and/or not attributed to their authors. The rise in plagiarism has led to a parallel rise in software products designed to detect plagiarism. Some of these products are configurable for rapid assessment and teaching, as well as for plagiarism detection.
Distributed File System Utilities to Manage Large DatasetsVersion 0.5

DOE Office of Scientific and Technical Information (OSTI.GOV)

2014-05-21

FileUtils provides a suite of tools to manage large datasets typically created by large parallel MPI applications. They are written in C and use standard POSIX I/Ocalls. The current suite consists of tools to copy, compare, remove, and list. The tools provide dramatic speedup over existing Linux tools, which often run as a single process.
The Affordability of University Education: A Perspective from Both Sides of the 49th Parallel

ERIC Educational Resources Information Center

Swail, Watson Scott

2004-01-01

This study was conducted to better understand the relative affordability of public university education in Canada and the United States. The report was written to answer two key questions: (1) How does access to university education in Canada compare to access in the US? and (2) How affordable is the Canadian university system compared to the…
Investigation of obstacle effect to improve conjugate heat transfer in backward facing step channel using fast simulation of incompressible flow

NASA Astrophysics Data System (ADS)

Nouri-Borujerdi, Ali; Moazezi, Arash

2018-01-01

The current study investigates the conjugate heat transfer characteristics for laminar flow in backward facing step channel. All of the channel walls are insulated except the lower thick wall under a constant temperature. The upper wall includes a insulated obstacle perpendicular to flow direction. The effect of obstacle height and location on the fluid flow and heat transfer are numerically explored for the Reynolds number in the range of 10 ≤ Re ≤ 300. Incompressible Navier-Stokes and thermal energy equations are solved simultaneously in fluid region by the upwind compact finite difference scheme based on flux-difference splitting in conjunction with artificial compressibility method. In the thick wall, the energy equation is obtained by Laplace equation. A multi-block approach is used to perform parallel computing to reduce the CPU time. Each block is modeled separately by sharing boundary conditions with neighbors. The developed program for modeling was written in FORTRAN language with OpenMP API. The obtained results showed that using of the multi-block parallel computing method is a simple robust scheme with high performance and high-order accurate. Moreover, the obtained results demonstrated that the increment of Reynolds number and obstacle height as well as decrement of horizontal distance between the obstacle and the step improve the heat transfer.
A mechanism for efficient debugging of parallel programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, B.P.; Choi, J.D.

1988-01-01

This paper addresses the design and implementation of an integrated debugging system for parallel programs running on shared memory multi-processors (SMMP). The authors describe the use of flowback analysis to provide information on causal relationships between events in a program's execution without re-executing the program for debugging. The authors introduce a mechanism called incremental tracing that, by using semantic analyses of the debugged program, makes the flowback analysis practical with only a small amount of trace generated during execution. The extend flowback analysis to apply to parallel programs and describe a method to detect race conditions in the interactions ofmore » the co-operating processes.« less
"I Like to Plan Events": A Document Analysis of Essays Written by Applicants to a Public Relations Program

ERIC Educational Resources Information Center

Taylor, Ronald E.

2016-01-01

A document analysis of 249 essays written during a 5-year period by applicants to a public relations program at a major state university in the southeast suggests that there are enduring reasons why students choose to major in public relations. Public relations is described as a major that allows for and encourages creative expression and that…
Mining Predictors of Success in Air Force Flight Training Regiments via Semantic Analysis of Instructor Evaluations

DTIC Science & Technology

2018-03-01

We apply our methodology to the criticism text written in the flight-training program student evaluations in order to construct a model that...factors. We apply our methodology to the criticism text written in the flight-training program student evaluations in order to construct a model...9 D. BINARY CLASSIFICATION AND FEATURE SELECTION ..........11 III. METHODOLOGY
Effectiveness of a Heritage Educational Program for the Acquisition of Oral and Written French and Tahitian in French Polynesia

ERIC Educational Resources Information Center

Nocus, Isabelle; Guimard, Philippe; Vernaudon, Jacques; Paia, Mirose; Cosnefroy, Olivier; Florin, Agnes

2012-01-01

The research examines the effects of a bilingual pedagogical program (French/Tahitian) on the acquisition of oral and written French as well as the Tahitian language itself in primary schools in French Polynesia. 125 children divided into an experimental group (partially schooled in Tahitian for 300 min per week) and a control group (schooled in…
77 FR 15343 - Oklahoma: Final Authorization of State Hazardous Waste Management Program Revisions

Federal Register 2010, 2011, 2012, 2013, 2014

2012-03-15

... written comments by April 16, 2012. ADDRESSES: Send written comments to Alima Patterson, Region 6... Patterson (214) 665-8533. SUPPLEMENTARY INFORMATION: For additional information, please see the immediate...
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems

PubMed Central

Stone, John E.; Gohara, David; Shi, Guochun

2010-01-01

We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures. PMID:21037981
ETARA - EVENT TIME AVAILABILITY, RELIABILITY ANALYSIS

NASA Technical Reports Server (NTRS)

Viterna, L. A.

1994-01-01

The ETARA system was written to evaluate the performance of the Space Station Freedom Electrical Power System, but the methodology and software can be modified to simulate any system that can be represented by a block diagram. ETARA is an interactive, menu-driven reliability, availability, and maintainability (RAM) simulation program. Given a Reliability Block Diagram representation of a system, the program simulates the behavior of the system over a specified period of time using Monte Carlo methods to generate block failure and repair times as a function of exponential and/or Weibull distributions. ETARA can calculate availability parameters such as equivalent availability, state availability (percentage of time at a particular output state capability), continuous state duration and number of state occurrences. The program can simulate initial spares allotment and spares replenishment for a resupply cycle. The number of block failures are tabulated both individually and by block type. ETARA also records total downtime, repair time, and time waiting for spares. Maintenance man-hours per year and system reliability, with or without repair, at or above a particular output capability can also be calculated. The key to using ETARA is the development of a reliability or availability block diagram. The block diagram is a logical graphical illustration depicting the block configuration necessary for a function to be successfully accomplished. Each block can represent a component, a subsystem, or a system. The function attributed to each block is considered for modeling purposes to be either available or unavailable; there are no degraded modes of block performance. A block does not have to represent physically connected hardware in the actual system to be connected in the block diagram. The block needs only to have a role in contributing to an available system function. ETARA can model the RAM characteristics of systems represented by multilayered, nesting block diagrams. There are no restrictions on the number of total blocks or on the number of blocks in a series, parallel, or M-of-N parallel subsystem. In addition, the same block can appear in more than one subsystem if such an arrangement is necessary for an accurate model. ETARA 3.3 is written in APL2 for IBM PC series computers or compatibles running MS-DOS and the APL2 interpreter. Hardware requirements for the APL2 system include 640K of RAM, 2Mb of extended memory, and an 80386 or 80486 processor with an 80x87 math co-processor. The standard distribution medium for this package is a set of two 5.25 inch 360K MS-DOS format diskettes. A sample executable is included. The executable contains licensed material from the APL2 for the IBM PC product which is program property of IBM; Copyright IBM Corporation 1988 - All rights reserved. It is distributed with IBM's permission. The contents of the diskettes are compressed using the PKWARE archiving tools. The utility to unarchive the files, PKUNZIP.EXE, is included. ETARA was developed in 1990 and last updated in 1991.
A parallel program for numerical simulation of discrete fracture network and groundwater flow

NASA Astrophysics Data System (ADS)

Huang, Ting-Wei; Liou, Tai-Sheng; Kalatehjari, Roohollah

2017-04-01

The ability of modeling fluid flow in Discrete Fracture Network (DFN) is critical to various applications such as exploration of reserves in geothermal and petroleum reservoirs, geological sequestration of carbon dioxide and final disposal of spent nuclear fuels. Although several commerical or acdametic DFN flow simulators are already available (e.g., FracMan and DFNWORKS), challenges in terms of computational efficiency and three-dimensional visualization still remain, which therefore motivates this study for developing a new DFN and flow simulator. A new DFN and flow simulator, DFNbox, was written in C++ under a cross-platform software development framework provided by Qt. DFNBox integrates the following capabilities into a user-friendly drop-down menu interface: DFN simulation and clipping, 3D mesh generation, fracture data analysis, connectivity analysis, flow path analysis and steady-state grounwater flow simulation. All three-dimensional visualization graphics were developed using the free OpenGL API. Similar to other DFN simulators, fractures are conceptualized as random point process in space, with stochastic characteristics represented by orientation, size, transmissivity and aperture. Fracture meshing was implemented by Delaunay triangulation for visualization but not flow simulation purposes. Boundary element method was used for flow simulations such that only unknown head or flux along exterior and interection bounaries are needed for solving the flow field in the DFN. Parallel compuation concept was taken into account in developing DFNbox for calculations that such concept is possible. For example, the time-consuming seqential code for fracture clipping calculations has been completely replaced by a highly efficient parallel one. This can greatly enhance compuational efficiency especially on multi-thread platforms. Furthermore, DFNbox have been successfully tested in Windows and Linux systems with equally-well performance.
Revised Extended Grid Library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martz, Roger L.

The Revised Eolus Grid Library (REGL) is a mesh-tracking library that was developed for use with the MCNP6TM computer code so that (radiation) particles can track on an unstructured mesh. The unstructured mesh is a finite element representation of any geometric solid model created with a state-of-the-art CAE/CAD tool. The mesh-tracking library is written using modern Fortran and programming standards; the library is Fortran 2003 compliant. The library was created with a defined application programmer interface (API) so that it could easily integrate with other particle tracking/transport codes. The library does not handle parallel processing via the message passing interfacemore » (mpi), but has been used successfully where the host code handles the mpi calls. The library is thread-safe and supports the OpenMP paradigm. As a library, all features are available through the API and overall a tight coupling between it and the host code is required. Features of the library are summarized with the following list: Can accommodate first and second order 4, 5, and 6-sided polyhedra; any combination of element types may appear in a single geometry model; parts may not contain tetrahedra mixed with other element types; pentahedra and hexahedra can be together in the same part; robust handling of overlaps and gaps; tracks element-to-element to produce path length results at the element level; finds element numbers for a given mesh location; finds intersection points on element faces for the particle tracks; produce a data file for post processing results analysis; reads Abaqus .inp input (ASCII) files to obtain information for the global mesh-model; supports parallel input processing via mpi; and support parallel particle transport by both mpi and OpenMP.« less
Genetic algorithms using SISAL parallel programming language

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tejada, S.

1994-05-06

Genetic algorithms are a mathematical optimization technique developed by John Holland at the University of Michigan [1]. The SISAL programming language possesses many of the characteristics desired to implement genetic algorithms. SISAL is a deterministic, functional programming language which is inherently parallel. Because SISAL is functional and based on mathematical concepts, genetic algorithms can be efficiently translated into the language. Several of the steps involved in genetic algorithms, such as mutation, crossover, and fitness evaluation, can be parallelized using SISAL. In this paper I will l discuss the implementation and performance of parallel genetic algorithms in SISAL.
An Expert System for the Development of Efficient Parallel Code

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Chun, Robert; Jin, Hao-Qiang; Labarta, Jesus; Gimenez, Judit

2004-01-01

We have built the prototype of an expert system to assist the user in the development of efficient parallel code. The system was integrated into the parallel programming environment that is currently being developed at NASA Ames. The expert system interfaces to tools for automatic parallelization and performance analysis. It uses static program structure information and performance data in order to automatically determine causes of poor performance and to make suggestions for improvements. In this paper we give an overview of our programming environment, describe the prototype implementation of our expert system, and demonstrate its usefulness with several case studies.

Optics Program Modified for Multithreaded Parallel Computing

NASA Technical Reports Server (NTRS)

Lou, John; Bedding, Dave; Basinger, Scott

2006-01-01

A powerful high-performance computer program for simulating and analyzing adaptive and controlled optical systems has been developed by modifying the serial version of the Modeling and Analysis for Controlled Optical Systems (MACOS) program to impart capabilities for multithreaded parallel processing on computing systems ranging from supercomputers down to Symmetric Multiprocessing (SMP) personal computers. The modifications included the incorporation of OpenMP, a portable and widely supported application interface software, that can be used to explicitly add multithreaded parallelism to an application program under a shared-memory programming model. OpenMP was applied to parallelize ray-tracing calculations, one of the major computing components in MACOS. Multithreading is also used in the diffraction propagation of light in MACOS based on pthreads [POSIX Thread, (where "POSIX" signifies a portable operating system for UNIX)]. In tests of the parallelized version of MACOS, the speedup in ray-tracing calculations was found to be linear, or proportional to the number of processors, while the speedup in diffraction calculations ranged from 50 to 60 percent, depending on the type and number of processors. The parallelized version of MACOS is portable, and, to the user, its interface is basically the same as that of the original serial version of MACOS.
Improving Nutrition and Physical Activity Policies in Afterschool Programs: Results from a Group-Randomized Controlled Trial

PubMed Central

Kenney, Erica L.; Giles, Catherine M.; deBlois, Madeleine E.; Gortmaker, Steven L.; Chinfatt, Sherene; Cradock, Angie L.

2017-01-01

OBJECTIVE Afterschool programs can be health-promoting environments for children. Written policies positively influence nutrition and physical activity (PA) environments, but effective strategies for building staff capacity to write such policies have not been evaluated. This study measures the comprehensiveness of written nutrition, PA, and screen time policies in afterschool programs and assesses impact of the Out of School Nutrition and Physical Activity (OSNAP) intervention on key policies. METHODS Twenty afterschool programs in Boston, MA participated in a group-randomized, controlled trial from September 2010 to June 2011. Intervention program staff attended learning collaboratives focused on practice and policy change. The Out-of-School Time (OST) Policy Assessment Index evaluated written policies. Inter-rater reliability and construct validity of the measure and impact of the intervention on written policies were assessed. RESULTS The measure demonstrated moderate to excellent inter-rater reliability (Spearman’s r=0.53 to 0.97) and construct validity. OSNAP was associated with significant increases in standards-based policy statements surrounding snacks (+2.6, p=0.003), beverages (+2.3, p=0.008), screen time (+0.8, p=0.046), family communication (+2.2, p=0.002), and a summary index of OSNAP goals (+3.3, p=0.02). CONCLUSIONS OSNAP demonstrated success in building staff capacity to write health-promoting policy statements. Future research should focus on determining policy change impact on practices. PMID:24941286
Solving Integer Programs from Dependence and Synchronization Problems

DTIC Science & Technology

1993-03-01

DEFF.NSNE Solving Integer Programs from Dependence and Synchronization Problems Jaspal Subhlok March 1993 CMU-CS-93-130 School of Computer ScienceT IC...method Is an exact and efficient way of solving integer programming problems arising in dependence and synchronization analysis of parallel programs...7/;- p Keywords: Exact dependence tesing, integer programming. parallelilzng compilers, parallel program analysis, synchronization analysis Solving
Highly Parallel Alternating Directions Algorithm for Time Dependent Problems

NASA Astrophysics Data System (ADS)

Ganzha, M.; Georgiev, K.; Lirkov, I.; Margenov, S.; Paprzycki, M.

2011-11-01

In our work, we consider the time dependent Stokes equation on a finite time interval and on a uniform rectangular mesh, written in terms of velocity and pressure. For this problem, a parallel algorithm based on a novel direction splitting approach is developed. Here, the pressure equation is derived from a perturbed form of the continuity equation, in which the incompressibility constraint is penalized in a negative norm induced by the direction splitting. The scheme used in the algorithm is composed of two parts: (i) velocity prediction, and (ii) pressure correction. This is a Crank-Nicolson-type two-stage time integration scheme for two and three dimensional parabolic problems in which the second-order derivative, with respect to each space variable, is treated implicitly while the other variable is made explicit at each time sub-step. In order to achieve a good parallel performance the solution of the Poison problem for the pressure correction is replaced by solving a sequence of one-dimensional second order elliptic boundary value problems in each spatial direction. The parallel code is implemented using the standard MPI functions and tested on two modern parallel computer systems. The performed numerical tests demonstrate good level of parallel efficiency and scalability of the studied direction-splitting-based algorithm.
FastaValidator: an open-source Java library to parse and validate FASTA formatted sequences.

PubMed

Waldmann, Jost; Gerken, Jan; Hankeln, Wolfgang; Schweer, Timmy; Glöckner, Frank Oliver

2014-06-14

Advances in sequencing technologies challenge the efficient importing and validation of FASTA formatted sequence data which is still a prerequisite for most bioinformatic tools and pipelines. Comparative analysis of commonly used Bio*-frameworks (BioPerl, BioJava and Biopython) shows that their scalability and accuracy is hampered. FastaValidator represents a platform-independent, standardized, light-weight software library written in the Java programming language. It targets computer scientists and bioinformaticians writing software which needs to parse quickly and accurately large amounts of sequence data. For end-users FastaValidator includes an interactive out-of-the-box validation of FASTA formatted files, as well as a non-interactive mode designed for high-throughput validation in software pipelines. The accuracy and performance of the FastaValidator library qualifies it for large data sets such as those commonly produced by massive parallel (NGS) technologies. It offers scientists a fast, accurate and standardized method for parsing and validating FASTA formatted sequence data.
Status and future plans for open source QuickPIC

NASA Astrophysics Data System (ADS)

An, Weiming; Decyk, Viktor; Mori, Warren

2017-10-01

QuickPIC is a three dimensional (3D) quasi-static particle-in-cell (PIC) code developed based on the UPIC framework. It can be used for efficiently modeling plasma based accelerator (PBA) problems. With quasi-static approximation, QuickPIC can use different time scales for calculating the beam (or laser) evolution and the plasma response, and a 3D plasma wake field can be simulated using a two-dimensional (2D) PIC code where the time variable is ξ = ct - z and z is the beam propagation direction. QuickPIC can be thousand times faster than the normal PIC code when simulating the PBA. It uses an MPI/OpenMP hybrid parallel algorithm, which can be run on either a laptop or the largest supercomputer. The open source QuickPIC is an object-oriented program with high level classes written in Fortran 2003. It can be found at https://github.com/UCLA-Plasma-Simulation-Group/QuickPIC-OpenSource.git
Modeling Magnetic Properties in EZTB

NASA Technical Reports Server (NTRS)

Lee, Seungwon; vonAllmen, Paul

2007-01-01

A software module that calculates magnetic properties of a semiconducting material has been written for incorporation into, and execution within, the Easy (Modular) Tight-Binding (EZTB) software infrastructure. [EZTB is designed to model the electronic structures of semiconductor devices ranging from bulk semiconductors, to quantum wells, quantum wires, and quantum dots. EZTB implements an empirical tight-binding mathematical model of the underlying physics.] This module can model the effect of a magnetic field applied along any direction and does not require any adjustment of model parameters. The module has thus far been applied to study the performances of silicon-based quantum computers in the presence of magnetic fields and of miscut angles in quantum wells. The module is expected to assist experimentalists in fabricating a spin qubit in a Si/SiGe quantum dot. This software can be executed in almost any Unix operating system, utilizes parallel computing, can be run as a Web-portal application program. The module has been validated by comparison of its predictions with experimental data available in the literature.
The FORCE - A highly portable parallel programming language

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.
The FORCE: A highly portable parallel programming language

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.
76 FR 19004 - Oklahoma: Final Authorization of State Hazardous Waste Management Program Revisions

Federal Register 2010, 2011, 2012, 2013, 2014

2011-04-06

... written comments by May 6, 2011. ADDRESSES: Send written comments to Alima Patterson, Region 6, Regional... Patterson (214) 665-8533. SUPPLEMENTARY INFORMATION: For additional information, please see the immediate...
Architectures for reasoning in parallel

NASA Technical Reports Server (NTRS)

Hall, Lawrence O.

1989-01-01

The research conducted has dealt with rule-based expert systems. The algorithms that may lead to effective parallelization of them were investigated. Both the forward and backward chained control paradigms were investigated in the course of this work. The best computer architecture for the developed and investigated algorithms has been researched. Two experimental vehicles were developed to facilitate this research. They are Backpac, a parallel backward chained rule-based reasoning system and Datapac, a parallel forward chained rule-based reasoning system. Both systems have been written in Multilisp, a version of Lisp which contains the parallel construct, future. Applying the future function to a function causes the function to become a task parallel to the spawning task. Additionally, Backpac and Datapac have been run on several disparate parallel processors. The machines are an Encore Multimax with 10 processors, the Concert Multiprocessor with 64 processors, and a 32 processor BBN GP1000. Both the Concert and the GP1000 are switch-based machines. The Multimax has all its processors hung off a common bus. All are shared memory machines, but have different schemes for sharing the memory and different locales for the shared memory. The main results of the investigations come from experiments on the 10 processor Encore and the Concert with partitions of 32 or less processors. Additionally, experiments have been run with a stripped down version of EMYCIN.
Parallel Implementation of Triangular Cellular Automata for Computing Two-Dimensional Elastodynamic Response on Arbitrary Domains

NASA Astrophysics Data System (ADS)

Leamy, Michael J.; Springer, Adam C.

In this research we report parallel implementation of a Cellular Automata-based simulation tool for computing elastodynamic response on complex, two-dimensional domains. Elastodynamic simulation using Cellular Automata (CA) has recently been presented as an alternative, inherently object-oriented technique for accurately and efficiently computing linear and nonlinear wave propagation in arbitrarily-shaped geometries. The local, autonomous nature of the method should lead to straight-forward and efficient parallelization. We address this notion on symmetric multiprocessor (SMP) hardware using a Java-based object-oriented CA code implementing triangular state machines (i.e., automata) and the MPI bindings written in Java (MPJ Express). We use MPJ Express to reconfigure our existing CA code to distribute a domain's automata to cores present on a dual quad-core shared-memory system (eight total processors). We note that this message passing parallelization strategy is directly applicable to computer clustered computing, which will be the focus of follow-on research. Results on the shared memory platform indicate nearly-ideal, linear speed-up. We conclude that the CA-based elastodynamic simulator is easily configured to run in parallel, and yields excellent speed-up on SMP hardware.
Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

DOE PAGES

Olivier, Stephen L.; de Supinski, Bronis R.; Schulz, Martin; ...

2013-01-01

Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems.more » Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.« less
Distributed and parallel Ada and the Ada 9X recommendations

NASA Technical Reports Server (NTRS)

Volz, Richard A.; Goldsack, Stephen J.; Theriault, R.; Waldrop, Raymond S.; Holzbacher-Valero, A. A.

1992-01-01

Recently, the DoD has sponsored work towards a new version of Ada, intended to support the construction of distributed systems. The revised version, often called Ada 9X, will become the new standard sometimes in the 1990s. It is intended that Ada 9X should provide language features giving limited support for distributed system construction. The requirements for such features are given. Many of the most advanced computer applications involve embedded systems that are comprised of parallel processors or networks of distributed computers. If Ada is to become the widely adopted language envisioned by many, it is essential that suitable compilers and tools be available to facilitate the creation of distributed and parallel Ada programs for these applications. The major languages issues impacting distributed and parallel programming are reviewed, and some principles upon which distributed/parallel language systems should be built are suggested. Based upon these, alternative language concepts for distributed/parallel programming are analyzed.
Implementing Access to Data Distributed on Many Processors

NASA Technical Reports Server (NTRS)

James, Mark

2006-01-01

A reference architecture is defined for an object-oriented implementation of domains, arrays, and distributions written in the programming language Chapel. This technology primarily addresses domains that contain arrays that have regular index sets with the low-level implementation details being beyond the scope of this discussion. What is defined is a complete set of object-oriented operators that allows one to perform data distributions for domain arrays involving regular arithmetic index sets. What is unique is that these operators allow for the arbitrary regions of the arrays to be fragmented and distributed across multiple processors with a single point of access giving the programmer the illusion that all the elements are collocated on a single processor. Today's massively parallel High Productivity Computing Systems (HPCS) are characterized by a modular structure, with a large number of processing and memory units connected by a high-speed network. Locality of access as well as load balancing are primary concerns in these systems that are typically used for high-performance scientific computation. Data distributions address these issues by providing a range of methods for spreading large data sets across the components of a system. Over the past two decades, many languages, systems, tools, and libraries have been developed for the support of distributions. Since the performance of data parallel applications is directly influenced by the distribution strategy, users often resort to low-level programming models that allow fine-tuning of the distribution aspects affecting performance, but, at the same time, are tedious and error-prone. This technology presents a reusable design of a data-distribution framework for data parallel high-performance applications. Distributions are a means to express locality in systems composed of large numbers of processor and memory components connected by a network. Since distributions have a great effect on the performance of applications, it is important that the distribution strategy is flexible, so its behavior can change depending on the needs of the application. At the same time, high productivity concerns require that the user be shielded from error-prone, tedious details such as communication and synchronization.
Implementation and performance of parallel Prolog interpreter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wei, S.; Kale, L.V.; Balkrishna, R.

1988-01-01

In this paper, the authors discuss the implementation of a parallel Prolog interpreter on different parallel machines. The implementation is based on the REDUCE--OR process model which exploits both AND and OR parallelism in logic programs. It is machine independent as it runs on top of the chare-kernel--a machine-independent parallel programming system. The authors also give the performance of the interpreter running a diverse set of benchmark pargrams on parallel machines including shared memory systems: an Alliant FX/8, Sequent and a MultiMax, and a non-shared memory systems: Intel iPSC/32 hypercube, in addition to its performance on a multiprocessor simulation system.
Parallel Program Systems for the Analysis of Wave Processes in Elastic-Plastic, Granular, Porous and Multi-Blocky Media

NASA Astrophysics Data System (ADS)

Sadovskaya, Oxana; Sadovskii, Vladimir

2017-04-01

Under modeling the wave propagation processes in geomaterials (granular and porous media, soils and rocks) it is necessary to take into account the structural inhomogeneity of these materials. Parallel program systems for numerical solution of 2D and 3D problems of the dynamics of deformable media with constitutive relationships of rather general form on the basis of universal mathematical model describing small strains of elastic, elastic-plastic, granular and porous materials are worked out. In the case of an elastic material, the model is reduced to the system of equations, hyperbolic by Friedrichs, written in terms of velocities and stresses in a symmetric form. In the case of an elastic-plastic material, the model is a special formulation of the Prandtl-Reuss theory in the form of variational inequality with one-sided constraints on the stress tensor. Generalization of the model to describe granularity and the collapse of pores is obtained by means of the rheological approach, taking into account different resistance of materials to tension and compression. Rotational motion of particles in the material microstructure is considered within the framework of a mathematical model of the Cosserat continuum. Computational domain may have a blocky structure, composed of an arbitrary number of layers, strips in a layer and blocks in a strip from different materials with self-consistent curvilinear interfaces. At the external boundaries of computational domain the main types of dissipative boundary conditions in terms of velocities, stresses or mixed boundary conditions can be given. Shock-capturing algorithm is proposed for implementation of the model on supercomputers with cluster architecture. It is based on the two-cyclic splitting method with respect to spatial variables and the special procedures of the stresses correction to take into account plasticity, granularity or porosity of a material. An explicit monotone ENO-scheme is applied for solving one-dimensional systems of equations at the stages of splitting method. The parallelizing of computations is carried out using the MPI library and the SPMD technology. The data exchange between processors occurs at step "predictor" of the finite-difference scheme. Program systems allow simulate the propagation of waves produced by external mechanical effects in a medium, aggregated of arbitrary number of heterogeneous blocks. Some computations of dynamic problems with and without taking into account the moment properties of a material were performed on clusters of ICM SB RAS (Krasnoyarsk) and JSCC RAS (Moscow). Parallel program systems 2Dyn_Granular, 3Dyn_Granular, 2Dyn_Cosserat, 3Dyn_Cosserat and 2Dyn_Blocks_MPI for numerical solution of 2D and 3D elastic-plastic problems of the dynamics of granular media and problems of the Cosserat elasticity theory, as well as for modeling of the dynamic processes in multi-blocky media with pliant viscoelastic, porous and fluid-saturated interlayers on cluster systems were registered by Rospatent.
Nebo: An efficient, parallel, and portable domain-specific language for numerically solving partial differential equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Earl, Christopher; Might, Matthew; Bagusetty, Abhishek

This study presents Nebo, a declarative domain-specific language embedded in C++ for discretizing partial differential equations for transport phenomena on multiple architectures. Application programmers use Nebo to write code that appears sequential but can be run in parallel, without editing the code. Currently Nebo supports single-thread execution, multi-thread execution, and many-core (GPU-based) execution. With single-thread execution, Nebo performs on par with code written by domain experts. With multi-thread execution, Nebo can linearly scale (with roughly 90% efficiency) up to 12 cores, compared to its single-thread execution. Moreover, Nebo’s many-core execution can be over 140x faster than its single-thread execution.
Nebo: An efficient, parallel, and portable domain-specific language for numerically solving partial differential equations

DOE PAGES

Earl, Christopher; Might, Matthew; Bagusetty, Abhishek; ...

2016-01-26

This study presents Nebo, a declarative domain-specific language embedded in C++ for discretizing partial differential equations for transport phenomena on multiple architectures. Application programmers use Nebo to write code that appears sequential but can be run in parallel, without editing the code. Currently Nebo supports single-thread execution, multi-thread execution, and many-core (GPU-based) execution. With single-thread execution, Nebo performs on par with code written by domain experts. With multi-thread execution, Nebo can linearly scale (with roughly 90% efficiency) up to 12 cores, compared to its single-thread execution. Moreover, Nebo’s many-core execution can be over 140x faster than its single-thread execution.
libvdwxc: a library for exchange-correlation functionals in the vdW-DF family

NASA Astrophysics Data System (ADS)

Hjorth Larsen, Ask; Kuisma, Mikael; Löfgren, Joakim; Pouillon, Yann; Erhart, Paul; Hyldgaard, Per

2017-09-01

We present libvdwxc, a general library for evaluating the energy and potential for the family of vdW-DF exchange-correlation functionals. libvdwxc is written in C and provides an efficient implementation of the vdW-DF method and can be interfaced with various general-purpose DFT codes. Currently, the Gpaw and Octopus codes implement interfaces to libvdwxc. The present implementation emphasizes scalability and parallel performance, and thereby enables ab initio calculations of nanometer-scale complexes. The numerical accuracy is benchmarked on the S22 test set whereas parallel performance is benchmarked on ligand-protected gold nanoparticles ({{Au}}144{({{SC}}11{{NH}}25)}60) up to 9696 atoms.

Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Hood, Robert; Jost, Gabriele

2001-01-01

This viewgraph presentation provides information on support sources available for the automatic parallelization of computer program. CAPTools, a support tool developed at the University of Greenwich, transforms, with user guidance, existing sequential Fortran code into parallel message passing code. Comparison routines are then run for debugging purposes, in essence, ensuring that the code transformation was accurate.
Trends in the nursing doctoral comprehensive examination process: a national survey.

PubMed

Mawn, Barbara E; Goldberg, Shari

2012-01-01

The doctoral comprehensive or qualifying examination (CE/QE) is a traditional rite of passage into the community of scholars for the nursing profession. This exploratory, descriptive cross-sectional study examined trends in the process, timing, and methodology of comprehensive and qualifying examinations in nursing doctoral programs in the United States. Administrators from 45 schools responded to an online survey from 27 states across the country (37% response rate). Participants reported wide variations in the process. The most common method of implementation was the written take-home test (47%), two thirds of which had a subsequent oral examination. Eleven survey respondents (24%) reported using a form of the traditional written, timed, on-site examination; however, only 4 of these also followed up with an oral defense. Nine schools (20%) moved to a requirement for a written publishable paper; three schools consider the written proposal and its defense as the CE/QE. Approximately half had changed their policy in the past 5 years. With the increase in nursing doctor of philosophy programs over the past decade, information is needed to facilitate the development of methods to achieve program outcomes. An understanding of national CE/QE trends can provide a starting point for discussion and allow innovative ideas to meet the need of individual programs. Copyright © 2012 Elsevier Inc. All rights reserved.
Xyce™ Parallel Electronic Simulator Users' Guide, Version 6.5.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Aadithya, Karthik V.; Mei, Ting

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright © 2002-2016 Sandia Corporation. All rights reserved.« less
HP-41CX Programs for HgCdTe Detectors and IR Systems.

DTIC Science & Technology

1987-10-01

FIELD GROUP SUB-GROUP IPocket Computer HgCdTe PhotoSensor Programs Detectors Analysis I I l-IP-41 Infrared IR Systems __________ 19 ABSTRACT (Continue... HgCdTe detectors , focal planes, and infrared systems. They have been written to run in a basic HP-41CV or HP-41CX with no card reader or additional ROMs...Programs have been written for the HP-41CX which aid in the analysis of HgCdTe detectors , focal r planes, and infrared systems. They have been installed as a
Tuning collective communication for Partitioned Global Address Space programming models

DOE PAGES

Nishtala, Rajesh; Zheng, Yili; Hargrove, Paul H.; ...

2011-06-12

Partitioned Global Address Space (PGAS) languages offer programmers the convenience of a shared memory programming style combined with locality control necessary to run on large-scale distributed memory systems. Even within a PGAS language programmers often need to perform global communication operations such as broadcasts or reductions, which are best performed as collective operations in which a group of threads work together to perform the operation. In this study we consider the problem of implementing collective communication within PGAS languages and explore some of the design trade-offs in both the interface and implementation. In particular, PGAS collectives have semantic issues thatmore » are different than in send–receive style message passing programs, and different implementation approaches that take advantage of the one-sided communication style in these languages. We present an implementation framework for PGAS collectives as part of the GASNet communication layer, which supports shared memory, distributed memory and hybrids. The framework supports a broad set of algorithms for each collective, over which the implementation may be automatically tuned. In conclusion, we demonstrate the benefit of optimized GASNet collectives using application benchmarks written in UPC, and demonstrate that the GASNet collectives can deliver scalable performance on a variety of state-of-the-art parallel machines including a Cray XT4, an IBM BlueGene/P, and a Sun Constellation system with InfiniBand interconnect.« less
Stakeholders' perceptions on competency and assessment program of entry-level pharmacists in developing countries.

PubMed

Asante, Isaac; Andoh, Irene; Muijtjens, Arno M M; Donkers, Jeroen

2017-05-01

To assess the stakeholders' perceptions on the competency of entry-level pharmacists and the use of written licensure examination as the primary assessment for licensure decisions on entry-level pharmacists who have completed the Pharmacy Internship Program 1 (PIP) in developing countries. A cross-sectional survey was conducted among stakeholders in which they completed a web-based 21-item pre-tested questionnaire to determine their views regarding the competency outcomes and assessment program for entry-level pharmacist. The stakeholders rated the entry-level pharmacists to possess all competencies except research skills. Stakeholders suggested improvement of the program by defining the competency framework and training preceptors. However, stakeholders disagree on using written examination as the primary assessment for licensure decision and suggested the incorporation of other performance-based assessments like preceptor's assessment reports. Stakeholders are uncertain on entry-level pharmacists in developing countries possessing adequate research competencies and think their assessment program for licensure need more than written examination to assess all required competencies. Copyright © 2017 Elsevier Inc. All rights reserved.
Parallelizing serial code for a distributed processing environment with an application to high frequency electromagnetic scattering

NASA Astrophysics Data System (ADS)

Work, Paul R.

1991-12-01

This thesis investigates the parallelization of existing serial programs in computational electromagnetics for use in a parallel environment. Existing algorithms for calculating the radar cross section of an object are covered, and a ray-tracing code is chosen for implementation on a parallel machine. Current parallel architectures are introduced and a suitable parallel machine is selected for the implementation of the chosen ray-tracing algorithm. The standard techniques for the parallelization of serial codes are discussed, including load balancing and decomposition considerations, and appropriate methods for the parallelization effort are selected. A load balancing algorithm is modified to increase the efficiency of the application, and a high level design of the structure of the serial program is presented. A detailed design of the modifications for the parallel implementation is also included, with both the high level and the detailed design specified in a high level design language called UNITY. The correctness of the design is proven using UNITY and standard logic operations. The theoretical and empirical results show that it is possible to achieve an efficient parallel application for a serial computational electromagnetic program where the characteristics of the algorithm and the target architecture critically influence the development of such an implementation.
Mads.jl

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vesselinov, Velimir; O'Malley, Daniel; Lin, Youzuo

2016-07-01

Mads.jl (Model analysis and decision support in Julia) is a code that streamlines the process of using data and models for analysis and decision support. It is based on another open-source code developed at LANL and written in C/C++ (MADS; http://mads.lanl.gov; LA-CC-11- 035). Mads.jl can work with external models of arbitrary complexity as well as built-in models of flow and transport in porous media. It enables a number of data- and model-based analyses including model calibration, sensitivity analysis, uncertainty quantification, and decision analysis. The code also can use a series of alternative adaptive computational techniques for Bayesian sampling, Monte Carlo,more » and Bayesian Information-Gap Decision Theory. The code is implemented in the Julia programming language, and has high-performance (parallel) and memory management capabilities. The code uses a series of third party modules developed by others. The code development will also include contributions to the existing third party modules written in Julia; this contributions will be important for the efficient implementation of the algorithm used by Mads.jl. The code also uses a series of LANL developed modules that are developed by Dan O'Malley; these modules will be also a part of the Mads.jl release. Mads.jl will be released under GPL V3 license. The code will be distributed as a Git repo at gitlab.com and github.com. Mads.jl manual and documentation will be posted at madsjulia.lanl.gov.« less
Users guide to E859 phoswich analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Costales, J.B.

1992-11-30

In this memo the authors describe the analysis path used to transform the phoswich data from raw data banks into cross sections suitable for publication. The primary purpose of this memo is not to document each analysis step in great detail but rather to point the reader to the fortran code used and to point out the essential features of the analysis path. A flow chart which summarizes the various steps performed to massage the data from beginning to end is given. In general, each step corresponds to a fortran program which was written to perform that particular task. Themore » automation of the data analysis has been kept purposefully minimal in order to ensure the highest quality of the final product. However, tools have been developed which ease the non--automated steps. There are two major parallel routes for the data analysis: data reduction and acceptance determination using detailed GEANT Monte Carlo simulations. In this memo, the authors will first describe the data reduction up to the point where PHAD banks (Pass 1-like banks) are created. They the will describe the steps taken in the GEANT Monte Carlo route. Note that a detailed memo describing the methodology of the acceptance corrections has already been written. Therefore the discussion of the acceptance determination will be kept to a minimum and the reader will be referred to the other memo for further details. Finally, they will describe the cross section formation process and how final spectra are extracted.« less
42 CFR 8.30 - Transmission of written communications by reviewing official and calculation of deadlines.

Code of Federal Regulations, 2010 CFR

2010-10-01

... HEALTH AND HUMAN SERVICES GENERAL PROVISIONS CERTIFICATION OF OPIOID TREATMENT PROGRAMS Procedures for... Withdrawal of Approval of an Accreditation Body § 8.30 Transmission of written communications by reviewing...
42 CFR 8.30 - Transmission of written communications by reviewing official and calculation of deadlines.

Code of Federal Regulations, 2011 CFR

2011-10-01

... HEALTH AND HUMAN SERVICES GENERAL PROVISIONS CERTIFICATION OF OPIOID TREATMENT PROGRAMS Procedures for... Withdrawal of Approval of an Accreditation Body § 8.30 Transmission of written communications by reviewing...
Low latency and persistent data storage

DOEpatents

Fitch, Blake G; Franceschini, Michele M; Jagmohan, Ashish; Takken, Todd

2014-11-04

Persistent data storage is provided by a computer program product that includes computer program code configured for receiving a low latency store command that includes write data. The write data is written to a first memory device that is implemented by a nonvolatile solid-state memory technology characterized by a first access speed. It is acknowledged that the write data has been successfully written to the first memory device. The write data is written to a second memory device that is implemented by a volatile memory technology. At least a portion of the data in the first memory device is written to a third memory device when a predetermined amount of data has been accumulated in the first memory device. The third memory device is implemented by a nonvolatile solid-state memory technology characterized by a second access speed that is slower than the first access speed.
The Automated Instrumentation and Monitoring System (AIMS): Design and Architecture. 3.2

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Schmidt, Melisa; Schulbach, Cathy; Bailey, David (Technical Monitor)

1997-01-01

Whether a researcher is designing the 'next parallel programming paradigm', another 'scalable multiprocessor' or investigating resource allocation algorithms for multiprocessors, a facility that enables parallel program execution to be captured and displayed is invaluable. Careful analysis of such information can help computer and software architects to capture, and therefore, exploit behavioral variations among/within various parallel programs to take advantage of specific hardware characteristics. A software tool-set that facilitates performance evaluation of parallel applications on multiprocessors has been put together at NASA Ames Research Center under the sponsorship of NASA's High Performance Computing and Communications Program over the past five years. The Automated Instrumentation and Monitoring Systematic has three major software components: a source code instrumentor which automatically inserts active event recorders into program source code before compilation; a run-time performance monitoring library which collects performance data; and a visualization tool-set which reconstructs program execution based on the data collected. Besides being used as a prototype for developing new techniques for instrumenting, monitoring and presenting parallel program execution, AIMS is also being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Currently, the execution of FORTRAN and C programs on the Intel Paragon and PALM workstations can be automatically instrumented and monitored. Performance data thus collected can be displayed graphically on various workstations. The process of performance tuning with AIMS will be illustrated using various NAB Parallel Benchmarks. This report includes a description of the internal architecture of AIMS and a listing of the source code.
Geometrization of the Dirac theory of the electron

NASA Technical Reports Server (NTRS)

Fock, V.

1977-01-01

Using the concept of parallel displacement of a half vector, the Dirac equations are generally written in invariant form. The energy tensor is formed and both the macroscopic and quantum mechanic equations of motion are set up. The former have the usual form: divergence of the energy tensor equals the Lorentz force and the latter are essentially identical with those of the geodesic line.
What Multilevel Parallel Programs do when you are not Watching: A Performance Analysis Case Study Comparing MPI/OpenMP, MLP, and Nested OpenMP

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Labarta, Jesus; Gimenez, Judit

2004-01-01

With the current trend in parallel computer architectures towards clusters of shared memory symmetric multi-processors, parallel programming techniques have evolved that support parallelism beyond a single level. When comparing the performance of applications based on different programming paradigms, it is important to differentiate between the influence of the programming model itself and other factors, such as implementation specific behavior of the operating system (OS) or architectural issues. Rewriting-a large scientific application in order to employ a new programming paradigms is usually a time consuming and error prone task. Before embarking on such an endeavor it is important to determine that there is really a gain that would not be possible with the current implementation. A detailed performance analysis is crucial to clarify these issues. The multilevel programming paradigms considered in this study are hybrid MPI/OpenMP, MLP, and nested OpenMP. The hybrid MPI/OpenMP approach is based on using MPI [7] for the coarse grained parallelization and OpenMP [9] for fine grained loop level parallelism. The MPI programming paradigm assumes a private address space for each process. Data is transferred by explicitly exchanging messages via calls to the MPI library. This model was originally designed for distributed memory architectures but is also suitable for shared memory systems. The second paradigm under consideration is MLP which was developed by Taft. The approach is similar to MPi/OpenMP, using a mix of coarse grain process level parallelization and loop level OpenMP parallelization. As it is the case with MPI, a private address space is assumed for each process. The MLP approach was developed for ccNUMA architectures and explicitly takes advantage of the availability of shared memory. A shared memory arena which is accessible by all processes is required. Communication is done by reading from and writing to the shared memory.
45 CFR 2540.620 - What are my rights if the Corporation determines that I have made a false or misleading statement?

Code of Federal Regulations, 2010 CFR

2010-10-01

... qualification to participate in, a Corporation-funded program, you will be hand delivered a written notice, or sent a written notice to your last known street address or e-mail address or that of your identified... be implemented. You will be given 10 calendar days to submit written materials in opposition to the...
A Comparison of Written Compositions of Head-Start Pupils with Non-Head-Start Pupils.

ERIC Educational Resources Information Center

Houston, David Ree

This study--a follow-up to one conducted by Giles in 1965-- compared the written compositions of fourth grade pupils who had been in Project Head Start in the summer of 1965 with those of comparable pupils not in the program to determine possible differences in their written language development. Seventy Negro students were divided by sex and…
Performance Evaluation in Network-Based Parallel Computing

NASA Technical Reports Server (NTRS)

Dezhgosha, Kamyar

1996-01-01

Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.
GAUSSIAN BEAM LASER RESONATOR PROGRAM

NASA Technical Reports Server (NTRS)

Cross, P. L.

1994-01-01

In designing a laser cavity, the laser engineer is frequently concerned with more than the stability of the resonator. Other considerations include the size of the beam at various optical surfaces within the resonator or the performance of intracavity line-narrowing or other optical elements. Laser resonators obey the laws of Gaussian beam propagation, not geometric optics. The Gaussian Beam Laser Resonator Program models laser resonators using Gaussian ray trace techniques. It can be used to determine the propagation of radiation through laser resonators. The algorithm used in the Gaussian Beam Resonator program has three major components. First, the ray transfer matrix for the laser resonator must be calculated. Next calculations of the initial beam parameters, specifically, the beam stability, the beam waist size and location for the resonator input element, and the wavefront curvature and beam radius at the input surface to the first resonator element are performed. Finally the propagation of the beam through the optical elements is computed. The optical elements can be modeled as parallel plates, lenses, mirrors, dummy surfaces, or Gradient Index (GRIN) lenses. A Gradient Index lens is a good approximation of a laser rod operating under a thermal load. The optical system may contain up to 50 elements. In addition to the internal beam elements the optical system may contain elements external to the resonator. The Gaussian Beam Resonator program was written in Microsoft FORTRAN (Version 4.01). It was developed for the IBM PS/2 80-071 microcomputer and has been implemented on an IBM PC compatible under MS DOS 3.21. The program was developed in 1988 and requires approximately 95K bytes to operate.
Calculating Trajectories And Orbits

NASA Technical Reports Server (NTRS)

Alderson, Daniel J.; Brady, Franklyn H.; Breckheimer, Peter J.; Campbell, James K.; Christensen, Carl S.; Collier, James B.; Ekelund, John E.; Ellis, Jordan; Goltz, Gene L.; Hintz, Gerarld R.;

1989-01-01

Double-Precision Trajectory Analysis Program, DPTRAJ, and Orbit Determination Program, ODP, developed and improved over years to provide highly reliable and accurate navigation capability for deep-space missions like Voyager. Each collection of programs working together to provide desired computational results. DPTRAJ, ODP, and supporting utility programs capable of handling massive amounts of data and performing various numerical calculations required for solving navigation problems associated with planetary fly-by and lander missions. Used extensively in support of NASA's Voyager project. DPTRAJ-ODP available in two machine versions. UNIVAC version, NPO-15586, written in FORTRAN V, SFTRAN, and ASSEMBLER. VAX/VMS version, NPO-17201, written in FORTRAN V, SFTRAN, PL/1 and ASSEMBLER.

WFIRST: Science from the Guest Investigator and Parallel Observation Programs

NASA Astrophysics Data System (ADS)

Postman, Marc; Nataf, David; Furlanetto, Steve; Milam, Stephanie; Robertson, Brant; Williams, Ben; Teplitz, Harry; Moustakas, Leonidas; Geha, Marla; Gilbert, Karoline; Dickinson, Mark; Scolnic, Daniel; Ravindranath, Swara; Strolger, Louis; Peek, Joshua; Marc Postman

2018-01-01

The Wide Field InfraRed Survey Telescope (WFIRST) mission will provide an extremely rich archival dataset that will enable a broad range of scientific investigations beyond the initial objectives of the proposed key survey programs. The scientific impact of WFIRST will thus be significantly expanded by a robust Guest Investigator (GI) archival research program. We will present examples of GI research opportunities ranging from studies of the properties of a variety of Solar System objects, surveys of the outer Milky Way halo, comprehensive studies of cluster galaxies, to unique and new constraints on the epoch of cosmic re-ionization and the assembly of galaxies in the early universe.WFIRST will also support the acquisition of deep wide-field imaging and slitless spectroscopic data obtained in parallel during campaigns with the coronagraphic instrument (CGI). These parallel wide-field imager (WFI) datasets can provide deep imaging data covering several square degrees at no impact to the scheduling of the CGI program. A competitively selected program of well-designed parallel WFI observation programs will, like the GI science above, maximize the overall scientific impact of WFIRST. We will give two examples of parallel observations that could be conducted during a proposed CGI program centered on a dozen nearby stars.
Parallelized direct execution simulation of message-passing parallel programs

NASA Technical Reports Server (NTRS)

Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

1994-01-01

As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study

DOE PAGES

Radhakrishnan, Hari; Rouson, Damian W. I.; Morris, Karla; ...

2015-01-01

This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO) and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP) facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were donemore » using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.« less
Programming Probabilistic Structural Analysis for Parallel Processing Computer

NASA Technical Reports Server (NTRS)

Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

1991-01-01

The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.
15 CFR 16.8 - Termination of participation.

Code of Federal Regulations, 2010 CFR

2010-01-01

... written decision to the participant in the event such participant does not appeal the proposed termination... responsibilities under this program with regard to a specific type of product by giving written notice to the... products of the type involved. ...
PCACE-Personal-Computer-Aided Cabling Engineering

NASA Technical Reports Server (NTRS)

Billitti, Joseph W.

1987-01-01

PCACE computer program developed to provide inexpensive, interactive system for learning and using engineering approach to interconnection systems. Basically database system that stores information as files of individual connectors and handles wiring information in circuit groups stored as records. Directly emulates typical manual engineering methods of handling data, thus making interface between user and program very natural. Apple version written in P-Code Pascal and IBM PC version of PCACE written in TURBO Pascal 3.0
Trajectory Reconstruction Program Milestone 2/3 Report. Volume 1. Description and Overview

DTIC Science & Technology

1974-12-16

Simulation Data Generation Missile Trajectory Error Analysis Modularized Program Guidance and Targeting Multiple Vehicle Simulation IBM 360/370 Numerical...consists of vehicle simulation subprograms designed and written in FORTRAN for CDC 6600/7600, IBM 360/370, and UNIVAC 1108/1110 series computers. The o-erall...vehicle simulation subprograms designed and written in FORTRAN fcr CDC 6600/7600, IBM 360/370, and UNIVAC l08/1110 series computers. The overall
Performance Implications of Synchronization Support for Parallel FORTRAN Programs

DTIC Science & Technology

1991-06-17

applications we used in this study are BDNA and FLO52. BDNA is a molecular dy- I namics simulator for biomolecules in water and it uses ordinary...parallelism structures and loop granularity. In the BDNA program, most of the parallel loops are not nested and the iterations are 200-1000 instructions long...are of concern. The BDNA curve in Figure 21 shows that for this program only 17% of all 32 I I 100 BDNA -4 FLO52 -I 80 3 CumuilatQe percentage of3
Parallelization of Program to Optimize Simulated Trajectories (POST3D)

NASA Technical Reports Server (NTRS)

Hammond, Dana P.; Korte, John J. (Technical Monitor)

2001-01-01

This paper describes the parallelization of the Program to Optimize Simulated Trajectories (POST3D). POST3D uses a gradient-based optimization algorithm that reaches an optimum design point by moving from one design point to the next. The gradient calculations required to complete the optimization process, dominate the computational time and have been parallelized using a Single Program Multiple Data (SPMD) on a distributed memory NUMA (non-uniform memory access) architecture. The Origin2000 was used for the tests presented.
Selective, Embedded, Just-In-Time Specialization (SEJITS): Portable Parallel Performance from Sequential, Productive, Embedded Domain-Specific Languages

DTIC Science & Technology

2012-12-01

identity operation SIMD Single instruction, multiple datastream parallel computing Scala A byte-compiled programming language featuring dynamic type...Specific Languages 5a. CONTRACT NUMBER FA8750-10-1-0191 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 61101E 6. AUTHOR(S) Armando Fox 5d...application performance, but usually must rely on efficiency programmers who are experts in explicit parallel programming to achieve it. Since such efficiency
Empirical valence bond models for reactive potential energy surfaces: a parallel multilevel genetic program approach.

PubMed

Bellucci, Michael A; Coker, David F

2011-07-28

We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent. © 2011 American Institute of Physics
Concepts of Concurrent Programming

DTIC Science & Technology

1990-04-01

to the material presented. Carriero89 Carriero, N., and Gelernter, D. " How to Write Parallel Programs : A Guide to the Perplexed." ACM...between the architectures on which programs can be executed and the application domains from which problems are drawn. Our goal is to show how programs ...Sept. 1989), 251-510. Abstract: There are four papers: 1. Programming Languages for Distributed Computing Systems (52); 2. How to Write Parallel
NavP: Structured and Multithreaded Distributed Parallel Programming

NASA Technical Reports Server (NTRS)

Pan, Lei; Xu, Jingling

2006-01-01

This slide presentation reviews some of the issues around distributed parallel programming. It compares and contrast two methods of programming: Single Program Multiple Data (SPMD) with the Navigational Programming (NAVP). It then reviews the distributed sequential computing (DSC) method and the methodology of NavP. Case studies are presented. It also reviews the work that is being done to enable the NavP system.
High Performance Programming Using Explicit Shared Memory Model on Cray T3D1

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Saini, Subhash; Grassi, Charles

1994-01-01

The Cray T3D system is the first-phase system in Cray Research, Inc.'s (CRI) three-phase massively parallel processing (MPP) program. This system features a heterogeneous architecture that closely couples DEC's Alpha microprocessors and CRI's parallel-vector technology, i.e., the Cray Y-MP and Cray C90. An overview of the Cray T3D hardware and available programming models is presented. Under Cray Research adaptive Fortran (CRAFT) model four programming methods (data parallel, work sharing, message-passing using PVM, and explicit shared memory model) are available to the users. However, at this time data parallel and work sharing programming models are not available to the user community. The differences between standard PVM and CRI's PVM are highlighted with performance measurements such as latencies and communication bandwidths. We have found that the performance of neither standard PVM nor CRI s PVM exploits the hardware capabilities of the T3D. The reasons for the bad performance of PVM as a native message-passing library are presented. This is illustrated by the performance of NAS Parallel Benchmarks (NPB) programmed in explicit shared memory model on Cray T3D. In general, the performance of standard PVM is about 4 to 5 times less than obtained by using explicit shared memory model. This degradation in performance is also seen on CM-5 where the performance of applications using native message-passing library CMMD on CM-5 is also about 4 to 5 times less than using data parallel methods. The issues involved (such as barriers, synchronization, invalidating data cache, aligning data cache etc.) while programming in explicit shared memory model are discussed. Comparative performance of NPB using explicit shared memory programming model on the Cray T3D and other highly parallel systems such as the TMC CM-5, Intel Paragon, Cray C90, IBM-SP1, etc. is presented.
Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

NASA Astrophysics Data System (ADS)

Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.

2013-08-01

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time ti (trajectory positions and velocities xi = (ri, vi)) to time ti + 1 (xi + 1) by xi + 1 = fi(xi), the dynamics problem spanning an interval from t0…tM can be transformed into a root finding problem, F(X) = [xi - f(x(i - 1)]i = 1, M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H2O AIMD simulation at the MP2 level. The maximum speedup (serial execution time/parallel execution time) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.
Python to learn programming

NASA Astrophysics Data System (ADS)

Bogdanchikov, A.; Zhaparov, M.; Suliyev, R.

2013-04-01

Today we have a lot of programming languages that can realize our needs, but the most important question is how to teach programming to beginner students. In this paper we suggest using Python for this purpose, because it is a programming language that has neatly organized syntax and powerful tools to solve any task. Moreover it is very close to simple math thinking. Python is chosen as a primary programming language for freshmen in most of leading universities. Writing code in python is easy. In this paper we give some examples of program codes written in Java, C++ and Python language, and we make a comparison between them. Firstly, this paper proposes advantages of Python language in relation to C++ and JAVA. Then it shows the results of a comparison of short program codes written in three different languages, followed by a discussion on how students understand programming. Finally experimental results of students' success in programming courses are shown.
On program restructuring, scheduling, and communication for parallel processor systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Polychronopoulos, Constantine D.

1986-08-01

This dissertation discusses several software and hardware aspects of program execution on large-scale, high-performance parallel processor systems. The issues covered are program restructuring, partitioning, scheduling and interprocessor communication, synchronization, and hardware design issues of specialized units. All this work was performed focusing on a single goal: to maximize program speedup, or equivalently, to minimize parallel execution time. Parafrase, a Fortran restructuring compiler was used to transform programs in a parallel form and conduct experiments. Two new program restructuring techniques are presented, loop coalescing and subscript blocking. Compile-time and run-time scheduling schemes are covered extensively. Depending on the program construct, thesemore » algorithms generate optimal or near-optimal schedules. For the case of arbitrarily nested hybrid loops, two optimal scheduling algorithms for dynamic and static scheduling are presented. Simulation results are given for a new dynamic scheduling algorithm. The performance of this algorithm is compared to that of self-scheduling. Techniques for program partitioning and minimization of interprocessor communication for idealized program models and for real Fortran programs are also discussed. The close relationship between scheduling, interprocessor communication, and synchronization becomes apparent at several points in this work. Finally, the impact of various types of overhead on program speedup and experimental results are presented.« less
20 CFR 632.260 - Worksite standards.

Code of Federal Regulations, 2010 CFR

2010-04-01

... EMPLOYMENT AND TRAINING PROGRAMS Summer Youth Employment and Training Programs § 632.260 Worksite standards... rules and regulations governig the summer program. (2) Such written agreements may be memoranda of...
20 CFR 632.260 - Worksite standards.

Code of Federal Regulations, 2011 CFR

2011-04-01

... EMPLOYMENT AND TRAINING PROGRAMS Summer Youth Employment and Training Programs § 632.260 Worksite standards... rules and regulations governig the summer program. (2) Such written agreements may be memoranda of...
20 CFR 632.260 - Worksite standards.

Code of Federal Regulations, 2012 CFR

2012-04-01

... EMPLOYMENT AND TRAINING PROGRAMS Summer Youth Employment and Training Programs § 632.260 Worksite standards... rules and regulations governig the summer program. (2) Such written agreements may be memoranda of...

Structured Design Language for Computer Programs

NASA Technical Reports Server (NTRS)

Pace, Walter H., Jr.

1986-01-01

Box language used at all stages of program development. Developed to provide improved productivity in designing, coding, and maintaining computer programs. BOX system written in FORTRAN 77 for batch execution.
Statistical techniques applied to aerial radiometric surveys (STAARS): principal components analysis user's manual. [NURE program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koch, C.D.; Pirkle, F.L.; Schmidt, J.S.

1981-01-01

A Principal Components Analysis (PCA) has been written to aid in the interpretation of multivariate aerial radiometric data collected by the US Department of Energy (DOE) under the National Uranium Resource Evaluation (NURE) program. The variations exhibited by these data have been reduced and classified into a number of linear combinations by using the PCA program. The PCA program then generates histograms and outlier maps of the individual variates. Black and white plots can be made on a Calcomp plotter by the application of follow-up programs. All programs referred to in this guide were written for a DEC-10. From thismore » analysis a geologist may begin to interpret the data structure. Insight into geological processes underlying the data may be obtained.« less
Computer programs for eddy-current defect studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pate, J. R.; Dodd, C. V.

Several computer programs to aid in the design of eddy-current tests and probes have been written. The programs, written in Fortran, deal in various ways with the response to defects exhibited by four types of probes: the pancake probe, the reflection probe, the circumferential boreside probe, and the circumferential encircling probe. Programs are included which calculate the impedance or voltage change in a coil due to a defect, which calculate and plot the defect sensitivity factor of a coil, and which invert calculated or experimental readings to obtain the size of a defect. The theory upon which the programs aremore » based is the Burrows point defect theory, and thus the calculations of the programs will be more accurate for small defects. 6 refs., 21 figs.« less
A DNA sequence analysis package for the IBM personal computer.

PubMed Central

Lagrimini, L M; Brentano, S T; Donelson, J E

1984-01-01

We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
Airport Landside. Volume IV. Appendix A. ALSIM AUXILIARY and MAIN Programs.

DOT National Transportation Integrated Search

1982-06-01

This Appendix describes the Program Logic of the Airport Landside Simulation Model (ALSIM) AUXILIARY and MAIN Programs. Both programs are written in GPSS-V. The AUXILIARY program is operated prior to the MAIN Program to create GPSS transactions repre...
Xyce Parallel Electronic Simulator Users Guide Version 6.2.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2014 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only)« less
Xyce Parallel Electronic Simulator Users Guide Version 6.4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2015 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only)« less
ZENO: N-body and SPH Simulation Codes

NASA Astrophysics Data System (ADS)

Barnes, Joshua E.

2011-02-01

The ZENO software package integrates N-body and SPH simulation codes with a large array of programs to generate initial conditions and analyze numerical simulations. Written in C, the ZENO system is portable between Mac, Linux, and Unix platforms. It is in active use at the Institute for Astronomy (IfA), at NRAO, and possibly elsewhere. Zeno programs can perform a wide range of simulation and analysis tasks. While many of these programs were first created for specific projects, they embody algorithms of general applicability and embrace a modular design strategy, so existing code is easily applied to new tasks. Major elements of the system include: Structured data file utilities facilitate basic operations on binary data, including import/export of ZENO data to other systems.Snapshot generation routines create particle distributions with various properties. Systems with user-specified density profiles can be realized in collisionless or gaseous form; multiple spherical and disk components may be set up in mutual equilibrium.Snapshot manipulation routines permit the user to sift, sort, and combine particle arrays, translate and rotate particle configurations, and assign new values to data fields associated with each particle.Simulation codes include both pure N-body and combined N-body/SPH programs: Pure N-body codes are available in both uniprocessor and parallel versions.SPH codes offer a wide range of options for gas physics, including isothermal, adiabatic, and radiating models. Snapshot analysis programs calculate temporal averages, evaluate particle statistics, measure shapes and density profiles, compute kinematic properties, and identify and track objects in particle distributions.Visualization programs generate interactive displays and produce still images and videos of particle distributions; the user may specify arbitrary color schemes and viewing transformations.
Modelling parallel programs and multiprocessor architectures with AXE

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Fineman, Charles E.

1991-01-01

AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
An Ada Linear-Algebra Software Package Modeled After HAL/S

NASA Technical Reports Server (NTRS)

Klumpp, Allan R.; Lawson, Charles L.

1990-01-01

New avionics software written more easily. Software package extends Ada programming language to include linear-algebra capabilities similar to those of HAL/S programming language. Designed for such avionics applications as Space Station flight software. In addition to built-in functions of HAL/S, package incorporates quaternion functions used in Space Shuttle and Galileo projects and routines from LINPAK solving systems of equations involving general square matrices. Contains two generic programs: one for floating-point computations and one for integer computations. Written on IBM/AT personal computer running under PC DOS, v.3.1.
Web Based Parallel Programming Workshop for Undergraduate Education.

ERIC Educational Resources Information Center

Marcus, Robert L.; Robertson, Douglass

Central State University (Ohio), under a contract with Nichols Research Corporation, has developed a World Wide web based workshop on high performance computing entitled "IBN SP2 Parallel Programming Workshop." The research is part of the DoD (Department of Defense) High Performance Computing Modernization Program. The research…
UPEML: a machine-portable CDC Update emulator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mehlhorn, T.A.; Young, M.F.

1984-12-01

UPEML is a machine-portable CDC Update emulation program. UPEML is written in ANSI standard Fortran-77 and is relatively simple and compact. It is capable of emulating a significant subset of the standard CDC Update functions including program library creation and subsequent modification. Machine-portability is an essential attribute of UPEML. It was written primarily to facilitate the use of CDC-based scientific packages on alternate computer systems such as the VAX 11/780 and the IBM 3081.
Instrumentation, performance visualization, and debugging tools for multiprocessors

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Fineman, Charles E.; Hontalas, Philip J.

1991-01-01

The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs.
Testing New Programming Paradigms with NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

2000-01-01

Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage was applied to several benchmarks, noticeably BT and SP, resulting in better sequential performance. In order to overcome the lack of an HPF performance model and guide the development of the HPF codes, we employed an empirical performance model for several primitives found in the benchmarks. We encountered a few limitations of HPF, such as lack of supporting the "REDISTRIBUTION" directive and no easy way to handle irregular computation. The parallelization with OpenMP directives was done at the outer-most loop level to achieve the largest granularity. The performance of six HPF and OpenMP benchmarks is compared with their MPI counterparts for the Class-A problem size in the figure in next page. These results were obtained on an SGI Origin2000 (195MHz) with MIPSpro-f77 compiler 7.2.1 for OpenMP and MPI codes and PGI pghpf-2.4.3 compiler with MPI interface for HPF programs.
Fighting Islamic Terrorists With Democracy: A Critique

DTIC Science & Technology

2007-05-21

other hand, translations of the Bible from its Aramaic, Hebrew , and Koine Greek yield no such unlimited, violent imperatives requiring contemporary...pronouncements of a mere man who claimed to speak with the authority of Christ, and yet repeatedly contradicted the clear teachings of the Bible . Thus, the...faithful. And here is where the parallel between “fundamentalist” worshippers seeking to understand the Qur’an and the Bible —both written in tongues
SEEK: A FORTRAN optimization program using a feasible directions gradient search

NASA Technical Reports Server (NTRS)

Savage, M.

1995-01-01

This report describes the use of computer program 'SEEK' which works in conjunction with two user-written subroutines and an input data file to perform an optimization procedure on a user's problem. The optimization method uses a modified feasible directions gradient technique. SEEK is written in ANSI standard Fortran 77, has an object size of about 46K bytes, and can be used on a personal computer running DOS. This report describes the use of the program and discusses the optimizing method. The program use is illustrated with four example problems: a bushing design, a helical coil spring design, a gear mesh design, and a two-parameter Weibull life-reliability curve fit.
Parallel computation with the force

NASA Technical Reports Server (NTRS)

Jordan, H. F.

1985-01-01

A methodology, called the force, supports the construction of programs to be executed in parallel by a force of processes. The number of processes in the force is unspecified, but potentially very large. The force idea is embodied in a set of macros which produce multiproceossor FORTRAN code and has been studied on two shared memory multiprocessors of fairly different character. The method has simplified the writing of highly parallel programs within a limited class of parallel algorithms and is being extended to cover a broader class. The individual parallel constructs which comprise the force methodology are discussed. Of central concern are their semantics, implementation on different architectures and performance implications.
78 FR 71708 - Notification of Application for Approval of a Railroad Safety Program Plan

Federal Register 2010, 2011, 2012, 2013, 2014

2013-11-29

... by a letter dated October 15, 2013, the Long Island Rail Road petitioned the Federal Railroad... accepting comments on the RSPP revision. A copy of the petition, as well as any written communications... written [[Page 71709
Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures

NASA Technical Reports Server (NTRS)

Biegel, Bryan A. (Technical Monitor); Jost, G.; Jin, H.; Labarta J.; Gimenez, J.; Caubet, J.

2003-01-01

Parallel programming paradigms include process level parallelism, thread level parallelization, and multilevel parallelism. This viewgraph presentation describes a detailed performance analysis of these paradigms for Shared Memory Architecture (SMA). This analysis uses the Paraver Performance Analysis System. The presentation includes diagrams of a flow of useful computations.
Chemical Education from Programs for Learning, Inc.

ERIC Educational Resources Information Center

Petrich, James A.

1981-01-01

This software review focuses on five concept-related packages of programs in the Apple version and are viewed as well-written in terms of both educational sophistication and programing expertise. (MP)

Monitoring and Acquisition Real-time System (MARS)

NASA Technical Reports Server (NTRS)

Holland, Corbin

2013-01-01

MARS is a graphical user interface (GUI) written in MATLAB and Java, allowing the user to configure and control the Scalable Parallel Architecture for Real-Time Acquisition and Analysis (SPARTAA) data acquisition system. SPARTAA not only acquires data, but also allows for complex algorithms to be applied to the acquired data in real time. The MARS client allows the user to set up and configure all settings regarding the data channels attached to the system, as well as have complete control over starting and stopping data acquisition. It provides a unique "Test" programming environment, allowing the user to create tests consisting of a series of alarms, each of which contains any number of data channels. Each alarm is configured with a particular algorithm, determining the type of processing that will be applied on each data channel and tested against a defined threshold. Tests can be uploaded to SPARTAA, thereby teaching it how to process the data. The uniqueness of MARS is in its capability to be adaptable easily to many test configurations. MARS sends and receives protocols via TCP/IP, which allows for quick integration into almost any test environment. The use of MATLAB and Java as the programming languages allows for developers to integrate the software across multiple operating platforms.
Aligning Greek-English parallel texts

NASA Astrophysics Data System (ADS)

Galiotou, Eleni; Koronakis, George; Lazari, Vassiliki

2015-02-01

In this paper, we discuss issues concerning the alignment of parallel texts written in languages with different alphabets based on an experiment of aligning texts from the proceedings of the European Parliament in Greek and English. First, we describe our implementation of the k-vec algorithm and its application to the bilingual corpus. Then the output of the algorithm is used as a starting point for an alignment procedure at a sentence level which also takes into account mark-ups of meta-information. The results of the implementation are compared to those of the application of the Church and Gale alignment algorithm on the Europarl corpus. The conclusions of this comparison can give useful insights as for the efficiency of alignment algorithms when applied to the particular bilingual corpus.
76 FR 62808 - Pilot Program for Parallel Review of Medical Products

Federal Register 2010, 2011, 2012, 2013, 2014

2011-10-11

... voluntary participation in the pilot program, as well as the guiding principles the Agencies intend to... 57045), parallel review is intended to reduce the time between FDA marketing approval and CMS national...
Algorithms and programming tools for image processing on the MPP

NASA Technical Reports Server (NTRS)

Reeves, A. P.

1985-01-01

Topics addressed include: data mapping and rotational algorithms for the Massively Parallel Processor (MPP); Parallel Pascal language; documentation for the Parallel Pascal Development system; and a description of the Parallel Pascal language used on the MPP.
Execution models for mapping programs onto distributed memory parallel computers

NASA Technical Reports Server (NTRS)

Sussman, Alan

1992-01-01

The problem of exploiting the parallelism available in a program to efficiently employ the resources of the target machine is addressed. The problem is discussed in the context of building a mapping compiler for a distributed memory parallel machine. The paper describes using execution models to drive the process of mapping a program in the most efficient way onto a particular machine. Through analysis of the execution models for several mapping techniques for one class of programs, we show that the selection of the best technique for a particular program instance can make a significant difference in performance. On the other hand, the results of benchmarks from an implementation of a mapping compiler show that our execution models are accurate enough to select the best mapping technique for a given program.
Parallel Computing Strategies for Irregular Algorithms

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Parallelization of NAS Benchmarks for Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

1998-01-01

This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.
Trace-Driven Debugging of Message Passing Programs

NASA Technical Reports Server (NTRS)

Frumkin, Michael; Hood, Robert; Lopez, Louis; Bailey, David (Technical Monitor)

1998-01-01

In this paper we report on features added to a parallel debugger to simplify the debugging of parallel message passing programs. These features include replay, setting consistent breakpoints based on interprocess event causality, a parallel undo operation, and communication supervision. These features all use trace information collected during the execution of the program being debugged. We used a number of different instrumentation techniques to collect traces. We also implemented trace displays using two different trace visualization systems. The implementation was tested on an SGI Power Challenge cluster and a network of SGI workstations.
Exploiting Symmetry on Parallel Architectures.

NASA Astrophysics Data System (ADS)

Stiller, Lewis Benjamin

1995-01-01

This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
Development a computer codes to couple PWR-GALE output and PC-CREAM input

NASA Astrophysics Data System (ADS)

Kuntjoro, S.; Budi Setiawan, M.; Nursinta Adi, W.; Deswandri; Sunaryo, G. R.

2018-02-01

Radionuclide dispersion analysis is part of an important reactor safety analysis. From the analysis it can be obtained the amount of doses received by radiation workers and communities around nuclear reactor. The radionuclide dispersion analysis under normal operating conditions is carried out using the PC-CREAM code, and it requires input data such as source term and population distribution. Input data is derived from the output of another program that is PWR-GALE and written Population Distribution data in certain format. Compiling inputs for PC-CREAM programs manually requires high accuracy, as it involves large amounts of data in certain formats and often errors in compiling inputs manually. To minimize errors in input generation, than it is make coupling program for PWR-GALE and PC-CREAM programs and a program for writing population distribution according to the PC-CREAM input format. This work was conducted to create the coupling programming between PWR-GALE output and PC-CREAM input and programming to written population data in the required formats. Programming is done by using Python programming language which has advantages of multiplatform, object-oriented and interactive. The result of this work is software for coupling data of source term and written population distribution data. So that input to PC-CREAM program can be done easily and avoid formatting errors. Programming sourceterm coupling program PWR-GALE and PC-CREAM is completed, so that the creation of PC-CREAM inputs in souceterm and distribution data can be done easily and according to the desired format.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

NASA Astrophysics Data System (ADS)

Simunovic, S.; Zacharia, T.; Baltas, N.; Spalding, D. B.

1995-03-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. The Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.
MPI implementation of PHOENICS: A general purpose computational fluid dynamics code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simunovic, S.; Zacharia, T.; Baltas, N.

1995-04-01

PHOENICS is a suite of computational analysis programs that are used for simulation of fluid flow, heat transfer, and dynamical reaction processes. The parallel version of the solver EARTH for the Computational Fluid Dynamics (CFD) program PHOENICS has been implemented using Message Passing Interface (MPI) standard. Implementation of MPI version of PHOENICS makes this computational tool portable to a wide range of parallel machines and enables the use of high performance computing for large scale computational simulations. MPI libraries are available on several parallel architectures making the program usable across different architectures as well as on heterogeneous computer networks. Themore » Intel Paragon NX and MPI versions of the program have been developed and tested on massively parallel supercomputers Intel Paragon XP/S 5, XP/S 35, and Kendall Square Research, and on the multiprocessor SGI Onyx computer at Oak Ridge National Laboratory. The preliminary testing results of the developed program have shown scalable performance for reasonably sized computational domains.« less
Parallel hyperbolic PDE simulation on clusters: Cell versus GPU

NASA Astrophysics Data System (ADS)

Rostrup, Scott; De Sterck, Hans

2010-12-01

Increasingly, high-performance computing is looking towards data-parallel computational devices to enhance computational performance. Two technologies that have received significant attention are IBM's Cell Processor and NVIDIA's CUDA programming model for graphics processing unit (GPU) computing. In this paper we investigate the acceleration of parallel hyperbolic partial differential equation simulation on structured grids with explicit time integration on clusters with Cell and GPU backends. The message passing interface (MPI) is used for communication between nodes at the coarsest level of parallelism. Optimizations of the simulation code at the several finer levels of parallelism that the data-parallel devices provide are described in terms of data layout, data flow and data-parallel instructions. Optimized Cell and GPU performance are compared with reference code performance on a single x86 central processing unit (CPU) core in single and double precision. We further compare the CPU, Cell and GPU platforms on a chip-to-chip basis, and compare performance on single cluster nodes with two CPUs, two Cell processors or two GPUs in a shared memory configuration (without MPI). We finally compare performance on clusters with 32 CPUs, 32 Cell processors, and 32 GPUs using MPI. Our GPU cluster results use NVIDIA Tesla GPUs with GT200 architecture, but some preliminary results on recently introduced NVIDIA GPUs with the next-generation Fermi architecture are also included. This paper provides computational scientists and engineers who are considering porting their codes to accelerator environments with insight into how structured grid based explicit algorithms can be optimized for clusters with Cell and GPU accelerators. It also provides insight into the speed-up that may be gained on current and future accelerator architectures for this class of applications. Program summaryProgram title: SWsolver Catalogue identifier: AEGY_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGY_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL v3 No. of lines in distributed program, including test data, etc.: 59 168 No. of bytes in distributed program, including test data, etc.: 453 409 Distribution format: tar.gz Programming language: C, CUDA Computer: Parallel Computing Clusters. Individual compute nodes may consist of x86 CPU, Cell processor, or x86 CPU with attached NVIDIA GPU accelerator. Operating system: Linux Has the code been vectorised or parallelized?: Yes. Tested on 1-128 x86 CPU cores, 1-32 Cell Processors, and 1-32 NVIDIA GPUs. RAM: Tested on Problems requiring up to 4 GB per compute node. Classification: 12 External routines: MPI, CUDA, IBM Cell SDK Nature of problem: MPI-parallel simulation of Shallow Water equations using high-resolution 2D hyperbolic equation solver on regular Cartesian grids for x86 CPU, Cell Processor, and NVIDIA GPU using CUDA. Solution method: SWsolver provides 3 implementations of a high-resolution 2D Shallow Water equation solver on regular Cartesian grids, for CPU, Cell Processor, and NVIDIA GPU. Each implementation uses MPI to divide work across a parallel computing cluster. Additional comments: Sub-program numdiff is used for the test run.
SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (IBM VERSION)

NASA Technical Reports Server (NTRS)

Manteufel, R.

1994-01-01

The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.
SAP- FORTRAN STATIC SOURCE CODE ANALYZER PROGRAM (DEC VAX VERSION)

NASA Technical Reports Server (NTRS)

Merwarth, P. D.

1994-01-01

The FORTRAN Static Source Code Analyzer program, SAP, was developed to automatically gather statistics on the occurrences of statements and structures within a FORTRAN program and to provide for the reporting of those statistics. Provisions have been made for weighting each statistic and to provide an overall figure of complexity. Statistics, as well as figures of complexity, are gathered on a module by module basis. Overall summed statistics are also accumulated for the complete input source file. SAP accepts as input syntactically correct FORTRAN source code written in the FORTRAN 77 standard language. In addition, code written using features in the following languages is also accepted: VAX-11 FORTRAN, IBM S/360 FORTRAN IV Level H Extended; and Structured FORTRAN. The SAP program utilizes two external files in its analysis procedure. A keyword file allows flexibility in classifying statements and in marking a statement as either executable or non-executable. A statistical weight file allows the user to assign weights to all output statistics, thus allowing the user flexibility in defining the figure of complexity. The SAP program is written in FORTRAN IV for batch execution and has been implemented on a DEC VAX series computer under VMS and on an IBM 370 series computer under MVS. The SAP program was developed in 1978 and last updated in 1985.
User's guide: Programs for processing altimeter data over inland seas

NASA Technical Reports Server (NTRS)

Au, A. Y.; Brown, R. D.; Welker, J. E.

1989-01-01

The programs described were developed to process GEODYN-formatted satellite altimeter data, and to apply the processed results to predict geoid undulations and gravity anomalies of inland sea areas. These programs are written in standard FORTRAN 77 and are designed to run on the NSESCC IBM 3081(MVS) computer. Because of the experimental nature of these programs they are tailored to the geographical area analyzed. The attached program listings are customized for processing the altimeter data over the Black Sea. Users interested in the Caspian Sea data are expected to modify each program, although the required modifications are generally minor. Program control parameters are defined in the programs via PARAMETER statements and/or DATA statements. Other auxiliary parameters, such as labels, are hard-wired into the programs. Large data files are read in or written out through different input or output units. The program listings of these programs are accompanied by sample IBM job control language (JCL) images. Familiarity with IBM JCL and the TEMPLATE graphic package is assumed.
Learner perception of oral and written examinations in an international medical training program

PubMed Central

Weiner, Scott G.; Anderson, Philip D.; Irish, Julie; Ciottone, Greg; Pini, Riccardo; Grifoni, Stefano; Rosen, Peter; Ban, Kevin M.

2010-01-01

Background There are an increasing number of training programs in emergency medicine involving different countries or cultures. Many examination types, both oral and written, have been validated as useful assessment tools around the world; but learner perception of their use in the setting of cross-cultural training programs has not been described. Aims The goal of this study was to evaluate learner perception of four common examination methods in an international educational curriculum in emergency medicine. Methods Twenty-four physicians in a cross-cultural training program were surveyed to determine learner perception of four different examination methods: structured oral case simulations, multiple-choice tests, semi-structured oral examinations, and essay tests. We also describe techniques used and barriers faced. Results There was a 100% response rate. Learners reported that all testing methods were useful in measuring knowledge and clinical ability and should be used for accreditation and future training programs. They rated oral examinations as significantly more useful than written in measuring clinical abilities (p < 0.01). Compared to the other three types of examinations, learners ranked oral case simulations as the most useful examination method for assessing learners’ fund of knowledge and clinical ability (p < 0.01). Conclusions Physician learners in a cross-cultural, international training program perceive all four written and oral examination methods as useful, but rate structured oral case simulations as the most useful method for assessing fund of knowledge and clinical ability. Electronic supplementary material The online version of this article (doi:10.1007/s12245-009-0147-2) contains supplementary material, which is available to authorized users. PMID:20414377
ORCA Project: Research on high-performance parallel computer programming environments. Final report, 1 Apr-31 Mar 90

DOE Office of Scientific and Technical Information (OSTI.GOV)

Snyder, L.; Notkin, D.; Adams, L.

1990-03-31

This task relates to research on programming massively parallel computers. Previous work on the Ensamble concept of programming was extended and investigation into nonshared memory models of parallel computation was undertaken. Previous work on the Ensamble concept defined a set of programming abstractions and was used to organize the programming task into three distinct levels; Composition of machine instruction, composition of processes, and composition of phases. It was applied to shared memory models of computations. During the present research period, these concepts were extended to nonshared memory models. During the present research period, one Ph D. thesis was completed, onemore » book chapter, and six conference proceedings were published.« less
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming

NASA Technical Reports Server (NTRS)

Dorband, John E.; Aburdene, Maurice F.

2002-01-01

Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
49 CFR 238.503 - Inspection, testing, and maintenance requirements.

Code of Federal Regulations, 2010 CFR

2010-10-01

... inspection, testing, or maintenance task under this part. (i) Standard procedures. The program under paragraph (a) of this section shall include the railroad's written standard procedures for performing all... this section shall contain the railroad's written procedures to ensure that all systems and components...

7 CFR 652.5 - Participant acquisition of technical services.

Code of Federal Regulations, 2010 CFR

2010-01-01

... technical service providers. (d) The Department may approve written agreements for technical assistance... identify in the particular program contract or written agreement the payment provisions for technical... 7 Agriculture 6 2010-01-01 2010-01-01 false Participant acquisition of technical services. 652.5...
The parallel programming of voluntary and reflexive saccades.

PubMed

Walker, Robin; McSorley, Eugene

2006-06-01

A novel two-step paradigm was used to investigate the parallel programming of consecutive, stimulus-elicited ('reflexive') and endogenous ('voluntary') saccades. The mean latency of voluntary saccades, made following the first reflexive saccades in two-step conditions, was significantly reduced compared to that of voluntary saccades made in the single-step control trials. The latency of the first reflexive saccades was modulated by the requirement to make a second saccade: first saccade latency increased when a second voluntary saccade was required in the opposite direction to the first saccade, and decreased when a second saccade was required in the same direction as the first reflexive saccade. A second experiment confirmed the basic effect and also showed that a second reflexive saccade may be programmed in parallel with a first voluntary saccade. The results support the view that voluntary and reflexive saccades can be programmed in parallel on a common motor map.
Incremental Parallelization of Non-Data-Parallel Programs Using the Charon Message-Passing Library

NASA Technical Reports Server (NTRS)

VanderWijngaart, Rob F.

2000-01-01

Message passing is among the most popular techniques for parallelizing scientific programs on distributed-memory architectures. The reasons for its success are wide availability (MPI), efficiency, and full tuning control provided to the programmer. A major drawback, however, is that incremental parallelization, as offered by compiler directives, is not generally possible, because all data structures have to be changed throughout the program simultaneously. Charon remedies this situation through mappings between distributed and non-distributed data. It allows breaking up the parallelization into small steps, guaranteeing correctness at every stage. Several tools are available to help convert legacy codes into high-performance message-passing programs. They usually target data-parallel applications, whose loops carrying most of the work can be distributed among all processors without much dependency analysis. Others do a full dependency analysis and then convert the code virtually automatically. Even more toolkits are available that aid construction from scratch of message passing programs. None, however, allows piecemeal translation of codes with complex data dependencies (i.e. non-data-parallel programs) into message passing codes. The Charon library (available in both C and Fortran) provides incremental parallelization capabilities by linking legacy code arrays with distributed arrays. During the conversion process, non-distributed and distributed arrays exist side by side, and simple mapping functions allow the programmer to switch between the two in any location in the program. Charon also provides wrapper functions that leave the structure of the legacy code intact, but that allow execution on truly distributed data. Finally, the library provides a rich set of communication functions that support virtually all patterns of remote data demands in realistic structured grid scientific programs, including transposition, nearest-neighbor communication, pipelining, gather/scatter, and redistribution. At the end of the conversion process most intermediate Charon function calls will have been removed, the non-distributed arrays will have been deleted, and virtually the only remaining Charon functions calls are the high-level, highly optimized communications. Distribution of the data is under complete control of the programmer, although a wide range of useful distributions is easily available through predefined functions. A crucial aspect of the library is that it does not allocate space for distributed arrays, but accepts programmer-specified memory. This has two major consequences. First, codes parallelized using Charon do not suffer from encapsulation; user data is always directly accessible. This provides high efficiency, and also retains the possibility of using message passing directly for highly irregular communications. Second, non-distributed arrays can be interpreted as (trivial) distributions in the Charon sense, which allows them to be mapped to truly distributed arrays, and vice versa. This is the mechanism that enables incremental parallelization. In this paper we provide a brief introduction of the library and then focus on the actual steps in the parallelization process, using some representative examples from, among others, the NAS Parallel Benchmarks. We show how a complicated two-dimensional pipeline-the prototypical non-data-parallel algorithm- can be constructed with ease. To demonstrate the flexibility of the library, we give examples of the stepwise, efficient parallel implementation of nonlocal boundary conditions common in aircraft simulations, as well as the construction of the sequence of grids required for multigrid.
Shuttle Inventory Management

NASA Technical Reports Server (NTRS)

1983-01-01

Inventory Management System (SIMS) consists of series of integrated support programs providing supply support for both Shuttle program and Kennedy Space Center base opeations SIMS controls all supply activities and requirements from single point. Programs written in COBOL.
FORTRAN manpower account program

NASA Technical Reports Server (NTRS)

Strand, J. N.

1972-01-01

Computer program for determining manpower costs for full time, part time, and contractor personnel is discussed. Twelve different tables resulting from computer output are described. Program is written in FORTRAN 4 for IBM 360/65 computer.
Airplane stability calculations with a card programmable pocket calculator

NASA Technical Reports Server (NTRS)

Sherman, W. L.

1978-01-01

Programs are presented for calculating airplane stability characteristics with a card programmable pocket calculator. These calculations include eigenvalues of the characteristic equations of lateral and longitudinal motion as well as stability parameters such as the time to damp to one-half amplitude or the damping ratio. The effects of wind shear are included. Background information and the equations programmed are given. The programs are written for the International System of Units, the dimensional form of the stability derivatives, and stability axes. In addition to programs for stability calculations, an unusual and short program is included for the Euler transformation of coordinates used in airplane motions. The programs have been written for a Hewlett Packard HP-67 calculator. However, the use of this calculator does not constitute an endorsement of the product by the National Aeronautics and Space Administration.
Developing Your Evaluation Plans: A Critical Component of Public Health Program Infrastructure.

PubMed

Lavinghouze, S Rene; Snyder, Kimberly

A program's infrastructure is often cited as critical to public health success. The Component Model of Infrastructure (CMI) identifies evaluation as essential under the core component of engaged data. An evaluation plan is a written document that describes how to monitor and evaluate a program, as well as how to use evaluation results for program improvement and decision making. The evaluation plan clarifies how to describe what the program did, how it worked, and why outcomes matter. We use the Centers for Disease Control and Prevention's (CDC) "Framework for Program Evaluation in Public Health" as a guide for developing an evaluation plan. Just as using a roadmap facilitates progress on a long journey, a well-written evaluation plan can clarify the direction your evaluation takes and facilitate achievement of the evaluation's objectives.
78 FR 76628 - Pilot Program for Parallel Review of Medical Products; Extension of the Duration of the Program

Federal Register 2010, 2011, 2012, 2013, 2014

2013-12-18

...The Food and Drug Administration (FDA) and the Centers for Medicare and Medicaid Services (CMS) (the Agencies) are announcing the extension of the ``Pilot Program for Parallel Review of Medical Products.'' The Agencies have decided to continue the program as currently designed for an additional period of 2 years from the date of publication of this notice.
The application of generalized, cyclic, and modified numerical integration algorithms to problems of satellite orbit computation

NASA Technical Reports Server (NTRS)

Chesler, L.; Pierce, S.

1971-01-01

Generalized, cyclic, and modified multistep numerical integration methods are developed and evaluated for application to problems of satellite orbit computation. Generalized methods are compared with the presently utilized Cowell methods; new cyclic methods are developed for special second-order differential equations; and several modified methods are developed and applied to orbit computation problems. Special computer programs were written to generate coefficients for these methods, and subroutines were written which allow use of these methods with NASA's GEOSTAR computer program.
Theory and operation of the Gould 32/27 programs ABLE-2A and EBLE for the tropospheric air motion measurement system

NASA Technical Reports Server (NTRS)

Butler, C.

1986-01-01

Software development for the Trospheric Air Motion Measurement Systems (TAMMS) is documented. In July/August the TAMMS was flown on the NASA/Goddard Flight Center Electra aircraft for 19 mission for the ABLE-2A (Amazon Boundary Layer Experiment) in Brazil. In December 1985, several flights were performed to assess the contamination and boundary layer of the Electra. Position data, flow angles, pressure transducer measurements were recorded. The programs written for the ABLE-2A were modified due to timing considerations for this particular program. The 3-step programs written for EBLE (Electra Boundary Layer Experiment) are described. Power up and log-on procedures are discussed. A few editing techniques are described for modification of the programs.
GRAY: a program to calculate gray-body radiation heat-transfer view factors from black-body view factors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wong, R. L.

1976-06-14

Program GRAY is written to perform the matrix manipulations necessary to convert black-body radiation heat-transfer view factors to gray-body view factors as required by thermal analyzer codes. The black-body view factors contain only geometric relationships. Program GRAY allows the effects of multiple gray-body reflections to be included. The resulting effective gray-body factors can then be used with the corresponding fourth-power temperature differences to obtain the net radiative heat flux. The program is written to accept a matrix input or the card image output generated by the black-body view factor program CNVUFAC. The resulting card image output generated by GRAY ismore » in a form usable by the TRUMP thermal analyzer.« less
Incremental comprehension of pitch relationships in written music: Evidence from eye movements.

PubMed

Hadley, Lauren V; Sturt, Patrick; Eerola, Tuomas; Pickering, Martin J

2017-03-17

To investigate how proficient pianists comprehend pitch relationships in written music when they first encounter it we conducted two experiments in which proficient pianists' eyes were tracked while they read and played single-line melodies. In Experiment 1, participants played at their own speed; in Experiment 2 they played with an external metronome. The melodies were either congruent or anomalous, with the anomaly involving one bar being shifted in pitch to alter the implied harmonic structure (e.g., non-resolution of a dominant). In both experiments, anomaly led to rapid disruption in participants' eye-movements in terms of regressions from the target bar, indicating that pianists process written pitch relationships online. This is particularly striking because in musical sight-reading eye movement behaviour is constrained by the concurrent performance. Both experiments also showed that anomaly induced pupil dilation. Together these results indicate that proficient pianists rapidly integrate the music that they read into the prior context, and that anomalies in terms of pitch relationships lead to processing difficulty. These findings parallel those of text reading, suggesting that structural processing involves similar constraints across domains.
Integrated Task and Data Parallel Programming

NASA Technical Reports Server (NTRS)

Grimshaw, A. S.

1998-01-01

This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated with Andrew Grimshaw and Adam Ferrari to write a book chapter which will be included in Parallel Processing in C++ edited by Gregory Wilson. I also finished two courses, Compilers and Advanced Compilers, in 1995. These courses complete my class requirements at the University of Virginia. I have only my dissertation research and defense to complete.
Integrated Task And Data Parallel Programming: Language Design

NASA Technical Reports Server (NTRS)

Grimshaw, Andrew S.; West, Emily A.

1998-01-01

his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated with Andrew Grimshaw and Adam Ferrari to write a book chapter which will be included in Parallel Processing in C++ edited by Gregory Wilson. I also finished two courses, Compilers and Advanced Compilers, in 1995. These courses complete my class requirements at the University of Virginia. I have only my dissertation research and defense to complete.
High performance Python for direct numerical simulations of turbulent flows

NASA Astrophysics Data System (ADS)

Mortensen, Mikael; Langtangen, Hans Petter

2016-06-01

Direct Numerical Simulations (DNS) of the Navier Stokes equations is an invaluable research tool in fluid dynamics. Still, there are few publicly available research codes and, due to the heavy number crunching implied, available codes are usually written in low-level languages such as C/C++ or Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS code that nearly matches the performance of C++ for thousands of processors and billions of unknowns. We also describe a version optimized through Cython, that is found to match the speed of C++. The solvers are written from scratch in Python, both the mesh, the MPI domain decomposition, and the temporal integrators. The solvers have been verified and benchmarked on the Shaheen supercomputer at the KAUST supercomputing laboratory, and we are able to show very good scaling up to several thousand cores. A very important part of the implementation is the mesh decomposition (we implement both slab and pencil decompositions) and 3D parallel Fast Fourier Transforms (FFT). The mesh decomposition and FFT routines have been implemented in Python using serial FFT routines (either NumPy, pyFFTW or any other serial FFT module), NumPy array manipulations and with MPI communications handled by MPI for Python (mpi4py). We show how we are able to execute a 3D parallel FFT in Python for a slab mesh decomposition using 4 lines of compact Python code, for which the parallel performance on Shaheen is found to be slightly better than similar routines provided through the FFTW library. For a pencil mesh decomposition 7 lines of code is required to execute a transform.
An approach to computing discrete adjoints for MPI-parallelized models applied to Ice Sheet System Model 4.11

NASA Astrophysics Data System (ADS)

Larour, Eric; Utke, Jean; Bovin, Anton; Morlighem, Mathieu; Perez, Gilberto

2016-11-01

Within the framework of sea-level rise projections, there is a strong need for hindcast validation of the evolution of polar ice sheets in a way that tightly matches observational records (from radar, gravity, and altimetry observations mainly). However, the computational requirements for making hindcast reconstructions possible are severe and rely mainly on the evaluation of the adjoint state of transient ice-flow models. Here, we look at the computation of adjoints in the context of the NASA/JPL/UCI Ice Sheet System Model (ISSM), written in C++ and designed for parallel execution with MPI. We present the adaptations required in the way the software is designed and written, but also generic adaptations in the tools facilitating the adjoint computations. We concentrate on the use of operator overloading coupled with the AdjoinableMPI library to achieve the adjoint computation of the ISSM. We present a comprehensive approach to (1) carry out type changing through the ISSM, hence facilitating operator overloading, (2) bind to external solvers such as MUMPS and GSL-LU, and (3) handle MPI-based parallelism to scale the capability. We demonstrate the success of the approach by computing sensitivities of hindcast metrics such as the misfit to observed records of surface altimetry on the northeastern Greenland Ice Stream, or the misfit to observed records of surface velocities on Upernavik Glacier, central West Greenland. We also provide metrics for the scalability of the approach, and the expected performance. This approach has the potential to enable a new generation of hindcast-validated projections that make full use of the wealth of datasets currently being collected, or already collected, in Greenland and Antarctica.
An Approach to Computing Discrete Adjoints for MPI-Parallelized Models Applied to the Ice Sheet System Model}

NASA Astrophysics Data System (ADS)

Perez, G. L.; Larour, E. Y.; Morlighem, M.

2016-12-01

Within the framework of sea-level rise projections, there is a strong need for hindcast validation of the evolution of polar ice sheets in a way that tightly matches observational records (from radar and altimetry observations mainly). However, the computational requirements for making hindcast reconstructions possible are severe and rely mainly on the evaluation of the adjoint state of transient ice-flow models. Here, we look at the computation of adjoints in the context of the NASA/JPL/UCI Ice Sheet System Model, written in C++ and designed for parallel execution with MPI. We present the adaptations required in the way the software is designed and written but also generic adaptations in the tools facilitating the adjoint computations. We concentrate on the use of operator overloading coupled with the AdjoinableMPI library to achieve the adjoint computation of ISSM. We present a comprehensive approach to 1) carry out type changing through ISSM, hence facilitating operator overloading, 2) bind to external solvers such as MUMPS and GSL-LU and 3) handle MPI-based parallelism to scale the capability. We demonstrate the success of the approach by computing sensitivities of hindcast metrics such as the misfit to observed records of surface altimetry on the North-East Greenland Ice Stream, or the misfit to observed records of surface velocities on Upernavik Glacier, Central West Greenland. We also provide metrics for the scalability of the approach, and the expected performance. This approach has the potential of enabling a new generation of hindcast-validated projections that make full use of the wealth of datasets currently being collected, or alreay collected in Greenland and Antarctica, such as surface altimetry, surface velocities, and/or gravity measurements.
Automatic Management of Parallel and Distributed System Resources

NASA Technical Reports Server (NTRS)

Yan, Jerry; Ngai, Tin Fook; Lundstrom, Stephen F.

1990-01-01

Viewgraphs on automatic management of parallel and distributed system resources are presented. Topics covered include: parallel applications; intelligent management of multiprocessing systems; performance evaluation of parallel architecture; dynamic concurrent programs; compiler-directed system approach; lattice gaseous cellular automata; and sparse matrix Cholesky factorization.
Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.

2013-08-21

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f , (e.g. Verlet algorithm) is available to propagate the system from time ti (trajectory positions and velocities xi = (ri; vi)) to time ti+1 (xi+1) by xi+1 = fi(xi), the dynamics problem spanning an interval from t0 : : : tM can be transformed into a root finding problem, F(X) = [xi - f (x(i-1)]i=1;M = 0, for the trajectory variables. The root finding problem is solved using amore » variety of optimization techniques, including quasi-Newton and preconditioned quasi-Newton optimization schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed and the effectiveness of various approaches to solving the root finding problem are tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl+4H2O AIMD simulation at the MP2 level. The maximum speedup obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow TCP/IP networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl+4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. By using these algorithms we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 seconds per time step to 6.9 seconds per time step.« less
Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations.

PubMed

Bylaska, Eric J; Weare, Jonathan Q; Weare, John H

2013-08-21

Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time ti (trajectory positions and velocities xi = (ri, vi)) to time ti + 1 (xi + 1) by xi + 1 = fi(xi), the dynamics problem spanning an interval from t0[ellipsis (horizontal)]tM can be transformed into a root finding problem, F(X) = [xi - f(x(i - 1)]i = 1, M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H2O AIMD simulation at the MP2 level. The maximum speedup (serial execution/timeparallel execution time) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a distributed computing environment using very slow transmission control protocol/Internet protocol networks. Scripts written in Python that make calls to a precompiled quantum chemistry package (NWChem) are demonstrated to provide an actual speedup of 8.2 for a 2.5 ps AIMD simulation of HCl + 4H2O at the MP2/6-31G* level. Implemented in this way these algorithms can be used for long time high-level AIMD simulations at a modest cost using machines connected by very slow networks such as WiFi, or in different time zones connected by the Internet. The algorithms can also be used with programs that are already parallel. Using these algorithms, we are able to reduce the cost of a MP2/6-311++G(2d,2p) simulation that had reached its maximum possible speedup in the parallelization of the electronic structure calculation from 32 s/time step to 6.9 s/time step.

Describing, using 'recognition cones'. [parallel-series model with English-like computer program

NASA Technical Reports Server (NTRS)

Uhr, L.

1973-01-01

A parallel-serial 'recognition cone' model is examined, taking into account the model's ability to describe scenes of objects. An actual program is presented in an English-like language. The concept of a 'description' is discussed together with possible types of descriptive information. Questions regarding the level and the variety of detail are considered along with approaches for improving the serial representations of parallel systems.
PISCES: An environment for parallel scientific computation

NASA Technical Reports Server (NTRS)

Pratt, T. W.

1985-01-01

The parallel implementation of scientific computing environment (PISCES) is a project to provide high-level programming environments for parallel MIMD computers. Pisces 1, the first of these environments, is a FORTRAN 77 based environment which runs under the UNIX operating system. The Pisces 1 user programs in Pisces FORTRAN, an extension of FORTRAN 77 for parallel processing. The major emphasis in the Pisces 1 design is in providing a carefully specified virtual machine that defines the run-time environment within which Pisces FORTRAN programs are executed. Each implementation then provides the same virtual machine, regardless of differences in the underlying architecture. The design is intended to be portable to a variety of architectures. Currently Pisces 1 is implemented on a network of Apollo workstations and on a DEC VAX uniprocessor via simulation of the task level parallelism. An implementation for the Flexible Computing Corp. FLEX/32 is under construction. An introduction to the Pisces 1 virtual computer and the FORTRAN 77 extensions is presented. An example of an algorithm for the iterative solution of a system of equations is given. The most notable features of the design are the provision for several granularities of parallelism in programs and the provision of a window mechanism for distributed access to large arrays of data.
Programs for Fundamentals of Chemistry.

ERIC Educational Resources Information Center

Gallardo, Julio; Delgado, Steven

This document provides computer programs, written in BASIC PLUS, for presenting fundamental or remedial college chemistry students with chemical problems in a computer assisted instructional program. Programs include instructions, a sample run, and 14 separate practice sessions covering: mathematical operations, using decimals, solving…
42 CFR 8.33 - Written decision.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 42 Public Health 1 2012-10-01 2012-10-01 false Written decision. 8.33 Section 8.33 Public Health PUBLIC HEALTH SERVICE, DEPARTMENT OF HEALTH AND HUMAN SERVICES GENERAL PROVISIONS CERTIFICATION OF OPIOID TREATMENT PROGRAMS Procedures for Review of Suspension or Proposed Revocation of OTP Certification, and of...
42 CFR 8.33 - Written decision.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 42 Public Health 1 2014-10-01 2014-10-01 false Written decision. 8.33 Section 8.33 Public Health PUBLIC HEALTH SERVICE, DEPARTMENT OF HEALTH AND HUMAN SERVICES GENERAL PROVISIONS CERTIFICATION OF OPIOID TREATMENT PROGRAMS Procedures for Review of Suspension or Proposed Revocation of OTP Certification, and of...
42 CFR 8.33 - Written decision.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 42 Public Health 1 2013-10-01 2013-10-01 false Written decision. 8.33 Section 8.33 Public Health PUBLIC HEALTH SERVICE, DEPARTMENT OF HEALTH AND HUMAN SERVICES GENERAL PROVISIONS CERTIFICATION OF OPIOID TREATMENT PROGRAMS Procedures for Review of Suspension or Proposed Revocation of OTP Certification, and of...
5 CFR 2638.706 - Agency's written plan for annual ethics training.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 5 Administrative Personnel 3 2011-01-01 2011-01-01 false Agency's written plan for annual ethics training. 2638.706 Section 2638.706 Administrative Personnel OFFICE OF GOVERNMENT ETHICS GOVERNMENT ETHICS OFFICE OF GOVERNMENT ETHICS AND EXECUTIVE AGENCY ETHICS PROGRAM RESPONSIBILITIES Executive Agency Ethics...
5 CFR 2638.706 - Agency's written plan for annual ethics training.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 5 Administrative Personnel 3 2010-01-01 2010-01-01 false Agency's written plan for annual ethics training. 2638.706 Section 2638.706 Administrative Personnel OFFICE OF GOVERNMENT ETHICS GOVERNMENT ETHICS OFFICE OF GOVERNMENT ETHICS AND EXECUTIVE AGENCY ETHICS PROGRAM RESPONSIBILITIES Executive Agency Ethics...
42 CFR 456.380 - Individual written plan of care.

Code of Federal Regulations, 2010 CFR

2010-10-01

... SERVICES (CONTINUED) MEDICAL ASSISTANCE PROGRAMS UTILIZATION CONTROL Utilization Control: Intermediate Care Facilities Plan of Care § 456.380 Individual written plan of care. (a) Before admission to an ICF or before...) Activities; (v) Therapies; (vi) Social services; (vii) Diet; and (viii) Special procedures designed to meet...
42 CFR 456.380 - Individual written plan of care.

Code of Federal Regulations, 2011 CFR

2011-10-01

... SERVICES (CONTINUED) MEDICAL ASSISTANCE PROGRAMS UTILIZATION CONTROL Utilization Control: Intermediate Care Facilities Plan of Care § 456.380 Individual written plan of care. (a) Before admission to an ICF or before...) Activities; (v) Therapies; (vi) Social services; (vii) Diet; and (viii) Special procedures designed to meet...
5 CFR 2638.706 - Agency's written plan for annual ethics training.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 5 Administrative Personnel 3 2014-01-01 2014-01-01 false Agency's written plan for annual ethics training. 2638.706 Section 2638.706 Administrative Personnel OFFICE OF GOVERNMENT ETHICS GOVERNMENT ETHICS OFFICE OF GOVERNMENT ETHICS AND EXECUTIVE AGENCY ETHICS PROGRAM RESPONSIBILITIES Executive Agency Ethics...
5 CFR 2638.706 - Agency's written plan for annual ethics training.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 5 Administrative Personnel 3 2012-01-01 2012-01-01 false Agency's written plan for annual ethics training. 2638.706 Section 2638.706 Administrative Personnel OFFICE OF GOVERNMENT ETHICS GOVERNMENT ETHICS OFFICE OF GOVERNMENT ETHICS AND EXECUTIVE AGENCY ETHICS PROGRAM RESPONSIBILITIES Executive Agency Ethics...
5 CFR 2638.706 - Agency's written plan for annual ethics training.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 5 Administrative Personnel 3 2013-01-01 2013-01-01 false Agency's written plan for annual ethics training. 2638.706 Section 2638.706 Administrative Personnel OFFICE OF GOVERNMENT ETHICS GOVERNMENT ETHICS OFFICE OF GOVERNMENT ETHICS AND EXECUTIVE AGENCY ETHICS PROGRAM RESPONSIBILITIES Executive Agency Ethics...
Generating performance portable geoscientific simulation code with Firedrake (Invited)

NASA Astrophysics Data System (ADS)

Ham, D. A.; Bercea, G.; Cotter, C. J.; Kelly, P. H.; Loriant, N.; Luporini, F.; McRae, A. T.; Mitchell, L.; Rathgeber, F.

2013-12-01

This presentation will demonstrate how a change in simulation programming paradigm can be exploited to deliver sophisticated simulation capability which is far easier to programme than are conventional models, is capable of exploiting different emerging parallel hardware, and is tailored to the specific needs of geoscientific simulation. Geoscientific simulation represents a grand challenge computational task: many of the largest computers in the world are tasked with this field, and the requirements of resolution and complexity of scientists in this field are far from being sated. However, single thread performance has stalled, even sometimes decreased, over the last decade, and has been replaced by ever more parallel systems: both as conventional multicore CPUs and in the emerging world of accelerators. At the same time, the needs of scientists to couple ever-more complex dynamics and parametrisations into their models makes the model development task vastly more complex. The conventional approach of writing code in low level languages such as Fortran or C/C++ and then hand-coding parallelism for different platforms by adding library calls and directives forces the intermingling of the numerical code with its implementation. This results in an almost impossible set of skill requirements for developers, who must simultaneously be domain science experts, numericists, software engineers and parallelisation specialists. Even more critically, it requires code to be essentially rewritten for each emerging hardware platform. Since new platforms are emerging constantly, and since code owners do not usually control the procurement of the supercomputers on which they must run, this represents an unsustainable development load. The Firedrake system, conversely, offers the developer the opportunity to write PDE discretisations in the high-level mathematical language UFL from the FEniCS project (http://fenicsproject.org). Non-PDE model components, such as parametrisations, can be written as short C kernels operating locally on the underlying mesh, with no explicit parallelism. The executable code is then generated in C, CUDA or OpenCL and executed in parallel on the target architecture. The system also offers features of special relevance to the geosciences. In particular, the large scale separation between the vertical and horizontal directions in many geoscientific processes can be exploited to offer the flexibility of unstructured meshes in the horizontal direction, without the performance penalty usually associated with those methods.
Program/Project Management Resources: A collection of 50 bibliographies focusing on continual improvement, reinventing government, and successful project management

NASA Technical Reports Server (NTRS)

Michaels, Jeffrey

1994-01-01

These Program/Project Management Resource Lists were originally written for the NASA project management community. Their purpose was to promote the use of the NASA Headquarters Library Program/Project Management Collection funded by NASA Headquarters Code FT, Training & Development Division, by offering introductions to the management topics studied by today's managers. Lists were also written at the request of NASA Headquarters Code T, Office of Continual improvements, and at the request of NASA members of the National Performance Review. This is the second edition of the compilation of these bibliographies; the first edition was printed in March 1994.
Eigensolver for a Sparse, Large Hermitian Matrix

NASA Technical Reports Server (NTRS)

Tisdale, E. Robert; Oyafuso, Fabiano; Klimeck, Gerhard; Brown, R. Chris

2003-01-01

A parallel-processing computer program finds a few eigenvalues in a sparse Hermitian matrix that contains as many as 100 million diagonal elements. This program finds the eigenvalues faster, using less memory, than do other, comparable eigensolver programs. This program implements a Lanczos algorithm in the American National Standards Institute/ International Organization for Standardization (ANSI/ISO) C computing language, using the Message Passing Interface (MPI) standard to complement an eigensolver in PARPACK. [PARPACK (Parallel Arnoldi Package) is an extension, to parallel-processing computer architectures, of ARPACK (Arnoldi Package), which is a collection of Fortran 77 subroutines that solve large-scale eigenvalue problems.] The eigensolver runs on Beowulf clusters of computers at the Jet Propulsion Laboratory (JPL).
Projection of Teachers' Salaries for Contract Negotiations.

ERIC Educational Resources Information Center

Ott, Jack P.

1982-01-01

Lists and explains a computer program written in BASIC which calculates teacher salaries using a salary index. Modification of this payroll program is suggested as a student project in schools which teach computer programing. (JJD)
Computer program for afterheat temperature distribution for mobile nuclear power plant

NASA Technical Reports Server (NTRS)

Parker, W. G.; Vanbibber, L. E.

1972-01-01

ESATA computer program was developed to analyze thermal safety aspects of post-impacted mobile nuclear power plants. Program is written in FORTRAN 4 and designed for IBM 7094/7044 direct coupled system.
5 CFR 362.102 - Definitions.

Code of Federal Regulations, 2013 CFR

2013-01-01

... Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT CIVIL SERVICE REGULATIONS PATHWAYS PROGRAMS General... written agreement between the agency and each Pathways Participant. Program Participant or Pathways Participant means any individual appointed under a Pathways Program. Qualifying educational institution means...
5 CFR 362.102 - Definitions.

Code of Federal Regulations, 2014 CFR

2014-01-01

... Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT CIVIL SERVICE REGULATIONS PATHWAYS PROGRAMS General... written agreement between the agency and each Pathways Participant. Program Participant or Pathways Participant means any individual appointed under a Pathways Program. Qualifying educational institution means...

A study of low-cost reliable actuators for light aircraft. Part B: Appendices

NASA Technical Reports Server (NTRS)

Eijsink, H.; Rice, M.

1978-01-01

Computer programs written in FORTRAN are given for time response calculations on pneumatic and linear hydraulic actuators. The programs are self-explanatory with comment statements. Program output is also included.
Parallelization of elliptic solver for solving 1D Boussinesq model

NASA Astrophysics Data System (ADS)

Tarwidi, D.; Adytia, D.

2018-03-01

In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.
3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sofronov, I.D.; Voronin, B.L.; Butnev, O.I.

1997-12-31

The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle.more » The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.« less
Support for Debugging Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Hood, Robert; Biegel, Bryan (Technical Monitor)

2001-01-01

We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the computations begin to differ. If the original serial code is correct, errors due to parallelization will be isolated by the comparison. One of the primary goals of the system is to minimize the effort required of the user. To that end, the debugging system uses information produced by the parallelization tool to drive the comparison process. In particular the debugging system relies on the parallelization tool to provide information about where variables may have been modified and how arrays are distributed across multiple processes. User effort is also reduced through the use of dynamic instrumentation. This allows us to modify the program execution without changing the way the user builds the executable. The use of dynamic instrumentation also permits us to compare the executions in a fine-grained fashion and only involve the debugger when a difference has been detected. This reduces the overhead of executing instrumentation.
Relative Debugging of Automatically Parallelized Programs

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Hood, Robert; Biegel, Bryan (Technical Monitor)

2002-01-01

We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the computations begin to differ. If the original serial code is correct, errors due to parallelization will be isolated by the comparison. One of the primary goals of the system is to minimize the effort required of the user. To that end, the debugging system uses information produced by the parallelization tool to drive the comparison process. In particular, the debugging system relies on the parallelization tool to provide information about where variables may have been modified and how arrays are distributed across multiple processes. User effort is also reduced through the use of dynamic instrumentation. This allows us to modify, the program execution with out changing the way the user builds the executable. The use of dynamic instrumentation also permits us to compare the executions in a fine-grained fashion and only involve the debugger when a difference has been detected. This reduces the overhead of executing instrumentation.
UPEML Version 2. 0: A machine-portable CDC Update emulator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mehlhorn, T.A.; Young, M.F.

1987-05-01

UPEML is a machine-portable CDC Update emulation program. UPEML is written in ANSI standard Fortran-77 and is relatively simple and compact. It is capable of emulating a significant subset of the standard CDC Update functions, including program library creation and subsequent modification. Machine-portability is an essential attribute of UPEML. UPEML was written primarily to facilitate the use of CDC-based scientific packages on alternate computer systems such as the VAX 11/780 and the IBM 3081. UPEML has also been successfully used on the multiprocessor ELXSI, on CRAYs under both COS and CTSS operating systems, on APOLLO workstations, and on the HP-9000.more » Version 2.0 includes enhanced error checking, full ASCI character support, a program library audit capability, and a partial update option in which only selected or modified decks are written to the compile file. Further enhancements include checks for overlapping corrections, processing of nested calls to common decks, and reads and addfiles from alternate input files.« less
Numerical Modeling of S-Wave Generation by Fracture Damage in Underground Nuclear Explosions

DTIC Science & Technology

2009-09-30

Element Package, ABAQUS. A user -defined subroutine , VUMAT, was written that incorporates the micro-mechanics based damage constitutive law described...dynamic damage evolution on the elastic and anelastic response. 2) whereas the Ashby/Sammis model was only applicable to the case where the initial cracks ...are all parallel and the same size, we can now include a specified distribution of initial crack sizes with random azimuthal orientation about the
Visualization Software for VisIT Java Client

DOE Office of Scientific and Technical Information (OSTI.GOV)

Billings, Jay Jay; Smith, Robert W

The VisIT Java Client (JVC) library is a lightweight thin client that is designed and written purely in the native language of Java (the Python & JavaScript versions of the library use the same concept) and communicates with any new unmodified standalone version of VisIT, a high performance computing parallel visualization toolkit, over traditional or web sockets and dynamically determines capabilities of the running VisIT instance whether local or remote.
Paralex: An Environment for Parallel Programming in Distributed Systems

DTIC Science & Technology

1991-12-07

distributed systems is coni- parable to assembly language programming for traditional sequential systems - the user must resort to low-level primitives ...to accomplish data encoding/decoding, communication, remote exe- cution, synchronization , failure detection and recovery. It is our belief that... synchronization . Finally, composing parallel programs by interconnecting se- quential computations allows automatic support for heterogeneity and fault tolerance
12 CFR 748.0 - Security program.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 12 Banks and Banking 7 2014-01-01 2014-01-01 false Security program. 748.0 Section 748.0 Banks and Banking NATIONAL CREDIT UNION ADMINISTRATION REGULATIONS AFFECTING CREDIT UNIONS SECURITY PROGRAM, REPORT....0 Security program. (a) Each federally insured credit union will develop a written security program...
12 CFR 748.0 - Security program.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 12 Banks and Banking 6 2011-01-01 2011-01-01 false Security program. 748.0 Section 748.0 Banks and Banking NATIONAL CREDIT UNION ADMINISTRATION REGULATIONS AFFECTING CREDIT UNIONS SECURITY PROGRAM, REPORT....0 Security program. (a) Each federally insured credit union will develop a written security program...
PROGRAMMED LEARNING IN CENTRAL AFRICAN CONTEXTS.

ERIC Educational Resources Information Center

HAWKRIDGE, D.G.

SINCE 1964, THE PROGRAMMED LEARNING CENTRE AT THE UNIVERSITY COLLEGE OF RHODESIA HAS BEEN INVESTIGATING THE POTENTIALITIES OF PROGRAMED LEARNING FOR CENTRAL AFRICA THROUGH A SERIES OF CONTROLLED EXPERIMENTS USING LOCALLY-WRITTEN AND PUBLISHED PROGRAMS. ASSESSMENT OF THE DESIRABILITY AND USEFULNESS OF PROGRAMS IN TEACHING, AND ASSESSMENT OF THEIR…
METALLURGICAL PROGRAMS: CALCULATION OF MASS FROM VOLUME, DENSITY OF MIXTURES, AND CONVERSION OF ATOMIC TO WEIGHT PERCENT

NASA Technical Reports Server (NTRS)

Degroh, H.

1994-01-01

The Metallurgical Programs include three simple programs which calculate solutions to problems common to metallurgical engineers and persons making metal castings. The first program calculates the mass of a binary ideal (alloy) given the weight fractions and densities of the pure components and the total volume. The second program calculates the densities of a binary ideal mixture. The third program converts the atomic percentages of a binary mixture to weight percentages. The programs use simple equations to assist the materials staff with routine calculations. The Metallurgical Programs are written in Microsoft QuickBASIC for interactive execution and have been implemented on an IBM PC-XT/AT operating MS-DOS 2.1 or higher with 256K bytes of memory. All instructions needed by the user appear as prompts as the software is used. Data is input using the keyboard only and output is via the monitor. The Metallurgical programs were written in 1987.
Interfacing Computer Aided Parallelization and Performance Analysis

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Biegel, Bryan A. (Technical Monitor)

2003-01-01

When porting sequential applications to parallel computer architectures, the program developer will typically go through several cycles of source code optimization and performance analysis. We have started a project to develop an environment where the user can jointly navigate through program structure and performance data information in order to make efficient optimization decisions. In a prototype implementation we have interfaced the CAPO computer aided parallelization tool with the Paraver performance analysis tool. We describe both tools and their interface and give an example for how the interface helps within the program development cycle of a benchmark code.
Documentation of a multiple-technique computer program for plotting major-ion composition of natural waters

USGS Publications Warehouse

Briel, L.I.

1993-01-01

A computer program was written to produce 6 different types of water-quality diagrams--Piper, Stiff, pie, X-Y, boxplot, and Piper 3-D--from the same file of input data. The Piper 3-D diagram is a new method that projects values from the surface of a Piper plot into a triangular prism to show how variations in chemical composition can be related to variations in other water-quality variables. This program is an analytical tool to aid in the interpretation of data. This program is interactive, and the user can select from a menu the type of diagram to be produced and a large number of individual features. Alternatively, these choices can be specified in the data file, which provides a batch mode for running the program. The program does not display water-quality diagrams directly; plots are written to a file. Four different plot- file formats are available: device-independent metafiles, Adobe PostScript graphics files, and two Hewlett-Packard graphics language formats (7475 and 7586). An ASCII data-table file is also produced to document the computed values. This program is written in Fortran '77 and uses graphics subroutines from either the PRIOR AGTK or the DISSPLA graphics library. The program has been implemented on Prime series 50 and Data General Aviion computers within the USGS; portability to other computing systems depends on the availability of the graphics library.
LDRD final report on massively-parallel linear programming : the parPCx system.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

2005-02-01

This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less
Computer programs for thermodynamic and transport properties of hydrogen

NASA Technical Reports Server (NTRS)

Hall, W. J.; Mc Carty, R. D.; Roder, H. M.

1968-01-01

Computer program subroutines provide the thermodynamic and transport properties of hydrogen in tabular form. The programs provide 18 combinations of input and output variables. This program is written in FORTRAN 4 for use on the IBM 7044 or CDC 3600 computers.
Opening New Doors. Friends through Writing.

ERIC Educational Resources Information Center

British Columbia Buildings Corp., Victoria.

This publication contains 37 stories, vignettes, and poems written by participants in the British Columbia Buildings Corporation Workplace Language Program. The pieces center on family, work, people, and places and were written by people who decided to learn how to improve their language skills. They represent slices of employees' lives as they…
49 CFR 238.109 - Training, qualification, and designation program.

Code of Federal Regulations, 2010 CFR

2010-10-01

... that must be performed on each type of equipment that the railroad operates; (2) Develop written... railroad may request earlier application of these requirements upon written notification to FRA's Associate...) Adopt a training curriculum that includes classroom and “hands-on” lessons designed to impart the skills...
Design of an Online Curriculum Promoting Transformative Learning in Post Professional Doctoral Students

ERIC Educational Resources Information Center

Provident, Ingrid; Salls, Joyce; Dolhi, Cathy; Schreiber, Jodi; Mattila, Amy; Eckel, Emily

2015-01-01

Written reflections of 113 occupational therapy clinical doctoral students who graduated from an online program between 2007 and 2013 were analyzed for themes which reflected transformative learning and characteristics of curricular design which promoted transformative learning. Qualitative analyses of written reflections were performed. Several…

Writing Is the Funnest Thing: Teaching Creative Writing.

ERIC Educational Resources Information Center

Witter, Janet; Emberlin, Don

1973-01-01

This curriculum bulletin discusses a program teaching creative writing to fifth and sixth grade children in an attempt to improve the quality of written English. These children wrote briefly every day throughout the school year. Every area of the written language curriculum was covered. Each student wrote letters, reports, stories, editorial…
Programmable Applications: Interpreter Meets Interface

DTIC Science & Technology

1991-10-01

ics program written for professional architects and designers, and including a huge library of files written in AutoLisp , a "design-enriched" Lisp... AutoLisp procedures). The choice of Lisp as a base language is a happy one for AutoCAD; the application has clearly benefitted from the contribution
42 CFR 456.380 - Individual written plan of care.

Code of Federal Regulations, 2012 CFR

2012-10-01

... SERVICES (CONTINUED) MEDICAL ASSISTANCE PROGRAMS UTILIZATION CONTROL Utilization Control: Intermediate Care Facilities Plan of Care § 456.380 Individual written plan of care. (a) Before admission to an ICF or before... designed to meet the objectives of the plan of care; (5) Plans for continuing care, including review and...
42 CFR 456.380 - Individual written plan of care.

Code of Federal Regulations, 2014 CFR

2014-10-01

... SERVICES (CONTINUED) MEDICAL ASSISTANCE PROGRAMS UTILIZATION CONTROL Utilization Control: Intermediate Care Facilities Plan of Care § 456.380 Individual written plan of care. (a) Before admission to an ICF or before... designed to meet the objectives of the plan of care; (5) Plans for continuing care, including review and...
42 CFR 456.380 - Individual written plan of care.

Code of Federal Regulations, 2013 CFR

2013-10-01

... SERVICES (CONTINUED) MEDICAL ASSISTANCE PROGRAMS UTILIZATION CONTROL Utilization Control: Intermediate Care Facilities Plan of Care § 456.380 Individual written plan of care. (a) Before admission to an ICF or before... designed to meet the objectives of the plan of care; (5) Plans for continuing care, including review and...
78 FR 24437 - Meeting of the CJIS Advisory Policy Board

Federal Register 2010, 2011, 2012, 2013, 2014

2013-04-25

... and appropriate technical and operational issues related to the programs administered by the FBI's... the Designated Federal Officer (DFO). Any member of the public may file a written statement with the Board. Written comments shall be focused on the APB's current issues under discussion and may not be...
77 FR 58870 - Meeting of the CJIS Advisory Policy Board

Federal Register 2010, 2011, 2012, 2013, 2014

2012-09-24

... policy issues and appropriate technical and operational issues related to the programs administered by... with approval of the Designated Federal Officer (DFO). Any member of the public may file a written statement with the Board. Written comments shall be focused on the APB's current issues under discussion and...
78 FR 42761 - Agency Information Collection Activities; Comment Request; Program for International Student...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-07-17

... lacking is an empirical linkage between PISA and measures of successful transition from high school to... the comment period will not be accepted. Written requests for information or comments submitted by... respondents, including through the use of information technology. Please note that written comments received...
78 FR 44600 - Comment Request for Information Collection for Veterans Retraining Assistance Program Participant...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-07-24

...: Written comments must be submitted to the office listed in the addresses section below on or before September 23, 2013. ADDRESSES: Submit written comments to Andrew Ridgeway, Office of Workforce Investment... contacting the office listed above. SUPPLEMENTARY INFORMATION: I. Background ETA seeks extension without...
A high-speed linear algebra library with automatic parallelism

NASA Technical Reports Server (NTRS)

Boucher, Michael L.

1994-01-01

Parallel or distributed processing is key to getting highest performance workstations. However, designing and implementing efficient parallel algorithms is difficult and error-prone. It is even more difficult to write code that is both portable to and efficient on many different computers. Finally, it is harder still to satisfy the above requirements and include the reliability and ease of use required of commercial software intended for use in a production environment. As a result, the application of parallel processing technology to commercial software has been extremely small even though there are numerous computationally demanding programs that would significantly benefit from application of parallel processing. This paper describes DSSLIB, which is a library of subroutines that perform many of the time-consuming computations in engineering and scientific software. DSSLIB combines the high efficiency and speed of parallel computation with a serial programming model that eliminates many undesirable side-effects of typical parallel code. The result is a simple way to incorporate the power of parallel processing into commercial software without compromising maintainability, reliability, or ease of use. This gives significant advantages over less powerful non-parallel entries in the market.
Dual and parallel postdoctoral training programs: implications for the osteopathic medical profession.

PubMed

Burkhart, Diane N; Lischka, Terri A

2011-04-01

Students in colleges of osteopathic medicine have several options when considering postdoctoral training programs. In addition to training programs approved solely by the American Osteopathic Association or accredited solely by the Accreditation Council for Graduate Medical Education (ACGME), students can pursue programs accredited by both organizations (ie, dually accredited programs) or osteopathic programs that occur side-by-side with ACGME programs (ie, parallel programs). In the present article, we report on the availability and growth of these 2 training options and describe their benefits and drawbacks for trainees and the osteopathic medical profession as a whole.
Compendium of student papers : 2008 Undergraduate Transportation Scholars Program.

DOT National Transportation Integrated Search

2008-08-01

This report is a compilation of research papers written by students participating in the 2008 Undergraduate : Transportation Scholars Program. The ten-week summer program, now in its eighteenth year, provides : undergraduate students in Civil Enginee...
Calculating the Flow Field in a Radial Turbine Scroll

NASA Technical Reports Server (NTRS)

Baskharone, E.; Abdallah, S.; Hamed, A.; Tabaoff, W.

1983-01-01

Set of two computer programs calculates flow field in radial turbine scroll. Programs represent improvement in analyzing flow in radial turbine scrolls and provide designer with tools for designing better scrolls. Programs written in FORTRAN IV.
Software For Least-Squares And Robust Estimation

NASA Technical Reports Server (NTRS)

Jeffreys, William H.; Fitzpatrick, Michael J.; Mcarthur, Barbara E.; Mccartney, James

1990-01-01

GAUSSFIT computer program includes full-featured programming language facilitating creation of mathematical models solving least-squares and robust-estimation problems. Programming language designed to make it easy to specify complex reduction models. Written in 100 percent C language.
Compendium of student papers : 2009 undergraduate transportation engineering fellows program.

DOT National Transportation Integrated Search

2009-10-01

This report is a compilation of research papers written by students participating in the 2009 Undergraduate : Transportation Scholars Program. The ten-week summer program, now in its nineteenth year, provides : undergraduate students in Civil Enginee...
HWNOISE program users' guide

DOT National Transportation Integrated Search

1992-02-01

HWNOISE is a VNTSC-developed user friendly program written in : Microsoft Fortran version 4.01 for the IBM PC/AT and : compatibles to analyze acoustic data. This program is an : integral part of the Federal Highway Administration's Mobile : Noise Dat...
HWINPUT program users' guide

DOT National Transportation Integrated Search

1992-02-01

HWINPUT is a VNTSC-developed user friendly program written in : Microsoft Fortran version 4.01 for the IBM PC/AT. This program : is an integral part of the Federal Highway Administration's : Mobile Noise Data Gathering and Analysis Laboratory and is ...
76 FR 72766 - Proposed Collection; Comment Request for Form 8952

Federal Register 2010, 2011, 2012, 2013, 2014

2011-11-25

... 8952, Application for Voluntary Classification Settlement Program. DATES: Written comments should [email protected] . SUPPLEMENTARY INFORMATION: Title: Application for Voluntary Classification Settlement... Classification Settlement Program. To participate in the program, taxpayers must meet certain eligibility...
Software For Genetic Algorithms

NASA Technical Reports Server (NTRS)

Wang, Lui; Bayer, Steve E.

1992-01-01

SPLICER computer program is genetic-algorithm software tool used to solve search and optimization problems. Provides underlying framework and structure for building genetic-algorithm application program. Written in Think C.
A computer program to calculate zeroes, extrema, and interval integrals for the associated Legendre functions. [for estimation of bounds of truncation error in spherical harmonic expansion of geopotential

NASA Technical Reports Server (NTRS)

Payne, M. H.

1973-01-01

A computer program is described for the calculation of the zeroes of the associated Legendre functions, Pnm, and their derivatives, for the calculation of the extrema of Pnm and also the integral between pairs of successive zeroes. The program has been run for all n,m from (0,0) to (20,20) and selected cases beyond that for n up to 40. Up to (20,20), the program (written in double precision) retains nearly full accuracy, and indications are that up to (40,40) there is still sufficient precision (4-5 decimal digits for a 54-bit mantissa) for estimation of various bounds and errors involved in geopotential modelling, the purpose for which the program was written.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.