Effective Vectorization with OpenMP 4.5
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huber, Joseph N.; Hernandez, Oscar R.; Lopez, Matthew Graham
This paper describes how the Single Instruction Multiple Data (SIMD) model and its extensions in OpenMP work, and how these are implemented in different compilers. Modern processors are highly parallel computational machines which often include multiple processors capable of executing several instructions in parallel. Understanding SIMD and executing instructions in parallel allows the processor to achieve higher performance without increasing the power required to run it. SIMD instructions can significantly reduce the runtime of code by executing a single operation on large groups of data. The SIMD model is so integral to the processor s potential performance that, if SIMDmore » is not utilized, less than half of the processor is ever actually used. Unfortunately, using SIMD instructions is a challenge in higher level languages because most programming languages do not have a way to describe them. Most compilers are capable of vectorizing code by using the SIMD instructions, but there are many code features important for SIMD vectorization that the compiler cannot determine at compile time. OpenMP attempts to solve this by extending the C++/C and Fortran programming languages with compiler directives that express SIMD parallelism. OpenMP is used to pass hints to the compiler about the code to be executed in SIMD. This is a key resource for making optimized code, but it does not change whether or not the code can use SIMD operations. However, in many cases critical functions are limited by a poor understanding of how SIMD instructions are actually implemented, as SIMD can be implemented through vector instructions or simultaneous multi-threading (SMT). We have found that it is often the case that code cannot be vectorized, or is vectorized poorly, because the programmer does not have sufficient knowledge of how SIMD instructions work.« less
Optimized scalar promotion with load and splat SIMD instructions
Eichenberger, Alexander E; Gschwind, Michael K; Gunnels, John A
2013-10-29
Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.
Optimized scalar promotion with load and splat SIMD instructions
Eichenberger, Alexandre E [Chappaqua, NY; Gschwind, Michael K [Chappaqua, NY; Gunnels, John A [Yorktown Heights, NY
2012-08-28
Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.
Evaluating local indirect addressing in SIMD proc essors
NASA Technical Reports Server (NTRS)
Middleton, David; Tomboulian, Sherryl
1989-01-01
In the design of parallel computers, there exists a tradeoff between the number and power of individual processors. The single instruction stream, multiple data stream (SIMD) model of parallel computers lies at one extreme of the resulting spectrum. The available hardware resources are devoted to creating the largest possible number of processors, and consequently each individual processor must use the fewest possible resources. Disagreement exists as to whether SIMD processors should be able to generate addresses individually into their local data memory, or all processors should access the same address. The tradeoff is examined between the increased capability and the reduced number of processors that occurs in this single instruction stream, multiple, locally addressed, data (SIMLAD) model. Factors are assembled that affect this design choice, and the SIMLAD model is compared with the bare SIMD and the MIMD models.
Gschwind, Michael K [Chappaqua, NY
2011-03-01
Mechanisms for implementing a floating point only single instruction multiple data instruction set architecture are provided. A processor is provided that comprises an issue unit, an execution unit coupled to the issue unit, and a vector register file coupled to the execution unit. The execution unit has logic that implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA). The floating point vector registers of the vector register file store both scalar and floating point values as vectors having a plurality of vector elements. The processor may be part of a data processing system.
Gschwind, Michael K
2013-04-16
Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)
1993-01-01
This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading.
Rahn, René; Budach, Stefan; Costanza, Pascal; Ehrhardt, Marcel; Hancox, Jonny; Reinert, Knut
2018-05-03
Pairwise sequence alignment is undoubtedly a central tool in many bioinformatics analyses. In this paper, we present a generically accelerated module for pairwise sequence alignments applicable for a broad range of applications. In our module, we unified the standard dynamic programming kernel used for pairwise sequence alignments and extended it with a generalized inter-sequence vectorization layout, such that many alignments can be computed simultaneously by exploiting SIMD (Single Instruction Multiple Data) instructions of modern processors. We then extended the module by adding two layers of thread-level parallelization, where we a) distribute many independent alignments on multiple threads and b) inherently parallelize a single alignment computation using a work stealing approach producing a dynamic wavefront progressing along the minor diagonal. We evaluated our alignment vectorization and parallelization on different processors, including the newest Intel® Xeon® (Skylake) and Intel® Xeon Phi™ (KNL) processors, and use cases. The instruction set AVX512-BW (Byte and Word), available on Skylake processors, can genuinely improve the performance of vectorized alignments. We could run single alignments 1600 times faster on the Xeon Phi™ and 1400 times faster on the Xeon® than executing them with our previous sequential alignment module. The module is programmed in C++ using the SeqAn (Reinert et al., 2017) library and distributed with version 2.4. under the BSD license. We support SSE4, AVX2, AVX512 instructions and included UME::SIMD, a SIMD-instruction wrapper library, to extend our module for further instruction sets. We thoroughly test all alignment components with all major C++ compilers on various platforms. rene.rahn@fu-berlin.de.
Liu, Yongchao; Wirawan, Adrianto; Schmidt, Bertil
2013-04-04
The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases. We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU SIMD parallelization, which employs CUDA PTX SIMD video instructions to gain more data parallelism beyond the SIMT execution model. Moreover, sequence alignment workloads are automatically distributed over CPUs and GPUs based on their respective compute capabilities. Evaluation on the Swiss-Prot database shows that CUDASW++ 3.0 gains a performance improvement over CUDASW++ 2.0 up to 2.9 and 3.2, with a maximum performance of 119.0 and 185.6 GCUPS, on a single-GPU GeForce GTX 680 and a dual-GPU GeForce GTX 690 graphics card, respectively. In addition, our algorithm has demonstrated significant speedups over other top-performing tools: SWIPE and BLAST+. CUDASW++ 3.0 is written in CUDA C++ and PTX assembly languages, targeting GPUs based on the Kepler architecture. This algorithm obtains significant speedups over its predecessor: CUDASW++ 2.0, by benefiting from the use of CPU and GPU SIMD instructions as well as the concurrent execution on CPUs and GPUs. The source code and the simulated data are available at http://cudasw.sourceforge.net.
NASA Astrophysics Data System (ADS)
Stone, Christopher P.; Alferman, Andrew T.; Niemeyer, Kyle E.
2018-05-01
Accurate and efficient methods for solving stiff ordinary differential equations (ODEs) are a critical component of turbulent combustion simulations with finite-rate chemistry. The ODEs governing the chemical kinetics at each mesh point are decoupled by operator-splitting allowing each to be solved concurrently. An efficient ODE solver must then take into account the available thread and instruction-level parallelism of the underlying hardware, especially on many-core coprocessors, as well as the numerical efficiency. A stiff Rosenbrock and a nonstiff Runge-Kutta ODE solver are both implemented using the single instruction, multiple thread (SIMT) and single instruction, multiple data (SIMD) paradigms within OpenCL. Both methods solve multiple ODEs concurrently within the same instruction stream. The performance of these parallel implementations was measured on three chemical kinetic models of increasing size across several multicore and many-core platforms. Two separate benchmarks were conducted to clearly determine any performance advantage offered by either method. The first benchmark measured the run-time of evaluating the right-hand-side source terms in parallel and the second benchmark integrated a series of constant-pressure, homogeneous reactors using the Rosenbrock and Runge-Kutta solvers. The right-hand-side evaluations with SIMD parallelism on the host multicore Xeon CPU and many-core Xeon Phi co-processor performed approximately three times faster than the baseline multithreaded C++ code. The SIMT parallel model on the host and Phi was 13%-35% slower than the baseline while the SIMT model on the NVIDIA Kepler GPU provided approximately the same performance as the SIMD model on the Phi. The runtimes for both ODE solvers decreased significantly with the SIMD implementations on the host CPU (2.5-2.7 ×) and Xeon Phi coprocessor (4.7-4.9 ×) compared to the baseline parallel code. The SIMT implementations on the GPU ran 1.5-1.6 times faster than the baseline multithreaded CPU code; however, this was significantly slower than the SIMD versions on the host CPU or the Xeon Phi. The performance difference between the three platforms was attributed to thread divergence caused by the adaptive step-sizes within the ODE integrators. Analysis showed that the wider vector width of the GPU incurs a higher level of divergence than the narrower Sandy Bridge or Xeon Phi. The significant performance improvement provided by the SIMD parallel strategy motivates further research into more ODE solver methods that are both SIMD-friendly and computationally efficient.
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)
1994-01-01
In a computer having a large number of single-instruction multiple data (SIMD) processors, each of the SIMD processors has two sets of three individual processor elements controlled by a master control unit and interconnected among a plurality of register file units where data is stored. The register files input and output data in synchronism with a minor cycle clock under control of two slave control units controlling the register file units connected to respective ones of the two sets of processor elements. Depending upon which ones of the register file units are enabled to store or transmit data during a particular minor clock cycle, the processor elements within an SIMD processor are connected in rings or in pipeline arrays, and may exchange data with the internal bus or with neighboring SIMD processors through interface units controlled by respective ones of the two slave control units.
A flexible algorithm for calculating pair interactions on SIMD architectures
NASA Astrophysics Data System (ADS)
Páll, Szilárd; Hess, Berk
2013-12-01
Calculating interactions or correlations between pairs of particles is typically the most time-consuming task in particle simulation or correlation analysis. Straightforward implementations using a double loop over particle pairs have traditionally worked well, especially since compilers usually do a good job of unrolling the inner loop. In order to reach high performance on modern CPU and accelerator architectures, single-instruction multiple-data (SIMD) parallelization has become essential. Avoiding memory bottlenecks is also increasingly important and requires reducing the ratio of memory to arithmetic operations. Moreover, when pairs only interact within a certain cut-off distance, good SIMD utilization can only be achieved by reordering input and output data, which quickly becomes a limiting factor. Here we present an algorithm for SIMD parallelization based on grouping a fixed number of particles, e.g. 2, 4, or 8, into spatial clusters. Calculating all interactions between particles in a pair of such clusters improves data reuse compared to the traditional scheme and results in a more efficient SIMD parallelization. Adjusting the cluster size allows the algorithm to map to SIMD units of various widths. This flexibility not only enables fast and efficient implementation on current CPUs and accelerator architectures like GPUs or Intel MIC, but it also makes the algorithm future-proof. We present the algorithm with an application to molecular dynamics simulations, where we can also make use of the effective buffering the method introduces.
NASA Technical Reports Server (NTRS)
Dorband, John E.
1987-01-01
Generating graphics to faithfully represent information can be a computationally intensive task. A way of using the Massively Parallel Processor to generate images by ray tracing is presented. This technique uses sort computation, a method of performing generalized routing interspersed with computation on a single-instruction-multiple-data (SIMD) computer.
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
NASA Astrophysics Data System (ADS)
Vincenti, H.; Lobet, M.; Lehe, R.; Sasanka, R.; Vay, J.-L.
2017-01-01
In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈ 20 pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scatter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of × 2 to × 2.5 speed-up in double precision for particle shape factor of orders 1- 3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles).
Rapid prototyping and evaluation of programmable SIMD SDR processors in LISA
NASA Astrophysics Data System (ADS)
Chen, Ting; Liu, Hengzhu; Zhang, Botao; Liu, Dongpei
2013-03-01
With the development of international wireless communication standards, there is an increase in computational requirement for baseband signal processors. Time-to-market pressure makes it impossible to completely redesign new processors for the evolving standards. Due to its high flexibility and low power, software defined radio (SDR) digital signal processors have been proposed as promising technology to replace traditional ASIC and FPGA fashions. In addition, there are large numbers of parallel data processed in computation-intensive functions, which fosters the development of single instruction multiple data (SIMD) architecture in SDR platform. So a new way must be found to prototype the SDR processors efficiently. In this paper we present a bit-and-cycle accurate model of programmable SIMD SDR processors in a machine description language LISA. LISA is a language for instruction set architecture which can gain rapid model at architectural level. In order to evaluate the availability of our proposed processor, three common baseband functions, FFT, FIR digital filter and matrix multiplication have been mapped on the SDR platform. Analytical results showed that the SDR processor achieved the maximum of 47.1% performance boost relative to the opponent processor.
Design of a massively parallel computer using bit serial processing elements
NASA Technical Reports Server (NTRS)
Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing
1995-01-01
A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.
Vectorization for Molecular Dynamics on Intel Xeon Phi Corpocessors
NASA Astrophysics Data System (ADS)
Yi, Hongsuk
2014-03-01
Many modern processors are capable of exploiting data-level parallelism through the use of single instruction multiple data (SIMD) execution. The new Intel Xeon Phi coprocessor supports 512 bit vector registers for the high performance computing. In this paper, we have developed a hierarchical parallelization scheme for accelerated molecular dynamics simulations with the Terfoff potentials for covalent bond solid crystals on Intel Xeon Phi coprocessor systems. The scheme exploits multi-level parallelism computing. We combine thread-level parallelism using a tightly coupled thread-level and task-level parallelism with 512-bit vector register. The simulation results show that the parallel performance of SIMD implementations on Xeon Phi is apparently superior to their x86 CPU architecture.
Characterization of robotics parallel algorithms and mapping onto a reconfigurable SIMD machine
NASA Technical Reports Server (NTRS)
Lee, C. S. G.; Lin, C. T.
1989-01-01
The kinematics, dynamics, Jacobian, and their corresponding inverse computations are six essential problems in the control of robot manipulators. Efficient parallel algorithms for these computations are discussed and analyzed. Their characteristics are identified and a scheme on the mapping of these algorithms to a reconfigurable parallel architecture is presented. Based on the characteristics including type of parallelism, degree of parallelism, uniformity of the operations, fundamental operations, data dependencies, and communication requirement, it is shown that most of the algorithms for robotic computations possess highly regular properties and some common structures, especially the linear recursive structure. Moreover, they are well-suited to be implemented on a single-instruction-stream multiple-data-stream (SIMD) computer with reconfigurable interconnection network. The model of a reconfigurable dual network SIMD machine with internal direct feedback is introduced. A systematic procedure internal direct feedback is introduced. A systematic procedure to map these computations to the proposed machine is presented. A new scheduling problem for SIMD machines is investigated and a heuristic algorithm, called neighborhood scheduling, that reorders the processing sequence of subtasks to reduce the communication time is described. Mapping results of a benchmark algorithm are illustrated and discussed.
An implementation of a tree code on a SIMD, parallel computer
NASA Technical Reports Server (NTRS)
Olson, Kevin M.; Dorband, John E.
1994-01-01
We describe a fast tree algorithm for gravitational N-body simulation on SIMD parallel computers. The tree construction uses fast, parallel sorts. The sorted lists are recursively divided along their x, y and z coordinates. This data structure is a completely balanced tree (i.e., each particle is paired with exactly one other particle) and maintains good spatial locality. An implementation of this tree-building algorithm on a 16k processor Maspar MP-1 performs well and constitutes only a small fraction (approximately 15%) of the entire cycle of finding the accelerations. Each node in the tree is treated as a monopole. The tree search and the summation of accelerations also perform well. During the tree search, node data that is needed from another processor is simply fetched. Roughly 55% of the tree search time is spent in communications between processors. We apply the code to two problems of astrophysical interest. The first is a simulation of the close passage of two gravitationally, interacting, disk galaxies using 65,636 particles. We also simulate the formation of structure in an expanding, model universe using 1,048,576 particles. Our code attains speeds comparable to one head of a Cray Y-MP, so single instruction, multiple data (SIMD) type computers can be used for these simulations. The cost/performance ratio for SIMD machines like the Maspar MP-1 make them an extremely attractive alternative to either vector processors or large multiple instruction, multiple data (MIMD) type parallel computers. With further optimizations (e.g., more careful load balancing), speeds in excess of today's vector processing computers should be possible.
NASA Technical Reports Server (NTRS)
Denning, Peter J.; Tichy, Walter F.
1990-01-01
Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.
Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™
Gomes, Jeremias M.; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H.
2016-01-01
We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel® Xeon Phi™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP’s irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63× on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7× and 1.62×, respectively, as compared to efficient CPU and GPU implementations. PMID:27298591
Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™.
Gomes, Jeremias M; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H
2015-10-01
We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel ® Xeon Phi ™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP's irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63 × on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7 × and 1.62 × , respectively, as compared to efficient CPU and GPU implementations.
Implementation of a parallel unstructured Euler solver on the CM-5
NASA Technical Reports Server (NTRS)
Morano, Eric; Mavriplis, D. J.
1995-01-01
An efficient unstructured 3D Euler solver is parallelized on a Thinking Machine Corporation Connection Machine 5, distributed memory computer with vectoring capability. In this paper, the single instruction multiple data (SIMD) strategy is employed through the use of the CM Fortran language and the CMSSL scientific library. The performance of the CMSSL mesh partitioner is evaluated and the overall efficiency of the parallel flow solver is discussed.
VASP-4096: a very high performance programmable device for digital media processing applications
NASA Astrophysics Data System (ADS)
Krikelis, Argy
2001-03-01
Over the past few years, technology drivers for microprocessors have changed significantly. Media data delivery and processing--such as telecommunications, networking, video processing, speech recognition and 3D graphics--is increasing in importance and will soon dominate the processing cycles consumed in computer-based systems. This paper presents the architecture of the VASP-4096 processor. VASP-4096 provides high media performance with low energy consumption by integrating associative SIMD parallel processing with embedded microprocessor technology. The major innovations in the VASP-4096 is the integration of thousands of processing units in a single chip that are capable of support software programmable high-performance mathematical functions as well as abstract data processing. In addition to 4096 processing units, VASP-4096 integrates on a single chip a RISC controller that is an implementation of the SPARC architecture, 128 Kbytes of Data Memory, and I/O interfaces. The SIMD processing in VASP-4096 implements the ASProCore architecture, which is a proprietary implementation of SIMD processing, operates at 266 MHz with program instructions issued by the RISC controller. The device also integrates a 64-bit synchronous main memory interface operating at 133 MHz (double-data rate), and a 64- bit 66 MHz PCI interface. VASP-4096, compared with other processors architectures that support media processing, offers true performance scalability, support for deterministic and non-deterministic data processing on a single device, and software programmability that can be re- used in future chip generations.
A scalable SIMD digital signal processor for high-quality multifunctional printer systems
NASA Astrophysics Data System (ADS)
Kang, Hyeong-Ju; Choi, Yongwoo; Kim, Kimo; Park, In-Cheol; Kim, Jung-Wook; Lee, Eul-Hwan; Gahang, Goo-Soo
2005-01-01
This paper describes a high-performance scalable SIMD digital signal processor (DSP) developed for multifunctional printer systems. The DSP supports a variable number of datapaths to cover a wide range of performance and maintain a RISC-like pipeline structure. Many special instructions suitable for image processing algorithms are included in the DSP. Quad/dual instructions are introduced for 8-bit or 16-bit data, and bit-field extraction/insertion instructions are supported to process various data types. Conditional instructions are supported to deal with complex relative conditions efficiently. In addition, an intelligent DMA block is integrated to align data in the course of data reading. Experimental results show that the proposed DSP outperforms a high-end printer-system DSP by at least two times.
Highly-Parallel, Highly-Compact Computing Structures Implemented in Nanotechnology
NASA Technical Reports Server (NTRS)
Crawley, D. G.; Duff, M. J. B.; Fountain, T. J.; Moffat, C. D.; Tomlinson, C. D.
1995-01-01
In this paper, we describe work in which we are evaluating how the evolving properties of nano-electronic devices could best be utilized in highly parallel computing structures. Because of their combination of high performance, low power, and extreme compactness, such structures would have obvious applications in spaceborne environments, both for general mission control and for on-board data analysis. However, the anticipated properties of nano-devices mean that the optimum architecture for such systems is by no means certain. Candidates include single instruction multiple datastream (SIMD) arrays, neural networks, and multiple instruction multiple datastream (MIMD) assemblies.
A sweep algorithm for massively parallel simulation of circuit-switched networks
NASA Technical Reports Server (NTRS)
Gaujal, Bruno; Greenberg, Albert G.; Nicol, David M.
1992-01-01
A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude.
Integration, Development and Performance of the 500 TFLOPS Heterogeneous Cluster (Condor)
2012-08-01
PlayStation 3 for High Performance Cluster Computing” LAPACK Working Note 185, 2007. [ 4 ] Feng, W., X. Feng, and R. Ge, “Green Supercomputing Comes of...CONFERENCE PAPER (Post Print) 3. DATES COVERED (From - To) JUN 2010 – MAY 2013 4 . TITLE AND SUBTITLE INTEGRATION, DEVELOPMENT AND PERFORMANCE OF...and streaming processing; the PlayStation 3 uses the IBM Cell BE processor, which adopts the multi-processor, single-instruction-multiple- data (SIMD
NASA Astrophysics Data System (ADS)
Tanikawa, Ataru; Yoshikawa, Kohji; Okamoto, Takashi; Nitadori, Keigo
2012-02-01
We present a high-performance N-body code for self-gravitating collisional systems accelerated with the aid of a new SIMD instruction set extension of the x86 architecture: Advanced Vector eXtensions (AVX), an enhanced version of the Streaming SIMD Extensions (SSE). With one processor core of Intel Core i7-2600 processor (8 MB cache and 3.40 GHz) based on Sandy Bridge micro-architecture, we implemented a fourth-order Hermite scheme with individual timestep scheme ( Makino and Aarseth, 1992), and achieved the performance of ˜20 giga floating point number operations per second (GFLOPS) for double-precision accuracy, which is two times and five times higher than that of the previously developed code implemented with the SSE instructions ( Nitadori et al., 2006b), and that of a code implemented without any explicit use of SIMD instructions with the same processor core, respectively. We have parallelized the code by using so-called NINJA scheme ( Nitadori et al., 2006a), and achieved ˜90 GFLOPS for a system containing more than N = 8192 particles with 8 MPI processes on four cores. We expect to achieve about 10 tera FLOPS (TFLOPS) for a self-gravitating collisional system with N ˜ 10 5 on massively parallel systems with at most 800 cores with Sandy Bridge micro-architecture. This performance will be comparable to that of Graphic Processing Unit (GPU) cluster systems, such as the one with about 200 Tesla C1070 GPUs ( Spurzem et al., 2010). This paper offers an alternative to collisional N-body simulations with GRAPEs and GPUs.
Scan line graphics generation on the massively parallel processor
NASA Technical Reports Server (NTRS)
Dorband, John E.
1988-01-01
Described here is how researchers implemented a scan line graphics generation algorithm on the Massively Parallel Processor (MPP). Pixels are computed in parallel and their results are applied to the Z buffer in large groups. To perform pixel value calculations, facilitate load balancing across the processors and apply the results to the Z buffer efficiently in parallel requires special virtual routing (sort computation) techniques developed by the author especially for use on single-instruction multiple-data (SIMD) architectures.
A VLSI chip set for real time vector quantization of image sequences
NASA Technical Reports Server (NTRS)
Baker, Richard L.
1989-01-01
The architecture and implementation of a VLSI chip set that vector quantizes (VQ) image sequences in real time is described. The chip set forms a programmable Single-Instruction, Multiple-Data (SIMD) machine which can implement various vector quantization encoding structures. Its VQ codebook may contain unlimited number of codevectors, N, having dimension up to K = 64. Under a weighted least squared error criterion, the engine locates at video rates the best code vector in full-searched or large tree searched VQ codebooks. The ability to manipulate tree structured codebooks, coupled with parallelism and pipelining, permits searches in as short as O (log N) cycles. A full codebook search results in O(N) performance, compared to O(KN) for a Single-Instruction, Single-Data (SISD) machine. With this VLSI chip set, an entire video code can be built on a single board that permits realtime experimentation with very large codebooks.
NASA Astrophysics Data System (ADS)
Tanikawa, Ataru; Yoshikawa, Kohji; Nitadori, Keigo; Okamoto, Takashi
2013-02-01
We have developed a numerical software library for collisionless N-body simulations named "Phantom-GRAPE" which highly accelerates force calculations among particles by use of a new SIMD instruction set extension to the x86 architecture, Advanced Vector eXtensions (AVX), an enhanced version of the Streaming SIMD Extensions (SSE). In our library, not only the Newton's forces, but also central forces with an arbitrary shape f(r), which has a finite cutoff radius rcut (i.e. f(r)=0 at r>rcut), can be quickly computed. In computing such central forces with an arbitrary force shape f(r), we refer to a pre-calculated look-up table. We also present a new scheme to create the look-up table whose binning is optimal to keep good accuracy in computing forces and whose size is small enough to avoid cache misses. Using an Intel Core i7-2600 processor, we measure the performance of our library for both of the Newton's forces and the arbitrarily shaped central forces. In the case of Newton's forces, we achieve 2×109 interactions per second with one processor core (or 75 GFLOPS if we count 38 operations per interaction), which is 20 times higher than the performance of an implementation without any explicit use of SIMD instructions, and 2 times than that with the SSE instructions. With four processor cores, we obtain the performance of 8×109 interactions per second (or 300 GFLOPS). In the case of the arbitrarily shaped central forces, we can calculate 1×109 and 4×109 interactions per second with one and four processor cores, respectively. The performance with one processor core is 6 times and 2 times higher than those of the implementations without any use of SIMD instructions and with the SSE instructions. These performances depend only weakly on the number of particles, irrespective of the force shape. It is good contrast with the fact that the performance of force calculations accelerated by graphics processing units (GPUs) depends strongly on the number of particles. Substantially weak dependence of the performance on the number of particles is suitable to collisionless N-body simulations, since these simulations are usually performed with sophisticated N-body solvers such as Tree- and TreePM-methods combined with an individual timestep scheme. We conclude that collisionless N-body simulations accelerated with our library have significant advantage over those accelerated by GPUs, especially on massively parallel environments.
Introducing difference recurrence relations for faster semi-global alignment of long sequences.
Suzuki, Hajime; Kasahara, Masahiro
2018-02-19
The read length of single-molecule DNA sequencers is reaching 1 Mb. Popular alignment software tools widely used for analyzing such long reads often take advantage of single-instruction multiple-data (SIMD) operations to accelerate calculation of dynamic programming (DP) matrices in the Smith-Waterman-Gotoh (SWG) algorithm with a fixed alignment start position at the origin. Nonetheless, 16-bit or 32-bit integers are necessary for storing the values in a DP matrix when sequences to be aligned are long; this situation hampers the use of the full SIMD width of modern processors. We proposed a faster semi-global alignment algorithm, "difference recurrence relations," that runs more rapidly than the state-of-the-art algorithm by a factor of 2.1. Instead of calculating and storing all the values in a DP matrix directly, our algorithm computes and stores mainly the differences between the values of adjacent cells in the matrix. Although the SWG algorithm and our algorithm can output exactly the same result, our algorithm mainly involves 8-bit integer operations, enabling us to exploit the full width of SIMD operations (e.g., 32) on modern processors. We also developed a library, libgaba, so that developers can easily integrate our algorithm into alignment programs. Our novel algorithm and optimized library implementation will facilitate accelerating nucleotide long-read analysis algorithms that use pairwise alignment stages. The library is implemented in the C programming language and available at https://github.com/ocxtal/libgaba .
2012-12-01
identity operation SIMD Single instruction, multiple datastream parallel computing Scala A byte-compiled programming language featuring dynamic type...Specific Languages 5a. CONTRACT NUMBER FA8750-10-1-0191 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 61101E 6. AUTHOR(S) Armando Fox 5d...application performance, but usually must rely on efficiency programmers who are experts in explicit parallel programming to achieve it. Since such efficiency
Fast Fourier Transform algorithm design and tradeoffs
NASA Technical Reports Server (NTRS)
Kamin, Ray A., III; Adams, George B., III
1988-01-01
The Fast Fourier Transform (FFT) is a mainstay of certain numerical techniques for solving fluid dynamics problems. The Connection Machine CM-2 is the target for an investigation into the design of multidimensional Single Instruction Stream/Multiple Data (SIMD) parallel FFT algorithms for high performance. Critical algorithm design issues are discussed, necessary machine performance measurements are identified and made, and the performance of the developed FFT programs are measured. Fast Fourier Transform programs are compared to the currently best Cray-2 FFT program.
A GPU-Parallelized Eigen-Based Clutter Filter Framework for Ultrasound Color Flow Imaging.
Chee, Adrian J Y; Yiu, Billy Y S; Yu, Alfred C H
2017-01-01
Eigen-filters with attenuation response adapted to clutter statistics in color flow imaging (CFI) have shown improved flow detection sensitivity in the presence of tissue motion. Nevertheless, its practical adoption in clinical use is not straightforward due to the high computational cost for solving eigendecompositions. Here, we provide a pedagogical description of how a real-time computing framework for eigen-based clutter filtering can be developed through a single-instruction, multiple data (SIMD) computing approach that can be implemented on a graphical processing unit (GPU). Emphasis is placed on the single-ensemble-based eigen-filtering approach (Hankel singular value decomposition), since it is algorithmically compatible with GPU-based SIMD computing. The key algebraic principles and the corresponding SIMD algorithm are explained, and annotations on how such algorithm can be rationally implemented on the GPU are presented. Real-time efficacy of our framework was experimentally investigated on a single GPU device (GTX Titan X), and the computing throughput for varying scan depths and slow-time ensemble lengths was studied. Using our eigen-processing framework, real-time video-range throughput (24 frames/s) can be attained for CFI frames with full view in azimuth direction (128 scanlines), up to a scan depth of 5 cm ( λ pixel axial spacing) for slow-time ensemble length of 16 samples. The corresponding CFI image frames, with respect to the ones derived from non-adaptive polynomial regression clutter filtering, yielded enhanced flow detection sensitivity in vivo, as demonstrated in a carotid imaging case example. These findings indicate that the GPU-enabled eigen-based clutter filtering can improve CFI flow detection performance in real time.
SIMD Optimization of Linear Expressions for Programmable Graphics Hardware
Bajaj, Chandrajit; Ihm, Insung; Min, Jungki; Oh, Jinsang
2009-01-01
The increased programmability of graphics hardware allows efficient graphical processing unit (GPU) implementations of a wide range of general computations on commodity PCs. An important factor in such implementations is how to fully exploit the SIMD computing capacities offered by modern graphics processors. Linear expressions in the form of ȳ = Ax̄ + b̄, where A is a matrix, and x̄, ȳ and b̄ are vectors, constitute one of the most basic operations in many scientific computations. In this paper, we propose a SIMD code optimization technique that enables efficient shader codes to be generated for evaluating linear expressions. It is shown that performance can be improved considerably by efficiently packing arithmetic operations into four-wide SIMD instructions through reordering of the operations in linear expressions. We demonstrate that the presented technique can be used effectively for programming both vertex and pixel shaders for a variety of mathematical applications, including integrating differential equations and solving a sparse linear system of equations using iterative methods. PMID:19946569
DOE Office of Scientific and Technical Information (OSTI.GOV)
Langer, Steven H.; Karlin, Ian; Marinak, Marty M.
HYDRA is used to simulate a variety of experiments carried out at the National Ignition Facility (NIF) [4] and other high energy density physics facilities. HYDRA has packages to simulate radiation transfer, atomic physics, hydrodynamics, laser propagation, and a number of other physics effects. HYDRA has over one million lines of code and includes both MPI and thread-level (OpenMP and pthreads) parallelism. This paper measures the performance characteristics of HYDRA using hardware counters on an IBM BlueGene/Q system. We report key ratios such as bytes/instruction and memory bandwidth for several different physics packages. The total number of bytes read andmore » written per time step is also reported. We show that none of the packages which use significant time are memory bandwidth limited on a Blue Gene/Q. HYDRA currently issues very few SIMD instructions. The pressure on memory bandwidth will increase if high levels of SIMD instructions can be achieved.« less
Accelerating navigation in the VecGeom geometry modeller
NASA Astrophysics Data System (ADS)
Wenzel, Sandro; Zhang, Yang; pre="for the"> VecGeom Developers, 2017-10-01 The VecGeom geometry library is a relatively recent effort aiming to provide a modern and high performance geometry service for particle detector simulation in hierarchical detector geometries common to HEP experiments. One of its principal targets is the efficient use of vector SIMD hardware instructions to accelerate geometry calculations for single track as well as multi-track queries. Previously, excellent performance improvements compared to Geant4/ROOT could be reported for elementary geometry algorithms at the level of single shape queries. In this contribution, we will focus on the higher level navigation algorithms in VecGeom, which are the most important components as seen from the simulation engines. We will first report on our R&D effort and developments to implement SIMD enhanced data structures to speed up the well-known “voxelised” navigation algorithms, ubiquitously used for particle tracing in complex detector modules consisting of many daughter parts. Second, we will discuss complementary new approaches to improve navigation algorithms in HEP. These ideas are based on a systematic exploitation of static properties of the detector layout as well as automatic code generation and specialisation of the C++ navigator classes. Such specialisations reduce the overhead of generic- or virtual function based algorithms and enhance the effectiveness of the SIMD vector units. These novel approaches go well beyond the existing solutions available in Geant4 or TGeo/ROOT, achieve a significantly superior performance, and might be of interest for a wide range of simulation backends (GeantV, Geant4). We exemplify this with concrete benchmarks for the CMS and ALICE detectors.
Architecture Adaptive Computing Environment
NASA Technical Reports Server (NTRS)
Dorband, John E.
2006-01-01
Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
Deploy Nalu/Kokkos algorithmic infrastructure with performance benchmarking.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Domino, Stefan P.; Ananthan, Shreyas; Knaus, Robert C.
The former Nalu interior heterogeneous algorithm design, which was originally designed to manage matrix assembly operations over all elemental topology types, has been modified to operate over homogeneous collections of mesh entities. This newly templated kernel design allows for removal of workset variable resize operations that were formerly required at each loop over a Sierra ToolKit (STK) bucket (nominally, 512 entities in size). Extensive usage of the Standard Template Library (STL) std::vector has been removed in favor of intrinsic Kokkos memory views. In this milestone effort, the transition to Kokkos as the underlying infrastructure to support performance and portability onmore » many-core architectures has been deployed for key matrix algorithmic kernels. A unit-test driven design effort has developed a homogeneous entity algorithm that employs a team-based thread parallelism construct. The STK Single Instruction Multiple Data (SIMD) infrastructure is used to interleave data for improved vectorization. The collective algorithm design, which allows for concurrent threading and SIMD management, has been deployed for the core low-Mach element- based algorithm. Several tests to ascertain SIMD performance on Intel KNL and Haswell architectures have been carried out. The performance test matrix includes evaluation of both low- and higher-order methods. The higher-order low-Mach methodology builds on polynomial promotion of the core low-order control volume nite element method (CVFEM). Performance testing of the Kokkos-view/SIMD design indicates low-order matrix assembly kernel speed-up ranging between two and four times depending on mesh loading and node count. Better speedups are observed for higher-order meshes (currently only P=2 has been tested) especially on KNL. The increased workload per element on higher-order meshes bene ts from the wide SIMD width on KNL machines. Combining multiple threads with SIMD on KNL achieves a 4.6x speedup over the baseline, with assembly timings faster than that observed on Haswell architecture. The computational workload of higher-order meshes, therefore, seems ideally suited for the many-core architecture and justi es further exploration of higher-order on NGP platforms. A Trilinos/Tpetra-based multi-threaded GMRES preconditioned by symmetric Gauss Seidel (SGS) represents the core solver infrastructure for the low-Mach advection/diffusion implicit solves. The threaded solver stack has been tested on small problems on NREL's Peregrine system using the newly developed and deployed Kokkos-view/SIMD kernels. fforts are underway to deploy the Tpetra-based solver stack on NERSC Cori system to benchmark its performance at scale on KNL machines.« less
NASA Astrophysics Data System (ADS)
Olson, Richard F.
2013-05-01
Rendering of point scatterer based radar scenes for millimeter wave (mmW) seeker tests in real-time hardware-in-the-loop (HWIL) scene generation requires efficient algorithms and vector-friendly computer architectures for complex signal synthesis. New processor technology from Intel implements an extended 256-bit vector SIMD instruction set (AVX, AVX2) in a multi-core CPU design providing peak execution rates of hundreds of GigaFLOPS (GFLOPS) on one chip. Real world mmW scene generation code can approach peak SIMD execution rates only after careful algorithm and source code design. An effective software design will maintain high computing intensity emphasizing register-to-register SIMD arithmetic operations over data movement between CPU caches or off-chip memories. Engineers at the U.S. Army Aviation and Missile Research, Development and Engineering Center (AMRDEC) applied two basic parallel coding methods to assess new 256-bit SIMD multi-core architectures for mmW scene generation in HWIL. These include use of POSIX threads built on vector library functions and more portable, highlevel parallel code based on compiler technology (e.g. OpenMP pragmas and SIMD autovectorization). Since CPU technology is rapidly advancing toward high processor core counts and TeraFLOPS peak SIMD execution rates, it is imperative that coding methods be identified which produce efficient and maintainable parallel code. This paper describes the algorithms used in point scatterer target model rendering, the parallelization of those algorithms, and the execution performance achieved on an AVX multi-core machine using the two basic parallel coding methods. The paper concludes with estimates for scale-up performance on upcoming multi-core technology.
Implementation and analysis of a Navier-Stokes algorithm on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1988-01-01
The results of the implementation of a Navier-Stokes algorithm on three parallel/vector computers are presented. The object of this research is to determine how well, or poorly, a single numerical algorithm would map onto three different architectures. The algorithm is a compact difference scheme for the solution of the incompressible, two-dimensional, time-dependent Navier-Stokes equations. The computers were chosen so as to encompass a variety of architectures. They are the following: the MPP, an SIMD machine with 16K bit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. The basic comparison is among SIMD instruction parallelism on the MPP, MIMD process parallelism on the Flex/32, and vectorization of a serial code on the Cray/2. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Efficiently modeling neural networks on massively parallel computers
NASA Technical Reports Server (NTRS)
Farber, Robert M.
1993-01-01
Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.
Using video-oriented instructions to speed up sequence comparison.
Wozniak, A
1997-04-01
This document presents an implementation of the well-known Smith-Waterman algorithm for comparison of proteic and nucleic sequences, using specialized video instructions. These instructions, SIMD-like in their design, make possible parallelization of the algorithm at the instruction level. Benchmarks on an ULTRA SPARC running at 167 MHz show a speed-up factor of two compared to the same algorithm implemented with integer instructions on the same machine. Performance reaches over 18 million matrix cells per second on a single processor, giving to our knowledge the fastest implementation of the Smith-Waterman algorithm on a workstation. The accelerated procedure was introduced in LASSAP--a LArge Scale Sequence compArison Package software developed at INRIA--which handles parallelism at higher level. On a SUN Enterprise 6000 server with 12 processors, a speed of nearly 200 million matrix cells per second has been obtained. A sequence of length 300 amino acids is scanned against SWISSPROT R33 (1,8531,385 residues) in 29 s. This procedure is not restricted to databank scanning. It applies to all cases handled by LASSAP (intra- and inter-bank comparisons, Z-score computation, etc.
Interaction sorting method for molecular dynamics on multi-core SIMD CPU architecture.
Matvienko, Sergey; Alemasov, Nikolay; Fomin, Eduard
2015-02-01
Molecular dynamics (MD) is widely used in computational biology for studying binding mechanisms of molecules, molecular transport, conformational transitions, protein folding, etc. The method is computationally expensive; thus, the demand for the development of novel, much more efficient algorithms is still high. Therefore, the new algorithm designed in 2007 and called interaction sorting (IS) clearly attracted interest, as it outperformed the most efficient MD algorithms. In this work, a new IS modification is proposed which allows the algorithm to utilize SIMD processor instructions. This paper shows that the improvement provides an additional gain in performance, 9% to 45% in comparison to the original IS method.
Parallel processors and nonlinear structural dynamics algorithms and software
NASA Technical Reports Server (NTRS)
Belytschko, Ted; Gilbertsen, Noreen D.; Neal, Mark O.; Plaskacz, Edward J.
1989-01-01
The adaptation of a finite element program with explicit time integration to a massively parallel SIMD (single instruction multiple data) computer, the CONNECTION Machine is described. The adaptation required the development of a new algorithm, called the exchange algorithm, in which all nodal variables are allocated to the element with an exchange of nodal forces at each time step. The architectural and C* programming language features of the CONNECTION Machine are also summarized. Various alternate data structures and associated algorithms for nonlinear finite element analysis are discussed and compared. Results are presented which demonstrate that the CONNECTION Machine is capable of outperforming the CRAY XMP/14.
Nose, Atsushi; Yamazaki, Tomohiro; Katayama, Hironobu; Uehara, Shuji; Kobayashi, Masatsugu; Shida, Sayaka; Odahara, Masaki; Takamiya, Kenichi; Matsumoto, Shizunori; Miyashita, Leo; Watanabe, Yoshihiro; Izawa, Takashi; Muramatsu, Yoshinori; Nitta, Yoshikazu; Ishikawa, Masatoshi
2018-04-24
We have developed a high-speed vision chip using 3D stacking technology to address the increasing demand for high-speed vision chips in diverse applications. The chip comprises a 1/3.2-inch, 1.27 Mpixel, 500 fps (0.31 Mpixel, 1000 fps, 2 × 2 binning) vision chip with 3D-stacked column-parallel Analog-to-Digital Converters (ADCs) and 140 Giga Operation per Second (GOPS) programmable Single Instruction Multiple Data (SIMD) column-parallel PEs for new sensing applications. The 3D-stacked structure and column parallel processing architecture achieve high sensitivity, high resolution, and high-accuracy object positioning.
Flexible Language Constructs for Large Parallel Programs
Rosing, Matt; Schnabel, Robert
1994-01-01
The goal of the research described in this article is to develop flexible language constructs for writing large data parallel numerical programs for distributed memory (multiple instruction multiple data [MIMD]) multiprocessors. Previously, several models have been developed to support synchronization and communication. Models for global synchronization include single instruction multiple data (SIMD), single program multiple data (SPMD), and sequential programs annotated with data distribution statements. The two primary models for communication include implicit communication based on shared memory and explicit communication based on messages. None of these models by themselves seem sufficient to permit the natural and efficient expression ofmore » the variety of algorithms that occur in large scientific computations. In this article, we give an overview of a new language that combines many of these programming models in a clean manner. This is done in a modular fashion such that different models can be combined to support large programs. Within a module, the selection of a model depends on the algorithm and its efficiency requirements. In this article, we give an overview of the language and discuss some of the critical implementation details.« less
An Analysis of Instruction-Cached SIMD Computer Architecture
1993-12-01
ASSEBLE SIMULATE SCHEDULE VERIFY :t og ... . .. ... V~JSRUCTONSFOR PECIIEDCOMPARE ASSEMBLEI SIMULATE Ift*U1II ~ ~ SCHEDULEIinw ;. & VERIFY...Cache to Place Blocks ................. 70 4.5.4 Step 4: Schedule Cache Blocks ............................. 70 4.5.5 Step 5: Store Cache Blocks...167 B.4 Scheduler .............................................. 167 B.4.1 Basic Block Definition
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
Vincenti, H.; Lobet, M.; Lehe, R.; ...
2016-09-19
In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries: OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vincenti, H.; Lobet, M.; Lehe, R.
In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries: OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less
Fault Tolerant Parallel Implementations of Iterative Algorithms for Optimal Control Problems
1988-01-21
p/.V)] steps, but did not discuss any specific parallel implementation. Gajski [51 improved upon this result by performing the SIMD computation in...N = p2. our approach reduces to that of [51, except that Gajski presents the coefficient computation and partial solution phases as a single...8217>. the SIMD algo- rithm presented by Gajski [5] can be most efficiently mapped to a unidirec- tional ring network with broadcasting capability. Based
A GPU-based large-scale Monte Carlo simulation method for systems with long-range interactions
NASA Astrophysics Data System (ADS)
Liang, Yihao; Xing, Xiangjun; Li, Yaohang
2017-06-01
In this work we present an efficient implementation of Canonical Monte Carlo simulation for Coulomb many body systems on graphics processing units (GPU). Our method takes advantage of the GPU Single Instruction, Multiple Data (SIMD) architectures, and adopts the sequential updating scheme of Metropolis algorithm. It makes no approximation in the computation of energy, and reaches a remarkable 440-fold speedup, compared with the serial implementation on CPU. We further use this method to simulate primitive model electrolytes, and measure very precisely all ion-ion pair correlation functions at high concentrations. From these data, we extract the renormalized Debye length, renormalized valences of constituent ions, and renormalized dielectric constants. These results demonstrate unequivocally physics beyond the classical Poisson-Boltzmann theory.
Line-drawing algorithms for parallel machines
NASA Technical Reports Server (NTRS)
Pang, Alex T.
1990-01-01
The fact that conventional line-drawing algorithms, when applied directly on parallel machines, can lead to very inefficient codes is addressed. It is suggested that instead of modifying an existing algorithm for a parallel machine, a more efficient implementation can be produced by going back to the invariants in the definition. Popular line-drawing algorithms are compared with two alternatives; distance to a line (a point is on the line if sufficiently close to it) and intersection with a line (a point on the line if an intersection point). For massively parallel single-instruction-multiple-data (SIMD) machines (with thousands of processors and up), the alternatives provide viable line-drawing algorithms. Because of the pixel-per-processor mapping, their performance is independent of the line length and orientation.
A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TRIANGULATED SURFACES*
Fu, Zhisong; Jeong, Won-Ki; Pan, Yongsheng; Kirby, Robert M.; Whitaker, Ross T.
2012-01-01
This paper presents an efficient, fine-grained parallel algorithm for solving the Eikonal equation on triangular meshes. The Eikonal equation, and the broader class of Hamilton–Jacobi equations to which it belongs, have a wide range of applications from geometric optics and seismology to biological modeling and analysis of geometry and images. The ability to solve such equations accurately and efficiently provides new capabilities for exploring and visualizing parameter spaces and for solving inverse problems that rely on such equations in the forward model. Efficient solvers on state-of-the-art, parallel architectures require new algorithms that are not, in many cases, optimal, but are better suited to synchronous updates of the solution. In previous work [W. K. Jeong and R. T. Whitaker, SIAM J. Sci. Comput., 30 (2008), pp. 2512–2534], the authors proposed the fast iterative method (FIM) to efficiently solve the Eikonal equation on regular grids. In this paper we extend the fast iterative method to solve Eikonal equations efficiently on triangulated domains on the CPU and on parallel architectures, including graphics processors. We propose a new local update scheme that provides solutions of first-order accuracy for both architectures. We also propose a novel triangle-based update scheme and its corresponding data structure for efficient irregular data mapping to parallel single-instruction multiple-data (SIMD) processors. We provide detailed descriptions of the implementations on a single CPU, a multicore CPU with shared memory, and SIMD architectures with comparative results against state-of-the-art Eikonal solvers. PMID:22641200
PIC codes for plasma accelerators on emerging computer architectures (GPUS, Multicore/Manycore CPUS)
NASA Astrophysics Data System (ADS)
Vincenti, Henri
2016-03-01
The advent of exascale computers will enable 3D simulations of a new laser-plasma interaction regimes that were previously out of reach of current Petasale computers. However, the paradigm used to write current PIC codes will have to change in order to fully exploit the potentialities of these new computing architectures. Indeed, achieving Exascale computing facilities in the next decade will be a great challenge in terms of energy consumption and will imply hardware developments directly impacting our way of implementing PIC codes. As data movement (from die to network) is by far the most energy consuming part of an algorithm future computers will tend to increase memory locality at the hardware level and reduce energy consumption related to data movement by using more and more cores on each compute nodes (''fat nodes'') that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, CPU machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. GPU's also have a reduced clock speed per core and can process Multiple Instructions on Multiple Datas (MIMD). At the software level Particle-In-Cell (PIC) codes will thus have to achieve both good memory locality and vectorization (for Multicore/Manycore CPU) to fully take advantage of these upcoming architectures. In this talk, we present the portable solutions we implemented in our high performance skeleton PIC code PICSAR to both achieve good memory locality and cache reuse as well as good vectorization on SIMD architectures. We also present the portable solutions used to parallelize the Pseudo-sepctral quasi-cylindrical code FBPIC on GPUs using the Numba python compiler.
Insertion of operation-and-indicate instructions for optimized SIMD code
Eichenberger, Alexander E; Gara, Alan; Gschwind, Michael K
2013-06-04
Mechanisms are provided for inserting indicated instructions for tracking and indicating exceptions in the execution of vectorized code. A portion of first code is received for compilation. The portion of first code is analyzed to identify non-speculative instructions performing designated non-speculative operations in the first code that are candidates for replacement by replacement operation-and-indicate instructions that perform the designated non-speculative operations and further perform an indication operation for indicating any exception conditions corresponding to special exception values present in vector register inputs to the replacement operation-and-indicate instructions. The replacement is performed and second code is generated based on the replacement of the at least one non-speculative instruction. The data processing system executing the compiled code is configured to store special exception values in vector output registers, in response to a speculative instruction generating an exception condition, without initiating exception handling.
Cache-Oblivious parallel SIMD Viterbi decoding for sequence search in HMMER.
Ferreira, Miguel; Roma, Nuno; Russo, Luis M S
2014-05-30
HMMER is a commonly used bioinformatics tool based on Hidden Markov Models (HMMs) to analyze and process biological sequences. One of its main homology engines is based on the Viterbi decoding algorithm, which was already highly parallelized and optimized using Farrar's striped processing pattern with Intel SSE2 instruction set extension. A new SIMD vectorization of the Viterbi decoding algorithm is proposed, based on an SSE2 inter-task parallelization approach similar to the DNA alignment algorithm proposed by Rognes. Besides this alternative vectorization scheme, the proposed implementation also introduces a new partitioning of the Markov model that allows a significantly more efficient exploitation of the cache locality. Such optimization, together with an improved loading of the emission scores, allows the achievement of a constant processing throughput, regardless of the innermost-cache size and of the dimension of the considered model. The proposed optimized vectorization of the Viterbi decoding algorithm was extensively evaluated and compared with the HMMER3 decoder to process DNA and protein datasets, proving to be a rather competitive alternative implementation. Being always faster than the already highly optimized ViterbiFilter implementation of HMMER3, the proposed Cache-Oblivious Parallel SIMD Viterbi (COPS) implementation provides a constant throughput and offers a processing speedup as high as two times faster, depending on the model's size.
Vectorization with SIMD extensions speeds up reconstruction in electron tomography.
Agulleiro, J I; Garzón, E M; García, I; Fernández, J J
2010-06-01
Electron tomography allows structural studies of cellular structures at molecular detail. Large 3D reconstructions are needed to meet the resolution requirements. The processing time to compute these large volumes may be considerable and so, high performance computing techniques have been used traditionally. This work presents a vector approach to tomographic reconstruction that relies on the exploitation of the SIMD extensions available in modern processors in combination to other single processor optimization techniques. This approach succeeds in producing full resolution tomograms with an important reduction in processing time, as evaluated with the most common reconstruction algorithms, namely WBP and SIRT. The main advantage stems from the fact that this approach is to be run on standard computers without the need of specialized hardware, which facilitates the development, use and management of programs. Future trends in processor design open excellent opportunities for vector processing with processor's SIMD extensions in the field of 3D electron microscopy.
Flexible language constructs for large parallel programs
NASA Technical Reports Server (NTRS)
Rosing, Matthew; Schnabel, Robert
1993-01-01
The goal of the research described is to develop flexible language constructs for writing large data parallel numerical programs for distributed memory (MIMD) multiprocessors. Previously, several models have been developed to support synchronization and communication. Models for global synchronization include SIMD (Single Instruction Multiple Data), SPMD (Single Program Multiple Data), and sequential programs annotated with data distribution statements. The two primary models for communication include implicit communication based on shared memory and explicit communication based on messages. None of these models by themselves seem sufficient to permit the natural and efficient expression of the variety of algorithms that occur in large scientific computations. An overview of a new language that combines many of these programming models in a clean manner is given. This is done in a modular fashion such that different models can be combined to support large programs. Within a module, the selection of a model depends on the algorithm and its efficiency requirements. An overview of the language and discussion of some of the critical implementation details is given.
An ultra low energy biomedical signal processing system operating at near-threshold.
Hulzink, J; Konijnenburg, M; Ashouei, M; Breeschoten, A; Berset, T; Huisken, J; Stuyt, J; de Groot, H; Barat, F; David, J; Van Ginderdeuren, J
2011-12-01
This paper presents a voltage-scalable digital signal processing system designed for the use in a wireless sensor node (WSN) for ambulatory monitoring of biomedical signals. To fulfill the requirements of ambulatory monitoring, power consumption, which directly translates to the WSN battery lifetime and size, must be kept as low as possible. The proposed processing platform is an event-driven system with resources to run applications with different degrees of complexity in an energy-aware way. The architecture uses effective system partitioning to enable duty cycling, single instruction multiple data (SIMD) instructions, power gating, voltage scaling, multiple clock domains, multiple voltage domains, and extensive clock gating. It provides an alternative processing platform where the power and performance can be scaled to adapt to the application need. A case study on a continuous wavelet transform (CWT)-based heart-beat detection shows that the platform not only preserves the sensitivity and positive predictivity of the algorithm but also achieves the lowest energy/sample for ElectroCardioGram (ECG) heart-beat detection publicly reported today.
A graphics-card implementation of Monte-Carlo simulations for cosmic-ray transport
NASA Astrophysics Data System (ADS)
Tautz, R. C.
2016-05-01
A graphics card implementation of a test-particle simulation code is presented that is based on the CUDA extension of the C/C++ programming language. The original CPU version has been developed for the calculation of cosmic-ray diffusion coefficients in artificial Kolmogorov-type turbulence. In the new implementation, the magnetic turbulence generation, which is the most time-consuming part, is separated from the particle transport and is performed on a graphics card. In this article, the modification of the basic approach of integrating test particle trajectories to employ the SIMD (single instruction, multiple data) model is presented and verified. The efficiency of the new code is tested and several language-specific accelerating factors are discussed. For the example of isotropic magnetostatic turbulence, sample results are shown and a comparison to the results of the CPU implementation is performed.
The factorization of large composite numbers on the MPP
NASA Technical Reports Server (NTRS)
Mckurdy, Kathy J.; Wunderlich, Marvin C.
1987-01-01
The continued fraction method for factoring large integers (CFRAC) was an ideal algorithm to be implemented on a massively parallel computer such as the Massively Parallel Processor (MPP). After much effort, the first 60 digit number was factored on the MPP using about 6 1/2 hours of array time. Although this result added about 10 digits to the size number that could be factored using CFRAC on a serial machine, it was already badly beaten by the implementation of Davis and Holdridge on the CRAY-1 using the quadratic sieve, an algorithm which is clearly superior to CFRAC for large numbers. An algorithm is illustrated which is ideally suited to the single instruction multiple data (SIMD) massively parallel architecture and some of the modifications which were needed in order to make the parallel implementation effective and efficient are described.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S.; Alam, Maksudul
A novel parallel algorithm is presented for generating random scale-free networks using the preferential-attachment model. The algorithm, named cuPPA, is custom-designed for single instruction multiple data (SIMD) style of parallel processing supported by modern processors such as graphical processing units (GPUs). To the best of our knowledge, our algorithm is the first to exploit GPUs, and also the fastest implementation available today, to generate scale free networks using the preferential attachment model. A detailed performance study is presented to understand the scalability and runtime characteristics of the cuPPA algorithm. In one of the best cases, when executed on an NVidiamore » GeForce 1080 GPU, cuPPA generates a scale free network of a billion edges in less than 2 seconds.« less
Jeong, Han Saem; Lee, Tae Hyub; Bang, Cho Hee; Kim, Jong-Ho; Hong, Soon Jun
2018-03-01
While both sepsis-induced myocardial dysfunction (SIMD) and stress-induced cardiomyopathy (SICMP) are common in patients with sepsis, the pathogenesis of the 2 diseases is different, and they require different treatment strategies. Thus, we aimed to investigate risk factors and outcomes between the 2 diseases.This retrospective study enrolled patients diagnosed with sepsis or septic shock, admitted to intensive care unit via emergency department in Korea University Anam Hospital, and who underwent transthoracic echocardiography within the first 24 hours of admission.In all, 25 patients with SIMD and 27 patients with SICMP were enrolled. Chronic obstructive pulmonary disease and a history of heart failure (HF) were more prevalent in both the SIMD and SICMP groups than in the control group. In the SIMD and SICMP groups, levels of inflammatory cytokines were similar. Serum troponin level was significantly elevated in the SICMP and SIMD group compared to the control group. N-terminal pro-brain natriuretic peptide (NT pro-BNP) level was significantly elevated in the SIMD group compared to the SICMP group or control group. The in-hospital mortality rate in the SIMD and SICMP group was about 40%, showing increased trends compared with the control group. The in-hospital mortality rate was significantly increased in SIMD group with EF<30% than in SICMP group with EF<30%. In multiple logistic regression analysis, a past history of diabetes mellitus (DM) and HF was significantly associated with the incidence of SIMD. Younger age, elevated levels of NT pro-BNP, and positive result of blood culture also showed significant odds ratio regard to the occurrence of SIMD. However, only elevated lactate and troponin level were positively associated with the incidence of SICMP.The SIMD and SICMP had different risk factors. The risk factors of SIMD were younger age, history of DM, history of HF, elevated NT pro-BNP, and positive result of blood culture. The elevated levels of lactate and troponin were identified as risk factors of SICMP. More importantly, in-hospital mortality rate from SIMD and SICMP showed increased trend and worse outcome in SIMD group with reduced EF<30%. Thus, developing SIMD or SICMP reflected poor prognosis in sepsis or septic shock.
Jeong, Han Saem; Lee, Tae Hyub; Bang, Cho Hee; Kim, Jong-Ho; Hong, Soon Jun
2018-01-01
Abstract While both sepsis-induced myocardial dysfunction (SIMD) and stress-induced cardiomyopathy (SICMP) are common in patients with sepsis, the pathogenesis of the 2 diseases is different, and they require different treatment strategies. Thus, we aimed to investigate risk factors and outcomes between the 2 diseases. This retrospective study enrolled patients diagnosed with sepsis or septic shock, admitted to intensive care unit via emergency department in Korea University Anam Hospital, and who underwent transthoracic echocardiography within the first 24 hours of admission. In all, 25 patients with SIMD and 27 patients with SICMP were enrolled. Chronic obstructive pulmonary disease and a history of heart failure (HF) were more prevalent in both the SIMD and SICMP groups than in the control group. In the SIMD and SICMP groups, levels of inflammatory cytokines were similar. Serum troponin level was significantly elevated in the SICMP and SIMD group compared to the control group. N-terminal pro-brain natriuretic peptide (NT pro-BNP) level was significantly elevated in the SIMD group compared to the SICMP group or control group. The in-hospital mortality rate in the SIMD and SICMP group was about 40%, showing increased trends compared with the control group. The in-hospital mortality rate was significantly increased in SIMD group with EF<30% than in SICMP group with EF<30%. In multiple logistic regression analysis, a past history of diabetes mellitus (DM) and HF was significantly associated with the incidence of SIMD. Younger age, elevated levels of NT pro-BNP, and positive result of blood culture also showed significant odds ratio regard to the occurrence of SIMD. However, only elevated lactate and troponin level were positively associated with the incidence of SICMP. The SIMD and SICMP had different risk factors. The risk factors of SIMD were younger age, history of DM, history of HF, elevated NT pro-BNP, and positive result of blood culture. The elevated levels of lactate and troponin were identified as risk factors of SICMP. More importantly, in-hospital mortality rate from SIMD and SICMP showed increased trend and worse outcome in SIMD group with reduced EF<30%. Thus, developing SIMD or SICMP reflected poor prognosis in sepsis or septic shock. PMID:29595686
Cache-Oblivious parallel SIMD Viterbi decoding for sequence search in HMMER
2014-01-01
Background HMMER is a commonly used bioinformatics tool based on Hidden Markov Models (HMMs) to analyze and process biological sequences. One of its main homology engines is based on the Viterbi decoding algorithm, which was already highly parallelized and optimized using Farrar’s striped processing pattern with Intel SSE2 instruction set extension. Results A new SIMD vectorization of the Viterbi decoding algorithm is proposed, based on an SSE2 inter-task parallelization approach similar to the DNA alignment algorithm proposed by Rognes. Besides this alternative vectorization scheme, the proposed implementation also introduces a new partitioning of the Markov model that allows a significantly more efficient exploitation of the cache locality. Such optimization, together with an improved loading of the emission scores, allows the achievement of a constant processing throughput, regardless of the innermost-cache size and of the dimension of the considered model. Conclusions The proposed optimized vectorization of the Viterbi decoding algorithm was extensively evaluated and compared with the HMMER3 decoder to process DNA and protein datasets, proving to be a rather competitive alternative implementation. Being always faster than the already highly optimized ViterbiFilter implementation of HMMER3, the proposed Cache-Oblivious Parallel SIMD Viterbi (COPS) implementation provides a constant throughput and offers a processing speedup as high as two times faster, depending on the model’s size. PMID:24884826
Sperl-Hillen, JoAnn; O'Connor, Patrick J; Ekstrom, Heidi L; Rush, William A; Asche, Stephen E; Fernandes, Omar D; Appana, Deepika; Amundson, Gerald H; Johnson, Paul E; Curran, Debra M
2014-12-01
To test a virtual case-based Simulated Diabetes Education intervention (SimDE) developed to teach primary care residents how to manage diabetes. Nineteen primary care residency programs, with 341 volunteer residents in all postgraduate years (PGY), were randomly assigned to a SimDE intervention group or control group (CG). The Web-based interactive educational intervention used computerized virtual patients who responded to provider actions through programmed simulation models. Eighteen distinct learning cases (L-cases) were assigned to SimDE residents over six months from 2010 to 2011. Impact was assessed using performance on four virtual assessment cases (A-cases), an objective knowledge test, and pre-post changes in self-assessed diabetes knowledge and confidence. Group comparisons were analyzed using generalized linear mixed models, controlling for clustering of residents within residency programs and differences in baseline knowledge. The percentages of residents appropriately achieving A-case composite clinical goals for glucose, blood pressure, and lipids were as follows: A-case 1: SimDE = 21.2%, CG = 1.8%, P = .002; A-case 2: SimDE = 15.7%, CG = 4.7%, P = .02; A-case 3: SimDE = 48.0%, CG = 10.4%, P < .001; and A-case 4: SimDE = 42.1%, CG = 18.7%, P = .004. The mean knowledge score and pre-post changes in self-assessed knowledge and confidence were significantly better for SimDE group than CG participants. A virtual case-based simulated diabetes education intervention improved diabetes management skills, knowledge, and confidence for primary care residents.
Pairwise Sequence Alignment Library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeff Daily, PNNL
2015-05-20
Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, amore » novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.« less
Empirical study of parallel LRU simulation algorithms
NASA Technical Reports Server (NTRS)
Carr, Eric; Nicol, David M.
1994-01-01
This paper reports on the performance of five parallel algorithms for simulating a fully associative cache operating under the LRU (Least-Recently-Used) replacement policy. Three of the algorithms are SIMD, and are implemented on the MasPar MP-2 architecture. Two other algorithms are parallelizations of an efficient serial algorithm on the Intel Paragon. One SIMD algorithm is quite simple, but its cost is linear in the cache size. The two other SIMD algorithm are more complex, but have costs that are independent on the cache size. Both the second and third SIMD algorithms compute all stack distances; the second SIMD algorithm is completely general, whereas the third SIMD algorithm presumes and takes advantage of bounds on the range of reference tags. Both MIMD algorithm implemented on the Paragon are general and compute all stack distances; they differ in one step that may affect their respective scalability. We assess the strengths and weaknesses of these algorithms as a function of problem size and characteristics, and compare their performance on traces derived from execution of three SPEC benchmark programs.
Geospace simulations using modern accelerator processor technology
NASA Astrophysics Data System (ADS)
Germaschewski, K.; Raeder, J.; Larson, D. J.
2009-12-01
OpenGGCM (Open Geospace General Circulation Model) is a well-established numerical code simulating the Earth's space environment. The most computing intensive part is the MHD (magnetohydrodynamics) solver that models the plasma surrounding Earth and its interaction with Earth's magnetic field and the solar wind flowing in from the sun. Like other global magnetosphere codes, OpenGGCM's realism is currently limited by computational constraints on grid resolution. OpenGGCM has been ported to make use of the added computational powerof modern accelerator based processor architectures, in particular the Cell processor. The Cell architecture is a novel inhomogeneous multicore architecture capable of achieving up to 230 GFLops on a single chip. The University of New Hampshire recently acquired a PowerXCell 8i based computing cluster, and here we will report initial performance results of OpenGGCM. Realizing the high theoretical performance of the Cell processor is a programming challenge, though. We implemented the MHD solver using a multi-level parallelization approach: On the coarsest level, the problem is distributed to processors based upon the usual domain decomposition approach. Then, on each processor, the problem is divided into 3D columns, each of which is handled by the memory limited SPEs (synergistic processing elements) slice by slice. Finally, SIMD instructions are used to fully exploit the SIMD FPUs in each SPE. Memory management needs to be handled explicitly by the code, using DMA to move data from main memory to the per-SPE local store and vice versa. We use a modern technique, automatic code generation, which shields the application programmer from having to deal with all of the implementation details just described, keeping the code much more easily maintainable. Our preliminary results indicate excellent performance, a speed-up of a factor of 30 compared to the unoptimized version.
An Automated Parallel Image Registration Technique Based on the Correlation of Wavelet Features
NASA Technical Reports Server (NTRS)
LeMoigne, Jacqueline; Campbell, William J.; Cromp, Robert F.; Zukor, Dorothy (Technical Monitor)
2001-01-01
With the increasing importance of multiple platform/multiple remote sensing missions, fast and automatic integration of digital data from disparate sources has become critical to the success of these endeavors. Our work utilizes maxima of wavelet coefficients to form the basic features of a correlation-based automatic registration algorithm. Our wavelet-based registration algorithm is tested successfully with data from the National Oceanic and Atmospheric Administration (NOAA) Advanced Very High Resolution Radiometer (AVHRR) and the Landsat/Thematic Mapper(TM), which differ by translation and/or rotation. By the choice of high-frequency wavelet features, this method is similar to an edge-based correlation method, but by exploiting the multi-resolution nature of a wavelet decomposition, our method achieves higher computational speeds for comparable accuracies. This algorithm has been implemented on a Single Instruction Multiple Data (SIMD) massively parallel computer, the MasPar MP-2, as well as on the CrayT3D, the Cray T3E and a Beowulf cluster of Pentium workstations.
High performance in silico virtual drug screening on many-core processors.
McIntosh-Smith, Simon; Price, James; Sessions, Richard B; Ibarra, Amaurys A
2015-05-01
Drug screening is an important part of the drug development pipeline for the pharmaceutical industry. Traditional, lab-based methods are increasingly being augmented with computational methods, ranging from simple molecular similarity searches through more complex pharmacophore matching to more computationally intensive approaches, such as molecular docking. The latter simulates the binding of drug molecules to their targets, typically protein molecules. In this work, we describe BUDE, the Bristol University Docking Engine, which has been ported to the OpenCL industry standard parallel programming language in order to exploit the performance of modern many-core processors. Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single Nvidia GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, including GPUs from Nvidia and AMD, Intel's Xeon Phi and multi-core CPUs with SIMD instruction sets.
High performance in silico virtual drug screening on many-core processors
Price, James; Sessions, Richard B; Ibarra, Amaurys A
2015-01-01
Drug screening is an important part of the drug development pipeline for the pharmaceutical industry. Traditional, lab-based methods are increasingly being augmented with computational methods, ranging from simple molecular similarity searches through more complex pharmacophore matching to more computationally intensive approaches, such as molecular docking. The latter simulates the binding of drug molecules to their targets, typically protein molecules. In this work, we describe BUDE, the Bristol University Docking Engine, which has been ported to the OpenCL industry standard parallel programming language in order to exploit the performance of modern many-core processors. Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single Nvidia GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, including GPUs from Nvidia and AMD, Intel’s Xeon Phi and multi-core CPUs with SIMD instruction sets. PMID:25972727
Deprivation in relation to urgent suspicion of head and neck cancer referrals in Glasgow.
Zeitler, M; Fingland, P; Tikka, T; Douglas, C M; Montgomery, J
2018-06-01
To examine deprivation measured by the Scottish index of multiple deprivation (SIMD) and its relation to urgent suspicion of head and neck cancer referrals. A secondary aim was to examine the symptomatology generating urgent suspicion of cancer (USOC) referrals by SIMD category. All "urgent suspicion of cancer" referrals to the GGC ENT department over a one-year period, between 2015 and 2016, were reviewed. Information was recorded anonymously and included demographics and red flag referral symptoms. A total of 1998 patients were assessed, 43.4% (n = 867) were male. A total of 171 (8.6%) patients had primary head and neck cancer. A total of 61 patients had other types of cancer, giving an all cause cancer rate of 11.6%. About 71.3% of primary patients with head and neck cancer (HNC) were male. The most common SIMD category observed was SIMD1, the most common SIMD category yielding a primary head and neck cancer diagnosis was SIMD1. Neck lump was the commonest symptom amongst all SIMD categories. A link between deprivation and USOC referrals has been established. A difference in gender distribution between referrals and HNC was observed, more females are referred but a significantly higher number of patients with HNC are males. Neck lump is a very strong referral indicator for HNC and intermittent hoarseness is not. The findings from this analysis could be used to refine local referral patterns and priority of referral. © 2018 John Wiley & Sons Ltd.
A Parallel First-Order Linear Recurrence Solver.
1986-09-01
1og2 -M)steps, but p M did not discuss any specific parallel implementation. Gajski [GAJ81] improved upon this result by performing the SIMD computation...solves a series of reduced recurrences of size p 2. However, when N = p 2, our approach reduces to that of I’- [GAJ81], except that Gajski presents the...existing SIMD algorithms to solve R<N,1>, the SIMD algo- rithm presented by Gajski [GAJ81] can be most efficiently mapped to a uni- directional ring
Concurrent Probabilistic Simulation of High Temperature Composite Structural Response
NASA Technical Reports Server (NTRS)
Abdi, Frank
1996-01-01
A computational structural/material analysis and design tool which would meet industry's future demand for expedience and reduced cost is presented. This unique software 'GENOA' is dedicated to parallel and high speed analysis to perform probabilistic evaluation of high temperature composite response of aerospace systems. The development is based on detailed integration and modification of diverse fields of specialized analysis techniques and mathematical models to combine their latest innovative capabilities into a commercially viable software package. The technique is specifically designed to exploit the availability of processors to perform computationally intense probabilistic analysis assessing uncertainties in structural reliability analysis and composite micromechanics. The primary objectives which were achieved in performing the development were: (1) Utilization of the power of parallel processing and static/dynamic load balancing optimization to make the complex simulation of structure, material and processing of high temperature composite affordable; (2) Computational integration and synchronization of probabilistic mathematics, structural/material mechanics and parallel computing; (3) Implementation of an innovative multi-level domain decomposition technique to identify the inherent parallelism, and increasing convergence rates through high- and low-level processor assignment; (4) Creating the framework for Portable Paralleled architecture for the machine independent Multi Instruction Multi Data, (MIMD), Single Instruction Multi Data (SIMD), hybrid and distributed workstation type of computers; and (5) Market evaluation. The results of Phase-2 effort provides a good basis for continuation and warrants Phase-3 government, and industry partnership.
A fast CT reconstruction scheme for a general multi-core PC.
Zeng, Kai; Bai, Erwei; Wang, Ge
2007-01-01
Expensive computational cost is a severe limitation in CT reconstruction for clinical applications that need real-time feedback. A primary example is bolus-chasing computed tomography (CT) angiography (BCA) that we have been developing for the past several years. To accelerate the reconstruction process using the filtered backprojection (FBP) method, specialized hardware or graphics cards can be used. However, specialized hardware is expensive and not flexible. The graphics processing unit (GPU) in a current graphic card can only reconstruct images in a reduced precision and is not easy to program. In this paper, an acceleration scheme is proposed based on a multi-core PC. In the proposed scheme, several techniques are integrated, including utilization of geometric symmetry, optimization of data structures, single-instruction multiple-data (SIMD) processing, multithreaded computation, and an Intel C++ compilier. Our scheme maintains the original precision and involves no data exchange between the GPU and CPU. The merits of our scheme are demonstrated in numerical experiments against the traditional implementation. Our scheme achieves a speedup of about 40, which can be further improved by several folds using the latest quad-core processors.
Adaptive track scheduling to optimize concurrency and vectorization in GeantV
Apostolakis, J.; Bandieramonte, M.; Bitzes, G.; ...
2015-05-22
The GeantV project is focused on the R&D of new particle transport techniques to maximize parallelism on multiple levels, profiting from the use of both SIMD instructions and co-processors for the CPU-intensive calculations specific to this type of applications. In our approach, vectors of tracks belonging to multiple events and matching different locality criteria must be gathered and dispatched to algorithms having vector signatures. While the transport propagates tracks and changes their individual states, data locality becomes harder to maintain. The scheduling policy has to be changed to maintain efficient vectors while keeping an optimal level of concurrency. The modelmore » has complex dynamics requiring tuning the thresholds to switch between the normal regime and special modes, i.e. prioritizing events to allow flushing memory, adding new events in the transport pipeline to boost locality, dynamically adjusting the particle vector size or switching between vector to single track mode when vectorization causes only overhead. Lastly, this work requires a comprehensive study for optimizing these parameters to make the behaviour of the scheduler self-adapting, presenting here its initial results.« less
A Fast CT Reconstruction Scheme for a General Multi-Core PC
Zeng, Kai; Bai, Erwei; Wang, Ge
2007-01-01
Expensive computational cost is a severe limitation in CT reconstruction for clinical applications that need real-time feedback. A primary example is bolus-chasing computed tomography (CT) angiography (BCA) that we have been developing for the past several years. To accelerate the reconstruction process using the filtered backprojection (FBP) method, specialized hardware or graphics cards can be used. However, specialized hardware is expensive and not flexible. The graphics processing unit (GPU) in a current graphic card can only reconstruct images in a reduced precision and is not easy to program. In this paper, an acceleration scheme is proposed based on a multi-core PC. In the proposed scheme, several techniques are integrated, including utilization of geometric symmetry, optimization of data structures, single-instruction multiple-data (SIMD) processing, multithreaded computation, and an Intel C++ compilier. Our scheme maintains the original precision and involves no data exchange between the GPU and CPU. The merits of our scheme are demonstrated in numerical experiments against the traditional implementation. Our scheme achieves a speedup of about 40, which can be further improved by several folds using the latest quad-core processors. PMID:18256731
NASA Astrophysics Data System (ADS)
Poya, Roman; Gil, Antonio J.; Ortigosa, Rogelio
2017-07-01
The paper presents aspects of implementation of a new high performance tensor contraction framework for the numerical analysis of coupled and multi-physics problems on streaming architectures. In addition to explicit SIMD instructions and smart expression templates, the framework introduces domain specific constructs for the tensor cross product and its associated algebra recently rediscovered by Bonet et al. (2015, 2016) in the context of solid mechanics. The two key ingredients of the presented expression template engine are as follows. First, the capability to mathematically transform complex chains of operations to simpler equivalent expressions, while potentially avoiding routes with higher levels of computational complexity and, second, to perform a compile time depth-first or breadth-first search to find the optimal contraction indices of a large tensor network in order to minimise the number of floating point operations. For optimisations of tensor contraction such as loop transformation, loop fusion and data locality optimisations, the framework relies heavily on compile time technologies rather than source-to-source translation or JIT techniques. Every aspect of the framework is examined through relevant performance benchmarks, including the impact of data parallelism on the performance of isomorphic and nonisomorphic tensor products, the FLOP and memory I/O optimality in the evaluation of tensor networks, the compilation cost and memory footprint of the framework and the performance of tensor cross product kernels. The framework is then applied to finite element analysis of coupled electro-mechanical problems to assess the speed-ups achieved in kernel-based numerical integration of complex electroelastic energy functionals. In this context, domain-aware expression templates combined with SIMD instructions are shown to provide a significant speed-up over the classical low-level style programming techniques.
Parallel computing of physical maps--a comparative study in SIMD and MIMD parallelism.
Bhandarkar, S M; Chirravuri, S; Arnold, J
1996-01-01
Ordering clones from a genomic library into physical maps of whole chromosomes presents a central computational problem in genetics. Chromosome reconstruction via clone ordering is usually isomorphic to the NP-complete Optimal Linear Arrangement problem. Parallel SIMD and MIMD algorithms for simulated annealing based on Markov chain distribution are proposed and applied to the problem of chromosome reconstruction via clone ordering. Perturbation methods and problem-specific annealing heuristics are proposed and described. The SIMD algorithms are implemented on a 2048 processor MasPar MP-2 system which is an SIMD 2-D toroidal mesh architecture whereas the MIMD algorithms are implemented on an 8 processor Intel iPSC/860 which is an MIMD hypercube architecture. A comparative analysis of the various SIMD and MIMD algorithms is presented in which the convergence, speedup, and scalability characteristics of the various algorithms are analyzed and discussed. On a fine-grained, massively parallel SIMD architecture with a low synchronization overhead such as the MasPar MP-2, a parallel simulated annealing algorithm based on multiple periodically interacting searches performs the best. For a coarse-grained MIMD architecture with high synchronization overhead such as the Intel iPSC/860, a parallel simulated annealing algorithm based on multiple independent searches yields the best results. In either case, distribution of clonal data across multiple processors is shown to exacerbate the tendency of the parallel simulated annealing algorithm to get trapped in a local optimum.
Kalman filter tracking on parallel architectures
NASA Astrophysics Data System (ADS)
Cerati, G.; Elmer, P.; Krutelyov, S.; Lantz, S.; Lefebvre, M.; McDermott, K.; Riley, D.; Tadel, M.; Wittich, P.; Wurthwein, F.; Yagil, A.
2017-10-01
We report on the progress of our studies towards a Kalman filter track reconstruction algorithm with optimal performance on manycore architectures. The combinatorial structure of these algorithms is not immediately compatible with an efficient SIMD (or SIMT) implementation; the challenge for us is to recast the existing software so it can readily generate hundreds of shared-memory threads that exploit the underlying instruction set of modern processors. We show how the data and associated tasks can be organized in a way that is conducive to both multithreading and vectorization. We demonstrate very good performance on Intel Xeon and Xeon Phi architectures, as well as promising first results on Nvidia GPUs.
2008-07-31
Unlike the Lyrtech, each DSP on a Bittware board offers 3 MB of on-chip memory and 3 GFLOPs of 32-bit peak processing power. Based on the performance...Each NVIDIA 8800 Ultra features 576 GFLOPS on 128 612-MHz single-precision floating-point SIMD processors, arranged in 16 clusters of eight. Each
NASA Astrophysics Data System (ADS)
Mori, Kensaku; Suenaga, Yasuhito; Toriwaki, Jun-ichiro
2003-05-01
This paper describes a software-based fast volume rendering (VolR) method on a PC platform by using multimedia instructions, such as SIMD instructions, which are currently available in PCs' CPUs. This method achieves fast rendering speed through highly optimizing software rather than an improved rendering algorithm. In volume rendering using a ray casting method, the system requires fast execution of the following processes: (a) interpolation of voxel or color values at sample points, (b) computation of normal vectors (gray-level gradient vectors), (c) calculation of shaded values obtained by dot-products of normal vectors and light source direction vectors, (d) memory access to a huge area, and (e) efficient ray skipping at translucent regions. The proposed software implements these fundamental processes in volume rending by using special instruction sets for multimedia processing. The proposed software can generate virtual endoscopic images of a 3-D volume of 512x512x489 voxel size by volume rendering with perspective projection, specular reflection, and on-the-fly normal vector computation on a conventional PC without any special hardware at thirteen frames per second. Semi-translucent display is also possible.
A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS
Fu, Zhisong; Kirby, Robert M.; Whitaker, Ross T.
2014-01-01
Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers. PMID:25221418
NASA Astrophysics Data System (ADS)
Hadade, Ioan; di Mare, Luca
2016-08-01
Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range of architectural features such as SIMD for data parallel execution or threads for core parallelism. The exploitation of multi-level parallelism is therefore crucial for achieving superior performance on current and future processors. This paper presents the performance tuning of a multiblock CFD solver on Intel SandyBridge and Haswell multicore CPUs and the Intel Xeon Phi Knights Corner coprocessor. Code optimisations have been applied on two computational kernels exhibiting different computational patterns: the update of flow variables and the evaluation of the Roe numerical fluxes. We discuss at great length the code transformations required for achieving efficient SIMD computations for both kernels across the selected devices including SIMD shuffles and transpositions for flux stencil computations and global memory transformations. Core parallelism is expressed through threading based on a number of domain decomposition techniques together with optimisations pertaining to alleviating NUMA effects found in multi-socket compute nodes. Results are correlated with the Roofline performance model in order to assert their efficiency for each distinct architecture. We report significant speedups for single thread execution across both kernels: 2-5X on the multicore CPUs and 14-23X on the Xeon Phi coprocessor. Computations at full node and chip concurrency deliver a factor of three speedup on the multicore processors and up to 24X on the Xeon Phi manycore coprocessor.
Jiang, Hanyu; Ganesan, Narayan
2016-02-27
HMMER software suite is widely used for analysis of homologous protein and nucleotide sequences with high sensitivity. The latest version of hmmsearch in HMMER 3.x, utilizes heuristic-pipeline which consists of MSV/SSV (Multiple/Single ungapped Segment Viterbi) stage, P7Viterbi stage and the Forward scoring stage to accelerate homology detection. Since the latest version is highly optimized for performance on modern multi-core CPUs with SSE capabilities, only a few acceleration attempts report speedup. However, the most compute intensive tasks within the pipeline (viz., MSV/SSV and P7Viterbi stages) still stand to benefit from the computational capabilities of massively parallel processors. A Multi-Tiered Parallel Framework (CUDAMPF) implemented on CUDA-enabled GPUs presented here, offers a finer-grained parallelism for MSV/SSV and Viterbi algorithms. We couple SIMT (Single Instruction Multiple Threads) mechanism with SIMD (Single Instructions Multiple Data) video instructions with warp-synchronism to achieve high-throughput processing and eliminate thread idling. We also propose a hardware-aware optimal allocation scheme of scarce resources like on-chip memory and caches in order to boost performance and scalability of CUDAMPF. In addition, runtime compilation via NVRTC available with CUDA 7.0 is incorporated into the presented framework that not only helps unroll innermost loop to yield upto 2 to 3-fold speedup than static compilation but also enables dynamic loading and switching of kernels depending on the query model size, in order to achieve optimal performance. CUDAMPF is designed as a hardware-aware parallel framework for accelerating computational hotspots within the hmmsearch pipeline as well as other sequence alignment applications. It achieves significant speedup by exploiting hierarchical parallelism on single GPU and takes full advantage of limited resources based on their own performance features. In addition to exceeding performance of other acceleration attempts, comprehensive evaluations against high-end CPUs (Intel i5, i7 and Xeon) shows that CUDAMPF yields upto 440 GCUPS for SSV, 277 GCUPS for MSV and 14.3 GCUPS for P7Viterbi all with 100 % accuracy, which translates to a maximum speedup of 37.5, 23.1 and 11.6-fold for MSV, SSV and P7Viterbi respectively. The source code is available at https://github.com/Super-Hippo/CUDAMPF.
Performance of GeantV EM Physics Models
NASA Astrophysics Data System (ADS)
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Cosmo, G.; Duhem, L.; Elvira, D.; Folger, G.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2017-10-01
The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.
Geospace simulations on the Cell BE processor
NASA Astrophysics Data System (ADS)
Germaschewski, K.; Raeder, J.; Larson, D.
2008-12-01
OpenGGCM (Open Geospace General circulation Model) is an established numerical code that simulates the Earth's space environment. The most computing intensive part is the MHD (magnetohydrodynamics) solver that models the plasma surrounding Earth and its interaction with Earth's magnetic field and the solar wind flowing in from the sun. Like other global magnetosphere codes, OpenGGCM's realism is limited by computational constraints on grid resolution. We investigate porting of the MHD solver to the Cell BE architecture, a novel inhomogeneous multicore architecture capable of up to 230 GFlops per processor. Realizing this high performance on the Cell processor is a programming challenge, though. We implemented the MHD solver using a multi-level parallel approach: On the coarsest level, the problem is distributed to processors based upon the usual domain decomposition approach. Then, on each processor, the problem is divided into 3D columns, each of which is handled by the memory limited SPEs (synergistic processing elements) slice by slice. Finally, SIMD instructions are used to fully exploit the vector/SIMD FPUs in each SPE. Memory management needs to be handled explicitly by the code, using DMA to move data from main memory to the per-SPE local store and vice versa. We obtained excellent performance numbers, a speed-up of a factor of 25 compared to just using the main processor, while still keeping the numerical implementation details of the code maintainable.
Determination of performance characteristics of scientific applications on IBM Blue Gene/Q
DOE Office of Scientific and Technical Information (OSTI.GOV)
Evangelinos, C.; Walkup, R. E.; Sachdeva, V.
The IBM Blue Gene®/Q platform presents scientists and engineers with a rich set of hardware features such as 16 cores per chip sharing a Level 2 cache, a wide SIMD (single-instruction, multiple-data) unit, a five-dimensional torus network, and hardware support for collective operations. Especially important is the feature related to cores that have four “hardware threads,” which makes it possible to hide latencies and obtain a high fraction of the peak issue rate from each core. All of these hardware resources present unique performance-tuning opportunities on Blue Gene/Q. We provide an overview of several important applications and solvers and studymore » them on Blue Gene/Q using performance counters and Message Passing Interface profiles. We also discuss how Blue Gene/Q tools help us understand the interaction of the application with the hardware and software layers and provide guidance for optimization. Furthermore, on the basis of our analysis, we discuss code improvement strategies targeting Blue Gene/Q. Information about how these algorithms map to the Blue Gene® architecture is expected to have an impact on future system design as we move to the exascale era.« less
The AIS-5000 parallel processor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schmitt, L.A.; Wilson, S.S.
1988-05-01
The AIS-5000 is a commercially available massively parallel processor which has been designed to operate in an industrial environment. It has fine-grained parallelism with up to 1024 processing elements arranged in a single-instruction multiple-data (SIMD) architecture. The processing elements are arranged in a one-dimensional chain that, for computer vision applications, can be as wide as the image itself. This architecture has superior cost/performance characteristics than two-dimensional mesh-connected systems. The design of the processing elements and their interconnections as well as the software used to program the system allow a wide variety of algorithms and applications to be implemented. In thismore » paper, the overall architecture of the system is described. Various components of the system are discussed, including details of the processing elements, data I/O pathways and parallel memory organization. A virtual two-dimensional model for programming image-based algorithms for the system is presented. This model is supported by the AIS-5000 hardware and software and allows the system to be treated as a full-image-size, two-dimensional, mesh-connected parallel processor. Performance bench marks are given for certain simple and complex functions.« less
fast_protein_cluster: parallel and optimized clustering of large-scale protein modeling data.
Hung, Ling-Hong; Samudrala, Ram
2014-06-15
fast_protein_cluster is a fast, parallel and memory efficient package used to cluster 60 000 sets of protein models (with up to 550 000 models per set) generated by the Nutritious Rice for the World project. fast_protein_cluster is an optimized and extensible toolkit that supports Root Mean Square Deviation after optimal superposition (RMSD) and Template Modeling score (TM-score) as metrics. RMSD calculations using a laptop CPU are 60× faster than qcprot and 3× faster than current graphics processing unit (GPU) implementations. New GPU code further increases the speed of RMSD and TM-score calculations. fast_protein_cluster provides novel k-means and hierarchical clustering methods that are up to 250× and 2000× faster, respectively, than Clusco, and identify significantly more accurate models than Spicker and Clusco. fast_protein_cluster is written in C++ using OpenMP for multi-threading support. Custom streaming Single Instruction Multiple Data (SIMD) extensions and advanced vector extension intrinsics code accelerate CPU calculations, and OpenCL kernels support AMD and Nvidia GPUs. fast_protein_cluster is available under the M.I.T. license. (http://software.compbio.washington.edu/fast_protein_cluster) © The Author 2014. Published by Oxford University Press.
Efficient, massively parallel eigenvalue computation
NASA Technical Reports Server (NTRS)
Huo, Yan; Schreiber, Robert
1993-01-01
In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.
Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation
2011-01-01
Background The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. Results A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Conclusions Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance. PMID:21631914
Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation.
Rognes, Torbjørn
2011-06-01
The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.
EmptyHeaded: A Relational Engine for Graph Processing
Aberger, Christopher R.; Tu, Susan; Olukotun, Kunle; Ré, Christopher
2016-01-01
There are two types of high-performance graph processing engines: low- and high-level engines. Low-level engines (Galois, PowerGraph, Snap) provide optimized data structures and computation models but require users to write low-level imperative code, hence ensuring that efficiency is the burden of the user. In high-level engines, users write in query languages like datalog (SociaLite) or SQL (Grail). High-level engines are easier to use but are orders of magnitude slower than the low-level graph engines. We present EmptyHeaded, a high-level engine that supports a rich datalog-like query language and achieves performance comparable to that of low-level engines. At the core of EmptyHeaded’s design is a new class of join algorithms that satisfy strong theoretical guarantees but have thus far not achieved performance comparable to that of specialized graph processing engines. To achieve high performance, EmptyHeaded introduces a new join engine architecture, including a novel query optimizer and data layouts that leverage single-instruction multiple data (SIMD) parallelism. With this architecture, EmptyHeaded outperforms high-level approaches by up to three orders of magnitude on graph pattern queries, PageRank, and Single-Source Shortest Paths (SSSP) and is an order of magnitude faster than many low-level baselines. We validate that EmptyHeaded competes with the best-of-breed low-level engine (Galois), achieving comparable performance on PageRank and at most 3× worse performance on SSSP. PMID:28077912
Towards a high performance geometry library for particle-detector simulations
Apostolakis, J.; Bandieramonte, M.; Bitzes, G.; ...
2015-05-22
Thread-parallelization and single-instruction multiple data (SIMD) ”vectorisation” of software components in HEP computing has become a necessity to fully benefit from current and future computing hardware. In this context, the Geant-Vector/GPU simulation project aims to re-engineer current software for the simulation of the passage of particles through detectors in order to increase the overall event throughput. As one of the core modules in this area, the geometry library plays a central role and vectorising its algorithms will be one of the cornerstones towards achieving good CPU performance. Here, we report on the progress made in vectorising the shape primitives, asmore » well as in applying new C++ template based optimizations of existing code available in the Geant4, ROOT or USolids geometry libraries. We will focus on a presentation of our software development approach that aims to provide optimized code for all use cases of the library (e.g., single particle and many-particle APIs) and to support different architectures (CPU and GPU) while keeping the code base small, manageable and maintainable. We report on a generic and templated C++ geometry library as a continuation of the AIDA USolids project. As a result, the experience gained with these developments will be beneficial to other parts of the simulation software, such as for the optimization of the physics library, and possibly to other parts of the experiment software stack, such as reconstruction and analysis.« less
Towards a high performance geometry library for particle-detector simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Apostolakis, J.; Bandieramonte, M.; Bitzes, G.
Thread-parallelization and single-instruction multiple data (SIMD) ”vectorisation” of software components in HEP computing has become a necessity to fully benefit from current and future computing hardware. In this context, the Geant-Vector/GPU simulation project aims to re-engineer current software for the simulation of the passage of particles through detectors in order to increase the overall event throughput. As one of the core modules in this area, the geometry library plays a central role and vectorising its algorithms will be one of the cornerstones towards achieving good CPU performance. Here, we report on the progress made in vectorising the shape primitives, asmore » well as in applying new C++ template based optimizations of existing code available in the Geant4, ROOT or USolids geometry libraries. We will focus on a presentation of our software development approach that aims to provide optimized code for all use cases of the library (e.g., single particle and many-particle APIs) and to support different architectures (CPU and GPU) while keeping the code base small, manageable and maintainable. We report on a generic and templated C++ geometry library as a continuation of the AIDA USolids project. As a result, the experience gained with these developments will be beneficial to other parts of the simulation software, such as for the optimization of the physics library, and possibly to other parts of the experiment software stack, such as reconstruction and analysis.« less
Hierarchical parallelisation of functional renormalisation group calculations - hp-fRG
NASA Astrophysics Data System (ADS)
Rohe, Daniel
2016-10-01
The functional renormalisation group (fRG) has evolved into a versatile tool in condensed matter theory for studying important aspects of correlated electron systems. Practical applications of the method often involve a high numerical effort, motivating the question in how far High Performance Computing (HPC) can leverage the approach. In this work we report on a multi-level parallelisation of the underlying computational machinery and show that this can speed up the code by several orders of magnitude. This in turn can extend the applicability of the method to otherwise inaccessible cases. We exploit three levels of parallelisation: Distributed computing by means of Message Passing (MPI), shared-memory computing using OpenMP, and vectorisation by means of SIMD units (single-instruction-multiple-data). Results are provided for two distinct High Performance Computing (HPC) platforms, namely the IBM-based BlueGene/Q system JUQUEEN and an Intel Sandy-Bridge-based development cluster. We discuss how certain issues and obstacles were overcome in the course of adapting the code. Most importantly, we conclude that this vast improvement can actually be accomplished by introducing only moderate changes to the code, such that this strategy may serve as a guideline for other researcher to likewise improve the efficiency of their codes.
Real-time road detection in infrared imagery
NASA Astrophysics Data System (ADS)
Andre, Haritini E.; McCoy, Keith
1990-09-01
Automatic road detection is an important part in many scene recognition applications. The extraction of roads provides a means of navigation and position update for remotely piloted vehicles or autonomous vehicles. Roads supply strong contextual information which can be used to improve the performance of automatic target recognition (ATh) systems by directing the search for targets and adjusting target classification confidences. This paper will describe algorithmic techniques for labeling roads in high-resolution infrared imagery. In addition, realtime implementation of this structural approach using a processor array based on the Martin Marietta Geometric Arithmetic Parallel Processor (GAPPTh) chip will be addressed. The algorithm described is based on the hypothesis that a road consists of pairs of line segments separated by a distance "d" with opposite gradient directions (antiparallel). The general nature of the algorithm, in addition to its parallel implementation in a single instruction, multiple data (SIMD) machine, are improvements to existing work. The algorithm seeks to identify line segments meeting the road hypothesis in a manner that performs well, even when the side of the road is fragmented due to occlusion or intersections. The use of geometrical relationships between line segments is a powerful yet flexible method of road classification which is independent of orientation. In addition, this approach can be used to nominate other types of objects with minor parametric changes.
Special-purpose computing for dense stellar systems
NASA Astrophysics Data System (ADS)
Makino, Junichiro
2007-08-01
I'll describe the current status of the GRAPE-DR project. The GRAPE-DR is the next-generation hardware for N-body simulation. Unlike the previous GRAPE hardwares, it is programmable SIMD machine with a large number of simple processors integrated into a single chip. The GRAPE-DR chip consists of 512 simple processors and operates at the clock speed of 500 MHz, delivering the theoretical peak speed of 512/226 Gflops (single/double precision). As of August 2006, the first prototype board with the sample chip successfully passed the test we prepared. The full GRAPE-DR system will consist of 4096 chips, reaching the theoretical peak speed of 2 Pflops.
Horizontal vectorization of electron repulsion integrals.
Pritchard, Benjamin P; Chow, Edmond
2016-10-30
We present an efficient implementation of the Obara-Saika algorithm for the computation of electron repulsion integrals that utilizes vector intrinsics to calculate several primitive integrals concurrently in a SIMD vector. Initial benchmarks display a 2-4 times speedup with AVX instructions over comparable scalar code, depending on the basis set. Speedup over scalar code is found to be sensitive to the level of contraction of the basis set, and is best for (lAlB|lClD) quartets when lD = 0 or lB=lD=0, which makes such a vectorization scheme particularly suitable for density fitting. The basic Obara-Saika algorithm, how it is vectorized, and the performance bottlenecks are analyzed and discussed. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Kernel optimization for short-range molecular dynamics
NASA Astrophysics Data System (ADS)
Hu, Changjun; Wang, Xianmeng; Li, Jianjiang; He, Xinfu; Li, Shigang; Feng, Yangde; Yang, Shaofeng; Bai, He
2017-02-01
To optimize short-range force computations in Molecular Dynamics (MD) simulations, multi-threading and SIMD optimizations are presented in this paper. With respect to multi-threading optimization, a Partition-and-Separate-Calculation (PSC) method is designed to avoid write conflicts caused by using Newton's third law. Serial bottlenecks are eliminated with no additional memory usage. The method is implemented by using the OpenMP model. Furthermore, the PSC method is employed on Intel Xeon Phi coprocessors in both native and offload models. We also evaluate the performance of the PSC method under different thread affinities on the MIC architecture. In the SIMD execution, we explain the performance influence in the PSC method, considering the "if-clause" of the cutoff radius check. The experiment results show that our PSC method is relatively more efficient compared to some traditional methods. In double precision, our 256-bit SIMD implementation is about 3 times faster than the scalar version.
Introduction to a system for implementing neural net connections on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1988-01-01
Neural networks have attracted much interest recently, and using parallel architectures to simulate neural networks is a natural and necessary application. The SIMD model of parallel computation is chosen, because systems of this type can be built with large numbers of processing elements. However, such systems are not naturally suited to generalized communication. A method is proposed that allows an implementation of neural network connections on massively parallel SIMD architectures. The key to this system is an algorithm permitting the formation of arbitrary connections between the neurons. A feature is the ability to add new connections quickly. It also has error recovery ability and is robust over a variety of network topologies. Simulations of the general connection system, and its implementation on the Connection Machine, indicate that the time and space requirements are proportional to the product of the average number of connections per neuron and the diameter of the interconnection network.
Introduction to a system for implementing neural net connections on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1988-01-01
Neural networks have attracted much interest recently, and using parallel architectures to simulate neural networks is a natural and necessary application. The SIMD model of parallel computation is chosen, because systems of this type can be built with large numbers of processing elements. However, such systems are not naturally suited to generalized elements. A method is proposed that allows an implementation of neural network connections on massively parallel SIMD architectures. The key to this system is an algorithm permitting the formation of arbitrary connections between the neurons. A feature is the ability to add new connections quickly. It also has error recovery ability and is robust over a variety of network topologies. Simulations of the general connection system, and its implementation on the Connection Machine, indicate that the time and space requirements are proportional to the product of the average number of connections per neuron and the diameter of the interconnection network.
Accelerated Adaptive MGS Phase Retrieval
NASA Technical Reports Server (NTRS)
Lam, Raymond K.; Ohara, Catherine M.; Green, Joseph J.; Bikkannavar, Siddarayappa A.; Basinger, Scott A.; Redding, David C.; Shi, Fang
2011-01-01
The Modified Gerchberg-Saxton (MGS) algorithm is an image-based wavefront-sensing method that can turn any science instrument focal plane into a wavefront sensor. MGS characterizes optical systems by estimating the wavefront errors in the exit pupil using only intensity images of a star or other point source of light. This innovative implementation of MGS significantly accelerates the MGS phase retrieval algorithm by using stream-processing hardware on conventional graphics cards. Stream processing is a relatively new, yet powerful, paradigm to allow parallel processing of certain applications that apply single instructions to multiple data (SIMD). These stream processors are designed specifically to support large-scale parallel computing on a single graphics chip. Computationally intensive algorithms, such as the Fast Fourier Transform (FFT), are particularly well suited for this computing environment. This high-speed version of MGS exploits commercially available hardware to accomplish the same objective in a fraction of the original time. The exploit involves performing matrix calculations in nVidia graphic cards. The graphical processor unit (GPU) is hardware that is specialized for computationally intensive, highly parallel computation. From the software perspective, a parallel programming model is used, called CUDA, to transparently scale multicore parallelism in hardware. This technology gives computationally intensive applications access to the processing power of the nVidia GPUs through a C/C++ programming interface. The AAMGS (Accelerated Adaptive MGS) software takes advantage of these advanced technologies, to accelerate the optical phase error characterization. With a single PC that contains four nVidia GTX-280 graphic cards, the new implementation can process four images simultaneously to produce a JWST (James Webb Space Telescope) wavefront measurement 60 times faster than the previous code.
Steele, R J C; Kostourou, I; McClements, P; Watling, C; Libby, G; Weller, D; Brewster, D H; Black, R; Carey, F A; Fraser, C
2010-01-01
To assess the effect of gender, age and deprivation on key performance indicators in a colorectal cancer screening programme. Between March 2000 and May 2006 a demonstration pilot of biennial guaiac faecal occult blood test (gFOBT) colorectal screening was carried out in North-East Scotland for all individuals aged 50-69 years. The relevant populations were subdivided, by gender, into four age groups and into five deprivation categories according to the Scottish Index of Multiple Deprivation (SIMD), and key performance indicators analysed within these groups. In all rounds, uptake of the gFOBT increased with age (P < 0.001), decreased with increasing deprivation in both genders (P < 0.001), and was consistently higher in women than in men in all age and all SIMD groups. In addition, increasing deprivation was negatively associated with uptake of colonoscopy in men with a positive gFOBT (P < 0.001) although this effect was not observed in women. Positivity rates increased with age (P < 0.001) and increasing deprivation (P < 0.001) in both genders in all rounds, although they were higher in men than in women for all age and SIMD categories. Cancer detection rates increased with age (P < 0.001), were higher in men than in women in all age and SIMD categories, but were not consistently related to deprivation. In both genders, the positive predictive value (PPV) for cancer increased with age (P < 0.001) and decreased with increasing deprivation (P < 0.001) in all rounds and was consistently higher in men than in women in all age and SIMD categories. In this population-based colorectal screening programme gender, age, and deprivation had marked effects on key performance indicators, and this has implications both for the evaluation of screening programmes and for strategies designed to reduce inequalities.
On the Impact of Widening Vector Registers on Sequence Alignment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daily, Jeffrey A.; Kalyanaraman, Anantharaman; Krishnamoorthy, Sriram
2016-09-22
Vector extensions, such as SSE, have been part of the x86 since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. In this paper, we demonstrate that the trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based onmore » striped data layouts. We present a practically efficient SIMD implementation of a parallel scan based sequence alignment algorithm that can better exploit wider SIMD units. We conduct comprehensive workload and use case analyses to characterize the relative behavior of the striped and scan approaches and identify the best choice of algorithm based on input length and SIMD width.« less
Perry, John L; Dempster, Martin; McKay, Michael T
2017-01-01
A developing literature continues to testify to the relationship between higher socio-economic status (SES) and better academic attainment. However, the literature is complex in terms of the variety of SES and attainment indicators used. Against the backdrop of a Scottish Government initiative to close the attainment gap between higher and lower SES children, the present study examined the relationship between individual-level Scottish Index of Multiple Deprivation (SIMD) and National Lower Tariff Score in school children in the West of Scotland. Results showed a practically significant relationship between SIMD and Tariff Score. This relationship was partially mediated by higher academic self-efficacy, so that higher belief in academic competency partially mediated the SIMD-Tariff Score relationship. Further, this partial mediation was robust to the influence of gender, sensation seeking, level of school attendance and past month frequency of Heavy Episodic Drinking. It is suggested that increasing attendance and perceived academic competence are viable ways (among others) of attempting to close the attainment gap.
Perry, John L.; Dempster, Martin; McKay, Michael T.
2017-01-01
A developing literature continues to testify to the relationship between higher socio-economic status (SES) and better academic attainment. However, the literature is complex in terms of the variety of SES and attainment indicators used. Against the backdrop of a Scottish Government initiative to close the attainment gap between higher and lower SES children, the present study examined the relationship between individual-level Scottish Index of Multiple Deprivation (SIMD) and National Lower Tariff Score in school children in the West of Scotland. Results showed a practically significant relationship between SIMD and Tariff Score. This relationship was partially mediated by higher academic self-efficacy, so that higher belief in academic competency partially mediated the SIMD-Tariff Score relationship. Further, this partial mediation was robust to the influence of gender, sensation seeking, level of school attendance and past month frequency of Heavy Episodic Drinking. It is suggested that increasing attendance and perceived academic competence are viable ways (among others) of attempting to close the attainment gap. PMID:29163281
Flexbar 3.0 - SIMD and multicore parallelization.
Roehr, Johannes T; Dieterich, Christoph; Reinert, Knut
2017-09-15
High-throughput sequencing machines can process many samples in a single run. For Illumina systems, sequencing reads are barcoded with an additional DNA tag that is contained in the respective sequencing adapters. The recognition of barcode and adapter sequences is hence commonly needed for the analysis of next-generation sequencing data. Flexbar performs demultiplexing based on barcodes and adapter trimming for such data. The massive amounts of data generated on modern sequencing machines demand that this preprocessing is done as efficiently as possible. We present Flexbar 3.0, the successor of the popular program Flexbar. It employs now twofold parallelism: multi-threading and additionally SIMD vectorization. Both types of parallelism are used to speed-up the computation of pair-wise sequence alignments, which are used for the detection of barcodes and adapters. Furthermore, new features were included to cover a wide range of applications. We evaluated the performance of Flexbar based on a simulated sequencing dataset. Our program outcompetes other tools in terms of speed and is among the best tools in the presented quality benchmark. https://github.com/seqan/flexbar. johannes.roehr@fu-berlin.de or knut.reinert@fu-berlin.de. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Unstructured grids on SIMD torus machines
NASA Technical Reports Server (NTRS)
Bjorstad, Petter E.; Schreiber, Robert
1994-01-01
Unstructured grids lead to unstructured communication on distributed memory parallel computers, a problem that has been considered difficult. Here, we consider adaptive, offline communication routing for a SIMD processor grid. Our approach is empirical. We use large data sets drawn from supercomputing applications instead of an analytic model of communication load. The chief contribution of this paper is an experimental demonstration of the effectiveness of certain routing heuristics. Our routing algorithm is adaptive, nonminimal, and is generally designed to exploit locality. We have a parallel implementation of the router, and we report on its performance.
NASA Technical Reports Server (NTRS)
Manohar, Mareboyana; Tilton, James C.
1994-01-01
A progressive vector quantization (VQ) compression approach is discussed which decomposes image data into a number of levels using full search VQ. The final level is losslessly compressed, enabling lossless reconstruction. The computational difficulties are addressed by implementation on a massively parallel SIMD machine. We demonstrate progressive VQ on multispectral imagery obtained from the Advanced Very High Resolution Radiometer instrument and other Earth observation image data, and investigate the trade-offs in selecting the number of decomposition levels and codebook training method.
A programmable computational image sensor for high-speed vision
NASA Astrophysics Data System (ADS)
Yang, Jie; Shi, Cong; Long, Xitian; Wu, Nanjian
2013-08-01
In this paper we present a programmable computational image sensor for high-speed vision. This computational image sensor contains four main blocks: an image pixel array, a massively parallel processing element (PE) array, a row processor (RP) array and a RISC core. The pixel-parallel PE is responsible for transferring, storing and processing image raw data in a SIMD fashion with its own programming language. The RPs are one dimensional array of simplified RISC cores, it can carry out complex arithmetic and logic operations. The PE array and RP array can finish great amount of computation with few instruction cycles and therefore satisfy the low- and middle-level high-speed image processing requirement. The RISC core controls the whole system operation and finishes some high-level image processing algorithms. We utilize a simplified AHB bus as the system bus to connect our major components. Programming language and corresponding tool chain for this computational image sensor are also developed.
Tomo3D 2.0--exploitation of advanced vector extensions (AVX) for 3D reconstruction.
Agulleiro, Jose-Ignacio; Fernandez, Jose-Jesus
2015-02-01
Tomo3D is a program for fast tomographic reconstruction on multicore computers. Its high speed stems from code optimization, vectorization with Streaming SIMD Extensions (SSE), multithreading and optimization of disk access. Recently, Advanced Vector eXtensions (AVX) have been introduced in the x86 processor architecture. Compared to SSE, AVX double the number of simultaneous operations, thus pointing to a potential twofold gain in speed. However, in practice, achieving this potential is extremely difficult. Here, we provide a technical description and an assessment of the optimizations included in Tomo3D to take advantage of AVX instructions. Tomo3D 2.0 allows huge reconstructions to be calculated in standard computers in a matter of minutes. Thus, it will be a valuable tool for electron tomography studies with increasing resolution needs. Copyright © 2014 Elsevier Inc. All rights reserved.
Ralston, Kevin; Dundas, Ruth; Leyland, Alastair H
2014-07-08
There is a growing international literature assessing inequalities in health and mortality by area based measures. However, there are few works comparing measures available to inform research design. The analysis here seeks to begin to address this issue by assessing whether there are important differences in the relationship between deprivation and inequalities in mortality when measures that have been constructed at different time points are compared. We contrast whether the interpretation of inequalities in all-cause mortality between the years 2008-10 changes in Scotland if we apply the earliest (2004) and the 2009 + 1 releases of the Scottish Index of Multiple Deprivation (SIMD) to make this comparison. The 2004 release is based on data from 2001/2 and the 2009 + 1 release is based on data from 2008/9. The slope index of inequality (SII) and 1:10 ratio are used to summarise inequalities standardised by age/sex using population and mortality records. The 1:10 ratio suggests some differences in the magnitude of inequalities measured using SIMD at different time points. However, the SII shows much closer correspondence. Overall the findings show that substantive conclusions in relation to inequalities in all-cause mortality are little changed by the updated measure. This information is beneficial to researchers as the most recent measures are not always available. This adds to the body of literature showing stability in inequalities in health and mortality by geographical deprivation over time.
Applications of Parallel Computation in Micro-Mechanics and Finite Element Method
NASA Technical Reports Server (NTRS)
Tan, Hui-Qian
1996-01-01
This project discusses the application of parallel computations related with respect to material analyses. Briefly speaking, we analyze some kind of material by elements computations. We call an element a cell here. A cell is divided into a number of subelements called subcells and all subcells in a cell have the identical structure. The detailed structure will be given later in this paper. It is obvious that the problem is "well-structured". SIMD machine would be a better choice. In this paper we try to look into the potentials of SIMD machine in dealing with finite element computation by developing appropriate algorithms on MasPar, a SIMD parallel machine. In section 2, the architecture of MasPar will be discussed. A brief review of the parallel programming language MPL also is given in that section. In section 3, some general parallel algorithms which might be useful to the project will be proposed. And, combining with the algorithms, some features of MPL will be discussed in more detail. In section 4, the computational structure of cell/subcell model will be given. The idea of designing the parallel algorithm for the model will be demonstrated. Finally in section 5, a summary will be given.
Serial multiplier arrays for parallel computation
NASA Technical Reports Server (NTRS)
Winters, Kel
1990-01-01
Arrays of systolic serial-parallel multiplier elements are proposed as an alternative to conventional SIMD mesh serial adder arrays for applications that are multiplication intensive and require few stored operands. The design and operation of a number of multiplier and array configurations featuring locality of connection, modularity, and regularity of structure are discussed. A design methodology combining top-down and bottom-up techniques is described to facilitate development of custom high-performance CMOS multiplier element arrays as well as rapid synthesis of simulation models and semicustom prototype CMOS components. Finally, a differential version of NORA dynamic circuits requiring a single-phase uncomplemented clock signal introduced for this application.
A hybrid algorithm for parallel molecular dynamics simulations
NASA Astrophysics Data System (ADS)
Mangiardi, Chris M.; Meyer, R.
2017-10-01
This article describes algorithms for the hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-range forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds or thousands of cores and SIMD units with large vector sizes. In order to test the efficiency of the method, simulations of a variety of configurations with up to 74 million atoms have been performed. Results are shown that were obtained on multi-core systems with Sandy Bridge and Haswell processors as well as systems with Xeon Phi many-core processors.
An efficient three-dimensional Poisson solver for SIMD high-performance-computing architectures
NASA Technical Reports Server (NTRS)
Cohl, H.
1994-01-01
We present an algorithm that solves the three-dimensional Poisson equation on a cylindrical grid. The technique uses a finite-difference scheme with operator splitting. This splitting maps the banded structure of the operator matrix into a two-dimensional set of tridiagonal matrices, which are then solved in parallel. Our algorithm couples FFT techniques with the well-known ADI (Alternating Direction Implicit) method for solving Elliptic PDE's, and the implementation is extremely well suited for a massively parallel environment like the SIMD architecture of the MasPar MP-1. Due to the highly recursive nature of our problem, we believe that our method is highly efficient, as it avoids excessive interprocessor communication.
Algorithm architecture co-design for ultra low-power image sensor
NASA Astrophysics Data System (ADS)
Laforest, T.; Dupret, A.; Verdant, A.; Lattard, D.; Villard, P.
2012-03-01
In a context of embedded video surveillance, stand alone leftbehind image sensors are used to detect events with high level of confidence, but also with a very low power consumption. Using a steady camera, motion detection algorithms based on background estimation to find regions in movement are simple to implement and computationally efficient. To reduce power consumption, the background is estimated using a down sampled image formed of macropixels. In order to extend the class of moving objects to be detected, we propose an original mixed mode architecture developed thanks to an algorithm architecture co-design methodology. This programmable architecture is composed of a vector of SIMD processors. A basic RISC architecture was optimized in order to implement motion detection algorithms with a dedicated set of 42 instructions. Definition of delta modulation as a calculation primitive has allowed to implement algorithms in a very compact way. Thereby, a 1920x1080@25fps CMOS image sensor performing integrated motion detection is proposed with a power estimation of 1.8 mW.
Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daily, Jeffrey A.
Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. As a result, a faster intra-sequence pairwise alignment implementation is described and benchmarked. Using a 375 residue query sequence a speed of 136 billion cell updates permore » second (GCUPS) was achieved on a dual Intel Xeon E5-2670 12-core processor system, the highest reported for an implementation based on Farrar’s ’striped’ approach. When using only a single thread, parasail was 1.7 times faster than Rognes’s SWIPE. For many score matrices, parasail is faster than BLAST. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. In conclusion, applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.« less
Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments
Daily, Jeffrey A.
2016-02-10
Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. As a result, a faster intra-sequence pairwise alignment implementation is described and benchmarked. Using a 375 residue query sequence a speed of 136 billion cell updates permore » second (GCUPS) was achieved on a dual Intel Xeon E5-2670 12-core processor system, the highest reported for an implementation based on Farrar’s ’striped’ approach. When using only a single thread, parasail was 1.7 times faster than Rognes’s SWIPE. For many score matrices, parasail is faster than BLAST. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. In conclusion, applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.« less
Application of multigrid methods to the solution of liquid crystal equations on a SIMD computer
NASA Technical Reports Server (NTRS)
Farrell, Paul A.; Ruttan, Arden; Zeller, Reinhardt R.
1993-01-01
We will describe a finite difference code for computing the equilibrium configurations of the order-parameter tensor field for nematic liquid crystals in rectangular regions by minimization of the Landau-de Gennes Free Energy functional. The implementation of the free energy functional described here includes magnetic fields, quadratic gradient terms, and scalar bulk terms through the fourth order. Boundary conditions include the effects of strong surface anchoring. The target architectures for our implementation are SIMD machines, with interconnection networks which can be configured as 2 or 3 dimensional grids, such as the Wavetracer DTC. We also discuss the relative efficiency of a number of iterative methods for the solution of the linear systems arising from this discretization on such architectures.
PS3 CELL Development for Scientific Computation and Research
NASA Astrophysics Data System (ADS)
Christiansen, M.; Sevre, E.; Wang, S. M.; Yuen, D. A.; Liu, S.; Lyness, M. D.; Broten, M.
2007-12-01
The Cell processor is one of the most powerful processors on the market, and researchers in the earth sciences may find its parallel architecture to be very useful. A cell processor, with 7 cores, can easily be obtained for experimentation by purchasing a PlayStation 3 (PS3) and installing linux and the IBM SDK. Each core of the PS3 is capable of 25 GFLOPS giving a potential limit of 150 GFLOPS when using all 6 SPUs (synergistic processing units) by using vectorized algorithms. We have used the Cell's computational power to create a program which takes simulated tsunami datasets, parses them, and returns a colorized height field image using ray casting techniques. As expected, the time required to create an image is inversely proportional to the number of SPUs used. We believe that this trend will continue when multiple PS3s are chained using OpenMP functionality and are in the process of researching this. By using the Cell to visualize tsunami data, we have found that its greatest feature is its power. This fact entwines well with the needs of the scientific community where the limiting factor is time. Any algorithm, such as the heat equation, that can be subdivided into multiple parts can take advantage of the PS3 Cell's ability to split the computations across the 6 SPUs reducing required run time by one sixth. Further vectorization of the code can allow for 4 simultanious floating point operations by using the SIMD (single instruction multiple data) capabilities of the SPU increasing efficiency 24 times.
Indirect addressing and load balancing for faster solution to Mandelbrot Set on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1989-01-01
SIMD computers with local indirect addressing allow programs to have queues and buffers, making certain kinds of problems much more efficient. Examined here are a class of problems characterized by computations on data points where the computation is identical, but the convergence rate is data dependent. Normally, in this situation, the algorithm time is governed by the maximum number of iterations required by each point. Using indirect addressing allows a processor to proceed to the next data point when it is done, reducing the overall number of iterations required to approach the mean convergence rate when a sufficiently large problem set is solved. Load balancing techniques can be applied for additional performance improvement. Simulations of this technique applied to solving Mandelbrot Sets indicate significant performance gains.
2D-RBUC for efficient parallel compression of residuals
NASA Astrophysics Data System (ADS)
Đurđević, Đorđe M.; Tartalja, Igor I.
2018-02-01
In this paper, we present a method for lossless compression of residuals with an efficient SIMD parallel decompression. The residuals originate from lossy or near lossless compression of height fields, which are commonly used to represent models of terrains. The algorithm is founded on the existing RBUC method for compression of non-uniform data sources. We have adapted the method to capture 2D spatial locality of height fields, and developed the data decompression algorithm for modern GPU architectures already present even in home computers. In combination with the point-level SIMD-parallel lossless/lossy high field compression method HFPaC, characterized by fast progressive decompression and seamlessly reconstructed surface, the newly proposed method trades off small efficiency degradation for a non negligible compression ratio (measured up to 91%) benefit.
Nicholson, L; Hotchin, H
2015-05-01
People with intellectual disabilities (ID) have high rates of psychiatric illness and are known to live in more deprived areas than the general population. This study investigated the relationship between area deprivation and contact with ID psychiatry. Psychiatric case notes and electronic records were used to identify all patients who had face-to-face contact with community ID psychiatric services over 1 year in the North East Community Health Partnership of Greater Glasgow and Clyde (estimated population 177,867). The Scottish Index of Multiple Deprivation (SIMD) were determined for the patient sample and for the general population living in the same area. Between 1 June 2012 and 1 June 2013, 184 patients were seen by ID psychiatry over a total of 553 contacts, with valid SIMD data for 179 patients and 543 contacts. Fifty-two per cent of patients (n = 93) lived in the most deprived SIMD decile, and 90.5% (n = 152) in the lowest 5 deciles. Compared with the general population, there were significantly more patients than expected living in the most deprived decile (Fisher's Exact test, P = 0.009) and in the most deprived 5 deciles (Fisher's Exact test, P = 0.001). The median number of contacts was 2 (interquartile range = 1-3). There was no significant association between the number of contacts and SIMD decile. Forty-eight point one per cent (n = 261) of all contacts were with patients living in the most deprived decile and 88.6% (n = 481) in the most deprived 5 deciles. This was significantly more than expected compared with general population data (Fisher's Exact test, P = 0.008 and Fisher's Exact test, P ≤ 0.001). In the area under study, contact with ID psychiatry was greater in more deprived areas. Given the high psychiatric morbidity of people with ID, if services do not adjust for deprivation, this may lead to further discrimination in an already disadvantaged population. © 2014 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Roofline model toolkit: A practical tool for architectural and program analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lo, Yu Jung; Williams, Samuel; Van Straalen, Brian
We present preliminary results of the Roofline Toolkit for multicore, many core, and accelerated architectures. This paper focuses on the processor architecture characterization engine, a collection of portable instrumented micro benchmarks implemented with Message Passing Interface (MPI), and OpenMP used to express thread-level parallelism. These benchmarks are specialized to quantify the behavior of different architectural features. Compared to previous work on performance characterization, these microbenchmarks focus on capturing the performance of each level of the memory hierarchy, along with thread-level parallelism, instruction-level parallelism and explicit SIMD parallelism, measured in the context of the compilers and run-time environments. We also measuremore » sustained PCIe throughput with four GPU memory managed mechanisms. By combining results from the architecture characterization with the Roofline model based solely on architectural specifications, this work offers insights for performance prediction of current and future architectures and their software systems. To that end, we instrument three applications and plot their resultant performance on the corresponding Roofline model when run on a Blue Gene/Q architecture.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Amadio, G.; et al.
An intensive R&D and programming effort is required to accomplish new challenges posed by future experimental high-energy particle physics (HEP) programs. The GeantV project aims to narrow the gap between the performance of the existing HEP detector simulation software and the ideal performance achievable, exploiting latest advances in computing technology. The project has developed a particle detector simulation prototype capable of transporting in parallel particles in complex geometries exploiting instruction level microparallelism (SIMD and SIMT), task-level parallelism (multithreading) and high-level parallelism (MPI), leveraging both the multi-core and the many-core opportunities. We present preliminary verification results concerning the electromagnetic (EM) physicsmore » models developed for parallel computing architectures within the GeantV project. In order to exploit the potential of vectorization and accelerators and to make the physics model effectively parallelizable, advanced sampling techniques have been implemented and tested. In this paper we introduce a set of automated statistical tests in order to verify the vectorized models by checking their consistency with the corresponding Geant4 models and to validate them against experimental data.« less
NASA Astrophysics Data System (ADS)
Drabik, Timothy J.; Lee, Sing H.
1986-11-01
The intrinsic parallelism characteristics of easily realizable optical SIMD arrays prompt their present consideration in the implementation of highly structured algorithms for the numerical solution of multidimensional partial differential equations and the computation of fast numerical transforms. Attention is given to a system, comprising several spatial light modulators (SLMs), an optical read/write memory, and a functional block, which performs simple, space-invariant shifts on images with sufficient flexibility to implement the fastest known methods for partial differential equations as well as a wide variety of numerical transforms in two or more dimensions. Either fixed or floating-point arithmetic may be used. A performance projection of more than 1 billion floating point operations/sec using SLMs with 1000 x 1000-resolution and operating at 1-MHz frame rates is made.
Route planning in a four-dimensional environment
NASA Technical Reports Server (NTRS)
Slack, M. G.; Miller, D. P.
1987-01-01
Robots must be able to function in the real world. The real world involves processes and agents that move independently of the actions of the robot, sometimes in an unpredictable manner. A real-time integrated route planning and spatial representation system for planning routes through dynamic domains is presented. The system will find the safest most efficient route through space-time as described by a set of user defined evaluation functions. Because the route planning algorthims is highly parallel and can run on an SIMD machine in O(p) time (p is the length of a path), the system will find real-time paths through unpredictable domains when used in an incremental mode. Spatial representation, an SIMD algorithm for route planning in a dynamic domain, and results from an implementation on a traditional computer architecture are discussed.
NASA Astrophysics Data System (ADS)
Francés, J.; Bleda, S.; Neipp, C.; Márquez, A.; Pascual, I.; Beléndez, A.
2013-03-01
The finite-difference time-domain method (FDTD) allows electromagnetic field distribution analysis as a function of time and space. The method is applied to analyze holographic volume gratings (HVGs) for the near-field distribution at optical wavelengths. Usually, this application requires the simulation of wide areas, which implies more memory and time processing. In this work, we propose a specific implementation of the FDTD method including several add-ons for a precise simulation of optical diffractive elements. Values in the near-field region are computed considering the illumination of the grating by means of a plane wave for different angles of incidence and including absorbing boundaries as well. We compare the results obtained by FDTD with those obtained using a matrix method (MM) applied to diffraction gratings. In addition, we have developed two optimized versions of the algorithm, for both CPU and GPU, in order to analyze the improvement of using the new NVIDIA Fermi GPU architecture versus highly tuned multi-core CPU as a function of the size simulation. In particular, the optimized CPU implementation takes advantage of the arithmetic and data transfer streaming SIMD (single instruction multiple data) extensions (SSE) included explicitly in the code and also of multi-threading by means of OpenMP directives. A good agreement between the results obtained using both FDTD and MM methods is obtained, thus validating our methodology. Moreover, the performance of the GPU is compared to the SSE+OpenMP CPU implementation, and it is quantitatively determined that a highly optimized CPU program can be competitive for a wider range of simulation sizes, whereas GPU computing becomes more powerful for large-scale simulations.
PIMS: Memristor-Based Processing-in-Memory-and-Storage.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cook, Jeanine
Continued progress in computing has augmented the quest for higher performance with a new quest for higher energy efficiency. This has led to the re-emergence of Processing-In-Memory (PIM) ar- chitectures that offer higher density and performance with some boost in energy efficiency. Past PIM work either integrated a standard CPU with a conventional DRAM to improve the CPU- memory link, or used a bit-level processor with Single Instruction Multiple Data (SIMD) control, but neither matched the energy consumption of the memory to the computation. We originally proposed to develop a new architecture derived from PIM that more effectively addressed energymore » efficiency for high performance scientific, data analytics, and neuromorphic applications. We also originally planned to implement a von Neumann architecture with arithmetic/logic units (ALUs) that matched the power consumption of an advanced storage array to maximize energy efficiency. Implementing this architecture in storage was our original idea, since by augmenting storage (in- stead of memory), the system could address both in-memory computation and applications that accessed larger data sets directly from storage, hence Processing-in-Memory-and-Storage (PIMS). However, as our research matured, we discovered several things that changed our original direc- tion, the most important being that a PIM that implements a standard von Neumann-type archi- tecture results in significant energy efficiency improvement, but only about a O(10) performance improvement. In addition to this, the emergence of new memory technologies moved us to propos- ing a non-von Neumann architecture, called Superstrider, implemented not in storage, but in a new DRAM technology called High Bandwidth Memory (HBM). HBM is a stacked DRAM tech- nology that includes a logic layer where an architecture such as Superstrider could potentially be implemented.« less
Solving the Cauchy-Riemann equations on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
Discussed is the implementation of a single algorithm on three parallel-vector computers. The algorithm is a relaxation scheme for the solution of the Cauchy-Riemann equations; a set of coupled first order partial differential equations. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, and SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The machine architectures are briefly described. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Conclusions are presented.
Neuromorphic processing for next generation automotive control and diagnostics
NASA Technical Reports Server (NTRS)
Aranki, N.; Tawel, R.
2001-01-01
This paper describes intra-layer architecture of a neuroprocessor organized in a SIMD configuration and the two challenging applications, misfire detection and engine idle speed control, that had served as the focus of this effort.
Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments.
Daily, Jeff
2016-02-10
Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. A faster intra-sequence local pairwise alignment implementation is described and benchmarked, including new global and semi-global variants. Using a 375 residue query sequence a speed of 136 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon E5-2670 24-core processor system, the highest reported for an implementation based on Farrar's 'striped' approach. Rognes's SWIPE optimal database search application is still generally the fastest available at 1.2 to at best 2.4 times faster than Parasail for sequences shorter than 500 amino acids. However, Parasail was faster for longer sequences. For global alignments, Parasail's prefix scan implementation is generally the fastest, faster even than Farrar's 'striped' approach, however the opal library is faster for single-threaded applications. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. Applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.
A system for routing arbitrary directed graphs on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1987-01-01
There are many problems which can be described in terms of directed graphs that contain a large number of vertices where simple computations occur using data from connecting vertices. A method is given for parallelizing such problems on an SIMD machine model that is bit-serial and uses only nearest neighbor connections for communication. Each vertex of the graph will be assigned to a processor in the machine. Algorithms are given that will be used to implement movement of data along the arcs of the graph. This architecture and algorithms define a system that is relatively simple to build and can do graph processing. All arcs can be transversed in parallel in time O(T), where T is empirically proportional to the diameter of the interconnection network times the average degree of the graph. Modifying or adding a new arc takes the same time as parallel traversal.
The 2nd Symposium on the Frontiers of Massively Parallel Computations
NASA Technical Reports Server (NTRS)
Mills, Ronnie (Editor)
1988-01-01
Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trędak, Przemysław, E-mail: przemyslaw.tredak@fuw.edu.pl; Rudnicki, Witold R.; Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, ul. Pawińskiego 5a, 02-106 Warsaw
The second generation Reactive Bond Order (REBO) empirical potential is commonly used to accurately model a wide range hydrocarbon materials. It is also extensible to other atom types and interactions. REBO potential assumes complex multi-body interaction model, that is difficult to represent efficiently in the SIMD or SIMT programming model. Hence, despite its importance, no efficient GPGPU implementation has been developed for this potential. Here we present a detailed description of a highly efficient GPGPU implementation of molecular dynamics algorithm using REBO potential. The presented algorithm takes advantage of rarely used properties of the SIMT architecture of a modern GPUmore » to solve difficult synchronizations issues that arise in computations of multi-body potential. Techniques developed for this problem may be also used to achieve efficient solutions of different problems. The performance of proposed algorithm is assessed using a range of model systems. It is compared to highly optimized CPU implementation (both single core and OpenMP) available in LAMMPS package. These experiments show up to 6x improvement in forces computation time using single processor of the NVIDIA Tesla K80 compared to high end 16-core Intel Xeon processor.« less
Parallel optimization algorithms and their implementation in VLSI design
NASA Technical Reports Server (NTRS)
Lee, G.; Feeley, J. J.
1991-01-01
Two new parallel optimization algorithms based on the simplex method are described. They may be executed by a SIMD parallel processor architecture and be implemented in VLSI design. Several VLSI design implementations are introduced. An application example is reported to demonstrate that the algorithms are effective.
Fisher, A; Craigie, A M; Macleod, M; Steele, R J C; Anderson, A S
2018-06-01
Although 45% of colorectal cancer (CRC) cases may be avoidable through appropriate lifestyle and weight management, health promotion interventions run the risk of widening health inequalities. The BeWEL randomised controlled trial assessed the impact of a diet and activity programme in overweight adults who were diagnosed with a colorectal adenoma, demonstrating a significantly greater weight loss at 12 months in intervention participants than in controls. The present study aimed to compare BeWEL intervention outcomes by participant deprivation status. The intervention group of the BeWEL trial (n = 163) was classified by the Scottish Index of Multiple Deprivation (SIMD) quintiles into 'more deprived' (SIMD 1-2, n = 58) and 'less deprived' (SIMD 3-5, n = 105). Socio-economic and lifestyle variables were compared at baseline to identify potential challenges to intervention adherence in the more deprived. Between group differences at 12 months in primary outcome (change in body weight) and secondary outcomes (cardiovascular risk factors, diet, physical activity, knowledge of CRC risk and psychosocial variables) were assessed by deprivation status. At baseline, education (P = 0.001), income (P < 0.001), spending on physical activity (P = 0.003) and success at previous weight loss attempts (P = 0.007) were significantly lower in the most deprived. At 12 months, no between group differences by deprivation status were detected for changes in primary and main secondary outcomes. Despite potential barriers faced by the more deprived participants, primary and most secondary outcomes were comparable between groups, indicating that this intervention is unlikely to worsen health inequalities and is equally effective across socio-economic groups. © 2017 The Authors. Journal of Human Nutrition and Dietetics published by John Wiley & Sons Ltd on behalf of British Dietetic Association.
Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver
NASA Astrophysics Data System (ADS)
Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre
2014-06-01
This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.
Efficient implementation of the many-body Reactive Bond Order (REBO) potential on GPU
NASA Astrophysics Data System (ADS)
Trędak, Przemysław; Rudnicki, Witold R.; Majewski, Jacek A.
2016-09-01
The second generation Reactive Bond Order (REBO) empirical potential is commonly used to accurately model a wide range hydrocarbon materials. It is also extensible to other atom types and interactions. REBO potential assumes complex multi-body interaction model, that is difficult to represent efficiently in the SIMD or SIMT programming model. Hence, despite its importance, no efficient GPGPU implementation has been developed for this potential. Here we present a detailed description of a highly efficient GPGPU implementation of molecular dynamics algorithm using REBO potential. The presented algorithm takes advantage of rarely used properties of the SIMT architecture of a modern GPU to solve difficult synchronizations issues that arise in computations of multi-body potential. Techniques developed for this problem may be also used to achieve efficient solutions of different problems. The performance of proposed algorithm is assessed using a range of model systems. It is compared to highly optimized CPU implementation (both single core and OpenMP) available in LAMMPS package. These experiments show up to 6x improvement in forces computation time using single processor of the NVIDIA Tesla K80 compared to high end 16-core Intel Xeon processor.
CMOS VLSI Layout and Verification of a SIMD Computer
NASA Technical Reports Server (NTRS)
Zheng, Jianqing
1996-01-01
A CMOS VLSI layout and verification of a 3 x 3 processor parallel computer has been completed. The layout was done using the MAGIC tool and the verification using HSPICE. Suggestions for expanding the computer into a million processor network are presented. Many problems that might be encountered when implementing a massively parallel computer are discussed.
Overview and extensions of a system for routing directed graphs on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1988-01-01
Many problems can be described in terms of directed graphs that contain a large number of vertices where simple computations occur using data from adjacent vertices. A method is given for parallelizing such problems on an SIMD machine model that uses only nearest neighbor connections for communication, and has no facility for local indirect addressing. Each vertex of the graph will be assigned to a processor in the machine. Rules for a labeling are introduced that support the use of a simple algorithm for movement of data along the edges of the graph. Additional algorithms are defined for addition and deletion of edges. Modifying or adding a new edge takes the same time as parallel traversal. This combination of architecture and algorithms defines a system that is relatively simple to build and can do fast graph processing. All edges can be traversed in parallel in time O(T), where T is empirically proportional to the average path length in the embedding times the average degree of the graph. Additionally, researchers present an extension to the above method which allows for enhanced performance by allowing some broadcasting capabilities.
Development of Improved Modeling and Analysis Techniques for Dynamics of Shell Structures
1991-07-24
Engineering Sciences and Center for Space Structures and Control University of Colorado,Campus Box 429 Boulder, Colorado 80309 Accesion :or -.... ... i...system architecture ; third, to implement a decomposi- tion/mapping procedure that matches as far as possible the layout of the processors to the...element computations. In particular. we address issues that are related to the processor memory size. to the SIMD architecture and to the fast
1994-06-01
algorithms for large, irreducibly coupled systems iteratively solve concurrent problems within different subspaces of a Hilbert space, or within different...effective on problems amenable to SIMD solution. Together with researchers at AT&T Bell Labs (Boris Lubachevsky, Albert Greenberg ) we have developed...reasonable measurement. In the study of different speedups, various causes of superlinear speedup are also presented. Greenberg , Albert G., Boris D
Scheins, J J; Vahedipour, K; Pietrzyk, U; Shah, N J
2015-12-21
For high-resolution, iterative 3D PET image reconstruction the efficient implementation of forward-backward projectors is essential to minimise the calculation time. Mathematically, the projectors are summarised as a system response matrix (SRM) whose elements define the contribution of image voxels to lines-of-response (LORs). In fact, the SRM easily comprises billions of non-zero matrix elements to evaluate the tremendous number of LORs as provided by state-of-the-art PET scanners. Hence, the performance of iterative algorithms, e.g. maximum-likelihood-expectation-maximisation (MLEM), suffers from severe computational problems due to the intensive memory access and huge number of floating point operations. Here, symmetries occupy a key role in terms of efficient implementation. They reduce the amount of independent SRM elements, thus allowing for a significant matrix compression according to the number of exploitable symmetries. With our previous work, the PET REconstruction Software TOolkit (PRESTO), very high compression factors (>300) are demonstrated by using specific non-Cartesian voxel patterns involving discrete polar symmetries. In this way, a pre-calculated memory-resident SRM using complex volume-of-intersection calculations can be achieved. However, our original ray-driven implementation suffers from addressing voxels, projection data and SRM elements in disfavoured memory access patterns. As a consequence, a rather limited numerical throughput is observed due to the massive waste of memory bandwidth and inefficient usage of cache respectively. In this work, an advantageous symmetry-driven evaluation of the forward-backward projectors is proposed to overcome these inefficiencies. The polar symmetries applied in PRESTO suggest a novel organisation of image data and LOR projection data in memory to enable an efficient single instruction multiple data vectorisation, i.e. simultaneous use of any SRM element for symmetric LORs. In addition, the calculation time is further reduced by using simultaneous multi-threading (SMT). A global speedup factor of 11 without SMT and above 100 with SMT has been achieved for the improved CPU-based implementation while obtaining equivalent numerical results.
CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment
Manavski, Svetlin A; Valle, Giorgio
2008-01-01
Background Searching for similarities in protein and DNA databases has become a routine procedure in Molecular Biology. The Smith-Waterman algorithm has been available for more than 25 years. It is based on a dynamic programming approach that explores all the possible alignments between two sequences; as a result it returns the optimal local alignment. Unfortunately, the computational cost is very high, requiring a number of operations proportional to the product of the length of two sequences. Furthermore, the exponential growth of protein and DNA databases makes the Smith-Waterman algorithm unrealistic for searching similarities in large sets of sequences. For these reasons heuristic approaches such as those implemented in FASTA and BLAST tend to be preferred, allowing faster execution times at the cost of reduced sensitivity. The main motivation of our work is to exploit the huge computational power of commonly available graphic cards, to develop high performance solutions for sequence alignment. Results In this paper we present what we believe is the fastest solution of the exact Smith-Waterman algorithm running on commodity hardware. It is implemented in the recently released CUDA programming environment by NVidia. CUDA allows direct access to the hardware primitives of the last-generation Graphics Processing Units (GPU) G80. Speeds of more than 3.5 GCUPS (Giga Cell Updates Per Second) are achieved on a workstation running two GeForce 8800 GTX. Exhaustive tests have been done to compare our implementation to SSEARCH and BLAST, running on a 3 GHz Intel Pentium IV processor. Our solution was also compared to a recently published GPU implementation and to a Single Instruction Multiple Data (SIMD) solution. These tests show that our implementation performs from 2 to 30 times faster than any other previous attempt available on commodity hardware. Conclusions The results show that graphic cards are now sufficiently advanced to be used as efficient hardware accelerators for sequence alignment. Their performance is better than any alternative available on commodity hardware platforms. The solution presented in this paper allows large scale alignments to be performed at low cost, using the exact Smith-Waterman algorithm instead of the largely adopted heuristic approaches. PMID:18387198
NASA Astrophysics Data System (ADS)
Scheins, J. J.; Vahedipour, K.; Pietrzyk, U.; Shah, N. J.
2015-12-01
For high-resolution, iterative 3D PET image reconstruction the efficient implementation of forward-backward projectors is essential to minimise the calculation time. Mathematically, the projectors are summarised as a system response matrix (SRM) whose elements define the contribution of image voxels to lines-of-response (LORs). In fact, the SRM easily comprises billions of non-zero matrix elements to evaluate the tremendous number of LORs as provided by state-of-the-art PET scanners. Hence, the performance of iterative algorithms, e.g. maximum-likelihood-expectation-maximisation (MLEM), suffers from severe computational problems due to the intensive memory access and huge number of floating point operations. Here, symmetries occupy a key role in terms of efficient implementation. They reduce the amount of independent SRM elements, thus allowing for a significant matrix compression according to the number of exploitable symmetries. With our previous work, the PET REconstruction Software TOolkit (PRESTO), very high compression factors (>300) are demonstrated by using specific non-Cartesian voxel patterns involving discrete polar symmetries. In this way, a pre-calculated memory-resident SRM using complex volume-of-intersection calculations can be achieved. However, our original ray-driven implementation suffers from addressing voxels, projection data and SRM elements in disfavoured memory access patterns. As a consequence, a rather limited numerical throughput is observed due to the massive waste of memory bandwidth and inefficient usage of cache respectively. In this work, an advantageous symmetry-driven evaluation of the forward-backward projectors is proposed to overcome these inefficiencies. The polar symmetries applied in PRESTO suggest a novel organisation of image data and LOR projection data in memory to enable an efficient single instruction multiple data vectorisation, i.e. simultaneous use of any SRM element for symmetric LORs. In addition, the calculation time is further reduced by using simultaneous multi-threading (SMT). A global speedup factor of 11 without SMT and above 100 with SMT has been achieved for the improved CPU-based implementation while obtaining equivalent numerical results.
Optical Interconnections for VLSI Computational Systems Using Computer-Generated Holography.
NASA Astrophysics Data System (ADS)
Feldman, Michael Robert
Optical interconnects for VLSI computational systems using computer generated holograms are evaluated in theory and experiment. It is shown that by replacing particular electronic connections with free-space optical communication paths, connection of devices on a single chip or wafer and between chips or modules can be improved. Optical and electrical interconnects are compared in terms of power dissipation, communication bandwidth, and connection density. Conditions are determined for which optical interconnects are advantageous. Based on this analysis, it is shown that by applying computer generated holographic optical interconnects to wafer scale fine grain parallel processing systems, dramatic increases in system performance can be expected. Some new interconnection networks, designed to take full advantage of optical interconnect technology, have been developed. Experimental Computer Generated Holograms (CGH's) have been designed, fabricated and subsequently tested in prototype optical interconnected computational systems. Several new CGH encoding methods have been developed to provide efficient high performance CGH's. One CGH was used to decrease the access time of a 1 kilobit CMOS RAM chip. Another was produced to implement the inter-processor communication paths in a shared memory SIMD parallel processor array.
The Effect of Single Gender Instruction on Eighth Grade Students' Mathematics Achievement
ERIC Educational Resources Information Center
Hammel, David Michael
2013-01-01
In the research study, this investigator utilized a non-experimental, causal-comparative design (ex post facto) with archival data to determine the real impact single gender instruction had on eighth grade students' mathematics achievement. The purpose of this study was to quantitatively analyze the benefits of single gender mathematics…
ERIC Educational Resources Information Center
Beaulieu, Barbara; And Others
This unit of instruction on selection and living styles for energy conservation in single-family and multi-family housing and mobile homes was designed for use by home economics teachers in Florida high schools and by home economics extension agents as they work with their clientele. It is one of a series of 11 instructional units (see note)…
Real-time SHVC software decoding with multi-threaded parallel processing
NASA Astrophysics Data System (ADS)
Gudumasu, Srinivas; He, Yuwen; Ye, Yan; He, Yong; Ryu, Eun-Seok; Dong, Jie; Xiu, Xiaoyu
2014-09-01
This paper proposes a parallel decoding framework for scalable HEVC (SHVC). Various optimization technologies are implemented on the basis of SHVC reference software SHM-2.0 to achieve real-time decoding speed for the two layer spatial scalability configuration. SHVC decoder complexity is analyzed with profiling information. The decoding process at each layer and the up-sampling process are designed in parallel and scheduled by a high level application task manager. Within each layer, multi-threaded decoding is applied to accelerate the layer decoding speed. Entropy decoding, reconstruction, and in-loop processing are pipeline designed with multiple threads based on groups of coding tree units (CTU). A group of CTUs is treated as a processing unit in each pipeline stage to achieve a better trade-off between parallelism and synchronization. Motion compensation, inverse quantization, and inverse transform modules are further optimized with SSE4 SIMD instructions. Simulations on a desktop with an Intel i7 processor 2600 running at 3.4 GHz show that the parallel SHVC software decoder is able to decode 1080p spatial 2x at up to 60 fps (frames per second) and 1080p spatial 1.5x at up to 50 fps for those bitstreams generated with SHVC common test conditions in the JCT-VC standardization group. The decoding performance at various bitrates with different optimization technologies and different numbers of threads are compared in terms of decoding speed and resource usage, including processor and memory.
Single instruction computer architecture and its application in image processing
NASA Astrophysics Data System (ADS)
Laplante, Phillip A.
1992-03-01
A single processing computer system using only half-adder circuits is described. In addition, it is shown that only a single hard-wired instruction is needed in the control unit to obtain a complete instruction set for this general purpose computer. Such a system has several advantages. First it is intrinsically a RISC machine--in fact the 'ultimate RISC' machine. Second, because only a single type of logic element is employed the entire computer system can be easily realized on a single, highly integrated chip. Finally, due to the homogeneous nature of the computer's logic elements, the computer has possible implementations as an optical or chemical machine. This in turn suggests possible paradigms for neural computing and artificial intelligence. After showing how we can implement a full-adder, min, max and other operations using the half-adder, we use an array of such full-adders to implement the dilation operation for two black and white images. Next we implement the erosion operation of two black and white images using a relative complement function and the properties of erosion and dilation. This approach was inspired by papers by van der Poel in which a single instruction is used to furnish a complete set of general purpose instructions and by Bohm- Jacopini where it is shown that any problem can be solved using a Turing machine with one entry and one exit.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duncan, J.B.
1997-01-07
This document provides specific test procedures and instructions to implement the test plan for the preparation and conduct of a cesium removal test, using Hanford Single Shell Tank Saltcake from tanks 24 t -BY- I 10, 24 1 -U- 108, 24 1 -U- 109, 24 1 -A- I 0 1, and 24 t - S-102, in a bench-scale column. The cesium sorbent to be tested is crystalline siticotitanate. The test plan for which this provides instructions is WHC-SD-RE-TP-024, Hanford Single Shell Tank Saltcake Cesium Removal Test Plan.
Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program
2017-08-01
same function to be used with scalar inputs, input arrays of the same shape, or even input arrays of dimensionality in some cases. Most of the math ... math operations on values ● Split-apply-combine: similar to group-by operations in databases ● Join: combine two datasets using common columns 4.3.3...Numba - Continue to increase SIMD performance with support for fast math flags and improved support for AVX, Intel’s large vector
Parallel Computing:. Some Activities in High Energy Physics
NASA Astrophysics Data System (ADS)
Willers, Ian
This paper examines some activities in High Energy Physics that utilise parallel computing. The topic includes all computing from the proposed SIMD front end detectors, the farming applications, high-powered RISC processors and the large machines in the computer centers. We start by looking at the motivation behind using parallelism for general purpose computing. The developments around farming are then described from its simplest form to the more complex system in Fermilab. Finally, there is a list of some developments that are happening close to the experiments.
Protecting Cryptographic Keys and Functions from Malware Attacks
2010-12-01
registers. modifies RSA private key signing in OpenSSL to use the technique. The resulting system has the following features: 1. No special hardware is...the above method based on OpenSSL , by exploiting the Streaming SIMD Extension (SSE) XMM registers of modern Intel and AMD x86-compatible CPU’s [22...one can store a 2048-bit exponent.1 Our prototype is based on OpenSSL 0.9.8e, the Ubuntu 6.06 Linux distribution with a 2.6.15 kernel, and SSE2 which
SC'11 Poster: A Highly Efficient MGPT Implementation for LAMMPS; with Strong Scaling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oppelstrup, T; Stukowski, A; Marian, J
2011-12-07
The MGPT potential has been implemented as a drop in package to the general molecular dynamics code LAMMPS. We implement an improved communication scheme that shrinks the communication layer thickness, and increases the load balancing. This results in unprecedented strong scaling, and speedup continuing beyond 1/8 atom/core. In addition, we have optimized the small matrix linear algebra with generic blocking (for all processors) and specific SIMD intrinsics for vectorization on Intel, AMD, and BlueGene CPUs.
ERIC Educational Resources Information Center
Arthur, Ann M.; Davis, Dawn L.
2016-01-01
Double-dose instruction, in which instructional lessons are supplemented to provide additional instructional time, is a mechanism used in some schools for boosting outcomes in certain academic areas. The purpose of this study was to examine the effects of double-dose vocabulary instruction, relative to single-dose and business-as-usual control…
Single-Sex Mathematics Instruction in an Urban Independent School.
ERIC Educational Resources Information Center
Seitsinger, Anne M.; Barboza, Helen C.; Hird, Anne
An urban independent middle school grouped its 63 sixth and seventh graders into single-sex mathematics classes (SSMC) to improve girls' achievement in mathematics (AIM) and attitudes toward mathematics (ATM) with no negative impact on boys. Researchers analyzed AIM, ATM, and interactions/instruction. AIM measures included Metropolitan Achievement…
Missileborne Artificial Vision System (MAVIS)
NASA Technical Reports Server (NTRS)
Andes, David K.; Witham, James C.; Miles, Michael D.
1994-01-01
Several years ago when INTEL and China Lake designed the ETANN chip, analog VLSI appeared to be the only way to do high density neural computing. In the last five years, however, digital parallel processing chips capable of performing neural computation functions have evolved to the point of rough equality with analog chips in system level computational density. The Naval Air Warfare Center, China Lake, has developed a real time, hardware and software system designed to implement and evaluate biologically inspired retinal and cortical models. The hardware is based on the Adaptive Solutions Inc. massively parallel CNAPS system COHO boards. Each COHO board is a standard size 6U VME card featuring 256 fixed point, RISC processors running at 20 MHz in a SIMD configuration. Each COHO board has a companion board built to support a real time VSB interface to an imaging seeker, a NTSC camera, and to other COHO boards. The system is designed to have multiple SIMD machines each performing different corticomorphic functions. The system level software has been developed which allows a high level description of corticomorphic structures to be translated into the native microcode of the CNAPS chips. Corticomorphic structures are those neural structures with a form similar to that of the retina, the lateral geniculate nucleus, or the visual cortex. This real time hardware system is designed to be shrunk into a volume compatible with air launched tactical missiles. Initial versions of the software and hardware have been completed and are in the early stages of integration with a missile seeker.
ERIC Educational Resources Information Center
Ku, Kelly Y. L.; Ho, Irene T.; Hau, Kit-Tai; Lai, Eva C. M.
2014-01-01
Critical thinking is a unifying goal of modern education. While past research has mostly examined the efficacy of a single instructional approach to teaching critical thinking, recent literature has begun discussing mixed teaching approaches. The present study examines three modes of instruction, featuring the direct instruction approach and the…
The Effects of Morpheme and Prosody Instruction on Middle School Spelling
ERIC Educational Resources Information Center
Dornay, Margaret A.
2017-01-01
A single case design was used to investigate the impact of two types of instruction on middle school students' spelling. Phase 1 emphasized morphology awareness instruction (MAI) and phase 2 employed the addition of prosody awareness instruction (PAI). In order to compare the effects of MAI and PAI, spelling scores were gathered from eight…
ERIC Educational Resources Information Center
Mery, Yvonne; Newby, Jill; Peng, Ke
2012-01-01
This study investigates whether the type of instruction (a single face-to-face librarian-led instruction, instructor-led instruction, or an online IL course--the Online Research Lab) has an impact on student information literacy gains in a Freshman English Composition program. A performance-based assessment was carried out by analyzing…
Numerical study of the vortex tube reconnection using vortex particle method on many graphics cards
NASA Astrophysics Data System (ADS)
Kudela, Henryk; Kosior, Andrzej
2014-08-01
Vortex Particle Methods are one of the most convenient ways of tracking the vorticity evolution. In the article we presented numerical recreation of the real life experiment concerning head-on collision of two vortex rings. In the experiment the evolution and reconnection of the vortex structures is tracked with passive markers (paint particles) which in viscous fluid does not follow the evolution of vorticity field. In numerical computations we showed the difference between vorticity evolution and movement of passive markers. The agreement with the experiment was very good. Due to problems with very long time of computations on a single processor the Vortex-in-Cell method was implemented on the multicore architecture of the graphics cards (GPUs). Vortex Particle Methods are very well suited for parallel computations. As there are myriads of particles in the flow and for each of them the same equations of motion have to be solved the SIMD architecture used in GPUs seems to be perfect. The main disadvantage in this case is the small amount of the RAM memory. To overcome this problem we created a multiGPU implementation of the VIC method. Some remarks on parallel computing are given in the article.
A configurable and low-power mixed signal SoC for portable ECG monitoring applications.
Kim, Hyejung; Kim, Sunyoung; Van Helleputte, Nick; Artes, Antonio; Konijnenburg, Mario; Huisken, Jos; Van Hoof, Chris; Yazicioglu, Refet Firat
2014-04-01
This paper describes a mixed-signal ECG System-on-Chip (SoC) that is capable of implementing configurable functionality with low-power consumption for portable ECG monitoring applications. A low-voltage and high performance analog front-end extracts 3-channel ECG signals and single channel electrode-tissue-impedance (ETI) measurement with high signal quality. This can be used to evaluate the quality of the ECG measurement and to filter motion artifacts. A custom digital signal processor consisting of 4-way SIMD processor provides the configurability and advanced functionality like motion artifact removal and R peak detection. A built-in 12-bit analog-to-digital converter (ADC) is capable of adaptive sampling achieving a compression ratio of up to 7, and loop buffer integration reduces the power consumption for on-chip memory access. The SoC is implemented in 0.18 μm CMOS process and consumes 32 μ W from a 1.2 V while heart beat detection application is running, and integrated in a wireless ECG monitoring system with Bluetooth protocol. Thanks to the ECG SoC, the overall system power consumption can be reduced significantly.
Access to chlamydia testing in remote and rural Scotland.
Hawkins, Katherine E; Thompson, Lucy; Wilson, Philip
2016-01-01
The aim of this study was to assess access to sexual health care in remote and rural settings using Chlamydia testing as a focus by measuring the extent of Chlamydia testing and positivity across the Scottish Highlands in relation to the Scottish Index of Multiple Deprivation Quintile (SIMD) and Urban Rural 8-fold index (UR8). Tests processed through Raigmore Hospital in Inverness, the main testing laboratory for microbiology tests in North and West and South and Mid Highlands, were studied. Where people are tested in relation to where they live was assessed, as well as the type of test they opt for. Also assessed was the rate of positivity in male and female patients in rural compared with urban settings using the Scottish Government UR8 and in relation to the SIMD. 9644 results were analysed. 77.2% of the results were for females and 22.4% for males. 8.1% of the results were positive and 84.4% were negative. There were proportionately more positive tests from the sexual health sources than from general practice. The proportion of men who had positive tests was almost double that for women (12.7% vs 6.6%) although men made up only 27.9% of the total number of tests. There was no significant difference in positivity when compared with UR8 index or SIMD. 37.7% of people living in the most rural areas (UR8 7-8) had their test performed in a more urban setting (UR8 1-6), and 20.4% people had their test performed in a very urban setting (UR8 1-2). Of these tests, there was a tendency for UR8 7-8 patients to be more likely to have a positive test if tested in an urban setting. These results are similar to previous results in other countries that suggest that Chlamydia positivity is similar in rural and urban settings. A large proportion of people living in more rurally classified areas, and perhaps those with a higher risk, have their test in a central setting, suggesting that they may be bypassing local resources to get a test. The reason for this is not clear. The results also show that men are more likely to have their test in a genitourinary setting as well as have proportionately more positive results. These results support the case for customising sexual health services to the most rural areas and suggest that providing an anonymous testing service in these areas might be beneficial, especially for men.
ERIC Educational Resources Information Center
Knight, G. William; And Others
1994-01-01
The first step in engineering the instruction of dental psychomotor skills, task analysis, is explained. A chart details the procedural, cognitive, desired-criteria, and desired-performance analysis of a single task, occlusal preparation for amalgam restoration with carious lesion. (MSE)
Repetition priming effects from attended vs. ignored single words in a semantic categorization task.
Ortells, Juan J; Fox, Elaine; Noguera, Carmen; Abad, María J F
2003-10-01
The present research examines priming effects from a centrally presented single-prime word to which participants were instructed to either attend or ignore. The prime word was followed by a single central target word to which participants made a semantic categorization (animate vs. inanimate) task. The main variables manipulated across experiments were attentional instructions (attend vs. ignore the prime word), presentation duration of the prime word (20, 50, 80 or 100 ms), prime-target stimulus onset asynchrony (SOA; 300 vs. 800 ms), and temporal presentation of instructions (before vs. after the prime word). The results showed (a) a consistent interaction between attentional instructions and repetition priming and (b) a qualitatively different ignored priming pattern as a function of prime duration: reduced positive priming (relative to the attend instruction) for prime exposures of 80 and 100 ms, and reliable negative priming for the shorter prime exposures of 20 and 50 ms. In addition (c), the differential priming pattern for attend and ignore trials was observed at a prime-target SOA of 800 ms (but not at a shorter 300-ms SOA) and only when instructions were presented before the prime word. Methodological and theoretical implications of the present findings for the extant negative priming literature are discussed.
Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications
NASA Astrophysics Data System (ADS)
Francés, J.; Otero, B.; Bleda, S.; Gallego, S.; Neipp, C.; Márquez, A.; Beléndez, A.
2015-06-01
The Finite-Difference Time-Domain (FDTD) method is applied to the analysis of vibroacoustic problems and to study the propagation of longitudinal and transversal waves in a stratified media. The potential of the scheme and the relevance of each acceleration strategy for massively computations in FDTD are demonstrated in this work. In this paper, we propose two new specific implementations of the bi-dimensional scheme of the FDTD method using multi-CPU and multi-GPU, respectively. In the first implementation, an open source message passing interface (OMPI) has been included in order to massively exploit the resources of a biprocessor station with two Intel Xeon processors. Moreover, regarding CPU code version, the streaming SIMD extensions (SSE) and also the advanced vectorial extensions (AVX) have been included with shared memory approaches that take advantage of the multi-core platforms. On the other hand, the second implementation called the multi-GPU code version is based on Peer-to-Peer communications available in CUDA on two GPUs (NVIDIA GTX 670). Subsequently, this paper presents an accurate analysis of the influence of the different code versions including shared memory approaches, vector instructions and multi-processors (both CPU and GPU) and compares them in order to delimit the degree of improvement of using distributed solutions based on multi-CPU and multi-GPU. The performance of both approaches was analysed and it has been demonstrated that the addition of shared memory schemes to CPU computing improves substantially the performance of vector instructions enlarging the simulation sizes that use efficiently the cache memory of CPUs. In this case GPU computing is slightly twice times faster than the fine tuned CPU version in both cases one and two nodes. However, for massively computations explicit vector instructions do not worth it since the memory bandwidth is the limiting factor and the performance tends to be the same than the sequential version with auto-vectorisation and also shared memory approach. In this scenario GPU computing is the best option since it provides a homogeneous behaviour. More specifically, the speedup of GPU computing achieves an upper limit of 12 for both one and two GPUs, whereas the performance reaches peak values of 80 GFlops and 146 GFlops for the performance for one GPU and two GPUs respectively. Finally, the method is applied to an earth crust profile in order to demonstrate the potential of our approach and the necessity of applying acceleration strategies in these type of applications.
Nonlinear Wave Simulation on the Xeon Phi Knights Landing Processor
NASA Astrophysics Data System (ADS)
Hristov, Ivan; Goranov, Goran; Hristova, Radoslava
2018-02-01
We consider an interesting from computational point of view standing wave simulation by solving coupled 2D perturbed Sine-Gordon equations. We make an OpenMP realization which explores both thread and SIMD levels of parallelism. We test the OpenMP program on two different energy equivalent Intel architectures: 2× Xeon E5-2695 v2 processors, (code-named "Ivy Bridge-EP") in the Hybrilit cluster, and Xeon Phi 7250 processor (code-named "Knights Landing" (KNL). The results show 2 times better performance on KNL processor.
NASA Astrophysics Data System (ADS)
Barnaś, Dawid; Bieniasz, Lesław K.
2017-07-01
We have recently developed a vectorized Thomas solver for quasi-block tridiagonal linear algebraic equation systems using Streaming SIMD Extensions (SSE) and Advanced Vector Extensions (AVX) in operations on dense blocks [D. Barnaś and L. K. Bieniasz, Int. J. Comput. Meth., accepted]. The acceleration caused by vectorization was observed for large block sizes, but was less satisfactory for small blocks. In this communication we report on another version of the solver, optimized for small blocks of size up to four rows and/or columns.
2014-09-30
portability is difficult to achieve on future supercomputers that use various type of accelerators (GPUs, Xeon - Phi , and SIMD etc). All of these...bottlenecks of NUMA. For example, in the CG code the state vector was originally stored as q(1 : Nvar ,1 : Npoin) where Nvar are the number of...a Global Grid Point (GGP) storage. On the other hand, in the DG code the state vector is typically stored as q(1 : Nvar ,1 : Npts,1 : Nelem) where
Meta-Analysis of Single-Case Research Design Studies on Instructional Pacing.
Tincani, Matt; De Mers, Marilyn
2016-11-01
More than four decades of research on instructional pacing has yielded varying and, in some cases, conflicting findings. The purpose of this meta-analysis was to synthesize single-case research design (SCRD) studies on instructional pacing to determine the relative benefits of brisker or slower pacing. Participants were children and youth with and without disabilities in educational settings, excluding higher education. Tau-U, a non-parametric statistic for analyzing data in SCRD studies, was used to determine effect size estimates. The article extraction yielded 13 instructional pacing studies meeting contemporary standards for high quality SCRD research. Eleven of the 13 studies reported small to large magnitude effects when two or more pacing parameters were compared, suggesting that instructional pacing is a robust instructional variable. Brisker instructional pacing with brief inter-trial interval (ITI) produced small increases in correct responding and medium to large reductions in challenging behavior compared with extended ITI. Slower instructional pacing with extended wait-time produced small increases in correct responding, but also produced small increases in challenging behavior compared with brief wait-time. Neither brief ITI nor extended wait-time meets recently established thresholds for evidence-based practice, highlighting the need for further instructional pacing research. © The Author(s) 2016.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shipman, Galen M.
These are the slides for a presentation on programming models in HPC, at the Los Alamos National Laboratory's Parallel Computing Summer School. The following topics are covered: Flynn's Taxonomy of computer architectures; single instruction single data; single instruction multiple data; multiple instruction multiple data; address space organization; definition of Trinity (Intel Xeon-Phi is a MIMD architecture); single program multiple data; multiple program multiple data; ExMatEx workflow overview; definition of a programming model, programming languages, runtime systems; programming model and environments; MPI (Message Passing Interface); OpenMP; Kokkos (Performance Portable Thread-Parallel Programming Model); Kokkos abstractions, patterns, policies, and spaces; RAJA, a systematicmore » approach to node-level portability and tuning; overview of the Legion Programming Model; mapping tasks and data to hardware resources; interoperability: supporting task-level models; Legion S3D execution and performance details; workflow, integration of external resources into the programming model.« less
Adult Learning Theories: Implications for Online Instruction
ERIC Educational Resources Information Center
Arghode, Vishal; Brieger, Earl W.; McLean, Gary N.
2017-01-01
Purpose: This paper analyzes critically four selected learning theories and their role in online instruction for adults. Design/methodology/approach: A literature review was conducted to analyze the theories. Findings: The theory comparison revealed that no single theory encompasses the entirety of online instruction for adult learning; each…
NASA Astrophysics Data System (ADS)
Dorfner, Tobias; Förtsch, Christian; Boone, William; Neuhaus, Birgit J.
2017-09-01
A number of studies on single instructional quality features have been reported for mathematics and science instruction. For summarizing single instructional quality features, researchers have created a model of three basic dimensions (classroom management, supportive climate, and cognitive activation) of instructional quality mainly through observing mathematics instruction. Considering this model as valid for all subjects and as usable for describing instruction, we used it in this study which aimed to analyze characteristics of instructional quality in biology lessons of high-achieving and low-achieving classes, independently of content. Therefore, we used the data of three different previous video studies of biology instruction conducted in Germany. From each video study, we selected three high-achieving and three low-achieving classes (N = 18 teachers; 35 videos) for our multiple-case study, in which conspicuous characteristics of instructional quality features were qualitatively identified and qualitatively analyzed. The amount of these characteristics was counted in a quantitative way in all the videos. The characteristics we found could be categorized using the model of three basic dimensions of instructional quality despite some subject-specific differences for biology instruction. Our results revealed that many more characteristics were observable in high-achieving classes than in low-achieving classes. Thus, we believe that this model could be used to describe biology instruction independently of the content. We also make the claims about the qualities for biology instruction—working with concentration in a content-structured environment, getting challenged in higher order thinking, and getting praised for performance—that could have positive influence on students' achievement.
What Works Clearinghouse Study Review Guide Instructions for Reviewing Single-Case Designs Studies
ERIC Educational Resources Information Center
What Works Clearinghouse, 2016
2016-01-01
This document provides step-by-step instructions on how to complete the Study Review Guide (SRG, Version S3, V2) for single-case designs (SCDs). Reviewers will complete an SRG for every What Works Clearinghouse (WWC) review. A completed SRG should be a reviewer's independent assessment of the study, relative to the criteria specified in the review…
A Revised Embedded Planning Tool for Intensive Reading Instruction
ERIC Educational Resources Information Center
Wei, Yan; Lombardi, Allison; Simonsen, Brandi; Coyne, Michael; Faggella-Luby, Michael; Freeman, Jennifer; Kearns, Devin
2017-01-01
A single-subject AB multiple-baseline design across participants was utilized to investigate the effectiveness of the Revised Tier Three Instructional Planning (T-TIP) tool on teacher lesson planning, with a focus on corrective and elaborative feedback within intensive literacy instructional settings in secondary schools. Findings revealed that…
Making the Most of Instructional Coaches
ERIC Educational Resources Information Center
Kane, Britnie Delinger; Rosenquist, Brooks
2018-01-01
Although coaching holds great promise for professional development, instructional coaches are often asked to take on responsibilities that are not focused on improving instruction. The authors discuss a quantitative study of four school districts and a qualitative analysis of a single district that, together, reveal how hiring practices and school…
NASA Astrophysics Data System (ADS)
Elam, Jeanette H.
The purpose of this study was to compare the academic performance of students enrolled in coeducational instruction and single-gender instruction. Within this framework, the researcher examined class type, gender, and racial/ethnicity using the sixth grade CRCT scores of selected students in the areas of mathematics and science. The fifth-grade mathematics and science scores for the same population were used to control for prior knowledge. This study examined the academic achievement of students based on class type, gender, and racial/ethnicity in relation to academic achievement. The study included the CRCT scores for mathematics and science of 6th-grade students at the middle school level who were tested during the 2007--2008 school year. Many studies conducted in the past have stressed females performed better in mathematics and science, while others have stated males performed better in the same areas. Yet, other studies have found conflicting results. A large Australian study (1996), compared the academic performance of students at single-gender and coeducational schools. The conclusion of this study indicated that both males and females who were educated in single-gender classrooms scored significantly higher than did males and females in coeducational classes. A study conducted by Graham Able (2003) documented superior academic performance of students in single-gender schools, after controlling for socioeconomic class and other variables. Able's most significant finding was that the advantage of single-gender schooling was greater for males in terms of academic results than for females. This directly contradicted the educational myth that males performed better in classrooms if females were present. The sample in this study consisted of CRCT scores for 304 sixth-grade students from four different middle schools. Due to the racial composition of the sample, the study only focused on black and white students. School 1 and School 2 involved single-gender instruction while Schools 3 and School 4 involved coeducational instruction. A sample of eighty students was taken from each of the middle schools with single-gender instruction and a sample of 72 students was taken from each of the middle schools with coeducational instruction. Prior to conducting the study, an extensive application was filed with the local board of education to request permission to conduct research in the county. This process involved a detail description of the sample, sampling procedures, sample size, staff members, grade levels, and background information for the study. The major findings in this study indicated that the coeducational students outperformed the single-gender students and the white students outperformed the black students. This study confirmed that white coeducational students performed significantly higher than the black coeducational students. It was also documented through this study that there was no significant difference between the performance of the single-gender black students and the single-gender white students. In contrast to the Australian study (1996), this study indicated that the coeducational students were outperforming the single-gender students. In comparison to the 2003 study by Able, the findings of this study showed single-gender instruction was greater for females in terms of higher academic achievement than for males. INDEX WORDS. Coeducational, Single-gender, Middle school students
Telishevesky, Yoel S; Levin, Liran; Ashkenazi, Malka
2012-01-01
The purpose of this study was to evaluate the effect of toothbrush design on the ability of parents to effectively brush their children's teeth. Parents of children (mean age=5.1±0.75 years old) from 4 kindergarten schools were randomly assigned to receive instruction in brushing their children's teeth using a manual single-headed toothbrush (2 schools) or a triple-headed toothbrush (2 schools). The parents' ability to brush their children's teeth was evaluated according to a novel toothbrush performing skill index (Ashkenazi Index), based on 2 criteria: (1) placement of the toothbrush on each tooth segment to be brushed ("reach"); and (2) completion of enough strokes on each segment ("stay"). One month after instruction, tooth-brushing ability was re-evaluated and plaque index of the children's teeth was assessed. One month after instruction, parents using the triple-headed toothbrush received significantly higher scores on the tooth-brushing performance index (~86%), than did those in the single-headed group (~61%; P=.001). The plaque index was significantly higher in the single-headed group (0.97±0.38) vs the triple-headed group (0.72±0.29; P<.01). The tooth-brushing performance index correlated negatively with the plaque index (P<.01). A triple-headed toothbrush promotes more consistent tooth-brushing by parents than does a single-headed toothbrush.
The Impact of Mode of Instructional Delivery on Second Language Teacher Self-Efficacy
ERIC Educational Resources Information Center
Kissau, Scott; Algozzine, Bob
2015-01-01
Research has called into question the suitability of fully-online instruction for certain teacher preparation courses. Methodology coursework, in particular, has been singled out in research as ill-suited to online instruction. Recent research, for example, involving second language (L2) teacher candidates has demonstrated that aspiring teachers…
Instructional Computing. An Action Guide for Educators.
ERIC Educational Resources Information Center
Dennis, J. Richard; Kansky, Robert J.
This book is directed to any educator who is interested in the use of the computer to improve classroom instruction. It is a book about the materials, human factors, and decision-making procedures that make up the instructional application of computers. This document's single goal is to promote educators' thoughtful selection and use of both…
Schlosser, Ralf W; Belfiore, Phillip J; Sigafoos, Jeff; Briesch, Amy M; Wendt, Oliver
2018-05-28
Evidence-based practice as a process requires the appraisal of research as a critical step. In the field of developmental disabilities, single-case experimental designs (SCEDs) figure prominently as a means for evaluating the effectiveness of non-reversible instructional interventions. Comparative SCEDs contrast two or more instructional interventions to document their relative effectiveness and efficiency. As such, these designs have great potential to inform evidence-based decision-making. To harness this potential, however, interventionists and authors of systematic reviews need tools to appraise the evidence generated by these designs. Our literature review revealed that existing tools do not adequately address the specific methodological considerations of comparative SCEDs that aim to compare instructional interventions of non-reversible target behaviors. The purpose of this paper is to introduce the Comparative Single-Case Experimental Design Rating System (CSCEDARS, "cedars") as a tool for appraising the internal validity of comparative SCEDs of two or more non-reversible instructional interventions. Pertinent literature will be reviewed to establish the need for this tool and to underpin the rationales for individual rating items. Initial reliability information will be provided as well. Finally, directions for instrument validation will be proposed. Copyright © 2018 Elsevier Ltd. All rights reserved.
Semantic priming effects from single words in a lexical decision task.
Noguera, Carmen; Ortells, Juan J; Abad, María J F; Carmona, Encarnación; Daza, M Teresa
2007-06-01
The present research examines the semantic priming effects of a centrally presented single prime word to which participants were instructed to either "attend and remember" or "ignore". The prime word was followed by a central probe target on which the participants made a lexical decision task. The main variables manipulated across experiments were prime duration (50 or 100 ms), the presence or absence of a mask following the prime, and the presence (or absence) and type of distractor stimulus (random set of consonants or pseudowords) on the probe display. There was a consistent interaction between the instructions and the semantic priming effects. Relative to the "attend and remember" instruction, an "ignore" instruction produced reduced positive priming from single primes presented for 100 ms, irrespective of the presence or absence of a prime mask, and regardless of whether the probe target was presented with or without distractors. Additionally, reliable negative priming was found from ignored primes presented for briefer durations (50 ms) and immediately followed by a mask. Methodological and theoretical implications of the present findings for the extant negative priming literature are discussed.
ERIC Educational Resources Information Center
Wosnitza, Marold; Volet, Simone
2014-01-01
This paper examines how distinct trajectories of change in students' general views of group work over the duration of one single group assignment could be explained by multidimensional aspects of their experience and the overall instructional context. Science (336) and Education (377) students involved in a semester-long group assignment…
Improving Quantum Gate Simulation using a GPU
NASA Astrophysics Data System (ADS)
Gutierrez, Eladio; Romero, Sergio; Trenas, Maria A.; Zapata, Emilio L.
2008-11-01
Due to the increasing computing power of the graphics processing units (GPU), they are becoming more and more popular when solving general purpose algorithms. As the simulation of quantum computers results on a problem with exponential complexity, it is advisable to perform a parallel computation, such as the one provided by the SIMD multiprocessors present in recent GPUs. In this paper, we focus on an important quantum algorithm, the quantum Fourier transform (QTF), in order to evaluate different parallelization strategies on a novel GPU architecture. Our implementation makes use of the new CUDA software/hardware architecture developed recently by NVIDIA.
ERIC Educational Resources Information Center
Kelly, J. Terence; Anandam, Kamala
Miami-Dade Community College's Response System with Variable Prescriptions (RSVP) is an example of faculty-computer partnership directed toward individualizing instruction while managing up to 5,000 students in a single course, regardless of class format. Individualization of instruction is accomplished by RSVP by virtue of its potential for three…
ERIC Educational Resources Information Center
Peltier, Corey; Vannest, Kimberly J.
2018-01-01
The current study examines the effects of schema instruction on the problem-solving performance of four second-grade students with emotional and behavioral disorders. The existence of a functional relationship between the schema instruction intervention and problem-solving accuracy in mathematics is examined through a single case experiment using…
ERIC Educational Resources Information Center
Kolarcik, Tiffany Nicole
2013-01-01
This study explored how elementary educators implement iPad devices as instructional tools to enhance their language arts instruction. The study used a phenomenological qualitative design with a single-subject case study design coupled with an embedded rubric component. The researcher conducted in-depth, semi-structured interviews, classroom…
NASA Astrophysics Data System (ADS)
Parker, Lesley H.; Rennie, Léonie J.
2002-09-01
Debate continues over the benefits, or otherwise, of single-sex classes in science and mathematics, particularly for the performance of girls. Previous research and analyses of the circumstances surrounding the implementation of single-sex classes warn that the success of the strategy requires due consideration of the nature of the instructional environment for both boys and girls, together with appropriate support for the teachers involved. This article reports the circumstances under which teachers were able to implement gender-inclusive strategies in single-sex science classes in coeducational high schools and documents some of the difficulties faced. The study was part of the Single-Sex Education Pilot Project (SSEPP) in ten high schools in rural and urban Western Australia. Qualitative and quantitative data were gathered during the project from teachers, students and classroom observations. Overall, it was apparent that single-sex grouping created environments in which teachers could implement gender-inclusive science instructional strategies more readily and effectively than in mixed-sex settings. Teachers were able to address some of the apparent shortcomings of the students' previous education (specifically, the poor written and oral communication of boys and the limited experience of girls with 'hands-on' activities and open-ended problem solving). Further, in same-sex classrooms, sexual harassment which inhibited girls' learning was eliminated. The extent to which teachers were successful in implementing gender-inclusive instructional strategies, however, depended upon their prior commitment to the SSEPP as a whole, and upon the support or obstacles encountered from a variety of sources, including parents, the community, students, and non-SSEPP teachers.
Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A
2012-01-01
Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.
Ayres, Daniel L.; Darling, Aaron; Zwickl, Derrick J.; Beerli, Peter; Holder, Mark T.; Lewis, Paul O.; Huelsenbeck, John P.; Ronquist, Fredrik; Swofford, David L.; Cummings, Michael P.; Rambaut, Andrew; Suchard, Marc A.
2012-01-01
Abstract Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software. PMID:21963610
Refocusing from a plenoptic camera within seconds on a mobile phone
NASA Astrophysics Data System (ADS)
Gómez-Cárdenes, Ã.`scar; Marichal-Hernández, José G.; Rosa, Fernando L.; Lüke, Jonas P.; Fernández-Valdivia, Juan José; Rodríguez-Ramos, José M.
2014-05-01
Refocusing a plenoptic image by digital means and after the exposure has been thoroughly studied in the last years, but few efforts have been made in the direction of real time implementation in a constrained environment such as that provided by current mobile phones and tablets. In this work we address the aforementioned challenge demonstrating that a complete focal stack, comprising 31 refocused planes from a (256ff16)2 plenoptic image, can be achieved within seconds by a current SoC mobile phone platform. The election of an appropriate algorithm is the key to success. In a previous work we developed an algorithm, the fast approximate 4D:3D discrete Radon transform, that performs this task with linear time complexity where others obtain quadratic or linearithmic time complexity. Moreover, that algorithm does not requires complex number transforms, trigonometric calculus nor even multiplications nor oat numbers. Our algorithm has been ported to a multi core ARM chip on an off-the-shelf tablet running Android. A careful implementation exploiting parallelism at several levels has been necessary. The final implementation takes advantage of multi-threading in native code and NEON SIMD instructions. As a result our current implementation completes the refocusing task within seconds for a 16 megapixels image, much faster than previous attempts running on powerful PC platforms or dedicated hardware. The times consumed by the different stages of the digital refocusing are given and the strategies to achieve this result are discussed. Time results are given for a variety of environments within Android ecosystem, from the weaker/cheaper SoCs to the top of the line for 2013.
ERIC Educational Resources Information Center
Elam, Jeanette H.
2009-01-01
The purpose of this study was to compare the academic performance of students enrolled in coeducational instruction and single-gender instruction. Within this framework, the researcher examined class type, gender, and racial/ethnicity using the sixth grade CRCT scores of selected students in the areas of mathematics and science. The fifth-grade…
ERIC Educational Resources Information Center
Lancioni, Giulio E.; Singh, Nirbhay N.; O'Reilly, Mark F.; Sigafoos, Jeff; Oliva, Doretta; Smaldone, Angela; La Martire, Maria L.; Alberti, Gloria; Scigliuzzo, Francesca
2011-01-01
In a recent single-case study, we showed that a new verbal-instruction system, ensuring the automatic presentation of step instructions, was beneficial for promoting the task performance of a woman with multiple disabilities (including blindness). The present study was aimed at replicating and extending the aforementioned investigation with three…
ERIC Educational Resources Information Center
Vincent, Susan, Ed.
In multigrade instruction, children of at least a 2-year grade span and diverse ability levels are grouped in a single classroom and share experiences involving intellectual, academic, and social skills. "The Multigrade Classroom" is a seven-book series that provides an overview of current research on multigrade instruction, identifies…
Soft Toys as Instructional Technology in Higher Education: The Case of Llewelyn the Lynx
ERIC Educational Resources Information Center
Raye, Lee
2017-01-01
Scholarship on instructive technologies in higher education has emphasized the use of high-tech facilitative technologies for long-term use, and low-tech props to illustrate single topics. This paper, on the contrary, discusses the use of a long-term, low-tech instructional technology: Llewelyn the Lynx was a soft animal used to assist with…
ERIC Educational Resources Information Center
Reed, Deborah K.
2013-01-01
This study sought to determine the effects of explicit phonics instruction and sight word instruction on the letter-sound identification and word reading of 13- to 15-year-old English language learners in the eighth grade who were identified as having intellectual disabilities (ID). Using a randomized single-subject design, four Hispanic students…
Instructable autonomous agents. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Huffman, Scott Bradley
1994-01-01
In contrast to current intelligent systems, which must be laboriously programmed for each task they are meant to perform, instructable agents can be taught new tasks and associated knowledge. This thesis presents a general theory of learning from tutorial instruction and its use to produce an instructable agent. Tutorial instruction is a particularly powerful form of instruction, because it allows the instructor to communicate whatever kind of knowledge a student needs at whatever point it is needed. To exploit this broad flexibility, however, a tutorable agent must support a full range of interaction with its instructor to learn a full range of knowledge. Thus, unlike most machine learning tasks, which target deep learning of a single kind of knowledge from a single kind of input, tutorability requires a breadth of learning from a broad range of instructional interactions. The theory of learning from tutorial instruction presented here has two parts. First, a computational model of an intelligent agent, the problem space computational model, indicates the types of knowledge that determine an agent's performance, and thus, that should be acquirable via instruction. Second, a learning technique, called situated explanation specifies how the agent learns general knowledge from instruction. The theory is embodied by an implemented agent, Instructo-Soar, built within the Soar architecture. Instructo-Soar is able to learn hierarchies of completely new tasks, to extend task knowledge to apply in new situations, and in fact to acquire every type of knowledge it uses during task performance - control knowledge, knowledge of operators' effects, state inferences, etc. - from interactive natural language instructions. This variety of learning occurs by applying the situated explanation technique to a variety of instructional interactions involving a variety of types of instructions (commands, statements, conditionals, etc.). By taking seriously the requirements of flexible tutorial instruction, Instructo-Soar demonstrates a breadth of interaction and learning capabilities that goes beyond previous instructable systems, such as learning apprentice systems. Instructo-Soar's techniques could form the basis for future 'instructable technologies' that come equipped with basic capabilities, and can be taught by novice users to perform any number of desired tasks.
2014-01-01
Background Because of relatively small treatment numbers together with low adverse drug reaction (ADR) reporting rates the timely identification of ADRs affecting children and young people is problematic. The primary objective of this study was to assess the utility of unplanned medication discontinuation as a signal for possible ADRs in children and young people. Methods Using orlistat as an exemplar, all orlistat prescriptions issued to patients up to 18 years of age together with patient characteristics, prescription duration, co-prescribed medicines and recorded clinical (Read) codes were identified from the Primary Care Informatics Unit database between 1st Jan 2006-30th Nov 2009. Binary logistic regression was used to assess association between characteristics and discontinuation. Results During the study period, 79 patients were prescribed orlistat (81% female, median age 17 years). Unplanned medication discontinuation rates for orlistat were 52% and 77% at 1 and 3-months. Almost 20% of patients were co-prescribed an anti-depressant. One month unplanned medication discontinuation was significantly lower in the least deprived group (SIMD 1–2 compared to SIMD 9–10 OR 0.09 (95% CI0.01 – 0.83)) and those co-prescribed at least one other medication. At 3 months, discontinuation was higher in young people (≥17 yr versus, OR 3.07 (95% CI1.03 – 9.14)). Read codes were recorded for digestive, respiratory and urinary symptoms around the time of discontinuation for 24% of patients. Urinary retention was reported for 7.6% of patients. Conclusions Identification of unplanned medication discontinuation using large primary care datasets may be a useful tool for pharmacovigilance signal generation and detection of potential ADRs in children and young people. PMID:24594374
ERIC Educational Resources Information Center
Ulke-Kurkcuoglu, Burcu; Bozkurt, Funda; Cuhadar, Selmin
2015-01-01
This study aims to investigate the effectiveness of the instruction process provided through computer-assisted activity schedules in the instruction of on-schedule and role-play skills to children with autism spectrum disorder. Herein, a multiple probe design with probe conditions across participants among single subject designs was used. Four…
Model-based vision using geometric hashing
NASA Astrophysics Data System (ADS)
Akerman, Alexander, III; Patton, Ronald
1991-04-01
The Geometric Hashing technique developed by the NYU Courant Institute has been applied to various automatic target recognition applications. In particular, I-MATH has extended the hashing algorithm to perform automatic target recognition ofsynthetic aperture radar (SAR) imagery. For this application, the hashing is performed upon the geometric locations of dominant scatterers. In addition to being a robust model-based matching algorithm -- invariant under translation, scale, and 3D rotations of the target -- hashing is of particular utility because it can still perform effective matching when the target is partially obscured. Moreover, hashing is very amenable to a SIMD parallel processing architecture, and thus potentially realtime implementable.
Fast and accurate de novo genome assembly from long uncorrected reads
Vaser, Robert; Sović, Ivan; Nagarajan, Niranjan
2017-01-01
The assembly of long reads from Pacific Biosciences and Oxford Nanopore Technologies typically requires resource-intensive error-correction and consensus-generation steps to obtain high-quality assemblies. We show that the error-correction step can be omitted and that high-quality consensus sequences can be generated efficiently with a SIMD-accelerated, partial-order alignment–based, stand-alone consensus module called Racon. Based on tests with PacBio and Oxford Nanopore data sets, we show that Racon coupled with miniasm enables consensus genomes with similar or better quality than state-of-the-art methods while being an order of magnitude faster. PMID:28100585
van Beek, Nathalie; Stegeman, Dick F; van den Noort, Josien C; H E J Veeger, DirkJan; Maas, Huub
2018-02-01
The fingers of the human hand cannot be controlled fully independently. This phenomenon may have a neurological as well as a mechanical basis. Despite previous studies, the neuromechanics of finger movements are not fully understood. The aims of this study were (1) to assess the activation and coactivation patterns of finger specific flexor and extensor muscle regions during instructed single finger flexion and (2) to determine the relationship between enslaved finger movements and respective finger muscle activation. In 9 healthy subjects (age 22-29), muscle activation was assessed during single finger flexion using a 90 surface electromyography electrode grid placed over the flexor digitorum superficialis (FDS) and the extensor digitorum (ED). We found (1) no significant differences in muscle activation timing between fingers, (2) considerable muscle activity in flexor and extensor regions associated with the non-instructed fingers and (3) no correlation between the muscle activations and corresponding movement of non-instructed fingers. A clear disparity was found between the movement pattern of the non-instructed fingers and the activity pattern of the corresponding muscle regions. This suggests that mechanical factors, such as intertendinous and myofascial connections, may also affect finger movement independency and need to be taken into consideration when studying finger movement. Copyright © 2017 Elsevier Ltd. All rights reserved.
Rittle-Johnson, Bethany; Fyfe, Emily R; Loehr, Abbey M
2016-12-01
Students, parents, teachers, and theorists often advocate for direct instruction on both concepts and procedures, but some theorists suggest that including instruction on procedures in combination with concepts may limit learning opportunities and student understanding. This study evaluated the effect of instruction on a math concept and procedure within the same lesson relative to a comparable amount of instruction on the concept alone. Direct instruction was provided before or after solving problems to evaluate whether the type of instruction interacted with the timing of instruction within a lesson. We worked with 180 second-grade children in the United States. In a randomized experiment, children received a classroom lesson on mathematical equivalence in one of four conditions that varied in instruction type (conceptual or combined conceptual and procedural) and in instruction order (instruction before or after solving problems). Children who received two iterations of conceptual instruction had better retention of conceptual and procedural knowledge than children who received both conceptual and procedural instruction in the same lesson. Order of instruction did not impact outcomes. Findings suggest that within a single lesson, spending more time on conceptual instruction may be more beneficial than time spent teaching a procedure when the goal is to promote more robust understanding of target concepts and procedures. © 2016 The British Psychological Society.
Adolescents' Chunking of Computer Programs.
ERIC Educational Resources Information Center
Magliaro, Susan; Burton, John K.
To investigate what children learn during computer programming instruction, students attending a summer computer camp were asked to recall either single lines or chunks of computer programs from either coherent or scrambled programs. The 16 subjects, ages 12 to 17, were divided into three instructional groups: (1) beginners, who were taught to…
Variable-Interval Sequenced-Action Camera (VINSAC). Dissemination Document No. 1.
ERIC Educational Resources Information Center
Ward, Ted
The 16 millimeter (mm) Variable-Interval Sequenced-Action Camera (VINSAC) is designed for inexpensive photographic recording of effective teacher instruction and use of instructional materials for teacher education and research purposes. The camera photographs single frames at preselected time intervals (.5 second to 20 seconds) which are…
Teaching Algebraic Equations to Middle School Students with Intellectual Disabilities
ERIC Educational Resources Information Center
Baker, Joshua N.; Rivera, Christopher J.; Morgan, Joseph John; Reese, Noelle
2015-01-01
The purpose of this study was to replicate similar instructional techniques of Jimenez, Browder, and Courtade (2008) using a single-subject multiple-probe across participants design to investigate the effects of task analytic instruction coupled with semi-concrete representations to teach linear algebraic equations to middle school students with…
How District Leaders Use Knowledge Management to Influence Principals' Instructional Leadership
ERIC Educational Resources Information Center
McGloughlin, Denise Marie
2016-01-01
The study of knowledge management, an integrated system of an organization's culture, conditions, and structure, as applied to educational institutions is limited. It was not known how district leaders use knowledge management to influence principals' instructional leadership performance. The purpose of this qualitative single-case study was to…
Analysis of area-time efficiency for an integrated focal plane architecture
NASA Astrophysics Data System (ADS)
Robinson, William H.; Wills, D. Scott
2003-05-01
Monolithic integration of photodetectors, analog-to-digital converters, digital processing, and data storage can improve the performance and efficiency of next-generation portable image products. Our approach combines these components into a single processing element, which is tiled to form a SIMD focal plane processor array with the capability to execute early image applications such as median filtering (noise removal), convolution (smoothing), and inside edge detection (segmentation). Digitizing and processing a pixel at the detection site presents new design challenges, including the allocation of silicon resources. This research investigates the area-time (A"T2) efficiency by adjusting the number of Pixels-per-Processing Element (PPE). Area calculations are based upon hardware implementations of components scaled for 250nm or 120nm technology. The total execution time is calculated from the sequential execution of each application on a generic focal plane architectural simulator. For a Quad-CIF system resolution (176×144), results show that 1 PPE provides the optimal area-time efficiency (5.7 μs2 x mm2 for 250nm, 1.7 μs2 x mm2 for 120nm) but requires a large silicon chip (2072mm2 for 250nm, 614mm2 for 120nm). Increasing the PPE to 4 or 16 can reduce silicon area by 48% and 60% respectively (120nm technology) while maintaining performance within real-time constraints.
Root, Jenny R; Stevenson, Bradley S; Davis, Luann Ley; Geddes-Hall, Jennifer; Test, David W
2017-02-01
Computer-assisted instruction (CAI) is growing in popularity and has demonstrated positive effects for students with disabilities, including those with autism spectrum disorder (ASD). In this review, criteria for group experimental and single case studies were used to determine quality (Horner et al., Exceptional Children 71:165-179, 2005; Gersten et al., Exceptional Children 71:149-164, 2005; National Technical Assistance Center on Transition Center 2015). Included studies of high and adequate quality were further analyzed in terms of content, context, and specific instructional practices. Based on the NTACT criteria, this systematic review has established CAI as an evidence-based practice for teaching academics to students with ASD with support from 10 single-case and two group design studies of high or adequate quality. Suggestions for future research and implications for practice are discussed.
Parallel algorithms for boundary value problems
NASA Technical Reports Server (NTRS)
Lin, Avi
1990-01-01
A general approach to solve boundary value problems numerically in a parallel environment is discussed. The basic algorithm consists of two steps: the local step where all the P available processors work in parallel, and the global step where one processor solves a tridiagonal linear system of the order P. The main advantages of this approach are two fold. First, this suggested approach is very flexible, especially in the local step and thus the algorithm can be used with any number of processors and with any of the SIMD or MIMD machines. Secondly, the communication complexity is very small and thus can be used as easily with shared memory machines. Several examples for using this strategy are discussed.
A sparse matrix algorithm on the Boolean vector machine
NASA Technical Reports Server (NTRS)
Wagner, Robert A.; Patrick, Merrell L.
1988-01-01
VLSI technology is being used to implement a prototype Boolean Vector Machine (BVM), which is a large network of very small processors with equally small memories that operate in SIMD mode; these use bit-serial arithmetic, and communicate via cube-connected cycles network. The BVM's bit-serial arithmetic and the small memories of individual processors are noted to compromise the system's effectiveness in large numerical problem applications. Attention is presently given to the implementation of a basic matrix-vector iteration algorithm for space matrices of the BVM, in order to generate over 1 billion useful floating-point operations/sec for this iteration algorithm. The algorithm is expressed in a novel language designated 'BVM'.
An update on the BQCD Hybrid Monte Carlo program
NASA Astrophysics Data System (ADS)
Haar, Taylor Ryan; Nakamura, Yoshifumi; Stüben, Hinnerk
2018-03-01
We present an update of BQCD, our Hybrid Monte Carlo program for simulating lattice QCD. BQCD is one of the main production codes of the QCDSF collaboration and is used by CSSM and in some Japanese finite temperature and finite density projects. Since the first publication of the code at Lattice 2010 the program has been extended in various ways. New features of the code include: dynamical QED, action modification in order to compute matrix elements by using Feynman-Hellman theory, more trace measurements (like Tr(D-n) for K, cSW and chemical potential reweighting), a more flexible integration scheme, polynomial filtering, term-splitting for RHMC, and a portable implementation of performance critical parts employing SIMD.
Abraham, Mark James; Murtola, Teemu; Schulz, Roland; ...
2015-07-15
GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. This work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. Finally, the latest best-in-class compressed trajectory storage format is supported.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abraham, Mark James; Murtola, Teemu; Schulz, Roland
GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. This work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. Finally, the latest best-in-class compressed trajectory storage format is supported.
Electromagnetic Physics Models for Parallel Computing Architectures
NASA Astrophysics Data System (ADS)
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2016-10-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well.
Benchmarking and performance analysis of the CM-2. [SIMD computer
NASA Technical Reports Server (NTRS)
Myers, David W.; Adams, George B., II
1988-01-01
A suite of benchmarking routines testing communication, basic arithmetic operations, and selected kernel algorithms written in LISP and PARIS was developed for the CM-2. Experiment runs are automated via a software framework that sequences individual tests, allowing for unattended overnight operation. Multiple measurements are made and treated statistically to generate well-characterized results from the noisy values given by cm:time. The results obtained provide a comparison with similar, but less extensive, testing done on a CM-1. Tests were chosen to aid the algorithmist in constructing fast, efficient, and correct code on the CM-2, as well as gain insight into what performance criteria are needed when evaluating parallel processing machines.
NASA Astrophysics Data System (ADS)
Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.
2017-11-01
Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
Defining the Difference: Comparing Integrated and Traditional Single-Subject Lessons
ERIC Educational Resources Information Center
Zhbanova, Ksenia S.; Rule, Audrey C.; Montgomery, Sarah E.; Nielsen, Lynn E.
2010-01-01
Early childhood curricula should be authentic and child-centered, however, many teachers still rely on direct instruction lessons. To better define how an integrated curriculum meets the needs of students, this study examined teacher talk and actions during instructional activities with first and second graders under two conditions: (1)…
ERIC Educational Resources Information Center
Shaltz, Mark B.
An experiment was conducted that compared the teaching effectiveness of a computer assisted instructional module and a lecture-discussion. The module, Predator Functional Response (PFR), was developed as part of the SUMIT (Single-concept User-adaptable Microcomputer-based Instructional Technique) project. A class of 30 students was randomly…
The Perceived Impact of Mindfulness Instruction on Pre-Service Elementary Teachers
ERIC Educational Resources Information Center
Brown, Rachel
2017-01-01
This study explored the self-reported effects of mindfuless instruction on pre-service elementary teachers in the context of a literacy education course. Twenty female undergraduates participated in a study that occurred over the course of a single semester in one university. Analysis of the results indicated that students were not significantly…
Examining Learning Rates in the Evaluation of Academic Interventions That Target Reading Fluency
ERIC Educational Resources Information Center
Solomon, Benjamin G.; Poncy, Brian C.; Caravello, Devin J.; Schweiger, Emily M.
2018-01-01
The purpose of the current study is to determine whether single-case intervention studies targeting reading fluency, ranked by traditional outcome metrics (i.e., effect sizes derived from phase differences), were discrepant with rankings based on instructional efficiency, including growth per session and minutes of instruction. Converging with…
Epilogue: Reading Comprehension Is Not a Single Ability--Implications for Assessment and Instruction
ERIC Educational Resources Information Center
Kamhi, Alan G.; Catts, Hugh W.
2017-01-01
Purpose: In this epilogue, we review the 4 response articles and highlight the implications of a multidimensional view of reading for the assessment and instruction of reading comprehension. Method: We reiterate the problems with standardized tests of reading comprehension and discuss the advantages and disadvantages of recently developed…
ERIC Educational Resources Information Center
Marsicano, Richard T.; Morrison, Julie Q.; Moomaw, Sally C.; Fite, Nathan M.; Kluesener, Courtney M.
2015-01-01
The current study used a single-case design to examine two performance feedback conditions varying in intensity on the frequency of naturalistic math instruction in preschool classrooms during non-instructional times (transition, lunch, free play). Three Head Start teachers received professional development that combined information on four…
ERIC Educational Resources Information Center
Bishop, Crystal D.; Snyder, Patricia A.; Crow, Robert E.
2015-01-01
We used a multi-component single-subject experimental design across three preschool teachers to examine the effects of video self-monitoring with graduated training and feedback on the accuracy with which teachers monitored their implementation of embedded instructional learning trials. We also examined changes in teachers' implementation of…
ERIC Educational Resources Information Center
Datchuk, Shawn M.; Kubina, Richard M., Jr.
2017-01-01
The present study used a multiple-baseline, single-case experimental design to investigate the effects of a multicomponent intervention on construction of simple sentences and word sequences. The intervention entailed sequential delivery of sentence instruction and frequency building to a performance criterion and paragraph instruction.…
ERIC Educational Resources Information Center
Sarsar, Firat
2017-01-01
The purpose of this study was to investigate the effectiveness of Emotional Motivational Feedback Message (EMFEM) in an online learning environment. This exploratory research was conducted using mixed method single case study design. Participants were 15 undergraduate students enrolled in an instructional technology course in a large state…
Teaching Children with Language-Learning Disabilities to Plan and Revise Compare-Contrast Texts
ERIC Educational Resources Information Center
Shen, Mei; Troia, Gary A.
2018-01-01
This study used a multiple-probe, multiple-baseline single-case design to investigate the efficacy of planning, and then revising strategy instruction using self-regulated strategy development on the compare-contrast writing performance of three late elementary students with language-learning disabilities. After receiving the planning instruction,…
Characterization versus Narration: Drama's Role in Multimedia Instructional Software
ERIC Educational Resources Information Center
Cates, Ward Mitchell; Bishop, M. J.; Hung, Woei
2005-01-01
As part of an ongoing research program, the authors investigated the use of single-voiced narration and multi-voiced characterizations/monologues in a formative evaluation study of an instructional lesson on information processing. That lesson employed a design based on the use of content-related metaphors and a metaphorical graphical user…
Libraries and Instructional Materials Centers. Educational Facilities Review Series Number 13.
ERIC Educational Resources Information Center
Baas, Alan M.
The concept of the instructional materials center (IMC) has evolved in response to the limitations of the traditional single-resource library. The IMC is an organizational solution for integrating traditional library services with the variety of multimedia devices and materials necessary to contemporary educational practice. The concept grew from…
Unifying Computer-Based Assessment across Conceptual Instruction, Problem-Solving, and Digital Games
ERIC Educational Resources Information Center
Miller, William L.; Baker, Ryan S.; Rossi, Lisa M.
2014-01-01
As students work through online learning systems such as the Reasoning Mind blended learning system, they often are not confined to working within a single educational activity; instead, they work through various different activities such as conceptual instruction, problem-solving items, and fluency-building games. However, most work on assessing…
Epilogue: Reading Comprehension Is Not a Single Ability-Implications for Assessment and Instruction.
Kamhi, Alan G; Catts, Hugh W
2017-04-20
In this epilogue, we review the 4 response articles and highlight the implications of a multidimensional view of reading for the assessment and instruction of reading comprehension. We reiterate the problems with standardized tests of reading comprehension and discuss the advantages and disadvantages of recently developed authentic tests of reading comprehension. In the "Instruction" section, we review the benefits and limitations of strategy instruction and highlight suggestions from the response articles to improve content and language knowledge. We argue that the only compelling reason to administer a standardized test of reading comprehension is when these tests are necessary to qualify students for special education services. Instruction should be focused on content knowledge, language knowledge, and specific task and learning requirements. This instruction may entail the use of comprehension strategies, particularly those that are specific to the task and focus on integrating new knowledge with prior knowledge.
Johnson, Heather A; Barrett, Laura
2017-01-01
The purpose of this study was to compare two pedagogical methods, active learning and passive instruction, to determine which is more useful in helping students to achieve the learning outcomes in a one-hour research skills instructional session. Two groups of high school students attended an instructional session to learn about consumer health resources and strategies to enhance their searching skills. The first group received passive instruction, and the second engaged in active learning. We assessed both groups' learning using 2 methods with differing complexity. A total of 59 students attended the instructional sessions (passive instruction, n=28; active learning, n=31). We found that the active learning group scored more favorably in four assessment categories. Active learning may help students engage with and develop a meaningful understanding of several resources in a single session. Moreover, when using a complex teaching strategy, librarians should be mindful to gauge learning using an equally complex assessment method.
Johnson, Heather A.; Barrett, Laura
2017-01-01
Objective The purpose of this study was to compare two pedagogical methods, active learning and passive instruction, to determine which is more useful in helping students to achieve the learning outcomes in a one-hour research skills instructional session. Methods Two groups of high school students attended an instructional session to learn about consumer health resources and strategies to enhance their searching skills. The first group received passive instruction, and the second engaged in active learning. We assessed both groups’ learning using 2 methods with differing complexity. A total of 59 students attended the instructional sessions (passive instruction, n=28; active learning, n=31). Results We found that the active learning group scored more favorably in four assessment categories. Conclusions Active learning may help students engage with and develop a meaningful understanding of several resources in a single session. Moreover, when using a complex teaching strategy, librarians should be mindful to gauge learning using an equally complex assessment method. PMID:28096745
The Effects of Single and Dual Coded Multimedia Instructional Methods on Chinese Character Learning
ERIC Educational Resources Information Center
Wang, Ling
2013-01-01
Learning Chinese characters is a difficult task for adult English native speakers due to the significant differences between the Chinese and English writing system. The visuospatial properties of Chinese characters have inspired the development of instructional methods using both verbal and visual information based on the Dual Coding Theory. This…
ERIC Educational Resources Information Center
Morley, Donald D.
2012-01-01
The vast majority of the research on student evaluation of instruction has assessed the reliability of groups of courses and yielded either a single reliability coefficient for the entire group, or grouped reliability coefficients for each student evaluation of teaching (SET) item. This manuscript argues that these practices constitute a form of…
ERIC Educational Resources Information Center
Vincent, Susan, Ed.
In multigrade instruction, children of at least a 2-year grade span and diverse ability levels are grouped in a single classroom and share experiences involving intellectual, academic, and social skills. "The Multigrade Classroom" is a seven-book series that provides an overview of current research on multigrade instruction, identifies key issues…
ERIC Educational Resources Information Center
Hart, John T., Jr.
2016-01-01
The purpose of this study was to examine the effects of Laban Effort Action (slash) instruction in an undergraduate conducting class on college wind ensemble member's ratings of conductors' gestural clarity. Participants--undergraduate and graduate wind ensemble members (N = 28)--rated 32 videos of eight undergraduate conducting students who had…
Science Instruction for Students with Emotional and Behavioral Disorders
ERIC Educational Resources Information Center
Therrien, William J.; Taylor, Jonte C.; Watt, Sarah; Kaldenberg, Erica R.
2014-01-01
This review examined classroom science instruction for students with emotional and behavioral disorders (EBD). A total of 11 group and single-subject studies were analyzed. Across all group studies, a conservatively calculated mean effect size of 0.471 was obtained indicating the interventions as a whole had at least a small to moderate impact on…
ERIC Educational Resources Information Center
Iowa State Dept. of Agriculture, Des Moines.
These instructional materials on agricultural diversification and marketing were developed for use by Iowa's vocational and technical agricultural instructors and extension personnel. This document is one of three manuals making up a single package. (The other two are Christmas Tree Production and Marketing and Sod Production and Marketing). The…
ERIC Educational Resources Information Center
Watson, Shevaun E.; Rex, Cathy; Markgraf, Jill; Kishel, Hans; Jennings, Eric; Hinnant, Kate
2013-01-01
The one-shot library instruction session has long been a mainstay for many information literacy programs. Identifying realistic learning goals, integrating active learning techniques, and conducting meaningful assessment for a single lesson all present challenges. Librarians and English faculty at one college campus confronted these challenges by…
ERIC Educational Resources Information Center
Root, Jenny R.; Stevenson, Bradley S.; Davis, Luann Ley; Geddes-Hall, Jennifer; Test, David W.
2017-01-01
Computer-assisted instruction (CAI) is growing in popularity and has demonstrated positive effects for students with disabilities, including those with autism spectrum disorder (ASD). In this review, criteria for group experimental and single case studies were used to determine quality (Horner et al., "Exceptional Children" 71:165-179,…
MULTI-CORE AND OPTICAL PROCESSOR RELATED APPLICATIONS RESEARCH AT OAK RIDGE NATIONAL LABORATORY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barhen, Jacob; Kerekes, Ryan A; ST Charles, Jesse Lee
2008-01-01
High-speed parallelization of common tasks holds great promise as a low-risk approach to achieving the significant increases in signal processing and computational performance required for next generation innovations in reconfigurable radio systems. Researchers at the Oak Ridge National Laboratory have been working on exploiting the parallelization offered by this emerging technology and applying it to a variety of problems. This paper will highlight recent experience with four different parallel processors applied to signal processing tasks that are directly relevant to signal processing required for SDR/CR waveforms. The first is the EnLight Optical Core Processor applied to matched filter (MF) correlationmore » processing via fast Fourier transform (FFT) of broadband Dopplersensitive waveforms (DSW) using active sonar arrays for target tracking. The second is the IBM CELL Broadband Engine applied to 2-D discrete Fourier transform (DFT) kernel for image processing and frequency domain processing. And the third is the NVIDIA graphical processor applied to document feature clustering. EnLight Optical Core Processor. Optical processing is inherently capable of high-parallelism that can be translated to very high performance, low power dissipation computing. The EnLight 256 is a small form factor signal processing chip (5x5 cm2) with a digital optical core that is being developed by an Israeli startup company. As part of its evaluation of foreign technology, ORNL's Center for Engineering Science Advanced Research (CESAR) had access to a precursor EnLight 64 Alpha hardware for a preliminary assessment of capabilities in terms of large Fourier transforms for matched filter banks and on applications related to Doppler-sensitive waveforms. This processor is optimized for array operations, which it performs in fixed-point arithmetic at the rate of 16 TeraOPS at 8-bit precision. This is approximately 1000 times faster than the fastest DSP available today. The optical core performs the matrix-vector multiplications, where the nominal matrix size is 256x256. The system clock is 125MHz. At each clock cycle, 128K multiply-and-add operations per second (OPS) are carried out, which yields a peak performance of 16 TeraOPS. IBM Cell Broadband Engine. The Cell processor is the extraordinary resulting product of 5 years of sustained, intensive R&D collaboration (involving over $400M investment) between IBM, Sony, and Toshiba. Its architecture comprises one multithreaded 64-bit PowerPC processor element (PPE) with VMX capabilities and two levels of globally coherent cache, and 8 synergistic processor elements (SPEs). Each SPE consists of a processor (SPU) designed for streaming workloads, local memory, and a globally coherent direct memory access (DMA) engine. Computations are performed in 128-bit wide single instruction multiple data streams (SIMD). An integrated high-bandwidth element interconnect bus (EIB) connects the nine processors and their ports to external memory and to system I/O. The Applied Software Engineering Research (ASER) Group at the ORNL is applying the Cell to a variety of text and image analysis applications. Research on Cell-equipped PlayStation3 (PS3) consoles has led to the development of a correlation-based image recognition engine that enables a single PS3 to process images at more than 10X the speed of state-of-the-art single-core processors. NVIDIA Graphics Processing Units. The ASER group is also employing the latest NVIDIA graphical processing units (GPUs) to accelerate clustering of thousands of text documents using recently developed clustering algorithms such as document flocking and affinity propagation.« less
Trussell, Jessica W; Nordhaus, Jason; Brusehaber, Alison; Amari, Brittany
2018-04-17
Deaf and hard-of-hearing (DHH) students have exhibited a morphological knowledge delay that begins in preschool and persists through college. Morphological knowledge is critical to vocabulary understanding and text comprehension in the science classroom. We investigated the effects of morphological instruction, commonly referred to as Word Detectives, on the morphological knowledge of college-age DHH students in a science course. We implemented a multiple probe across behaviors single-case experimental design study with nine student participants. The student participants attended the National Technical Institute for the Deaf. A functional relation was found between the morphological instruction and the student participants' improvement of morphological knowledge regarding the morphemes taught during instruction. These findings indicate that DHH students benefit from morphological instruction to build their vocabulary knowledge in content-area classrooms, such as science courses.
Hardware accelerator design for tracking in smart camera
NASA Astrophysics Data System (ADS)
Singh, Sanjay; Dunga, Srinivasa Murali; Saini, Ravi; Mandal, A. S.; Shekhar, Chandra; Vohra, Anil
2011-10-01
Smart Cameras are important components in video analysis. For video analysis, smart cameras needs to detect interesting moving objects, track such objects from frame to frame, and perform analysis of object track in real time. Therefore, the use of real-time tracking is prominent in smart cameras. The software implementation of tracking algorithm on a general purpose processor (like PowerPC) could achieve low frame rate far from real-time requirements. This paper presents the SIMD approach based hardware accelerator designed for real-time tracking of objects in a scene. The system is designed and simulated using VHDL and implemented on Xilinx XUP Virtex-IIPro FPGA. Resulted frame rate is 30 frames per second for 250x200 resolution video in gray scale.
Electromagnetic physics models for parallel computing architectures
Amadio, G.; Ananya, A.; Apostolakis, J.; ...
2016-11-21
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part ofmore » the GeantV project. Finally, the results of preliminary performance evaluation and physics validation are presented as well.« less
Pulse-coupled neural network implementation in FPGA
NASA Astrophysics Data System (ADS)
Waldemark, Joakim T. A.; Lindblad, Thomas; Lindsey, Clark S.; Waldemark, Karina E.; Oberg, Johnny; Millberg, Mikael
1998-03-01
Pulse Coupled Neural Networks (PCNN) are biologically inspired neural networks, mainly based on studies of the visual cortex of small mammals. The PCNN is very well suited as a pre- processor for image processing, particularly in connection with object isolation, edge detection and segmentation. Several implementations of PCNN on von Neumann computers, as well as on special parallel processing hardware devices (e.g. SIMD), exist. However, these implementations are not as flexible as required for many applications. Here we present an implementation in Field Programmable Gate Arrays (FPGA) together with a performance analysis. The FPGA hardware implementation may be considered a platform for further, extended implementations and easily expanded into various applications. The latter may include advanced on-line image analysis with close to real-time performance.
First experience of vectorizing electromagnetic physics models for detector simulation
NASA Astrophysics Data System (ADS)
Amadio, G.; Apostolakis, J.; Bandieramonte, M.; Bianchini, C.; Bitzes, G.; Brun, R.; Canal, P.; Carminati, F.; de Fine Licht, J.; Duhem, L.; Elvira, D.; Gheata, A.; Jun, S. Y.; Lima, G.; Novak, M.; Presbyterian, M.; Shadura, O.; Seghal, R.; Wenzel, S.
2015-12-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. The GeantV vector prototype for detector simulations has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth, parallelization needed to achieve optimal performance or memory access latency and speed. An additional challenge is to avoid the code duplication often inherent to supporting heterogeneous platforms. In this paper we present the first experience of vectorizing electromagnetic physics models developed for the GeantV project.
Prologue: Reading Comprehension Is Not a Single Ability.
Catts, Hugh W; Kamhi, Alan G
2017-04-20
In this initial article of the clinical forum on reading comprehension, we argue that reading comprehension is not a single ability that can be assessed by one or more general reading measures or taught by a small set of strategies or approaches. We present evidence for a multidimensional view of reading comprehension that demonstrates how it varies as a function of reader ability, text, and task. The implications of this view for instruction of reading comprehension are considered. Reading comprehension is best conceptualized with a multidimensional model. The multidimensionality of reading comprehension means that instruction will be more effective when tailored to student performance with specific texts and tasks.
Markopoulos, G; Rutherford, A; Cairns, C; Green, J
2010-08-01
Murnane and Phelps (1993) recommend word pair presentations in local environmental context (EC) studies to prevent associations being formed between successively presented items and their ECs and a consequent reduction in the EC effect. Two experiments were conducted to assess the veracity of this assumption. In Experiment 1, participants memorised single words or word pairs, or categorised them as natural or man made. Their free recall protocols were examined to assess any associations established between successively presented items. Fewest associations were observed when the item-specific encoding task (i.e., natural or man made categorisation of word referents) was applied to single words. These findings were examined further in Experiment 2, where the influence of encoding instructions and stimulus presentation on local EC dependent recognition memory was examined. Consistent with recognition dual-process signal detection model predictions and findings (e.g., Macken, 2002; Parks & Yonelinas, 2008), recollection sensitivity, but not familiarity sensitivity, was found to be local EC dependent. However, local EC dependent recognition was observed only after item-specific encoding instructions, irrespective of stimulus presentation. These findings and the existing literature suggest that the use of single word presentations and item-specific encoding enhances local EC dependent recognition.
ERIC Educational Resources Information Center
Britt, Alexander P.
2015-01-01
A single-subject, multiple-baseline across participants design was used to examine the functional relation between systematic instruction and the ability to complete a graphic organizer and recall facts about informational texts by students with significant development disabilities. Four high school students enrolled in an adapted academic program…
ERIC Educational Resources Information Center
Tozkoparam, Süleyman Burak; Kiliç, Muhammet Emre; Usta, Ertugrul
2015-01-01
The aim of this study is to determine Technological Pedagogical Content Knowledge (TPACK) Competencies of teacher candidates in Turkish Teaching department of Mevlana (Rumi) University and the effect of Instructional Technology and Material Design (ITMD) Course on TPACK. The study is a study of quantitative type and single-group pretest-posttest…
ERIC Educational Resources Information Center
Mutlu, Yilmaz; Akgün, Levent
2017-01-01
The aim of this study is to examine the effects of computer assisted instruction materials on approximate number skills of students with mathematics learning difficulties. The study was carried out with pretest-posttest quasi experimental method with a single subject. The participants of the study consist of a girl and two boys who attend 3rd…
Impact of Explicit Vocabulary Instruction on Writing Achievement of Upper-Intermediate EFL Learners
ERIC Educational Resources Information Center
Solati-Dehkordi, Seyed Amir; Salehi, Hadi
2016-01-01
Studying explicit vocabulary instruction effects on improving L2 learners' writing skill and their short and long-term retention is the purpose of the present study. To achieve the mentioned goal, a fill-in-the-blank test including 36 single words and 60 lexical phrases were administrated to 30 female upper-intermediate EFL learners. The EFL…
The Differentiated School: Making Revolutionary Changes in Teaching and Learning
ERIC Educational Resources Information Center
Tomlinson, Carol Ann; Brimijoin, Kay; Narvaez, Lane
2008-01-01
While there are lots of books with information on why and how to differentiate instruction, here at last is a book that gets to the heart of why some efforts to differentiate flop while others lead to sweeping, positive results. When it's your job to support differentiated instruction in a single school or systemwide, you need the guidance from…
ERIC Educational Resources Information Center
Grabowski, Barbara
An intelligent videodisc system on which comprehensive instructional development research can be conducted has been developed. This integrated learning system combines all other existing media, except objects, using a videodisc, microcomputer, printer, single monitor, hard disc storage with CPU for random access digitized audio, and headphones.…
ERIC Educational Resources Information Center
Bolden, Felicia Mickles
2012-01-01
A persistent mathematics achievement gap between African American, Hispanic, and European American students at one elementary school was the focus of this investigation. The research questions of this single site case study involved understanding why an achievement gap exists, and to identify the instructional strategies and best practices used to…
ERIC Educational Resources Information Center
Roehr-Brackin, Karen
2014-01-01
This article considers explicit knowledge and processes in second language (L2) learning from a usage-based theoretical perspective. It reports on the long-term development of a single instructed adult learner's use of two L2 constructions, the German Perfekt of "gehen" ("go," "walk") and "fahren"…
ERIC Educational Resources Information Center
Barton, Erin E.; Pustejovsky, James E.; Maggin, Daniel M.; Reichow, Brian
2017-01-01
The adoption of methods and strategies validated through rigorous, experimentally oriented research is a core professional value of special education. We conducted a systematic review and meta-analysis examining the experimental literature on Technology-Aided Instruction and Intervention (TAII) using research identified as part of the National…
Instruction on the Go: Reaching out to Students from the Academic Library
ERIC Educational Resources Information Center
Moorefield-Lang, Heather; Hall, Tracy
2015-01-01
The purpose of this paper is to describe how a series of one-shot or single class library instruction webinars were created for on-campus and distance education students at Virginia Tech, a land grant institution in rural southwestern Virginia. Virginia Tech's distance learning department on campus trained in Centra 7.6 software and the lead…
ERIC Educational Resources Information Center
Pavlenko, Aneta
2005-01-01
The focus of this paper is on the complex interaction between ideologies of language, gender and identity during the Americanisation era (1900-1924) in the USA. I will argue that the Americanisation movement had a "hidden curriculum" which singled out immigrant women--and in particular mothers--for specific kinds of English instruction.…
NASA Astrophysics Data System (ADS)
Mitchell, Sherese A.
This researcher investigated the long- and short-term retention of information using traditional instruction versus previously tested tactual resources versus innovative tactual resources on the achievement and attitudes of second-grade students in science. The processing of new and difficult knowledge has challenged many young children who tend to be kinesthetic or tactual learners. In compliance with the National Science Education Standards, students should be actively engaged in their own learning. Therefore, to boost student achievement in science, the use of tactual materials was implemented. The sample included 67 second-grade students drawn from three heterogeneously grouped classes in a low socio-economic neighborhood. It consisted of 30 females and 37 males of which 97 percent were African American, 2 percent were Hispanic, and 1 percent Other. Students were unaware of their diagnosed learning-style preference(s) during the instruction and assessment phases of the study. Therefore, students' knowledge of their learning-style preferences could not have had any impact on their achievement or attitudes. A counterbalanced research design was employed. During the first session, Group 1 was taught with previously tested tactual resources (Electroboards, Flip Chutes, Fact Wheels, and Fact Fans), and Group 3 was taught traditionally. During the second session of instruction, Group 1 received instruction with innovative tactual resources, Group 2 received traditional instruction, Group 3 received instruction with previously tested tactual resources. During the final session of instruction, Group 1 received traditional instruction, Group 2 received instruction with previously tested tactual resources, and Group 3 received instruction with innovative tactual resources. The results indicated that the use of tactual materials, regardless of whether they were previously tested or innovative, produced higher achievement gains and more positive attitudes than traditional instruction. The lowest gains in the General Linear Model procedure were in the traditional condition. An omnibus ANOVA revealed that the three conditions yielded significantly different results (F(2,65) = 4.66, p = .013). Pairwise analyses of mean differences indicated that the means of both the innovative tactual resources and previously tested tactual resources were significantly different from the means of the traditional condition, but were not significantly different from each other. A series of single factor t-tests was performed on the items of the attitude scale. Most ratings differed from 3.0. The single-sample t-tests indicated that all the ratings were significantly higher than 3.0. This result revealed that the traditional instruction was statistically less effective than the other two tactual treatments.
Support for hands-on optics immersions (Conference Presentation)
NASA Astrophysics Data System (ADS)
Spalding, Gabriel C.; McCann, Lowell I.
2016-09-01
The Advanced Laboratory Physics Association (ALPhA) is an official affiliate organization of the AAPT, supporting upper-level undergraduate instructional lab education in physics. The ALPhA Immersions program is intended to be an efficient use of an instructor's time: with expert colleague-mentors on hand they spend 2.5 days learning a key new instructional experiment (of their choice) well enough to confidently teach it to the students at their home institutions. At an ALPhA Immersion, participants work in groups of no more than three per experimental setup. Our follow-up surveys support the notion that this individualized, concentrated focus directly results in significant updating and improvement of undergraduate laboratory instruction in physics across the country. Such programs have the effect of encouraging investment, on the part of individual institutions. For example, we have disseminated ideas, training, and equipment for contemporary single-photon-based instructional labs dealing with core, contemporary issues in Quantum Mechanics. By the time this paper is presented, ALPhA will have delivered at least 420 single-photon detectors to a wide variety of educational institutions. We have also partnered with the non-profit Jonathan F. Reichert Foundation to support equipment acquisition by institutions participating in our wide variety of training programs.
Neef, N A; Lensbower, J; Hockersmith, I; DePalma, V; Gray, K
1990-01-01
We analyzed the role of the range of variation in training exemplars as a contextual variable influencing the effects of in vivo versus simulation training in producing generalized responding. Four mentally retarded adults received single case instruction, followed by general case instruction, on washing machine and dryer use; one task was taught using actual appliances (in vivo) and the other using simulation. In vivo and simulation training were counterbalanced across the two tasks for the 2 subject pairs, using a within-subjects Latin square design. With both paradigms, more errors were made after single case than after general case instruction during probe sessions with untrained washing machines and dryers. These results suggest that generalization errors were affected by the range of training exemplars and not by the use of simulated versus natural training stimuli. Although both general case simulation and general case in vivo training facilitated generalized performance of laundry skills, an analysis of training time and costs indicated that the former approach was more efficient. The study illustrates a methodology for studying complex interactions and guiding decisions on the optimal use of instructional alternatives. PMID:2074236
ERIC Educational Resources Information Center
Scoggins, Donna K.
2009-01-01
Single-sex education is an instructional innovation implemented to improve student academic achievement by teaching to the learning styles and interests of boys and/or girls. This ex post facto quantitative study examined the differences in academic achievement between single-sex education and coeducation classes on students' achievement in…
Beyond Comprehension Strategy Instruction: What's Next?
Elleman, Amy M; Compton, Donald L
2017-04-20
In this article, we respond to Catts and Kamhi's (2017) argument that reading comprehension is not a single ability. We provide a brief review of the impact of strategy instruction, the importance of knowledge in reading comprehension, and possible avenues for future research and practice. We agree with Catts and Kamhi's argument that reading comprehension is a complex endeavor and that current recommended practices do not reflect the complexity of the construct. Knowledge building, despite its important role in comprehension, has been relegated to a back seat in reading comprehension instruction. In the final section of the article, we outline possible avenues for research and practice (e.g., generative language instruction, dialogic approaches to knowledge building, analogical reasoning and disciplinary literacy, the use of graphics and media, inference instruction) for improving reading-comprehension outcomes. Reading comprehension is a complex ability, and comprehension instruction should reflect this complexity. If we want to have an impact on long-term growth in reading comprehension, we will need to expand our current repertoire of instructional methods to include approaches that support the acquisition and integration of knowledge across a variety of texts and topics.
The Unified Floating Point Vector Coprocessor for Reconfigurable Hardware
NASA Astrophysics Data System (ADS)
Kathiara, Jainik
There has been an increased interest recently in using embedded cores on FPGAs. Many of the applications that make use of these cores have floating point operations. Due to the complexity and expense of floating point hardware, these algorithms are usually converted to fixed point operations or implemented using floating-point emulation in software. As the technology advances, more and more homogeneous computational resources and fixed function embedded blocks are added to FPGAs and hence implementation of floating point hardware becomes a feasible option. In this research we have implemented a high performance, autonomous floating point vector Coprocessor (FPVC) that works independently within an embedded processor system. We have presented a unified approach to vector and scalar computation, using a single register file for both scalar operands and vector elements. The Hybrid vector/SIMD computational model of FPVC results in greater overall performance for most applications along with improved peak performance compared to other approaches. By parameterizing vector length and the number of vector lanes, we can design an application specific FPVC and take optimal advantage of the FPGA fabric. For this research we have also initiated designing a software library for various computational kernels, each of which adapts FPVC's configuration and provide maximal performance. The kernels implemented are from the area of linear algebra and include matrix multiplication and QR and Cholesky decomposition. We have demonstrated the operation of FPVC on a Xilinx Virtex 5 using the embedded PowerPC.
Self-Sufficiency for Single-Parent Families: An Integrated Approach.
ERIC Educational Resources Information Center
Townley, Kim F.; And Others
1991-01-01
At the One-Parent Family Facility in Lexington, Kentucky, government agencies, businesses, and the university provide integrated services to meet single-parent needs. The residential, transitional living/learning program offers child development and nutrition instruction, health screening, vocational counseling, and educational planning. (SK)
Can One Lab Make a Difference?
ERIC Educational Resources Information Center
Abbott, David S.; Saul, Jeffery M.; Parker, George W.; Beichner, Robert J.
2000-01-01
Investigates whether replacing a single traditional laboratory activity with a widely-used, non-microcomputer-based laboratory, research-based activity could produce improved conceptual understanding of a topic in electricity. Shows that a single instructional experience utilizing the research-based tutorial materials is superior to a traditional…
Dupan, Sigrid S G; Stegeman, Dick F; Maas, Huub
2018-06-01
Single finger force tasks lead to unintended activation of the non-instructed fingers, commonly referred to as enslaving. Both neural and mechanical factors have been associated with this absence of finger individuality. This study investigates the amplitude modulation of both intrinsic and extrinsic finger muscles during single finger isometric force tasks. Twelve participants performed single finger flexion presses at 20% of maximum voluntary contraction, while simultaneously the electromyographic activity of several intrinsic and extrinsic muscles associated with all four fingers was recorded using 8 electrode pairs in the hand and two 30-electrode grids on the lower arm. The forces exerted by each of the fingers, in both flexion and extension direction, were recorded with individual force sensors. This study shows distinct activation patterns in intrinsic and extrinsic hand muscles. Intrinsic muscles exhibited individuation, where the agonistic and antagonistic muscles associated with the instructed fingers showed the highest activation. This activation in both agonistic and antagonistic muscles appears to facilitate finger stabilisation during the isometric force task. Extrinsic muscles show an activation independent from instructed finger in both agonistic and antagonistic muscles, which appears to be associated with stabilisation of the wrist, with an additional finger-dependent modulation only present in the agonistic extrinsic muscles. These results indicate distinct muscle patterns in intrinsic and extrinsic hand muscles during single finger isometric force pressing. We conclude that the finger specific activation of intrinsic muscles is not sufficient to fully counteract enslaving caused by the broad activation of the extrinsic muscles. Copyright © 2018 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Mann, Zennetta; McLaughlin, T. F.; Williams, Randy Lee; Derby, K. Mark; Everson, Mary
2012-01-01
The purpose of the present study was to evaluate the effects of Direct Instruction (DI) flashcard procedure, combined with strategies and rewards on multiplication fact accuracy of two elementary school-age students. A single subject replication design across three and four sets of multiplication facts was used to evaluate outcomes. The results…
ERIC Educational Resources Information Center
Owocki, Gretchen
2010-01-01
Children's needs differ so vastly that a single program designed to support numerous students can only do so much. More than anything else, students need to use professional expertise to unravel their needs and to plan instruction that is directly responsive. This book makes exemplary RTI possible in every reading classroom. The author gives you…
ERIC Educational Resources Information Center
Lichten, William
A three-part program investigated the use of computers at an inner-city high school. An attempt was made to introduce a digital computer for instructional purposes at the high school. A single portable teletype terminal and a simple programing language, BASIC, were used. It was found that a wide variety of students could benefit from this…
ERIC Educational Resources Information Center
O'Neal, Angeline N.
2013-01-01
In response to numerous mandates in the field of education, schools have found it imperative to ensure that teachers are incorporating effective instructional methods which meet the diverse needs of student populations within a single classroom. The co-teaching model of instruction is just one way educators have chosen to lead classroom…
ERIC Educational Resources Information Center
What Works Clearinghouse, 2013
2013-01-01
The study examined the impact of "POWERSOURCE"[C], an intervention consisting of formative assessments, instructional resources, and professional development designed to help teachers provide individual instruction to their students in Algebra I. This study took place in seven districts in Arizona and California during the 2007-08 school…
Code of Federal Regulations, 2011 CFR
2011-10-01
... package No. of packages Bandage compress—4″ 1 Single 5 Bandage compress—2″ 4 do 2 Waterproof adhesive..., forceps, scissors, 12 safety pins 1, 1, 1, and 12, respectively Double 1 Wire splint 1 Single 1 Ammonia..., 61/2 gr tablets, vials of 20 5 Double 1 Sterile petrolatum gauze, 3″×18″ 4 Single 3 (c) Instructions...
NASA Astrophysics Data System (ADS)
Ryoo, Ji Hoon; Tai, Robert H.; Skeeles-Worley, Angela D.
2018-02-01
In longitudinal studies, measurement invariance is required to conduct substantive comparisons over time or across groups. In this study, we examined measurement invariance on a recently developed instrument capturing student preferences for seven instructional strategies related to science learning and career interest. We have labeled these seven instructional strategies as Collaborating, Competing, Caretaking, Creating/Making, Discovering, Performing, and Teaching. A better understanding of student preferences for particular instructional strategies can help educators, researchers, and policy makers deliberately tailor programmatic instructional structure to increase student persistence in the STEM pipeline. However, simply confirming the relationship between student preferences for science instructional strategies and their future career choices at a single time point is not sufficient to clarify our understanding of the relationship between instructional strategies and student persistence in the STEM pipeline, especially since preferences for instructional strategies are understood to vary over time. As such, we sought to develop a measure that invariantly captures student preference over a period of time: the Framework for Observing and Categorizing Instructional Strategies (FOCIS). We administered the FOCIS instrument over four semesters over two middle school grades to 1009 6th graders and 1021 7th graders and confirmed the longitudinal invariance of the FOCIS measure. This confirmation of longitudinal invariance will allow researchers to examine the relationship between student preference for certain instructional strategies and student persistence in the STEM pipeline.
Single-Sex Computer Classes: An Effective Alternative.
ERIC Educational Resources Information Center
Swain, Sandra L.; Harvey, Douglas M.
2002-01-01
Advocates single-sex computer instruction as a temporary alternative educational program to provide middle school and secondary school girls with access to computers, to present girls with opportunities to develop positive attitudes towards technology, and to make available a learning environment conducive to girls gaining technological skills.…
Mannewitz, A; Bock, J; Kreitz, S; Hess, A; Goldschmidt, J; Scheich, H; Braun, Katharina
2018-05-01
Learning can be categorized into cue-instructed and spontaneous learning types; however, so far, there is no detailed comparative analysis of specific brain pathways involved in these learning types. The aim of this study was to compare brain activity patterns during these learning tasks using the in vivo imaging technique of single photon-emission computed tomography (SPECT) of regional cerebral blood flow (rCBF). During spontaneous exploratory learning, higher levels of rCBF compared to cue-instructed learning were observed in motor control regions, including specific subregions of the motor cortex and the striatum, as well as in regions of sensory pathways including olfactory, somatosensory, and visual modalities. In addition, elevated activity was found in limbic areas, including specific subregions of the hippocampal formation, the amygdala, and the insula. The main difference between the two learning paradigms analyzed in this study was the higher rCBF observed in prefrontal cortical regions during cue-instructed learning when compared to spontaneous learning. Higher rCBF during cue-instructed learning was also observed in the anterior insular cortex and in limbic areas, including the ectorhinal and entorhinal cortexes, subregions of the hippocampus, subnuclei of the amygdala, and the septum. Many of the rCBF changes showed hemispheric lateralization. Taken together, our study is the first to compare partly lateralized brain activity patterns during two different types of learning.
Evaluation of language concordant, patient-centered drug label instructions.
Bailey, Stacy Cooper; Sarkar, Urmimala; Chen, Alice Hm; Schillinger, Dean; Wolf, Michael S
2012-12-01
Despite federal laws requiring language access in healthcare settings, most US pharmacies are unable to provide prescription (Rx) medication instructions to limited English proficient (LEP) patients in their native language. To evaluate the efficacy of health literacy-informed, multilingual Rx instructions (the ConcordantRx instructions) to improve Rx understanding, regimen dosing and regimen consolidation in comparison to standard, language-concordant Rx instructions. Randomized, experimental evaluation. Two hundred and two LEP adults speaking five non-English languages (Chinese, Korean, Russian, Spanish, Vietnamese), recruited from nine clinics and community organizations in San Francisco and Chicago. Subjects were randomized to review Rx bottles with either ConcordantRx or standard instructions. Proper demonstration of common prescription label instructions for single and multi-drug medication regimens. Regimen consolidation was assessed by determining how many times per day subjects would take medicine for a multi-drug regimen. Subjects receiving the ConcordantRx instructions demonstrated significantly greater Rx understanding, regimen dosing and regimen consolidation in comparison to those receiving standard instructions (incidence rate ratio [IRR]: 1.25, 95 % confidence interval [CI]: 1.06-1.48; P= 0.007 for Rx understanding, IRR: 1.19, 95 % CI: 1.03-1.39; P= 0.02 for regimen dosing and IRR: 0.76, 95 % CI: 0.64-0.90; P= 0.001 for regimen consolidation). In most cases, instruction type was the sole, independent predictor of outcomes in multivariate models controlling for relevant covariates. There is a need for standardized, multilingual Rx instructions that can be implemented in pharmacy practices to promote safe medication use among LEP patients. The ConcordantRx instructions represent an important step towards achieving this goal.
Kelly, Valerie E; Shumway-Cook, Anne
2014-01-01
Gait impairments are a common and consequential motor symptom in Parkinson's disease (PD). A cognitive strategy that incorporates instructions to concentrate on specific parameters of walking is an effective approach to gait rehabilitation for persons with PD during single-task and simple dual-task walking conditions. This study examined the ability to modify dual-task walking in response to instructions during a complex walking task in people with PD compared to healthy older adults (HOA). Eleven people with PD and twelve HOA performed a cognitive task while walking with either a usual base or a narrow base of support. Dual-task walking and cognitive task performance were characterized under two conditions-when participants were instructed focus on walking and when they were instructed to focus on the cognitive task. During both usual base and narrow base walking, instructions affected cognitive task response latency, with slower performance when instructed to focus on walking compared to the cognitive task. Regardless of task or instructions, cognitive task performance was slower in participants with PD compared to HOA. During usual base walking, instructions influenced gait speed for both people with PD and HOA, with faster gait speed when instructed to focus on walking compared to the cognitive task. In contrast, during the narrow base walking, instructions affected gait speed only for HOA, but not for people with PD. This suggests that among people with PD the ability to modify walking in response to instructions depends on the complexity of the walking task.
ERIC Educational Resources Information Center
Shoulders, Catherine Woglom
2012-01-01
The purpose of this study was to determine the effects of a socioscientific issues-based instructional model on secondary agricultural education students' content knowledge, scientific reasoning ability, argumentation skills, and views of the nature of science. This study utilized a pre-experimental, single group pretest-posttest design to assess…
ERIC Educational Resources Information Center
What Works Clearinghouse, 2014
2014-01-01
A recent study, "The Effects of Cognitive Strategy Instruction on Math Problem Solving of Middle School Students of Varying Ability," examined the effectiveness of "Solve It!," a program intended to improve the problem-solving skills of seventh-grade math students. During the program, students are taught cognitive strategies of…
A GUIDE TO INSTRUCTIONAL TELEVISION.
ERIC Educational Resources Information Center
DIAMOND, ROBERT M., ED.
THIS IS A GUIDE DESIGNED AS A SINGLE REFERENCE FOR ADMINISTRATORS, TEACHERS, STUDENTS, AND LAYMEN INTERESTED IN TELEVISION FOR A SPECIFIC SCHOOL OR SCHOOL SYSTEM. FOUR EXAMPLES OF SINGLE-ROOM TELEVISION ARE GIVEN AND SUCCESSFUL APPLICATIONS OF STUDIO TELEVISION ARE PRESENTED. ITS USE IN GUIDANCE AND IN ADMINISTRATION IS EXPLAINED. THE PROBLEMS…
24 CFR 291.304 - Bidding process.
Code of Federal Regulations, 2012 CFR
2012-04-01
... DEVELOPMENT HUD-OWNED PROPERTIES DISPOSITION OF HUD-ACQUIRED SINGLE FAMILY PROPERTY Sale of HUD-Held Single... HUD in accordance with instructions in the bid package for a particular sale. (b) Effect of bid. By... bid package. Along with the bid, the bidder must submit an executed copy of the Loan Sale Agreement...
24 CFR 291.304 - Bidding process.
Code of Federal Regulations, 2014 CFR
2014-04-01
... DEVELOPMENT HUD-OWNED PROPERTIES DISPOSITION OF HUD-ACQUIRED SINGLE FAMILY PROPERTY Sale of HUD-Held Single... HUD in accordance with instructions in the bid package for a particular sale. (b) Effect of bid. By... bid package. Along with the bid, the bidder must submit an executed copy of the Loan Sale Agreement...
24 CFR 291.304 - Bidding process.
Code of Federal Regulations, 2010 CFR
2010-04-01
... DEVELOPMENT HUD-OWNED PROPERTIES DISPOSITION OF HUD-ACQUIRED SINGLE FAMILY PROPERTY Sale of HUD-Held Single... HUD in accordance with instructions in the bid package for a particular sale. (b) Effect of bid. By... bid package. Along with the bid, the bidder must submit an executed copy of the Loan Sale Agreement...
77 FR 37910 - Submission for OMB Review; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-25
... Program Instructions (PIs). The training and data grants are governed by the ``new grant'' PI and the basic grant is governed by the ``basic grant'' PI. Current PIs require separate applications and program... and reporting processes by consolidating the PIs into one single PI and requiring one single...
77 FR 799 - Proposed Information Collection Activity; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-06
... separate Program Instructions (PIs). The training and data grants are governed by the ``new grant'' PI and the basic grant is governed by the ``basic grant'' PI. Current PIs require separate applications and... application and reporting processes by consolidating the PIs into one single PI and requiring one single...
76 FR 32213 - Proposed Information Collection Activity; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2011-06-03
... separate Program Instructions (PIs). The training and data grants are governed by the ``new grant'' PI and the basic grant is governed by the ``basic grant'' PI. Current PIs require separate applications and... application and reporting processes by consolidating the PIs into one single PI and requiring one single...
Parallel Treatments Design: A Nested Single Subject Design for Comparing Instructional Procedures.
ERIC Educational Resources Information Center
Gast, David L.; Wolery, Mark
1988-01-01
This paper describes the parallel treatments design, a nested single subject experimental design that combines two concurrently implemented multiple probe designs, allows control for effects of extraneous variables through counterbalancing, and replicates its effects across behaviors. Procedural guidelines for the design's use and issues related…
Presentation Software and the Single Computer.
ERIC Educational Resources Information Center
Brown, Cindy A.
1998-01-01
Shows how the "Kid Pix" software and a single multimedia computer can aid classroom instruction for kindergarten through second grade. Topics include using the computer as a learning center for small groups of students; making a "Kid Pix" slide show; using it as an electronic chalkboard; and creating curriculum-related…
Developing the Girl as a Leader
ERIC Educational Resources Information Center
Hembrow-Beach, Rose
2011-01-01
Single-sex educational environments can create young women who are engaged, active leaders. Girls receive differential treatment in combined-sex education environments. Girls often do not receive the encouragement or instruction to assume leadership. I want to identify the elements of single-sex education that foster female leadership and consider…
APRON: A Cellular Processor Array Simulation and Hardware Design Tool
NASA Astrophysics Data System (ADS)
Barr, David R. W.; Dudek, Piotr
2009-12-01
We present a software environment for the efficient simulation of cellular processor arrays (CPAs). This software (APRON) is used to explore algorithms that are designed for massively parallel fine-grained processor arrays, topographic multilayer neural networks, vision chips with SIMD processor arrays, and related architectures. The software uses a highly optimised core combined with a flexible compiler to provide the user with tools for the design of new processor array hardware architectures and the emulation of existing devices. We present performance benchmarks for the software processor array implemented on standard commodity microprocessors. APRON can be configured to use additional processing hardware if necessary and can be used as a complete graphical user interface and development environment for new or existing CPA systems, allowing more users to develop algorithms for CPA systems.
Applications of massively parallel computers in telemetry processing
NASA Technical Reports Server (NTRS)
El-Ghazawi, Tarek A.; Pritchard, Jim; Knoble, Gordon
1994-01-01
Telemetry processing refers to the reconstruction of full resolution raw instrumentation data with artifacts, of space and ground recording and transmission, removed. Being the first processing phase of satellite data, this process is also referred to as level-zero processing. This study is aimed at investigating the use of massively parallel computing technology in providing level-zero processing to spaceflights that adhere to the recommendations of the Consultative Committee on Space Data Systems (CCSDS). The workload characteristics, of level-zero processing, are used to identify processing requirements in high-performance computing systems. An example of level-zero functions on a SIMD MPP, such as the MasPar, is discussed. The requirements in this paper are based in part on the Earth Observing System (EOS) Data and Operation System (EDOS).
VanSuch, Monica; Naessens, James M; Stroebel, Robert J; Huddleston, Jeanne M; Williams, Arthur R
2006-01-01
Background Most nationally standardised quality measures use widely accepted evidence‐based processes as their foundation, but the discharge instruction component of the United States standards of Joint Commission on Accreditation of Healthcare Organizations heart failure core measure appears to be based on expert opinion alone. Objective To determine whether documentation of compliance with any or all of the six required discharge instructions is correlated with readmissions to hospital or mortality. Research design A retrospective study at a single tertiary care hospital was conducted on randomly sampled patients hospitalised for heart failure from July 2002 to September 2003. Participants Applying the Joint Commission on Accreditation of Healthcare Organizations criteria, 782 of 1121 patients were found eligible to receive discharge instructions. Eligibility was determined by age, principal diagnosis codes and discharge status codes. Measures The primary outcome measures are time to death and time to readmission for heart failure or readmission for any cause and time to death. Results In all, 68% of patients received all instructions, whereas 6% received no instructions. Patients who received all instructions were significantly less likely to be readmitted for any cause (p = 0.003) and for heart failure (p = 0.035) than those who missed at least one type of instruction. Documentation of discharge instructions is correlated with reduced readmission rates. However, there was no association between documentation of discharge instructions and mortality (p = 0.521). Conclusions Including discharge instructions among other evidence‐based heart failure core measures appears justified. PMID:17142589
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curran, L.
1988-03-03
Interest has been building in recent months over the imminent arrival of a new class of supercomputer, called the ''supercomputer on a desk'' or the single-user model. Most observers expected the first such product to come from either of two startups, Ardent Computer Corp. or Stellar Computer Inc. But a surprise entry has shown up. Apollo Computer Inc. is launching a new work station this week that racks up an impressive list of industry first as it puts supercomputer power at the disposal of a single user. The new series 10000 from the Chelmsford, Mass., a company is built aroundmore » a reduced-instruction-set architecture that the company calls Prism, for parallel reduced-instruction-set multiprocessor. This article describes the 10000 and Prism.« less
Development for SSV on a parallel processing system (PARAGON)
NASA Astrophysics Data System (ADS)
Gothard, Benny M.; Allmen, Mark; Carroll, Michael J.; Rich, Dan
1995-12-01
A goal of the surrogate semi-autonomous vehicle (SSV) program is to have multiple vehicles navigate autonomously and cooperatively with other vehicles. This paper describes the process and tools used in porting UGV/SSV (unmanned ground vehicle) autonomous mobility and target recognition algorithms from a SISD (single instruction single data) processor architecture (i.e., a Sun SPARC workstation running C/UNIX) to a MIMD (multiple instruction multiple data) parallel processor architecture (i.e., PARAGON-a parallel set of i860 processors running C/UNIX). It discusses the gains in performance and the pitfalls of such a venture. It also examines the merits of this processor architecture (based on this conceptual prototyping effort) and programming paradigm to meet the final SSV demonstration requirements.
ERIC Educational Resources Information Center
McPartland, James M.
This study tests the general hypothesis that there is no single best way to organize a middle school to meet the variety of needs of early adolescent students. Using data from a sample of 433 schools in the Pennsylvania Educational Quality Assessment, it examines the effects of self-contained classroom instruction and departmentalization on two…
Task Prioritization in Dual-Tasking: Instructions versus Preferences
Jansen, Reinier J.; van Egmond, René; de Ridder, Huib
2016-01-01
The role of task prioritization in performance tradeoffs during multi-tasking has received widespread attention. However, little is known on whether people have preferences regarding tasks, and if so, whether these preferences conflict with priority instructions. Three experiments were conducted with a high-speed driving game and an auditory memory task. In Experiment 1, participants did not receive priority instructions. Participants performed different sequences of single-task and dual-task conditions. Task performance was evaluated according to participants’ retrospective accounts on preferences. These preferences were reformulated as priority instructions in Experiments 2 and 3. The results showed that people differ in their preferences regarding task prioritization in an experimental setting, which can be overruled by priority instructions, but only after increased dual-task exposure. Additional measures of mental effort showed that performance tradeoffs had an impact on mental effort. The interpretation of these findings was used to explore an extension of Threaded Cognition Theory with Hockey’s Compensatory Control Model. PMID:27391779
Success Skills Curriculum for Single Parents and Displaced Homemakers. Curriculum Guide.
ERIC Educational Resources Information Center
Nash, Margaret A.; Norden, Tamara
This curriculum is designed to meet the unique needs of single parents and displaced homemakers who require additional skill building before entering the job market or a job training program. The curriculum, which is designed as a 36-hour program instruction, contains units on the following topics: taking responsibility for oneself (assessing…
7 CFR 800.86 - Inspection of shiplot, unit train, and lash barge grain in single lots.
Code of Federal Regulations, 2010 CFR
2010-01-01
... prescribed in the instructions. (b) Application procedure. Applications for the official inspection of... statistical acceptance sampling and inspection plan according to the provisions of this section and procedures... inspection as part of a single lot and accepted by a statistical acceptance sampling and inspection plan...
ERIC Educational Resources Information Center
Valenzuela, Vanessa V.; Gutierrez, Gabriel; Lambros, Katina M.
2014-01-01
An A-B single-case design assessed at-risk students' responsiveness to mathematics interventions. Four culturally and linguistically diverse second-grade students were given a Tier 2 standard protocol mathematics intervention that included number sense instruction, modeling procedures, guided math drill and practice of addition and subtraction…
Learning from Multimedia Presentations: The Effects of Graphical Realism and Voice Gender
ERIC Educational Resources Information Center
Rodicio, Hector Garcia
2012-01-01
Introduction: Most of the research on the design of multimedia instructional materials has addressed how to combine words and pictures to produce effective presentations whereas the development of single representations has received less attention. In this study we explored different ways of presenting single representations. Method: In Experiment…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-19
... Program Instructions (PIs). The training and data grants are governed by the ``new grant'' PI and the basic grant is governed by the ``basic grant'' PI. Current PIs require separate applications and program... and reporting processes by consolidating the PIs into one single PI and requiring one single...
Theme-Based Tests: Teaching in Context
ERIC Educational Resources Information Center
Anderson, Gretchen L.; Heck, Marsha L.
2005-01-01
Theme-based tests provide an assessment tool that instructs as well as provides a single general context for a broad set of biochemical concepts. A single story line connects the questions on the tests and models applications of scientific principles and biochemical knowledge in an extended scenario. Theme-based tests are based on a set of…
ERIC Educational Resources Information Center
Hsu, Jenq-Muh; Chang, Ting-Wen; Yu, Pao-Ta
2012-01-01
The teaching and learning environment in a traditional classroom typically includes a projection screen, a projector, and a computer within a digital interactive table. Instructors may apply multimedia learning materials using various information communication technologies to increase interaction effects. However, a single screen only displays a…
Impact of Single-Sex Instruction on Student Motivation to Learn Spanish
ERIC Educational Resources Information Center
Kissau, Scott; Quach, Lan; Wang, Chuang
2009-01-01
To increase male motivation to learn additional languages studies have suggested teaching males in single-sex second and foreign language classes (Carr & Pauwels, 2006; Chambers, 2005). Despite the reported benefits of this unique arrangement, a review of literature found no related research conducted in Canada or the United States. To address…
Female-Only Classes in a Rural Context: Self-Concept, Achievement, and Discourse
ERIC Educational Resources Information Center
Wilson, Hope E.; Gresham, Jeanie; Williams, Michelle; Whitley, Claudia; Partin, Jimmy
2013-01-01
Two middle schools in rural east Texas implemented an optional, single-sex program. Although previous studies have documented the effects of single-sex instruction, and recent educational innovations have focused on its benefits, little research has investigated its effects in rural contexts. This study found that for rural populations, patterns…
Tweed, E J; Allardice, G M; McLoone, P; Morrison, D S
2018-01-01
To investigate the relationship between socio-economic circumstances and cancer incidence in Scotland in recent years. Population-based study using cancer registry data. Data on incident cases of colorectal, lung, female breast, and prostate cancer diagnosed between 2001 and 2012 were obtained from a population-based cancer registry covering a population of approximately 2.5 million people in the West of Scotland. Socio-economic circumstances were assessed based on postcode of residence at diagnosis, using the Scottish Index of Multiple Deprivation (SIMD). For each cancer, crude and age-standardised incidence rates were calculated by quintile of SIMD score, and the number of excess cases associated with socio-economic deprivation was estimated. 93,866 cases met inclusion criteria, comprising 21,114 colorectal, 31,761 lung, 23,757 female breast, and 15,314 prostate cancers. Between 2001 and 2006, there was no consistent association between socio-economic circumstances and colorectal cancer incidence, but 2006-2012 saw an emerging deprivation gradient in both sexes. The incidence rate ratio (IRR) for colorectal cancer between most deprived and least deprived increased from 1.03 (95% confidence interval [CI] 0.91-1.16) to 1.24 (95% CI 1.11-1.39) during the study period. The incidence of lung cancer showed the strongest relationship with socio-economic circumstances, with inequalities widening across the study period among women from IRR 2.66 (95% CI 2.33-3.05) to 2.91 (95% CI 2.54-3.33) in 2001-03 and 2010-12, respectively. Breast and prostate cancer showed an inverse relationship with socio-economic circumstances, with lower incidence among people living in more deprived areas. Significant socio-economic inequalities remain in cancer incidence in the West of Scotland, and in some cases are increasing. In particular, this study has identified an emerging, previously unreported, socio-economic gradient in colorectal cancer incidence among women as well as men. Actions to prevent, mitigate, and undo health inequalities should be a public health priority. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Cohen, Herbert
The primary problem investigated was whether examining materials from a variety of perspecitives enhances the development of projective spatial abilities more than examining materials from a single perspective. A secondary consideration dealt with gender effects. One hundred and five (56 females and 49 males) fifth grade students were randomly assigned to one of four groups. Two teachers taught two classes apiece-one receiving instruction encouraging examination of materials from a single perspective, the other from multiple perspectives. All four groups received instruction consisting of access to manipulatives-SCIIS, 2nd edition, Level 5. Instruction occurred twice a week, 45 minutes per session, for 6 weeks. The experimental design was the Solomon Four Group Design. A Battery of 8 Piagetian-type tasks were used to assess possession of the projective groupings. The main and interactive effects of pretesting were determined to be negligible, while the treatment was determined to have a statistically significant effect on the development on projective spatial abilities. Gender was determined to have no direct effect on the dependent variables.
Highly flexible pulse programmer for NMR applications
NASA Technical Reports Server (NTRS)
Dart, J.; Burum, D. P.; Rhim, W. K.
1980-01-01
A pulse generator for NMR application is described. Eighteen output channels are provided to allow use in single and double resonance experiments. Complex pulse sequences may be generated by loading instructions into a 256-word by 16-bit program memory. Features of the pulse generator include programmable time delays from 0.5 micros to 1000 s, branching and looping instructions, and the ability to be loaded and operated either manually or from a PDP-11/10 computer.
ERIC Educational Resources Information Center
Helms, Samuel Arthur
2010-01-01
This single subject case study followed a high school student and his use of a simulation of marine ecosystems. The study examined his metaworld, motivation, and learning before, during and after using the simulation. A briefing was conceptualized based on the literature on pre-instructional activities, advance organizers, and performance…
Rutger's CAM2000 chip architecture
NASA Technical Reports Server (NTRS)
Smith, Donald E.; Hall, J. Storrs; Miyake, Keith
1993-01-01
This report describes the architecture and instruction set of the Rutgers CAM2000 memory chip. The CAM2000 combines features of Associative Processing (AP), Content Addressable Memory (CAM), and Dynamic Random Access Memory (DRAM) in a single chip package that is not only DRAM compatible but capable of applying simple massively parallel operations to memory. This document reflects the current status of the CAM2000 architecture and is continually updated to reflect the current state of the architecture and instruction set.
Cognitive task analysis for instruction in single-injection ultrasound guided-regional anesthesia
NASA Astrophysics Data System (ADS)
Gucev, Gligor V.
Cognitive task analysis (CTA) is methodology for eliciting knowledge from subject matter experts. CTA has been used to capture the cognitive processes, decision-making, and judgments that underlie expert behaviors. A review of the literature revealed that CTA has not yet been used to capture the knowledge required to perform ultrasound guided regional anesthesia (UGRA). The purpose of this study was to utilize CTA to extract knowledge from UGRA experts and to determine whether instruction based on CTA of UGRA will produce results superior to the results of traditional training. This study adds to the knowledge base of CTA in being the first one to effectively capture the expert knowledge of UGRA. The derived protocol was used in a randomized, double blinded experiment involving UGRA instruction to 39 novice learners. The results of this study strongly support the hypothesis that CTA-based instruction in UGRA is more effective than conventional clinical instruction, as measured by conceptual pre- and post-tests, performance of a simulated UGRA procedure, and time necessary for the task performance. This study adds to the number of studies that have proven the superiority of CTA-informed instruction. Finally, it produced several validated instruments that can be used in instructing and evaluating UGRA.
Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek
2014-01-01
This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1, 920 × 1, 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems. PMID:24526303
Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek
2014-02-12
This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1; 920 × 1; 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems.
NASA Astrophysics Data System (ADS)
Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.
2016-05-01
In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.
Split Decision. Public Schools Are Finding New Reasons to Segregate the Sexes
ERIC Educational Resources Information Center
Bixler, Mark
2005-01-01
The number of public schools offering single-sex instruction has risen from fewer than a dozen to 205 since 1997, with classrooms sprouting up in places such as Atlanta, New York, and Philadelphia, says Leonard Sax, a psychologist and physician who directs the National Association for Single Sex Public Education, in Maryland. The increase is…
Success Skills Curriculum for Teen Single Parents. Bulletin No. 96142.
ERIC Educational Resources Information Center
Hendon, Sarah, Ed.; And Others
This guide contains the materials required to teach a 36-hour program of competency-based instruction designed to meet the needs of teen single parents who require additional skill building before entering the job market or a job training program. The course is divided into 4 learning modules that cover 18 competencies as follows: taking…
ERIC Educational Resources Information Center
Elhai, Jon D.; Engdahl, Ryan M.; Palmieri, Patrick A.; Naifeh, James A.; Schweinle, Amy; Jacobs, Gerard A.
2009-01-01
The authors examined the effects of a methodological manipulation on the Posttraumatic Stress Disorder (PTSD) Checklist's factor structure: specifically, whether respondents were instructed to reference a single worst traumatic event when rating PTSD symptoms. Nonclinical, trauma-exposed participants were randomly assigned to 1 of 2 PTSD…
14 CFR Appendix A to Part 121 - First Aid Kits and Emergency Medical Kits
Code of Federal Regulations, 2013 CFR
2013-01-01
..., 50cc 1 Epinephrine 1:1000, single dose ampule or equivalent) 2 Diphenhydramine HC1 injection, single dose ampule or equivalent 2 Nitroglycerin tablets 10 Basic instructions for use of the drugs in the kit 1 protective nonpermeable gloves or equivalent 1 pair 2. As of April 12, 2004, at least one approved...
14 CFR Appendix A to Part 121 - First Aid Kits and Emergency Medical Kits
Code of Federal Regulations, 2010 CFR
2010-01-01
..., 50cc 1 Epinephrine 1:1000, single dose ampule or equivalent) 2 Diphenhydramine HC1 injection, single dose ampule or equivalent 2 Nitroglycerin tablets 10 Basic instructions for use of the drugs in the kit 1 protective nonpermeable gloves or equivalent 1 pair 2. As of April 12, 2004, at least one approved...
14 CFR Appendix A to Part 121 - First Aid Kits and Emergency Medical Kits
Code of Federal Regulations, 2014 CFR
2014-01-01
..., 50cc 1 Epinephrine 1:1000, single dose ampule or equivalent) 2 Diphenhydramine HC1 injection, single dose ampule or equivalent 2 Nitroglycerin tablets 10 Basic instructions for use of the drugs in the kit 1 protective nonpermeable gloves or equivalent 1 pair 2. As of April 12, 2004, at least one approved...
14 CFR Appendix A to Part 121 - First Aid Kits and Emergency Medical Kits
Code of Federal Regulations, 2011 CFR
2011-01-01
..., 50cc 1 Epinephrine 1:1000, single dose ampule or equivalent) 2 Diphenhydramine HC1 injection, single dose ampule or equivalent 2 Nitroglycerin tablets 10 Basic instructions for use of the drugs in the kit 1 protective nonpermeable gloves or equivalent 1 pair 2. As of April 12, 2004, at least one approved...
14 CFR Appendix A to Part 121 - First Aid Kits and Emergency Medical Kits
Code of Federal Regulations, 2012 CFR
2012-01-01
..., 50cc 1 Epinephrine 1:1000, single dose ampule or equivalent) 2 Diphenhydramine HC1 injection, single dose ampule or equivalent 2 Nitroglycerin tablets 10 Basic instructions for use of the drugs in the kit 1 protective nonpermeable gloves or equivalent 1 pair 2. As of April 12, 2004, at least one approved...
EFL Learners' Multiple Documents Literacy: Effects of a Strategy-Directed Intervention Program
ERIC Educational Resources Information Center
Karimi, Mohammad Nabi
2015-01-01
There is a substantial body of L2 research documenting the central role of strategy instruction in reading comprehension. However, this line of research has been conducted mostly within the single text paradigm of reading research. With reading literacy undergoing a marked shift from single source reading to multiple documents literacy, little is…
Gender-Specific Instructional Strategies and Student Achievement in 5th Grade Classrooms
ERIC Educational Resources Information Center
Dickey, Millicent Whitener
2014-01-01
There are three purposes of this mixed methods phenomenological case study. First, the researcher attempted to determine if there is evidence that teachers in single-sex classes adjust the delivery of the academic content when compared to coeducational classes. Secondly, while trying to understand the phenomenon of learning in a single-sex…
Effect of the Sport Education Tactical Model on Coeducational and Single Gender Game Performance
ERIC Educational Resources Information Center
Pritchard, Tony; McCollum, Starla; Sundal, Jacqueline; Colquit, Gavin
2014-01-01
Physical education teachers are faced with a decision when teaching physical activities in schools. What type of instructional model should be used, and should classes be coeducational or single gender? The current study had two purposes. The first purpose investigated the effectiveness of the sport education tactical model (SETM) during game play…
This product is an LC/MS/MS single laboratory validated method for the determination of cylindrospermopsin and anatoxin-a in ambient waters. The product contains step-by-step instructions for sample preparation, analyses, preservation, sample holding time and QC protocols to ensu...
Nelson, Peter M; Demers, Joseph A; Christ, Theodore J
2014-06-01
This study details the initial development of the Responsive Environmental Assessment for Classroom Teachers (REACT). REACT was developed as a questionnaire to evaluate student perceptions of the classroom teaching environment. Researchers engaged in an iterative process to develop, field test, and analyze student responses on 100 rating-scale items. Participants included 1,465 middle school students across 48 classrooms in the Midwest. Item analysis, including exploratory and confirmatory factor analysis, was used to refine a 27-item scale with a second-order factor structure. Results support the interpretation of a single general dimension of the Classroom Teaching Environment with 6 subscale dimensions: Positive Reinforcement, Instructional Presentation, Goal Setting, Differentiated Instruction, Formative Feedback, and Instructional Enjoyment. Applications of REACT in research and practice are discussed along with implications for future research and the development of classroom environment measures. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Atzema, Clare L; Austin, Peter C; Wu, Libo; Brzozowski, Michael; Feldman, Michael J; McDonnell, Michael; Mazurik, Laurie
2013-01-01
Emergency department discharge instructions are variably understood by patients, and in the setting of emergency department crowding, innovations are needed to counteract shortened interaction times with the physician. We evaluated the effect of viewing an online video of diagnosis-specific discharge instructions on patient comprehension and recall of instructions. In this prospective, single-center, randomized controlled trial conducted between November 2011 and January 2012, we randomized emergency department patients who were discharged with one of 38 diagnoses to either view (after they left the emergency department) a vetted online video of diagnosis-specific discharge instructions, or to usual care. Patients were subsequently contacted by telephone and asked three standardized questions about their discharge instructions; one point was awarded for each correct answer. Using an intention-to-treat analysis, differences between groups were assessed using univariate testing, and with logistic regression that accounted for clustering on managing physician. A secondary outcome measure was patient satisfaction with the videos, on a 10-point scale. Among 133 patients enrolled, mean age was 46.1 (s.d.D. 21.5) and 55% were female. Patients in the video group had 19% higher mean scores (2.5, s.d. 0.7) than patients in the control group (2.1, s.d. 0.8) (p=0.002). After adjustment for patient age, sex, first language, triage acuity score, and clustering, the odds of achieving a fully correct score (3 out of 3) were 3.5 (95% CI, 1.7 to 7.2) times higher in the video group, compared to the control group. Among those who viewed the videos, median rating of the videos was 10 (IQR 8 to 10). In this single-center trial, patients who viewed an online video of their discharge instructions scored higher on their understanding of key concepts around their diagnosis and subsequent care. Those who viewed the videos found them to be a helpful addition to standard care. ClinicalTrials.gov NCT01361932 http://clinicaltrials.gov/ct2/show/NCT01361932?term=nct01361932&rank=1.
7 CFR 3550.150 - OMB control number.
Code of Federal Regulations, 2013 CFR
2013-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 504 Origination and Section 306C Water and..., including time for reviewing instructions, searching existing data sources, gathering and maintaining the...
7 CFR 3550.150 - OMB control number.
Code of Federal Regulations, 2014 CFR
2014-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 504 Origination and Section 306C Water and..., including time for reviewing instructions, searching existing data sources, gathering and maintaining the...
7 CFR 3550.150 - OMB control number.
Code of Federal Regulations, 2011 CFR
2011-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 504 Origination and Section 306C Water and..., including time for reviewing instructions, searching existing data sources, gathering and maintaining the...
7 CFR 3550.150 - OMB control number.
Code of Federal Regulations, 2012 CFR
2012-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 504 Origination and Section 306C Water and..., including time for reviewing instructions, searching existing data sources, gathering and maintaining the...
7 CFR 3550.150 - OMB control number.
Code of Federal Regulations, 2010 CFR
2010-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 504 Origination and Section 306C Water and..., including time for reviewing instructions, searching existing data sources, gathering and maintaining the...
Stressing Design in Electronics Teaching
ERIC Educational Resources Information Center
Cuthbert, L. G.
1976-01-01
Advocates a strong emphasis on the teaching of the design of electronic circuits in undergraduate courses. An instructional paradigm involving the design and construction of a single-transistor amplifier is provided. (CP)
In silico FRET from simulated dye dynamics
NASA Astrophysics Data System (ADS)
Hoefling, Martin; Grubmüller, Helmut
2013-03-01
Single molecule fluorescence resonance energy transfer (smFRET) experiments probe molecular distances on the nanometer scale. In such experiments, distances are recorded from FRET transfer efficiencies via the Förster formula, E=1/(1+(). The energy transfer however also depends on the mutual orientation of the two dyes used as distance reporter. Since this information is typically inaccessible in FRET experiments, one has to rely on approximations, which reduce the accuracy of these distance measurements. A common approximation is an isotropic and uncorrelated dye orientation distribution. To assess the impact of such approximations, we present the algorithms and implementation of a computational toolkit for the simulation of smFRET on the basis of molecular dynamics (MD) trajectory ensembles. In this study, the dye orientation dynamics, which are used to determine dynamic FRET efficiencies, are extracted from MD simulations. In a subsequent step, photons and bursts are generated using a Monte Carlo algorithm. The application of the developed toolkit on a poly-proline system demonstrated good agreement between smFRET simulations and experimental results and therefore confirms our computational method. Furthermore, it enabled the identification of the structural basis of measured heterogeneity. The presented computational toolkit is written in Python, available as open-source, applicable to arbitrary systems and can easily be extended and adapted to further problems. Catalogue identifier: AENV_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AENV_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPLv3, the bundled SIMD friendly Mersenne twister implementation [1] is provided under the SFMT-License. No. of lines in distributed program, including test data, etc.: 317880 No. of bytes in distributed program, including test data, etc.: 54774217 Distribution format: tar.gz Programming language: Python, Cython, C (ANSI C99). Computer: Any (see memory requirements). Operating system: Any OS with CPython distribution (e.g. Linux, MacOSX, Windows). Has the code been vectorised or parallelized?: Yes, in Ref. [2], 4 CPU cores were used. RAM: About 700MB per process for the simulation setup in Ref. [2]. Classification: 16.1, 16.7, 23. External routines: Calculation of Rκ2-trajectories from GROMACS [3] MD trajectories requires the GromPy Python module described in Ref. [4] or a GROMACS 4.6 installation. The md2fret program uses a standard Python interpreter (CPython) v2.6+ and < v3.0 as well as the NumPy module. The analysis examples require the Matplotlib Python module. Nature of problem: Simulation and interpretation of single molecule FRET experiments. Solution method: Combination of force-field based molecular dynamics (MD) simulating the dye dynamics and Monte Carlo sampling to obtain photon statistics of FRET kinetics. Additional comments: !!!!! The distribution file for this program is over 50 Mbytes and therefore is not delivered directly when download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. !!!!! Running time: A single run in Ref. [2] takes about 10 min on a Quad Core Intel Xeon CPU W3520 2.67GHz with 6GB physical RAM References: [1] M. Saito, M. Matsumoto, SIMD-oriented fast Mersenne twister: a 128-bit pseudorandom number generator, in: A. Keller, S. Heinrich, H. Niederreiter (Eds.), Monte Carlo and Quasi-Monte Carlo Methods 2006, Springer; Berlin, Heidelberg, 2008, pp. 607-622. [2] M. Hoefling, N. Lima, D. Hänni, B. Schuler, C. A. M. Seidel, H. Grubmüller, Structural heterogeneity and quantitative FRET efficiency distributions of polyprolines through a hybrid atomistic simulation and Monte Carlo approach, PLoS ONE 6 (5) (2011) e19791. [3] D. V. D. Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark, H. J. C. Berendsen, GROMACS: fast, flexible, and free., J Comput Chem 26 (16) (2005) 1701-1718. [4] R. Pool, A. Feenstra, M. Hoefling, R. Schulz, J. C. Smith, J. Heringa, Enabling grand-canonical Monte Carlo: Extending the flexibility of gromacs through the GromPy Python interface module, Journal of Chemical Theory and Computation 33 (12) (2012) 1207-1214.
A 32-bit NMOS microprocessor with a large register file
NASA Astrophysics Data System (ADS)
Sherburne, R. W., Jr.; Katevenis, M. G. H.; Patterson, D. A.; Sequin, C. H.
1984-10-01
Two scaled versions of a 32-bit NMOS reduced instruction set computer CPU, called RISC II, have been implemented on two different processing lines using the simple Mead and Conway layout rules with lambda values of 2 and 1.5 microns (corresponding to drawn gate lengths of 4 and 3 microns), respectively. The design utilizes a small set of simple instructions in conjunction with a large register file in order to provide high performance. This approach has resulted in two surprisingly powerful single-chip processors.
Hypercluster Parallel Processor
NASA Technical Reports Server (NTRS)
Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela
1992-01-01
Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.
7 CFR 3550.50 - OMB control number.
Code of Federal Regulations, 2011 CFR
2011-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS General § 3550.50 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
7 CFR 3550.50 - OMB control number.
Code of Federal Regulations, 2013 CFR
2013-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS General § 3550.50 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
7 CFR 3550.50 - OMB control number.
Code of Federal Regulations, 2010 CFR
2010-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS General § 3550.50 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
7 CFR 3550.50 - OMB control number.
Code of Federal Regulations, 2012 CFR
2012-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS General § 3550.50 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
7 CFR 3550.50 - OMB control number.
Code of Federal Regulations, 2014 CFR
2014-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS General § 3550.50 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
Complex Instruction Set Quantum Computing
NASA Astrophysics Data System (ADS)
Sanders, G. D.; Kim, K. W.; Holton, W. C.
1998-03-01
In proposed quantum computers, electromagnetic pulses are used to implement logic gates on quantum bits (qubits). Gates are unitary transformations applied to coherent qubit wavefunctions and a universal computer can be created using a minimal set of gates. By applying many elementary gates in sequence, desired quantum computations can be performed. This reduced instruction set approach to quantum computing (RISC QC) is characterized by serial application of a few basic pulse shapes and a long coherence time. However, the unitary matrix of the overall computation is ultimately a unitary matrix of the same size as any of the elementary matrices. This suggests that we might replace a sequence of reduced instructions with a single complex instruction using an optimally taylored pulse. We refer to this approach as complex instruction set quantum computing (CISC QC). One trades the requirement for long coherence times for the ability to design and generate potentially more complex pulses. We consider a model system of coupled qubits interacting through nearest neighbor coupling and show that CISC QC can reduce the time required to perform quantum computations.
The effect of instructions on postural-suprapostural interactions in three working memory tasks.
Burcal, Christopher J; Drabik, Evan C; Wikstrom, Erik A
2014-06-01
Examining postural control while simultaneously performing a cognitive, or suprapostural task, has shown a fairly consistent trend of improving postural control in young healthy adults and provides insight into postural control mechanisms used in everyday life. However, the role of attention driven by explicit verbal instructions while dual-tasking is less understood. Therefore, the purpose of this investigation is to determine the effects of explicit verbal instructions on the postural-suprapostural interactions among various domains of working memory. A total of 22 healthy young adults with a heterogeneous history of ankle sprains volunteered to participate (age: 22.2±5.1 years; n=10 history of ankle sprains, n=12 no history). Participants were asked to perform single-limb balance trials while performing three suprapostural tasks: backwards counting, random number generation, and the manikin test. In addition, each suprapostural task was completed under three conditions of instruction: no instructions, focus on the postural control task, focus on the suprapostural task. The results indicate a significant effect of instructions on postural control outcomes, with postural performance improving in the presence of instructions across all three cognitive tasks which each stress different aspects of working memory. Further, postural-suprapostural interactions appear to be related to the direction or focus of an individual's attention as instructions to focus on the suprapostural task resulted in the greatest postural control improvements.Thus, attention driven by explicit verbal instructions influence postural-suprapostural interactions as measured by a temporal-spatial postural control outcome, time-to-boundary, regardless of the suprapostural task performed. Copyright © 2014 Elsevier B.V. All rights reserved.
Academic Benefits of Peer Tutoring: A Meta-Analytic Review of Single-Case Research
ERIC Educational Resources Information Center
Bowman-Perrott, Lisa; Davis, Heather; Vannest, Kimberly; Williams, Lauren; Greenwood, Charles; Parker, Richard
2013-01-01
Peer tutoring is an instructional strategy that involves students helping each other learn content through repetition of key concepts. This meta-analysis examined effects of peer tutoring across 26 single-case research experiments for 938 students in Grades 1-12. The TauU effect size for 195 phase contrasts was 0.75 with a confidence interval of…
7 CFR 3550.100 - OMB control number.
Code of Federal Regulations, 2011 CFR
2011-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 502 Origination § 3550.100 OMB control... response, with an average of 11/2 hours per response, including time for reviewing instructions, searching...
7 CFR 3550.100 - OMB control number.
Code of Federal Regulations, 2013 CFR
2013-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 502 Origination § 3550.100 OMB control... response, with an average of 11/2 hours per response, including time for reviewing instructions, searching...
7 CFR 3550.100 - OMB control number.
Code of Federal Regulations, 2012 CFR
2012-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 502 Origination § 3550.100 OMB control... response, with an average of 11/2 hours per response, including time for reviewing instructions, searching...
7 CFR 3550.100 - OMB control number.
Code of Federal Regulations, 2014 CFR
2014-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 502 Origination § 3550.100 OMB control... response, with an average of 11/2 hours per response, including time for reviewing instructions, searching...
7 CFR 3550.200 - OMB control number.
Code of Federal Regulations, 2011 CFR
2011-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Regular Servicing § 3550.200 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
7 CFR 3550.200 - OMB control number.
Code of Federal Regulations, 2013 CFR
2013-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Regular Servicing § 3550.200 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
7 CFR 3550.100 - OMB control number.
Code of Federal Regulations, 2010 CFR
2010-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Section 502 Origination § 3550.100 OMB control... response, with an average of 11/2 hours per response, including time for reviewing instructions, searching...
7 CFR 3550.200 - OMB control number.
Code of Federal Regulations, 2012 CFR
2012-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Regular Servicing § 3550.200 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
7 CFR 3550.200 - OMB control number.
Code of Federal Regulations, 2010 CFR
2010-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Regular Servicing § 3550.200 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
7 CFR 3550.200 - OMB control number.
Code of Federal Regulations, 2014 CFR
2014-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Regular Servicing § 3550.200 OMB control number. The... average of 11/2 hours per response, including time for reviewing instructions, searching existing data...
Latash, M L
1994-01-01
A method for reconstructing joint compliant characteristics during voluntary movements was applied to the analysis of oscillatory and unidirectional elbow flexion movements. In different series, the subjects were given one of the following instructions: (1) do not intervene voluntarily; (2) keep the trajectory; (3) in cases of perturbations, return back to the starting position as quickly as possible (only during unidirectional movements). Under the instruction 'keep trajectory', the apparent joint stiffness increased by 50% to 250%. During oscillatory movements, this was accompanied by a decrease in the maximal difference between the actual and equilibrium joint trajectories and, in several cases, led to a change in the phase relation between the two trajectories. The coefficients of correlation between joint torque and angle were very high (commonly, over 0.9) under the 'do not intervene' instruction. They dropped to about 0.6 under the 'keep trajectory' and to about 0.3 under the 'return back' instructions. Under these two instructions, the low values of the coefficients of correlation did not allow reconstruction of segments of equilibrium trajectories and joint stiffness values in all the subjects. The results provide further support for the lambda-version of the equilibrium-point hypothesis and for using the instruction 'do not intervene voluntarily' to obtain reproducible time patterns of the central motor command.
NASA Astrophysics Data System (ADS)
Foster, Hyacinth Carmen
Science educators and administrators support the idea that inquiry-based and didactic-based instructional strategies have varying effects on students' acquisition of science concepts. The research problem addressed whether incorporating the two approaches covered the learning requirements of all students in science classes, enabling them to meet state and national standards. The purpose of this quasiexperimental, posttest design research study was to determine if student learning and achievement in high school biology classes differed for each type of instructional method. Constructivism theory suggested that each learner creates knowledge over time because of the learners' interactions with the environment. The optimal teaching method, didactic (teacher-directed), inquiry-based, or a combination of two approaches instructional method, becomes essential if students are to discover ways to learn information. The research question examined which form of instruction had a significant effect on student achievement in biology. The data analysis consisted of single-factor, independent-measures analysis of variance (ANOVA) that tested the hypotheses of the research study. Locally, the results indicated greater and statistically significant differences in standardized laboratory scores for students who were taught using the combination of two approaches. Based on these results, biology instructors will gain new insights into ways of improving the instructional process. Social change may occur as the science curriculum leadership applies the combination of two instructional approaches to improve acquisition of science concepts by biology students.
NASA Astrophysics Data System (ADS)
Badeau, Ryan; White, Daniel R.; Ibrahim, Bashirah; Ding, Lin; Heckler, Andrew F.
2017-12-01
The ability to solve physics problems that require multiple concepts from across the physics curriculum—"synthesis" problems—is often a goal of physics instruction. Three experiments were designed to evaluate the effectiveness of two instructional methods employing worked examples on student performance with synthesis problems; these instructional techniques, analogical comparison and self-explanation, have previously been studied primarily in the context of single-concept problems. Across three experiments with students from introductory calculus-based physics courses, both self-explanation and certain kinds of analogical comparison of worked examples significantly improved student performance on a target synthesis problem, with distinct improvements in recognition of the relevant concepts. More specifically, analogical comparison significantly improved student performance when the comparisons were invoked between worked synthesis examples. In contrast, similar comparisons between corresponding pairs of worked single-concept examples did not significantly improve performance. On a more complicated synthesis problem, self-explanation was significantly more effective than analogical comparison, potentially due to differences in how successfully students encoded the full structure of the worked examples. Finally, we find that the two techniques can be combined for additional benefit, with the trade-off of slightly more time on task.
7 CFR 3550.300 - OMB control number.
Code of Federal Regulations, 2012 CFR
2012-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Post-Servicing Actions § 3550.300 OMB control number..., with an average of 11/2 hours per response, including time for review instructions, searching existing...
7 CFR 3550.300 - OMB control number.
Code of Federal Regulations, 2013 CFR
2013-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Post-Servicing Actions § 3550.300 OMB control number..., with an average of 11/2 hours per response, including time for review instructions, searching existing...
7 CFR 3550.300 - OMB control number.
Code of Federal Regulations, 2010 CFR
2010-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Post-Servicing Actions § 3550.300 OMB control number..., with an average of 11/2 hours per response, including time for review instructions, searching existing...
7 CFR 3550.300 - OMB control number.
Code of Federal Regulations, 2014 CFR
2014-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Post-Servicing Actions § 3550.300 OMB control number..., with an average of 11/2 hours per response, including time for review instructions, searching existing...
7 CFR 3550.300 - OMB control number.
Code of Federal Regulations, 2011 CFR
2011-01-01
... AGRICULTURE DIRECT SINGLE FAMILY HOUSING LOANS AND GRANTS Post-Servicing Actions § 3550.300 OMB control number..., with an average of 11/2 hours per response, including time for review instructions, searching existing...
32 CFR 705.37 - Public affairs and public service awards.
Code of Federal Regulations, 2010 CFR
2010-07-01
... considered by the Distinguished Civilian Service Awards Panel. (See Civilian Manpower Management Instruction... of the Navy. (B) Be the result of a single outstanding project or program. (C) Have been accomplished...
32 CFR 705.37 - Public affairs and public service awards.
Code of Federal Regulations, 2011 CFR
2011-07-01
... considered by the Distinguished Civilian Service Awards Panel. (See Civilian Manpower Management Instruction... of the Navy. (B) Be the result of a single outstanding project or program. (C) Have been accomplished...
32 CFR 705.37 - Public affairs and public service awards.
Code of Federal Regulations, 2014 CFR
2014-07-01
... considered by the Distinguished Civilian Service Awards Panel. (See Civilian Manpower Management Instruction... of the Navy. (B) Be the result of a single outstanding project or program. (C) Have been accomplished...
32 CFR 705.37 - Public affairs and public service awards.
Code of Federal Regulations, 2012 CFR
2012-07-01
... considered by the Distinguished Civilian Service Awards Panel. (See Civilian Manpower Management Instruction... of the Navy. (B) Be the result of a single outstanding project or program. (C) Have been accomplished...
32 CFR 705.37 - Public affairs and public service awards.
Code of Federal Regulations, 2013 CFR
2013-07-01
... considered by the Distinguished Civilian Service Awards Panel. (See Civilian Manpower Management Instruction... of the Navy. (B) Be the result of a single outstanding project or program. (C) Have been accomplished...
48 CFR 52.212-1 - Instructions to Offerors-Commercial Items.
Code of Federal Regulations, 2014 CFR
2014-10-01
... be ordered from the Department of Defense Single Stock Point (DoDSSP) by— (i) Using the ASSIST... SAM records for identifying alternative Electronic Funds Transfer (EFT) accounts (see FAR Subpart 32...
48 CFR 52.212-1 - Instructions to Offerors-Commercial Items.
Code of Federal Regulations, 2012 CFR
2012-10-01
... ordered from the Department of Defense Single Stock Point (DoDSSP) by— (i) Using the ASSIST Shopping... CCR records for identifying alternative Electronic Funds Transfer (EFT) accounts (see FAR Subpart 32...
48 CFR 52.212-1 - Instructions to Offerors-Commercial Items.
Code of Federal Regulations, 2011 CFR
2011-10-01
... ordered from the Department of Defense Single Stock Point (DoDSSP) by— (i) Using the ASSIST Shopping... CCR records for identifying alternative Electronic Funds Transfer (EFT) accounts (see FAR Subpart 32...
Teaching Verbal Chains Using Flow Diagrams and Texts
ERIC Educational Resources Information Center
Holliday, William G.
1976-01-01
A discussion of the recent diagram and attention theory and research surprisingly suggests that a single flow diagram with instructive questions constitutes an effective learning medium in terms of verbal chaining. (Author)
Remember to blink: Reduced attentional blink following instructions to forget.
Taylor, Tracy L
2018-04-24
This study used rapid serial visual presentation (RSVP) to determine whether, in an item-method directed forgetting task, study word processing ends earlier for forget words than for remember words. The critical manipulation required participants to monitor an RSVP stream of black nonsense strings in which a single blue word was embedded. The next item to follow the word was a string of red fs that instructed the participant to forget the word or green rs that instructed the participant to remember the word. After the memory instruction, a probe string of black xs or os appeared at postinstruction positions 1-8. Accuracy in reporting the identity of the probe string revealed an attenuated attentional blink following instructions to forget. A yes-no recognition task that followed the study trials confirmed a directed forgetting effect, with better recognition of remember words than forget words. Considered in the context of control conditions that required participants to commit either all or none of the study words to memory, the pattern of probe identification accuracy following the directed forgetting task argues that an intention to forget releases limited-capacity attentional resources sooner than an instruction to remember-despite participants needing to maintain an ongoing rehearsal set in both cases.
Méndez, Lucía I; Crais, Elizabeth R; Castro, Dina C; Kainz, Kirsten
2015-02-01
This study examined the role of the language of vocabulary instruction in promoting English vocabulary in preschool Latino dual language learners (DLLs). The authors compared the effectiveness of delivering a single evidence-informed vocabulary approach using English as the language of vocabulary instruction (English culturally responsive [ECR]) versus using a bilingual modality that strategically combined Spanish and English (culturally and linguistically responsive [CLR]). Forty-two DLL Spanish-speaking preschoolers were randomly assigned to the ECR group (n=22) or CLR group (n=20). Thirty English words were presented during small-group shared readings in their preschools 3 times a week for 5 weeks. Multilevel models were used to examine group differences in postinstruction scores on 2 Spanish and 2 English vocabulary assessments at instruction end and follow-up. Children receiving instruction in the CLR bilingual modality had significantly higher posttest scores (than those receiving the ECR English-only instruction) on Spanish and English vocabulary assessments at instruction end and on the Spanish vocabulary assessment at follow-up, even after controlling for preinstruction scores. The results provide additional evidence of the benefits of strategically combining the first and second language to promote English and Spanish vocabulary development in this population. Future directions for research and clinical applications are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McGhee, J.M.; Roberts, R.M.; Morel, J.E.
1997-06-01
A spherical harmonics research code (DANTE) has been developed which is compatible with parallel computer architectures. DANTE provides 3-D, multi-material, deterministic, transport capabilities using an arbitrary finite element mesh. The linearized Boltzmann transport equation is solved in a second order self-adjoint form utilizing a Galerkin finite element spatial differencing scheme. The core solver utilizes a preconditioned conjugate gradient algorithm. Other distinguishing features of the code include options for discrete-ordinates and simplified spherical harmonics angular differencing, an exact Marshak boundary treatment for arbitrarily oriented boundary faces, in-line matrix construction techniques to minimize memory consumption, and an effective diffusion based preconditioner formore » scattering dominated problems. Algorithm efficiency is demonstrated for a massively parallel SIMD architecture (CM-5), and compatibility with MPP multiprocessor platforms or workstation clusters is anticipated.« less
Dense and Sparse Matrix Operations on the Cell Processor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel W.; Shalf, John; Oliker, Leonid
2005-05-01
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. Therefore, the high performance computing community is examining alternative architectures that address the limitations of modern superscalar designs. In this work, we examine STI's forthcoming Cell processor: a novel, low-power architecture that combines a PowerPC core with eight independent SIMD processing units coupled with a software-controlled memory to offer high FLOP/s/Watt. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop an analytic framework to predict Cell performance on dense and sparse matrix operations, usingmore » a variety of algorithmic approaches. Results demonstrate Cell's potential to deliver more than an order of magnitude better GFLOP/s per watt performance, when compared with the Intel Itanium2 and Cray X1 processors.« less
The MasPar MP-1 As a Computer Arithmetic Laboratory
Anuta, Michael A.; Lozier, Daniel W.; Turner, Peter R.
1996-01-01
This paper is a blueprint for the use of a massively parallel SIMD computer architecture for the simulation of various forms of computer arithmetic. The particular system used is a DEC/MasPar MP-1 with 4096 processors in a square array. This architecture has many advantages for such simulations due largely to the simplicity of the individual processors. Arithmetic operations can be spread across the processor array to simulate a hardware chip. Alternatively they may be performed on individual processors to allow simulation of a massively parallel implementation of the arithmetic. Compromises between these extremes permit speed-area tradeoffs to be examined. The paper includes a description of the architecture and its features. It then summarizes some of the arithmetic systems which have been, or are to be, implemented. The implementation of the level-index and symmetric level-index, LI and SLI, systems is described in some detail. An extensive bibliography is included. PMID:27805123
A biconjugate gradient type algorithm on massively parallel architectures
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Hochbruck, Marlis
1991-01-01
The biconjugate gradient (BCG) method is the natural generalization of the classical conjugate gradient algorithm for Hermitian positive definite matrices to general non-Hermitian linear systems. Unfortunately, the original BCG algorithm is susceptible to possible breakdowns and numerical instabilities. Recently, Freund and Nachtigal have proposed a novel BCG type approach, the quasi-minimal residual method (QMR), which overcomes the problems of BCG. Here, an implementation is presented of QMR based on an s-step version of the nonsymmetric look-ahead Lanczos algorithm. The main feature of the s-step Lanczos algorithm is that, in general, all inner products, except for one, can be computed in parallel at the end of each block; this is unlike the other standard Lanczos process where inner products are generated sequentially. The resulting implementation of QMR is particularly attractive on massively parallel SIMD architectures, such as the Connection Machine.
Implementation of an ADI method on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
The implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, an SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the FLEX/32 and CRAY/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Implementation of an ADI method on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
In this paper the implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are the MPP, an SIMD machine with 16-Kbit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the Flex/32 and Cray/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally conclusions are presented.
A heuristic for deriving the optimal number and placement of reconnaissance sensors
NASA Astrophysics Data System (ADS)
Nanda, S.; Weeks, J.; Archer, M.
2008-04-01
A key to mastering asymmetric warfare is the acquisition of accurate intelligence on adversaries and their assets in urban and open battlefields. To achieve this, one needs adequate numbers of tactical sensors placed in locations to optimize coverage, where optimality is realized by covering a given area of interest with the least number of sensors, or covering the largest possible subsection of an area of interest with a fixed set of sensors. Unfortunately, neither problem admits a polynomial time algorithm as a solution, and therefore, the placement of such sensors must utilize intelligent heuristics instead. In this paper, we present a scheme implemented on parallel SIMD processing architectures to yield significantly faster results, and that is highly scalable with respect to dynamic changes in the area of interest. Furthermore, the solution to the first problem immediately translates to serve as a solution to the latter if and when any sensors are rendered inoperable.
7 CFR 810.1402 - Definition of other terms.
Code of Federal Regulations, 2014 CFR
2014-01-01
... containing spots that, singly or in combination, cover 25.0 percent or less of the kernel. (4) Mixed sorghum... the 5/64 triangular-hole sieve according to procedures prescribed in FGIS instructions. (g) Heat...
7 CFR 810.1402 - Definition of other terms.
Code of Federal Regulations, 2013 CFR
2013-01-01
... containing spots that, singly or in combination, cover 25.0 percent or less of the kernel. (4) Mixed sorghum... the 5/64 triangular-hole sieve according to procedures prescribed in FGIS instructions. (g) Heat...
7 CFR 810.1402 - Definition of other terms.
Code of Federal Regulations, 2012 CFR
2012-01-01
... containing spots that, singly or in combination, cover 25.0 percent or less of the kernel. (4) Mixed sorghum... the 5/64 triangular-hole sieve according to procedures prescribed in FGIS instructions. (g) Heat...
48 CFR 52.212-1 - Instructions to Offerors-Commercial Items.
Code of Federal Regulations, 2013 CFR
2013-10-01
... ordered from the Department of Defense Single Stock Point (DoDSSP) by— (i) Using the ASSIST Shopping... for identifying alternative Electronic Funds Transfer (EFT) accounts (see FAR Subpart 32.11) for the...
ERIC Educational Resources Information Center
Gorton, Carolyn
This package of instructional materials is intended for use in preparing single parents and displaced homemakers for entry into the job market. The materials were developed for the ENCORE program--a 4-week, 48-hour, 3-days-per-week program focusing on employability skills, vocational assessment, personal development, shadowing in traditional and…
ERIC Educational Resources Information Center
Canada, Patricia Oxendine
2012-01-01
In response to the mandates of No Child Left Behind, (NCLB), educators across the country struggle to close the gaps between males and females. Some of the physiological differences existing between the male and female brain suggest support for single-gender instruction, which is on the rise within this country as well as other parts of the world.…
ERIC Educational Resources Information Center
Phipps, Christa Brown
2017-01-01
Low income male preschoolers with externalizing behaviors have continued behavior issues throughout elementary school, middle school, high school, and into adulthood and create stress for their teachers. Because of this, it is important to detect externalizing behaviors early and implement an appropriate intervention. A single subject reversal…
Senator John Glenn training in Single Systems Trainer
1998-03-30
S98-08640 (6 April 1998) --- U.S. Sen. John H. Glenn Jr. (D.-Ohio) temporarily occupies the commander's station in a space shuttle instruction facility called the single systems trainer. The senator is training as a payload specialist for the STS-95 mission, scheduled for launch aboard the Space Shuttle Discovery later this year. The photo was taken by Joe Mcnally, National Geographic, for NASA.
Manipulation of cognitive load variables and impact on auscultation test performance.
Chen, Ruth; Grierson, Lawrence; Norman, Geoffrey
2015-10-01
Health profession educators have identified auscultation skill as a learning need for health professional students. This article explores the application of cognitive load theory (CLT) to designing cardiac and respiratory auscultation skill instruction for senior-level undergraduate nursing students. Three experiments assessed student auscultation performance following instructional manipulations of the three primary components of cognitive load: intrinsic, extraneous, and germane load. Study 1 evaluated the impact of intrinsic cognitive load by varying the number of diagnoses learned in one instruction session; Study 2 evaluated the impact of extraneous cognitive load by providing students with single or multiple examples of diagnoses during instruction; and Study 3 evaluated the impact of germane cognitive load by employing mixed or blocked sequences of diagnostic examples to students. Each of the three studies presents results that support CLT as explaining the influence of different types of cognitive processing on auscultation skill acquisition. We conclude with a discussion regarding CLT's usefulness as a framework for education and education research in the health professions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R
Methods, apparatuses, and computer program products for endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (`PAMI`) of a parallel computer are provided. Embodiments include establishing by a parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry. Embodiments also include registering in each endpoint in the geometry a dispatch callback function for a collective operation and executing without blocking, through a single onemore » of the endpoints in the geometry, an instruction for the collective operation.« less
NASA Astrophysics Data System (ADS)
Förtsch, Christian; Dorfner, Tobias; Baumgartner, Julia; Werner, Sonja; von Kotzebue, Lena; Neuhaus, Birgit J.
2018-04-01
The German National Education Standards (NES) for biology were introduced in 2005. The content part of the NES emphasizes fostering conceptual knowledge. However, there are hardly any indications of what such an instructional implementation could look like. We introduce a theoretical framework of an instructional approach to foster students' conceptual knowledge as demanded in the NES (Fostering Conceptual Knowledge) including instructional practices derived from research on single core ideas, general psychological theories, and biology-specific features of instructional quality. First, we aimed to develop a rating manual, which is based on this theoretical framework. Second, we wanted to describe current German biology instruction according to this approach and to quantitatively analyze its effectiveness. And third, we aimed to provide qualitative examples of this approach to triangulate our findings. In a first step, we developed a theoretically devised rating manual to measure Fostering Conceptual Knowledge in videotaped lessons. Data for quantitative analysis included 81 videotaped biology lessons of 28 biology teachers from different German secondary schools. Six hundred forty students completed a questionnaire on their situational interest after each lesson and an achievement test. Results from multilevel modeling showed significant positive effects of Fostering Conceptual Knowledge on students' achievement and situational interest. For qualitative analysis, we contrasted instruction of four teachers, two with high and two with low student achievement and situational interest using the qualitative method of thematic analysis. Qualitative analysis revealed five main characteristics describing Fostering Conceptual Knowledge. Therefore, implementing Fostering Conceptual Knowledge in biology instruction seems promising. Examples of how to implement Fostering Conceptual Knowledge in instruction are shown and discussed.
Strategies to Overcome Negative Reading Habits of ABE Participants: A Case Study.
ERIC Educational Resources Information Center
Stanfel, Jane E.
1996-01-01
The author theorizes that her students, single-parent welfare recipients, develop negative reading habits to camouflage low literacy. She describes instructional strategies that fail to help and methods she found effective. (SK)
Teaching Cockpit Automation in the Classroom
NASA Technical Reports Server (NTRS)
Casner, Stephen M.
2003-01-01
This study explores the idea of teaching fundamental cockpit automation concepts and skills to aspiring professional pilots in a classroom setting, without the use of sophisticated aircraft or equipment simulators. Pilot participants from a local professional pilot academy completed eighteen hours of classroom instruction that placed a strong emphasis on understanding the underlying principles of cockpit automation systems and their use in a multi-crew cockpit. The instructional materials consisted solely of a single textbook. Pilots received no hands-on instruction or practice during their training. At the conclusion of the classroom instruction, pilots completed a written examination testing their mastery of what had been taught during the classroom meetings. Following the written exam, each pilot was given a check flight in a full-mission Level D simulator of a Boeing 747-400 aircraft. Pilots were given the opportunity to fly one practice leg, and were then tested on all concepts and skills covered in the class during a second leg. The results of the written exam and simulator checks strongly suggest that instruction delivered in a traditional classroom setting can lead to high levels of preparation without the need for expensive airplane or equipment simulators.
Does interactive instruction in introductory physics impact long-term outcomes for students?
NASA Astrophysics Data System (ADS)
Gordon, Vernita
Early college classroom experiences contribute greatly to students leaving STEM majors. Peer instruction is a research-based pedagogy in which students, in small groups in the classroom, discuss concepts and work short problems. A single study at Harvard found that taking peer-instruction introductory physics also increases persistence in science majors. To what degree, if at all, peer instruction helps retention and performance for STEM majors at large public institutions (like University of Texas, Austin) is not known. Here I describe the results of a retrospective pilot study comparing outcomes for students who took different sections of the same calculus-based introductory mechanics course in Fall 2012 and Fall 2014. Compared with traditional lecture sections, peer-instruction sections had a 50% lower drop rate, a 40% / 55% higher rate of enrollment in the 2nd/ 3rd courses in the sequence, and, for the Fall 2012 cohort, a 74% / 165% higher rate of graduating from UT Austin / the UT Austin College of Natural Sciences by Fall 2015. I will discuss weaknesses of this retrospective pilot study and present plans for an intentionally-designed study to be implemented beginning Fall 2017.
Speeding up tsunami wave propagation modeling
NASA Astrophysics Data System (ADS)
Lavrentyev, Mikhail; Romanenko, Alexey
2014-05-01
Trans-oceanic wave propagation is one of the most time/CPU consuming parts of the tsunami modeling process. The so-called Method Of Splitting Tsunami (MOST) software package, developed at PMEL NOAA USA (Pacific Marine Environmental Laboratory of the National Oceanic and Atmospheric Administration, USA), is widely used to evaluate the tsunami parameters. However, it takes time to simulate trans-ocean wave propagation, that is up to 5 hours CPU time to "drive" the wave from Chili (epicenter) to the coast of Japan (even using a rather coarse computational mesh). Accurate wave height prediction requires fine meshes which leads to dramatic increase in time for simulation. Computation time is among the critical parameter as it takes only about 20 minutes for tsunami wave to approach the coast of Japan after earthquake at Japan trench or Sagami trench (as it was after the Great East Japan Earthquake on March 11, 2011). MOST solves numerically the hyperbolic system for three unknown functions, namely velocity vector and wave height (shallow water approximation). The system could be split into two independent systems by orthogonal directions (splitting method). Each system can be treated independently. This calculation scheme is well suited for SIMD architecture and GPUs as well. We performed adaptation of MOST package to GPU. Several numerical tests showed 40x performance gain for NVIDIA Tesla C2050 GPU vs. single core of Intel i7 processor. Results of numerical experiments were compared with other available simulation data. Calculation results, obtained at GPU, differ from the reference ones by 10^-3 cm of the wave height simulating 24 hours wave propagation. This allows us to speak about possibility to develop real-time system for evaluating tsunami danger.
ERIC Educational Resources Information Center
Cannon, Joanna E.; Guardino, Caroline; Antia, Shirin D.; Luckner, John L.
2015-01-01
The field of education of deaf and hard of hearing (DHH) students has a paucity of evidence-based practices (EBPs) to guide instruction. The authors discussed how the research methodology of single-case design (SCD) can be used to build EBPs through direct and systematic replication of studies. An overview of SCD research methods is presented,…
2017-01-26
Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/5514--17-9692 High Resolution Bathymetry Estimation Improvement with Single Image Super...collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources...gathering and maintaining the data needed, and completing and reviewing this collection of information. Send comments regarding this burden estimate
Acceptability of unsupervised HPV self-sampling using written instructions.
Waller, J; McCaffery, K; Forrest, S; Szarewski, A; Cadman, L; Austin, J; Wardle, J
2006-01-01
The study measured the acceptability of self-sampling for human papillomavirus (HPV) testing in the context of cervical cancer screening. Women carried out self-sampling unsupervised, using a written instruction sheet. Participants were women attending either a family planning clinic or a primary care trust for routine cervical screening. Women (n = 902) carried out self-sampling for HPV testing and then a clinician did a routine cervical smear and HPV test. Immediately after having the two tests, participants completed a measure of acceptability for both tests, and answered questions about ease of using the instruction sheet and willingness to use self-sampling in the future. The majority of women found self-sampling more acceptable than the clinician-administered test, but there was a lack of confidence that the test had been done correctly. Significant demographic differences in attitudes were found, with married women having more favourable attitudes towards self-sampling than single women, and Asian women having more negative attitudes than women in other ethnic groups. Intention to use self-sampling in the future was very high across all demographic groups. Self-sampling for HPV testing was highly acceptable in this large and demographically diverse sample, and women were able to carry out the test alone, using simple written instructions. Consistent with previous studies, women were concerned about doing the test properly and this issue will need to be addressed if self-sampling is introduced. More work is needed to see whether the demographic differences we found are robust and to identify reasons for lower acceptability among single women and those from Asian background.
Virtual trajectories of single-joint movements performed under two basic strategies.
Latash, M L; Gottlieb, G L
1992-01-01
The framework of the equilibrium point hypothesis has been used to analyse motor control processes for single-joint movements. Virtual trajectories and joint stiffness were reconstructed for different movement speeds and distances when subjects were instructed either to move "as fast as possible" or to intentionally vary movement speed. These instructions are assumed to be associated with similar or different rates of change of hypothetical central control variables (corresponding to the speed-sensitive and speed-insensitive strategies). The subjects were trained to perform relatively slow, moderately fast and very fast (nominal movement times 800, 400 and 250 ms) single-joint elbow flexion movements against a constant extending torque bias. They were instructed to reproduce the motor command for a series of movements while ignoring possible changes in the external torque which could slowly and unpredictably increase, decrease, or remain constant. The total muscle torque was calculated as a sum of external and inertial components. Fast movements over different distances were made with the speed-insensitive strategy. They were characterized by an increase in joint stiffness near the midpoint of the movements which was relatively independent of movement amplitude. Their virtual trajectories had a non-monotonic N-shape. All three arms of the N-shape scaled with movement amplitude. Movements over one distance at different speeds were made with a speed-sensitive strategy. They demonstrated different patterns of virtual trajectories and joint stiffness that depended on movement speed. The N-shape became less apparent for moderately fast movements and virtually disappeared for the slow movements. Slow movements showed no visible increase in joint stiffness.(ABSTRACT TRUNCATED AT 250 WORDS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, T; Lin, H; Xu, X
Purpose: (1) To perform phase space (PS) based source modeling for Tomotherapy and Varian TrueBeam 6 MV Linacs, (2) to examine the accuracy and performance of the ARCHER Monte Carlo code on a heterogeneous computing platform with Many Integrated Core coprocessors (MIC, aka Xeon Phi) and GPUs, and (3) to explore the software micro-optimization methods. Methods: The patient-specific source of Tomotherapy and Varian TrueBeam Linacs was modeled using the PS approach. For the helical Tomotherapy case, the PS data were calculated in our previous study (Su et al. 2014 41(7) Medical Physics). For the single-view Varian TrueBeam case, we analyticallymore » derived them from the raw patient-independent PS data in IAEA’s database, partial geometry information of the jaw and MLC as well as the fluence map. The phantom was generated from DICOM images. The Monte Carlo simulation was performed by ARCHER-MIC and GPU codes, which were benchmarked against a modified parallel DPM code. Software micro-optimization was systematically conducted, and was focused on SIMD vectorization of tight for-loops and data prefetch, with the ultimate goal of increasing 512-bit register utilization and reducing memory access latency. Results: Dose calculation was performed for two clinical cases, a Tomotherapy-based prostate cancer treatment and a TrueBeam-based left breast treatment. ARCHER was verified against the DPM code. The statistical uncertainty of the dose to the PTV was less than 1%. Using double-precision, the total wall time of the multithreaded CPU code on a X5650 CPU was 339 seconds for the Tomotherapy case and 131 seconds for the TrueBeam, while on 3 5110P MICs it was reduced to 79 and 59 seconds, respectively. The single-precision GPU code on a K40 GPU took 45 seconds for the Tomotherapy dose calculation. Conclusion: We have extended ARCHER, the MIC and GPU-based Monte Carlo dose engine to Tomotherapy and Truebeam dose calculations.« less
Effects of single sex lab groups on physics self-efficacy, behavior, and academic performance
NASA Astrophysics Data System (ADS)
Hunt, Gary L.
The purpose of this study was to investigate the relationships between the gender composition of a laboratory group and student behaviors, self-efficacy, and quiz performance, within the college physics laboratory. A student population was chosen and subdivided into two groups, which were assigned either same-sex or coed laboratory teams while executing identical laboratory activities and instruction. Assessments were carried out prior to instruction, during the course, and at the end of one semester worth of instruction and laboratory activities. Students were assessed in three areas: behaviors exhibited during laboratory activities, self-efficacy, and scores on laboratory quizzes. Analyses considered the differences in outcomes after a single semester of physics laboratories that differed only in team gender organization. The results indicated that there were no statistically significant differences in behavior variable, self-efficacy or laboratory quiz scores between same sex teams and coed teams. There were also no statistically significant differences between genders, and no interaction effect present. In a post-hoc analysis of the individual behaviors data, it was noted that there is present a practical difference in the individual behaviors exhibited by males and females. This difference implies a difference in how males and females successfully engage in the laboratory activities.
... is not a means of abortion. Most medicine brands require a single dose of 1 pill. Some brands have 2 doses (1 pill followed by a ... pills together. Follow the instructions for each specific brand. The U.S. Food and Drug Administration (FDA) says ...
Understanding Students’ Ideas about the Geometry of the Universe
NASA Astrophysics Data System (ADS)
Coble, Kimberly A.; Conlon, Mallory; Bailey, Janelle M.
2017-06-01
As astronomers further develop an understanding of the geometry of the Universe, it is essential to study students’ ideas so that instructors can communicate the field’s current status more effectively to their students. In this study, we examine undergraduate students’ pre- instruction ideas in general education astronomy courses (ASTRO 101) at three institutions through pre-course surveys given during the first week of instruction [N ~ 265]. We also examine students’ post-instruction ideas at a single institution through exam questions [N ~ 75] and interviews. Responses are analyzed through an iterative process of identifying self-emergent themes. We examine not only what students think the curvature of the universe is, but also "how we know." We find that many students think the Universe is “round” or that we cannot measure its curvature. Additionally, popular visualizations may enforce incorrect ideas.
Code of Federal Regulations, 2010 CFR
2010-10-01
..., single-break, signal control circuits using a grounded common, and alternating current power distribution... TRANSPORTATION RULES, STANDARDS, AND INSTRUCTIONS GOVERNING THE INSTALLATION, INSPECTION, MAINTENANCE, AND REPAIR... General § 236.2 Grounds. Each circuit, the functioning of which affects the safety of train operations...
Ampullary Electroreceptors in Neurophysiological Instruction.
ERIC Educational Resources Information Center
Peters, R. C.; And Others
1988-01-01
Presents a model system designed for the electrophysiological investigation of single unit activity in intact anaesthetized animals. Illustrates how information is coded into action potential patterns by sense organs. Uses the ampullary electroreceptor of the brown bullhead catfish as an example. (Author/CW)
48 CFR 652.242-72 - Shipping Instructions.
Code of Federal Regulations, 2011 CFR
2011-10-01
... dimensions of lumber for struts, frame members, and single diagonal braces Up to 45 kg 19.05 × 57.15mm 46 to... (b) Each box shall be lined with waterproof paper and shall be bound with 19.05mm″ steel straps...
48 CFR 652.242-72 - Shipping Instructions.
Code of Federal Regulations, 2010 CFR
2010-10-01
... dimensions of lumber for struts, frame members, and single diagonal braces Up to 45 kg 19.05 × 57.15mm 46 to... (b) Each box shall be lined with waterproof paper and shall be bound with 19.05mm″ steel straps...
Lung Injury; Relates to Real-Time Endoscopic Monitoring of Single Cells Respiratory Health in Lung
2017-09-01
AWARD NUMBER: W81XWH-16-1-0253 TITLE: Lung Injury; Relates to Real- Time Endoscopic Monitoring of Single Cells Respiratory Health in Lung...and should not be construed as an official Department of the Army position, policy or decision unless so designated by other documentation. REPORT...response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and
El Dakhakhny, A M; Hesham, M A; Hassan, T H; El Awady, S; Hanfy, M M
2014-07-01
Nowadays, health education has been elevated to a higher standing in healthcare systems in managing chronic illness; yet, this approach has not received sufficient support in developing countries as these societies still tend to the traditional stage of 'treatment after disease'. Adolescence is a critical period and voyage into adulthood can be more challenging for haemophilia teens. For teens with haemophilia, learning to care for their own disorder is a giant step forward in asserting their independence and preparation for adult life. We aimed to determine impact of health instructions on improving knowledge and practices of haemophilia A adolescents. An interventional study was conducted on 50 haemophilia A adolescents at outpatient clinic of Pediatric Hematology Unit of Zagazig University Hospitals. Three tools were used. The first was a structured interview sheet to evaluate patients' knowledge. The second was a clinical checklist to evaluate patients' practices. The third was health instructions program. Tools were developed by the researchers based on a thorough review of related literature and a full understanding of the needs of haemophilic adolescents. Evaluation of health instructions success was based on comparing scores of tool I and tool II before health instructions (pretest) and after health instructions immediately (posttest) and after 2 months (follow-up test). There was a significant improvement in knowledge and practices of haemophilia A adolescents in posttest and follow-up test compared to pretest. Health instructions have an impact on improving knowledge and practices of haemophilia A adolescents. © 2014 John Wiley & Sons Ltd.
Typology of after-hours care instructions for patients
Bordman, Risa; Bovett, Monica; Drummond, Neil; Crighton, Eric J.; Wheler, David; Moineddin, Rahim; White, David
2007-01-01
OBJECTIVE To develop a typology of after-hours care (AHC) instructions and to examine physician and practice characteristics associated with each type of instruction. DESIGN Cross-sectional telephone survey. Physicians’ offices were called during evenings and weekends to listen to their messages regarding AHC. All messages were categorized. Thematic analysis of a subset of messages was conducted to develop a typology of AHC instructions. Logistic regression analysis was used to identify associations between physician and practice characteristics and the instructions left for patients. SETTING Family practices in the greater Toronto area. PARTICIPANTS Stratified random sample of family physicians providing office-based primary care. MAIN OUTCOME MEASURES Form of response (eg, answering machine), content of message, and physician and practice characteristics. RESULTS Of 514 after-hours messages from family physicians’ offices, 421 were obtained from answering machines, 58 were obtained from answering services, 23 had no answer, 2 gave pager numbers, and 10 had other responses. Message content ranged from no AHC instructions to detailed advice; 54% of messages provided a single instruction, and the rest provided a combination of instructions. Content analysis identified 815 discrete instructions or types of response that were classified into 7 categories: 302 instructed patients to go to an emergency department; 122 provided direct contact with a physician; 115 told patients to go to a clinic; 94 left no directions; 76 suggested calling a housecall service; 45 suggested calling Telehealth; and 61 suggested other things. About 22% of messages only advised attending an emergency department, and 18% gave no advice at all. Physicians who were female, had Canadian certification in family medicine, held hospital privileges, or had attended a Canadian medical school were more likely to be directly available to their patients. CONCLUSION Important issues identified included the recommendation to use an emergency department as the sole source of AHC, practices providing no specific AHC instructions to their patients, and physicians’ lack of acceptance of Telehealth. To improve AHC, new initiatives should build upon the existing system, changes should be integrated, and there should be a range of AHC options for patients and physicians. PMID:17872681
Typology of after-hours care instructions for patients: telephone survey and multivariate analysis.
Bordman, Risa; Bovett, Monica; Drummond, Neil; Crighton, Eric J; Wheler, David; Moineddin, Rahim; White, David
2007-03-01
To develop a typology of after-hours care (AHC) instructions and to examine physician and practice characteristics associated with each type of instruction. Cross-sectional telephone survey. Physicians' offices were called during evenings and weekends to listen to their messages regarding AHC. All messages were categorized. Thematic analysis of a subset of messages was conducted to develop a typology of AHC instructions. Logistic regression analysis was used to identify associations between physician and practice characteristics and the instructions left for patients. Family practices in the greater Toronto area. Stratified random sample of family physicians providing office-based primary care. Form of response (eg, answering machine), content of message, and physician and practice characteristics. Of 514 after-hours messages from family physicians' offices, 421 were obtained from answering machines, 58 were obtained from answering services, 23 had no answer, 2 gave pager numbers, and 10 had other responses. Message content ranged from no AHC instructions to detailed advice; 54% of messages provided a single instruction, and the rest provided a combination of instructions. Content analysis identified 815 discrete instructions or types of response that were classified into 7 categories: 302 instructed patients to go to an emergency department; 122 provided direct contact with a physician; 115 told patients to go to a clinic; 94 left no directions; 76 suggested calling a housecall service; 45 suggested calling Telehealth; and 61 suggested other things. About 22% of messages only advised attending an emergency department, and 18% gave no advice at all. Physicians who were female, had Canadian certification in family medicine, held hospital privileges, or had attended a Canadian medical school were more likely to be directly available to their patients. Important issues identified included the recommendation to use an emergency department as the sole source of AHC, practices providing no specific AHC instructions to their patients, and physicians' lack of acceptance of Telehealth. To improve AHC, new initiatives should build upon the existing system, changes should be integrated, and there should be a range of AHC options for patients and physicians.
Smith, Michelle K.; Knight, Jennifer K.
2012-01-01
To help genetics instructors become aware of fundamental concepts that are persistently difficult for students, we have analyzed the evolution of student responses to multiple-choice questions from the Genetics Concept Assessment. In total, we examined pretest (before instruction) and posttest (after instruction) responses from 751 students enrolled in six genetics courses for either majors or nonmajors. Students improved on all 25 questions after instruction, but to varying degrees. Notably, there was a subgroup of nine questions for which a single incorrect answer, called the most common incorrect answer, was chosen by >20% of students on the posttest. To explore response patterns to these nine questions, we tracked individual student answers before and after instruction and found that particular conceptual difficulties about genetics are both more likely to persist and more likely to distract students than other incorrect ideas. Here we present an analysis of the evolution of these incorrect ideas to encourage instructor awareness of these genetics concepts and provide advice on how to address common conceptual difficulties in the classroom. PMID:22367036
DOE Office of Scientific and Technical Information (OSTI.GOV)
Endele, Max; Etzrodt, Martin; Schroeder, Timm, E-mail: timm.schroeder@bsse.ethz.ch
Hematopoiesis is the cumulative consequence of finely tuned signaling pathways activated through extrinsic factors, such as local niche signals and systemic hematopoietic cytokines. Whether extrinsic factors actively instruct the lineage choice of hematopoietic stem and progenitor cells or are only selectively allowing survival and proliferation of already intrinsically lineage-committed cells has been debated over decades. Recent results demonstrated that cytokines can instruct lineage choice. However, the precise function of individual cytokine-triggered signaling molecules in inducing cellular events like proliferation, lineage choice, and differentiation remains largely elusive. Signal transduction pathways activated by different cytokine receptors are highly overlapping, but support themore » production of distinct hematopoietic lineages. Cellular context, signaling dynamics, and the crosstalk of different signaling pathways determine the cellular response of a given extrinsic signal. New tools to manipulate and continuously quantify signaling events at the single cell level are therefore required to thoroughly interrogate how dynamic signaling networks yield a specific cellular response. - Highlights: • Recent studies provided definite proof for lineage-instructive action of cytokines. • Signaling pathways involved in hematopoietic lineage instruction remain elusive. • New tools are emerging to quantitatively study dynamic signaling networks over time.« less
Na, Ji Young; Wilkinson, Krista M
2017-08-07
Children with Down syndrome often have more restricted emotion expression and recognition skills than their peers who are developing typically, and potentially fewer opportunities to learn these skills. This study investigated the effect of the Strategies for Talking about Emotions as PartnerS (STEPS) programme on parents' provision of opportunities for emotion communication using visual communication supports. The study used a single-subject multiple-baseline across participants design with three parent-child dyads. Shared book reading was used as the context for parent instruction and data collection. Parents increased their use of the emotion communication strategies immediately following an instructional session, and continued to use them for the remaining phases of the study. In turn, the children participated more actively in the discussion by making comments about emotions when parents provided more opportunities. The STEPS instructional programme is effective for improving parents' provision of opportunities for discussing emotions during storybook reading with children who have Down syndrome. All parents indicated that they would use the strategy during future reading activities. This paper discusses the results of the study and directions for future research.
Smith, Michelle K; Knight, Jennifer K
2012-05-01
To help genetics instructors become aware of fundamental concepts that are persistently difficult for students, we have analyzed the evolution of student responses to multiple-choice questions from the Genetics Concept Assessment. In total, we examined pretest (before instruction) and posttest (after instruction) responses from 751 students enrolled in six genetics courses for either majors or nonmajors. Students improved on all 25 questions after instruction, but to varying degrees. Notably, there was a subgroup of nine questions for which a single incorrect answer, called the most common incorrect answer, was chosen by >20% of students on the posttest. To explore response patterns to these nine questions, we tracked individual student answers before and after instruction and found that particular conceptual difficulties about genetics are both more likely to persist and more likely to distract students than other incorrect ideas. Here we present an analysis of the evolution of these incorrect ideas to encourage instructor awareness of these genetics concepts and provide advice on how to address common conceptual difficulties in the classroom.
Roberts, Megan Y; Kaiser, Ann P; Wolfe, Cathy E; Bryant, Julie D; Spidalieri, Alexandria M
2014-10-01
In this study, the authors examined the effects of the Teach-Model-Coach-Review instructional approach on caregivers' use of four enhanced milieu teaching (EMT) language support strategies and on their children's use of expressive language. Four caregiver-child dyads participated in a single-subject, multiple-baseline study. Children were between 24 and 42 months of age and had language impairment. Interventionists used the Teach-Model-Coach-Review instructional approach to teach caregivers to use matched turns, expansions, time delays, and milieu teaching prompts during 24 individualized clinic sessions. Caregiver use of each EMT language support strategy and child use of communication targets were the dependent variables. The caregivers demonstrated increases in their use of each EMT language support strategy after instruction. Generalization and maintenance of strategy use to the home was limited, indicating that teaching across routines is necessary to achieve maximal outcomes. All children demonstrated gains in their use of communication targets and in their performance on norm-referenced measures of language. The results indicate that the Teach-Model-Coach-Review instructional approach resulted in increased use of EMT language support strategies by caregivers. Caregiver use of these strategies was associated with positive changes in child language skills.
Re-examining the effects of verbal instructional type on early stage motor learning.
Bobrownicki, Ray; MacPherson, Alan C; Coleman, Simon G S; Collins, Dave; Sproule, John
2015-12-01
The present study investigated the differential effects of analogy and explicit instructions on early stage motor learning and movement in a modified high jump task. Participants were randomly assigned to one of three experimental conditions: analogy, explicit light (reduced informational load), or traditional explicit (large informational load). During the two-day learning phase, participants learned a novel high jump technique based on the 'scissors' style using the instructions for their respective conditions. For the single-day testing phase, participants completed both a retention test and task-relevant pressure test, the latter of which featured a rising high-jump-bar pressure manipulation. Although analogy learners demonstrated slightly more efficient technique and reported fewer technical rules on average, the differences between the conditions were not statistically significant. There were, however, significant differences in joint variability with respect to instructional type, as variability was lowest for the analogy condition during both the learning and testing phases, and as a function of block, as joint variability decreased for all conditions during the learning phase. Findings suggest that reducing the informational volume of explicit instructions may mitigate the deleterious effects on performance previously associated with explicit learning in the literature. Copyright © 2015 Elsevier B.V. All rights reserved.
Effects of Task Instruction on Autobiographical Memory Specificity in Young and Older Adults
Ford, Jaclyn Hennessey; Rubin, David C.; Giovanello, Kelly S.
2013-01-01
Older adults tend to retrieve autobiographical information that is overly general (i.e. not restricted to a single event, termed the overgenerality effect) relative to young adults’ specific memories. A vast majority of studies that have reported overgenerality effects explicitly instruct participants to retrieve specific memories, thereby requiring participants to maintain task goals, inhibit inappropriate responses, and control their memory search. Since these processes are impaired in healthy aging, it is important to determine whether such task instructions influence the magnitude of the overgenerality effect in older adults. In the current study, participants retrieved autobiographical memories during presentation of musical clips. Task instructions were manipulated to separate age-related differences in the specificity of underlying memory representations from age-related differences in following task instructions. Whereas young adults modulated memory specificity based on task demands, older adults did not. These findings suggest that reported rates of overgenerality in older adults’ memories may include age-related differences in memory representation, as well as differences in task compliance. Such findings provide a better understanding of the underlying cognitive mechanisms involved in age-related changes in autobiographical memory and may also be valuable for future research examining effects of overgeneral memory on general well-being. PMID:23915176
Malleable architecture generator for FPGA computing
NASA Astrophysics Data System (ADS)
Gokhale, Maya; Kaba, James; Marks, Aaron; Kim, Jang
1996-10-01
The malleable architecture generator (MARGE) is a tool set that translates high-level parallel C to configuration bit streams for field-programmable logic based computing systems. MARGE creates an application-specific instruction set and generates the custom hardware components required to perform exactly those computations specified by the C program. In contrast to traditional fixed-instruction processors, MARGE's dynamic instruction set creation provides for efficient use of hardware resources. MARGE processes intermediate code in which each operation is annotated by the bit lengths of the operands. Each basic block (sequence of straight line code) is mapped into a single custom instruction which contains all the operations and logic inherent in the block. A synthesis phase maps the operations comprising the instructions into register transfer level structural components and control logic which have been optimized to exploit functional parallelism and function unit reuse. As a final stage, commercial technology-specific tools are used to generate configuration bit streams for the desired target hardware. Technology- specific pre-placed, pre-routed macro blocks are utilized to implement as much of the hardware as possible. MARGE currently supports the Xilinx-based Splash-2 reconfigurable accelerator and National Semiconductor's CLAy-based parallel accelerator, MAPA. The MARGE approach has been demonstrated on systolic applications such as DNA sequence comparison.
Teacher spatial skills are linked to differences in geometry instruction.
Otumfuor, Beryl Ann; Carr, Martha
2017-12-01
Spatial skills have been linked to better performance in mathematics. The purpose of this study was to examine the relationship between teacher spatial skills and their instruction, including teacher content and pedagogical knowledge, use of pictorial representations, and use of gestures during geometry instruction. Fifty-six middle school teachers participated in the study. The teachers were administered spatial measures of mental rotations and spatial visualization. Next, a single geometry class was videotaped. Correlational analyses revealed that spatial skills significantly correlate with teacher's use of representational gestures and content and pedagogical knowledge during instruction of geometry. Spatial skills did not independently correlate with the use of pointing gestures or the use of pictorial representations. However, an interaction term between spatial skills and content and pedagogical knowledge did correlate significantly with the use of pictorial representations. Teacher experience as measured by the number of years of teaching and highest degree did not appear to affect the relationships among the variables with the exception of the relationship between spatial skills and teacher content and pedagogical knowledge. Teachers with better spatial skills are also likely to use representational gestures and to show better content and pedagogical knowledge during instruction. Spatial skills predict pictorial representation use only as a function of content and pedagogical knowledge. © 2017 The British Psychological Society.
ERIC Educational Resources Information Center
Lent, John
1984-01-01
This article describes a computer network system that connects several microcomputers to a single disk drive and one copy of software. Many schools are switching to networks as a cheaper and more efficient means of computer instruction. Teachers may be faced with copywriting problems when reproducing programs. (DF)
Adding Another Dimension With Holography.
ERIC Educational Resources Information Center
McNair, Rita H.; Rice, Dale R.
1984-01-01
Provides instructions for preparing, processing, and viewing single-beam reflection holograms in science classrooms. Indicates that the process is simple to demonstrate and moderate in cost. A description of the required equipment (optics table, laser, mirrors, lens, filmholder/plateholder, recording materials, and darkroom chemicals/equipment) is…
van den Noort, Josien C.; van Beek, Nathalie; van der Kraan, Thomas; Veeger, DirkJan H. E. J.; Stegeman, Dick F.; Veltink, Peter H.; Maas, Huub
2016-01-01
The variability in the numerous tasks in which we use our hands is very large. However, independent movement control of individual fingers is limited. To assess the extent of finger independency during full-range finger flexion including all finger joints, we studied enslaving (movement in non-instructed fingers) and range of independent finger movement through the whole finger flexion trajectory in single and multi-finger movement tasks. Thirteen young healthy subjects performed single- and multi-finger movement tasks under two conditions: active flexion through the full range of movement with all fingers free to move and active flexion while the non-instructed finger(s) were restrained. Finger kinematics were measured using inertial sensors (PowerGlove), to assess enslaving and range of independent finger movement. Although all fingers showed enslaving movement to some extent, highest enslaving was found in adjacent fingers. Enslaving effects in ring and little finger were increased with movement of additional, non-adjacent fingers. The middle finger was the only finger affected by restriction in movement of non-instructed fingers. Each finger showed a range of independent movement before the non-instructed fingers started to move, which was largest for the index finger. The start of enslaving was asymmetrical for adjacent fingers. Little finger enslaving movement was affected by multi-finger movement. We conclude that no finger can move independently through the full range of finger flexion, although some degree of full independence is present for smaller movements. This range of independent movement is asymmetric and variable between fingers and between subjects. The presented results provide insight into the role of finger independency for different types of tasks and populations. PMID:27992598
Ten Eyck, Raymond P; Tews, Matthew; Ballester, John M; Hamilton, Glenn C
2010-06-01
To determine the impact of simulation-based instruction on student performance in the role of emergency department resuscitation team leader. A randomized, single-blinded, controlled study using an intention to treat analysis. Eighty-three fourth-year medical students enrolled in an emergency medicine clerkship were randomly allocated to two groups differing only by instructional format. Each student individually completed an initial simulation case, followed by a standardized curriculum of eight cases in either group simulation or case-based group discussion format before a second individual simulation case. A remote coinvestigator measured eight objective performance end points using digital recordings of all individual simulation cases. McNemar chi2, Pearson correlation, repeated measures multivariate analysis of variance, and follow-up analysis of variance were used for statistical evaluation. Sixty-eight students (82%) completed both initial and follow-up individual simulations. Eight students were lost from the simulation group and seven from the discussion group. The mean postintervention case performance was significantly better for the students allocated to simulation instruction compared with the group discussion students for four outcomes including a decrease in mean time to (1) order an intravenous line; (2) initiate cardiac monitoring; (3) order initial laboratory tests; and (4) initiate blood pressure monitoring. Paired comparisons of each student's initial and follow-up simulations demonstrated significant improvement in the same four areas, in mean time to order an abdominal radiograph and in obtaining an allergy history. A single simulation-based teaching session significantly improved student performance as a team leader. Additional simulation sessions provided further improvement compared with instruction provided in case-based group discussion format.
Simulation-based instruction of technical skills
NASA Technical Reports Server (NTRS)
Towne, Douglas M.; Munro, Allen
1991-01-01
A rapid intelligent tutoring development system (RAPIDS) was developed to facilitate the production of interactive, real-time graphical device models for use in instructing the operation and maintenance of complex systems. The tools allowed subject matter experts to produce device models by creating instances of previously defined objects and positioning them in the emerging device model. These simulation authoring functions, as well as those associated with demonstrating procedures and functional effects on the completed model, required no previous programming experience or use of frame-based instructional languages. Three large simulations were developed in RAPIDS, each involving more than a dozen screen-sized sections. Seven small, single-view applications were developed to explore the range of applicability. Three workshops were conducted to train others in the use of the authoring tools. Participants learned to employ the authoring tools in three to four days and were able to produce small working device models on the fifth day.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R
Endpoint-based parallel data processing with non-blocking collective instructions in a PAMI of a parallel computer is disclosed. The PAMI is composed of data communications endpoints, each including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task. The compute nodes are coupled for data communications through the PAMI. The parallel application establishes a data communications geometry specifying a set of endpoints that are used in collective operations of the PAMI by associating with the geometry a list of collective algorithms valid for use with themore » endpoints of the geometry; registering in each endpoint in the geometry a dispatch callback function for a collective operation; and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.« less
Jitendra, Asha K; Petersen-Brown, Shawna; Lein, Amy E; Zaslofsky, Anne F; Kunkel, Amy K; Jung, Pyung-Gang; Egan, Andrea M
2015-01-01
This study examined the quality of the research base related to strategy instruction priming the underlying mathematical problem structure for students with learning disabilities and those at risk for mathematics difficulties. We evaluated the quality of methodological rigor of 18 group research studies using the criteria proposed by Gersten et al. and 10 single case design (SCD) research studies using criteria suggested by Horner et al. and the What Works Clearinghouse. Results indicated that 14 group design studies met the criteria for high-quality or acceptable research, whereas SCD studies did not meet the standards for an evidence-based practice. Based on these findings, strategy instruction priming the mathematics problem structure is considered an evidence-based practice using only group design methodological criteria. Implications for future research and for practice are discussed. © Hammill Institute on Disabilities 2013.
Active learning of geometrical optics in high school: the ALOP approach
NASA Astrophysics Data System (ADS)
Alborch, Alejandra; Pandiella, Susana; Benegas, Julio
2017-09-01
A group comparison experiment of two high school classes with pre and post instruction testing has been carried out to study the suitability and advantages of using the active learning of optics and photonics (ALOP) curricula in high schools of developing countries. Two parallel, mixed gender, 12th grade classes of a high school run by the local university were chosen. One course was randomly selected to follow the experimental instruction, based on teacher and student activities contained in the ALOP Manual. The other course followed the traditional, teacher-centered, instruction previously practiced. Conceptual knowledge of the characteristics of image formation by plane mirrors and single convergent and divergent lenses was measured by applying, in both courses, the multiple-choice test, light and optics conceptual evaluation (LOCE). Measurement before instruction showed that initial knowledge was almost null, and therefore equivalent, in both courses. After instruction testing showed that the conceptual knowledge of students following the ALOP curricula more than doubled that achieved by students in the control course, a situation maintained throughout the six conceptual dimensions tested by the 34 questions of the LOCE test used in this experiment. Using a 60% performance level on the LOCE test as the threshold of satisfactory performance, most (about 90%) of the experimental group achieved this level—independent of initial knowledge, while no student following traditional instruction reached this level of understanding. Some considerations and recommendations for prospective users are also included.
Marti, A M; Harris, B T; Metz, M J; Morton, D; Scarfe, W C; Metz, C J; Lin, W-S
2017-08-01
With increasing use of digital scanning with restorative procedures in the dental office, it becomes necessary that educational institutions adopt instructional methodology for introducing this technology together with conventional impression techniques. To compare the time differences between instructing dental students on digital scanning (DS) (LAVA C.O.S. digital impression system) and a conventional impression technique (CI) (polyvinyl siloxane), and to compare students' attitudes and beliefs towards both techniques. Volunteer sophomore dental students (n = 25) with no prior experience in clinical impressions were recruited and IRB consent obtained. Participants responded to a pre-and post-exposure questionnaire. Participants were instructed on the use of both DS and CI for a single tooth full coverage crown restoration using a consecutive sequence of video lecture, investigator-led demonstration and independent impression exercise. The time necessary for each step (minutes) was recorded. Statistical significance was calculated using dependent t-tests (time measurements) and 2-sample Mann-Whitney (questionnaire responses). The time spent teaching students was greater for DS than CI for video lecture (15.95 and 10.07 min, P = 0.0000), demonstration time (9.06 and 4.70 min, P = 0.0000) and impression time (18.17 and 8.59 min, P = 0.0000). Prior to the instruction and practice, students considered themselves more familiar with CI (3.96) than DS (1.96) (P = 0.0000). After the instruction and practice, participants reported CI technique proved significantly easier than expected (pre-instruction: 3.52 and post-instruction: 4.08, P = 0.002). However, overall participants' perception of ease of use for DS was not influenced by this instruction and practice experience (pre-instruction: 3.84 and post-instruction: 3.56, P = 0.106). Despite the results, 96% of participants expressed an expectation that DS will become their predominant impression technique during their careers. Dental students with no clinical experience have high expectations for digital scanning, and despite their initial difficulty, expect it to become their primary impression technique during their professional futures. The instructional time necessary for introducing DS into the curriculum is significantly greater than CI in both classroom (lecture) and clinical simulation settings (investigator-led demonstration). © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
TEMP, TEXAS EDUCATIONAL MICROWAVE PROJECT. FINAL REPORT.
ERIC Educational Resources Information Center
SCHENKKAN, R.F.; AND OTHERS
A PILOT EFFORT TO LINK 11 INSTITUTIONS OF HIGHER LEARNING BY MICROWAVE TRANSMITTERS FOR INSTRUCTIONAL PURPOSES WAS DEMONSTRATED. THIS MICROWAVE LINKAGE PROVIDED A SINGLE CLOSED-CIRCUIT TELEVISION SYSTEM THROUGHOUT THE PARTICIPATING INSTITUTIONS. THROUGH THE FORMATION OF THE "ELECTRONIC CAMPUS," THE FACULTY RESOURCES OF ALL 11 SCHOOLS…
29 CFR 1926.12 - Reorganization Plan No. 14 of 1950.
Code of Federal Regulations, 2013 CFR
2013-07-01
... purposes. Research and related purposes means research, research training, surveys, or demonstrations in... research, research training, surveys, or demonstrations in the field of sectarian instruction or the... mortgage insurance on single family or multifamily housing with experimental design of materials. (ix) War...
29 CFR 1926.12 - Reorganization Plan No. 14 of 1950.
Code of Federal Regulations, 2012 CFR
2012-07-01
... purposes. Research and related purposes means research, research training, surveys, or demonstrations in... research, research training, surveys, or demonstrations in the field of sectarian instruction or the... mortgage insurance on single family or multifamily housing with experimental design of materials. (ix) War...
29 CFR 1926.12 - Reorganization Plan No. 14 of 1950.
Code of Federal Regulations, 2014 CFR
2014-07-01
... purposes. Research and related purposes means research, research training, surveys, or demonstrations in... research, research training, surveys, or demonstrations in the field of sectarian instruction or the... mortgage insurance on single family or multifamily housing with experimental design of materials. (ix) War...
29 CFR 1926.12 - Reorganization Plan No. 14 of 1950.
Code of Federal Regulations, 2011 CFR
2011-07-01
... purposes. Research and related purposes means research, research training, surveys, or demonstrations in... research, research training, surveys, or demonstrations in the field of sectarian instruction or the... mortgage insurance on single family or multifamily housing with experimental design of materials. (ix) War...
ERIC Educational Resources Information Center
Allegheny County Community Coll., Pittsburgh, PA.
Instructional objectives and performance requirements are outlined in this course guide for Welding IV, a competency-based course in advanced arc welding offered at the Community College of Allegheny County to provide students with proficiency in: (1) single vee groove welding using code specifications established by the American Welding Society…
MICROBIAL LABORATORY GUIDANCE MANUAL FOR THE ...
The Long-Term 2 Enhanced Surface Water Treatment Rule Laboratory Instruction Manual will be a compilation of all information needed by laboratories and field personnel to collect, analyze, and report the microbiological data required under the rule. The manual will provide laboratories with a single source of information that currently is available from various sources including the latest versions of Methods 1622 and 1623, including all approved, equivalent modifications; the procedures for E.coli methods approved for use under the LT2ESWTR; lists of vendor sources; data recording forms; data reporting requirements; information on the Laboratory Quality Assurance Evaluation Program for the Analysis of Cryptosporidium in Water; and sample collection procedures. Although most of this information is available elsewhere, a single, comprehensive compendium containing this information is needed to aid utilities and laboratories performing the sampling and analysis activities required under the LT2 rule. This manual will serve as an instruction manual for laboratories to use when collecting data for Crypto, E. coli and turbidity.
Zhao, Li; Xing, Xiao; Guo, Xuhong; Liu, Zehua; He, Yang
2014-10-01
Brain-computer interface (BCI) system is a system that achieves communication and control among humans and computers and other electronic equipment with the electroencephalogram (EEG) signals. This paper describes the working theory of the wireless smart home system based on the BCI technology. We started to get the steady-state visual evoked potential (SSVEP) using the single chip microcomputer and the visual stimulation which composed by LED lamp to stimulate human eyes. Then, through building the power spectral transformation on the LabVIEW platform, we processed timely those EEG signals under different frequency stimulation so as to transfer them to different instructions. Those instructions could be received by the wireless transceiver equipment to control the household appliances and to achieve the intelligent control towards the specified devices. The experimental results showed that the correct rate for the 10 subjects reached 100%, and the control time of average single device was 4 seconds, thus this design could totally achieve the original purpose of smart home system.
Adopting reform-based pedagogy in post-secondary microbiology education
NASA Astrophysics Data System (ADS)
Bonner, Jeffery W.
Current emphasis on improving student learning and retention in post-secondary science education can potentially motivate veteran faculty to reconsider what is often a traditional, instructor-centered instructional model. Alternative models that foster a student-centered classroom environment are more aligned with research on how students learn. These models often incorporate active-learning opportunities that engage students in ways that passively taking notes in an instructor-centered classroom cannot. Although evidence is mounting that active-learning is an effective strategy for improving student learning and attitude, university professors, without formal pedagogical knowledge and training, can face uncertainty about where to start and how to implement these strategies. The research presented here was conducted in two parts under the same context during one semester of a post-secondary microbiology course. First, a quantitative study was conducted to compare collaborative and individual completion of a reform-based instructional strategy that utilized a student-centered, active-learning component. Students were evaluated on learning, critical thinking, and epistemological beliefs about biology. Results indicated no significant differences between treatment groups. Interestingly, the impact of active-learning implementations had positive effects on students' epistemological beliefs. This was a finding contradicting previous research in which epistemological beliefs became more novice-like in science majors enrolled in courses without an active-learning component. Study two represents one case in which a professor with a traditional instructional model became motivated to pursue instructional change in his introductory microbiology course. A single-case qualitative study was conducted to document the professor's initial effort at instructional reform. Results indicated that his utilization and understanding of reform-based instructional strategies improved over the course of one semester. Furthermore, this sustained effort of reform resulted in positive opinions developed by the professor regarding the use of reform-based instructional strategies in the future.
NASA Astrophysics Data System (ADS)
Brooks, John
A problem facing science educators is determining the most effective means of science instruction so that students will meet or exceed the new rigorous standards. The theoretical framework for this study was based on reform and research efforts that have informed science teachers that using constructivism is the best method of science instruction. The purpose of this study was to investigate how the constructivist method of science instruction affected student achievement and student motivation in a sixth grade science classroom. The guiding research question involved understanding which method of science instruction would be most effective at improving student achievement in science. Other sub-questions included the factors that contribute to student motivation in science and the method of science instruction students receive that affects motivation to learn science. Quantitative data were collected using a pre-test and post-test single group design. T-test and ANCOVA were used to test quantitative hypotheses. Qualitative data were collected using student reflective journals and classroom discussions. Students' perspectives were transcribed, coded and used to further inform quantitative findings. The findings of this study supported the recommendations made by science reformists that the best method of science instruction was a constructivist method. This study also found that participant comments favored constructivist taught classes. The implications for social change at the local level included potential increases in student achievement in science and possibly increased understanding that can facilitate similar changes at other schools. From a global perspective, constructivist-oriented methods might result in students becoming more interested in majoring in science at the college level and in becoming part of a scientifically literate work force.
LHCb Kalman Filter cross architecture studies
NASA Astrophysics Data System (ADS)
Cámpora Pérez, Daniel Hugo
2017-10-01
The 2020 upgrade of the LHCb detector will vastly increase the rate of collisions the Online system needs to process in software, in order to filter events in real time. 30 million collisions per second will pass through a selection chain, where each step is executed conditional to its prior acceptance. The Kalman Filter is a fit applied to all reconstructed tracks which, due to its time characteristics and early execution in the selection chain, consumes 40% of the whole reconstruction time in the current trigger software. This makes the Kalman Filter a time-critical component as the LHCb trigger evolves into a full software trigger in the Upgrade. I present a new Kalman Filter algorithm for LHCb that can efficiently make use of any kind of SIMD processor, and its design is explained in depth. Performance benchmarks are compared between a variety of hardware architectures, including x86_64 and Power8, and the Intel Xeon Phi accelerator, and the suitability of said architectures to efficiently perform the LHCb Reconstruction process is determined.
Particle simulation of plasmas on the massively parallel processor
NASA Technical Reports Server (NTRS)
Gledhill, I. M. A.; Storey, L. R. O.
1987-01-01
Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.
Highly parallel sparse Cholesky factorization
NASA Technical Reports Server (NTRS)
Gilbert, John R.; Schreiber, Robert
1990-01-01
Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.
Multitasking runtime systems for the Cedar Multiprocessor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guzzi, M.D.
1986-07-01
The programming of a MIMD machine is more complex than for SISD and SIMD machines. The multiple computational resources of the machine must be made available to the programming language compiler and to the programmer so that multitasking programs may be written. This thesis will explore the additional complexity of programming a MIMD machine, the Cedar Multiprocessor specifically, and the multitasking runtime system necessary to provide multitasking resources to the user. First, the problem will be well defined: the Cedar machine, its operating system, the programming language, and multitasking concepts will be described. Second, a solution to the problem, calledmore » macrotasking, will be proposed. This solution provides multitasking facilities to the programmer at a very coarse level with many visible machine dependencies. Third, an alternate solution, called microtasking, will be proposed. This solution provides multitasking facilities of a much finer grain. This solution does not depend so rigidly on the specific architecture of the machine. Finally, the two solutions will be compared for effectiveness. 12 refs., 16 figs.« less
ERIC Educational Resources Information Center
Yee, Kevin; Hargis, Jace
2010-01-01
This article discusses the benefits of screencasts and its instructional uses. Well-known for some years to advanced technology users, Screen Capture Software (SCS) offers the promise of recording action on the computer desktop together with voiceover narration, all combined into a single movie file that can be shared, emailed, or uploaded.…
47 CFR 54.502 - Eligible services.
Code of Federal Regulations, 2012 CFR
2012-10-01
... SERVICE Universal Service Support for Schools and Libraries § 54.502 Eligible services. (a) Supported... comprise a single library branch. Discounts are not available for internal connections in non-instructional buildings of a school or school district, or in administrative buildings of a library, to the extent that a...
Observation of the Intrinsic Bandgap Behavior in As-Grown Epitaxial Twisted Graphene (Postprint)
2015-01-06
the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing this...intensity, which becomes compar able to that of G band intensity and is now a symmetric , single Lorentzian peak. The third difference is the 2D band peak...spectrum with SiC signal subtracted. (c,d) Variation of Raman C60/GF FLG spectra as a function of C60 deposition time . A single Lorentzian 2D band was
NASA Astrophysics Data System (ADS)
Ford, Gregory Scott
2007-12-01
Title. Effect of computer-aided instruction versus traditional modes on student PT's learning musculoskeletal special tests. Problem. Lack of quantitative evidence to support the use of computer-aided instruction (CAI) in PT education for both the cognitive and psychomotor domains and lack of qualitative support as to an understanding why CAI may or may not be effective. Design. 3 group single-blind pre-test, immediate post-test, final post-test repeated measures with qualitative survey for the CAI group. Methods. Subjects were randomly assigned to CAI, live demonstration or textbook learning groups. Three novel special tests were instructed. Analysis of performance on written and practical examinations was conducted across the 3 repeated measures. A qualitative survey was completed by the CAI group post intervention. Results. CAI is equally as effective as live demonstration and textbook learning of musculoskeletal special tests in the cognitive domain, however, CAI was superior to live demonstration and textbook instruction at final post-testing. Significance. The significance of this research is that a gap in the literature of PT education needs to be bridged as it pertains to the effect of CAI on learning in both the cognitive and psychomotor domains as well as attempt to understand why CAI results in certain student performance. The methods of this study allowed for a wide range of generalizability to any and all PT programs across the country.
Science teacher orientations and PCK across science topics in grade 9 earth science
NASA Astrophysics Data System (ADS)
Campbell, Todd; Melville, Wayne; Goodwin, Dawne
2017-07-01
While the literature is replete with studies examining teacher knowledge and pedagogical content knowledge (PCK), few studies have investigated how science teacher orientations (STOs) shape classroom instruction. Therefore, this research explores the interplay between a STOs and the topic specificity of PCK across two science topics within a grade 9 earth science course. Through interviews and observations of one teacher's classroom across two sequentially taught, this research contests the notion that teachers hold a single way of conceptualising science teaching and learning. In this, we consider if multiple ontologies can provide potential explanatory power for characterising instructional enactments. In earlier work with the teacher in this study, using generic interview prompts and general discussions about science teaching and learning, we accepted the existence of a unitary STO and its promise of consistent reformed instruction in the classroom. However, upon close examination of instruction focused on different science topics, evidence was found to demonstrate the explanatory power of multiple ontologies for shaping characteristically different epistemological constructions across science topics. This research points to the need for care in generalising about teacher practice, as it reveals that a teacher's practice, and orientation, can vary, dependent on the context and science topics taught.
ERIC Educational Resources Information Center
Koza, Julia Eklund
1993-01-01
Reports on an analysis of gender-related references appearing in the "Music Supervisors' Journal" from 1914-24. Finds that both coeducational and single-sex musical organizations were discussed and that vocal and instrumental instruction for boys and girls was advocated. (CFR)
Screen Layout Design: Research into the Overall Appearance of the Screen.
ERIC Educational Resources Information Center
Grabinger, R. Scott
1989-01-01
Examines the current state of research into the visual effects of screen designs used in computer-assisted instruction and suggests areas for future efforts. Topics discussed include technical elements and comprehensibility elements in layout design; single element and multiple element research methodologies; dependent variables; and learning…
Developing Instructional Leaders through Assistant Principals' Academy: A Partnership for Success
ERIC Educational Resources Information Center
Gurley, D. Keith; Anast-May, Linda; Lee, H. T.
2015-01-01
This article describes findings from a single-case qualitative study of a unique 2-year professional development academy for practicing assistant principals designed and implemented in partnership between school district personnel and university educational leadership faculty members. The study was conducted based on the theoretical framework of…
Effects of Collaborative Online Learning on EFL Learners' Writing Performance and Self-Efficacy
ERIC Educational Resources Information Center
Tai, Hung-Cheng
2016-01-01
This study explored the effects of collaborative writing instruction on undergraduate nursing students' writing performance and self-efficacy beliefs within an online learning system. A single-group experimental study utilized two instruments, the NCEEC (National College Entrance Examination Center) writing grading criteria (the SRCT) and a…
One Approach to Teaching the Specific Language Disabled Adult Language Arts.
ERIC Educational Resources Information Center
Peterson, Binnie L.
1981-01-01
One approach never before used in adult language arts instruction--the Slingerland Simultaneous Multisensory Technique--has been found useful for specific language disabled adults in multisensory programs at Anchorage Community College. The Slingerland method builds from single sight, sound, and feel of letters through combinations, encoding,…
Conceptually Based Vocabulary Intervention: Second Graders' Development of Vocabulary Words
ERIC Educational Resources Information Center
Dimling, Lisa M.
2010-01-01
An instructional strategy was investigated that addressed the needs of deaf and hard of hearing students through a conceptually based sign language vocabulary intervention. A single-subject multiple-baseline design was used to determine the effects of the vocabulary intervention on word recognition, production, and comprehension. Six students took…
Achieving Balance: Secondary Physical Education Gender-Grouping Options
ERIC Educational Resources Information Center
Gabbei, Ritchie
2004-01-01
This article provides options and a rationale for expanding gender-grouping considerations to include single-gender, coed, and combination strategies for instruction in secondary physical education classes. This rationale is based on empirical evidence that suggests that female students are denied equal opportunity to achieve learning goals during…
Narrated Animated Solution Videos in a Mastery Setting
ERIC Educational Resources Information Center
Schroeder, Noah; Gladding, Gary; Gutmann, Brianne; Stelzer, Timothy
2015-01-01
Narrated animated solution videos were implemented in a clinical study that compared a mastery setting that employed repeated cycles of testing with instructional support to a group that had a single opportunity to experience the materials. The mastery setting students attempted sequential questions sets on a topic, with animated solutions between…
Say again? How complexity and format of air traffic control instructions affect pilot recall
DOT National Transportation Integrated Search
1999-01-01
This study compared the recall of ATC information presented in cither grouped or sequential format : in a part-task simulation. It also tested the effect of complexity of ATC clearances on recall, that is, : how many pieces of information a single tr...
Expanding Literacy for Learners with Intellectual Disabilities: The Role of Supported eText
ERIC Educational Resources Information Center
Douglas, Karen H.; Ayres, Kevin M.; Langone, John; Bell, Virginia; Meade, Cara
2009-01-01
A series of single-subject experiments were conducted to evaluate the effects of presentational, translational, illustrative, instructional, and summarizing supports on the reading and listening comprehension of students with moderate intellectual disabilities. The specific eText supports under investigation included digitized voice and…
ERIC Educational Resources Information Center
Gage, Nicholas A.; Grasley-Boy, Nicolette M.; MacSuga-Gage, Ashley S.
2018-01-01
Effective classroom instruction is contingent upon successful classroom management. Unfortunately, not all teachers successfully manage classroom behavior and need in-service professional development. In this study, we replicated a targeted professional development approach that included a brief one-on-one training session and emailed visual…
Writing for Distance Education. Samples Booklet.
ERIC Educational Resources Information Center
International Extension Coll., Cambridge (England).
Approaches to the format, design, and layout of printed instructional materials for distance education are illustrated in 36 samples designed to accompany the manual, "Writing for Distance Education." Each sample is presented on a single page with a note pointing out its key features. Features illustrated include use of typescript layout, a comic…
ERIC Educational Resources Information Center
Rex, Jim; Chadwell, David
2009-01-01
Public schools are offering more choices because educators increasingly have come to believe that a broader instructional menu brings positive results for everyone involved. The days of parents simply signing up their children at the neighborhood school for a one-size-fits-all curriculum are nearly over. In South Carolina, parents in high-choice…
Educational Media and Technology Yearbook, 1993. Volume 19.
ERIC Educational Resources Information Center
Ely, Donald P., Ed.; Minor, Barbara B., Ed.
This yearbook is designed to provide media and instructional technology professionals with an up-to-date, single-source overview and assessment of the field of educational technology. It offers organized access to the hot topics, trends, issues, and advancements in the field, with comprehensive coverage of developments in theory, hardware,…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-04-02
... Procedures C. Review of Single-Voltage External Power Supply Test Procedure D. Multiple-Voltage External...) Deletions of Existing Definitions (b) Revisions to Existing Definitions (c) Additions of New Definitions 4. Test Apparatus and General Instructions (a) Confidence Intervals (b) Temperature (c) AC Input Voltage...
Cognitive Factors in Sexual Arousal: The Role of Distraction
ERIC Educational Resources Information Center
Geer, James H.; Fuhr, Robert
1976-01-01
Four groups of male undergraduates were instructed to perform complex cognitive operations when randomly presented single digits of a dichotic listening paradigm. An erotic tape recording was played into the nonattended ear. Sexual arousal varied directly as a function of the complexity of the distracting cognitive operations. (Author)
Relating French Immersion Teacher Practices to Better Student Oral Production
ERIC Educational Resources Information Center
Haj-Broussard, Michelle; Olson Beal, Heather K.; Boudreaux, Nicole
2017-01-01
This study examined seven Louisiana kindergarten immersion teachers' practices to evaluate students' oral target language production and compare the oral production elicited when different instructional practices were used over a single semester. Three rounds of three 20-minute observations in three different contexts--circle time, direct…
The Neural Basis of Cognitive Control: Response Selection and Inhibition
ERIC Educational Resources Information Center
Goghari, Vina M.; MacDonald, Angus W., III
2009-01-01
The functional neuroanatomy of tasks that recruit different forms of response selection and inhibition has to our knowledge, never been directly addressed in a single fMRI study using similar stimulus-response paradigms where differences between scanning time and sequence, stimuli, and experimenter instructions were minimized. Twelve right-handed…
Muscle Responses to Stimulation of the Tadpole Tail
ERIC Educational Resources Information Center
Funkhouser, Anne
1976-01-01
Describes use of tail muscles and spinal cord in the tadpole as an alternative source for muscle-and-nerve experiments. Includes explanation of simple dissection and preparation of tadpole; instructions for experiments such as threshold, strength of stimulus, frequency of stimulus, single twitch, tetanus, fatigue, effects of temperature on…
Continuing the Classroom Community: Suggestions for Using Online Discussion Boards
ERIC Educational Resources Information Center
Jewell, Vivian
2005-01-01
A considerable use of technology to supplement classroom instruction could improve student learning. A high school teacher reveals the ways in which the use of online discussions of literature assignments increases student participation by extending dialogue beyond the physical space and time of a single class.
14 CFR 135.421 - Additional maintenance requirements.
Code of Federal Regulations, 2011 CFR
2011-01-01
... programs, or a program approved by the Administrator, for each aircraft engine, propeller, rotor, and each... instructions set forth by the manufacturer as required by this chapter for the aircraft, aircraft engine, propeller, rotor or item of emergency equipment. (c) For each single engine aircraft to be used in passenger...
14 CFR 135.421 - Additional maintenance requirements.
Code of Federal Regulations, 2014 CFR
2014-01-01
... programs, or a program approved by the Administrator, for each aircraft engine, propeller, rotor, and each... instructions set forth by the manufacturer as required by this chapter for the aircraft, aircraft engine, propeller, rotor or item of emergency equipment. (c) For each single engine aircraft to be used in passenger...
14 CFR 135.421 - Additional maintenance requirements.
Code of Federal Regulations, 2012 CFR
2012-01-01
... programs, or a program approved by the Administrator, for each aircraft engine, propeller, rotor, and each... instructions set forth by the manufacturer as required by this chapter for the aircraft, aircraft engine, propeller, rotor or item of emergency equipment. (c) For each single engine aircraft to be used in passenger...
14 CFR 135.421 - Additional maintenance requirements.
Code of Federal Regulations, 2010 CFR
2010-01-01
... programs, or a program approved by the Administrator, for each aircraft engine, propeller, rotor, and each... instructions set forth by the manufacturer as required by this chapter for the aircraft, aircraft engine, propeller, rotor or item of emergency equipment. (c) For each single engine aircraft to be used in passenger...
14 CFR 135.421 - Additional maintenance requirements.
Code of Federal Regulations, 2013 CFR
2013-01-01
... programs, or a program approved by the Administrator, for each aircraft engine, propeller, rotor, and each... instructions set forth by the manufacturer as required by this chapter for the aircraft, aircraft engine, propeller, rotor or item of emergency equipment. (c) For each single engine aircraft to be used in passenger...
Basic Reading Instruction for Students in Automotive Occupations. Student's Handbook.
ERIC Educational Resources Information Center
General Behavioral Systems, Inc., Torrance, CA.
The basic reading course outlined in this student handbook emphasizes the decoding process. The contents consist of a letter-and-sound spelling chart and 87 course modules which are based on single-letter and letter-combination sounds. Many of the modules include exercises, and some contain reading material. (JM)
Learning through Aviation. Final Report.
ERIC Educational Resources Information Center
Conway, Lee
This study summarizes the effects of an educational experiment which used a light, single engine airplane to generate basic instructional and behavioral changes in an inner city junior high school class. The project involved 25 disadvantaged area, 13-year-old boys and their parents, four regular staff teachers, two pilot instructors and a college…
Mindful Listening Instruction: Does It Make a Difference
ERIC Educational Resources Information Center
Anderson, William Todd
2013-01-01
This study examines the effect of mindfulness on student listening. Mindfulness is defined as "the process of noticing novel distinctions." Fifth grade students (N = 38) at a single school participated in this study, which used a posttest-only, random selection experimental design. The Independent Variable was exposure to mindful…
The Imperfect Art of Designing Online Courses
ERIC Educational Resources Information Center
Berrett, Dan
2012-01-01
Growing pressure to provide more virtual instruction is spurring efforts to design large courses that balance standardization of content with flexibility for instructors. Each course uses a common template, which sets out lesson objectives, lecture material, practice activities, and assessments. The process results in a single version of each…
ERIC Educational Resources Information Center
Texas Tech Univ., Lubbock. Home Economics Curriculum Center.
This student activities book is designed to allow students to examine the multiple roles of contemporary dual-earner couples. It examines the concerns of the two-career couple and the sharing of male and female responsibilities as well as certain roles of employed single adults. Intended for use independently or with classroom instruction, this…
Choice and Effects of Instrument Sound in Aural Training
ERIC Educational Resources Information Center
Loh, Christian Sebastian
2007-01-01
A musical note produced through the vibration of a single string is psychoacoustically simpler/purer than that produced via multiple-strings vibration. Does the psychoacoustics of instrument sound have any effect on learning outcomes in music instruction? This study investigated the effect of two psychoacoustically distinct instrument sounds on…
Building Software Development Capacity to Advance the State of Educational Technology
ERIC Educational Resources Information Center
Luterbach, Kenneth J.
2013-01-01
Educational technologists may advance the state of the field by increasing capacity to develop software tools and instructional applications. Presently, few academic programs in educational technology require even a single computer programming course. Further, the educational technologists who develop software generally work independently or in…
More than Words: Probing the Terms Undergraduate Students Use to Describe Their Instructors
ERIC Educational Resources Information Center
Kendall, K. Denise; Schussler, Elisabeth E.
2013-01-01
Undergraduates often use single words to describe their instructors, including "boring," "enthusiastic," and "organized," but what instructional behaviors cause students to use these words? This study utilized interviews and an online survey to ask students to translate commonly used instructor descriptions into their…
Advanced Electronics Systems 1, Industrial Electronics 3: 9327.03.
ERIC Educational Resources Information Center
Dade County Public Schools, Miami, FL.
The 135 clock-hour course for the 12th year consists of outlines for blocks of instruction on transistor applications to basic circuits, principles of single sideband communications, maintenance practices, preparation for FCC licenses, application of circuits to advanced electronic systems, nonsinusoidal wave shapes, multivibrators, and blocking…
Learning to Leverage Children's Multiple Mathematical Knowledge Bases in Mathematics Instruction
ERIC Educational Resources Information Center
Turner, Erin E.; Foote, Mary Q.; Stoehr, Kathleen Jablon; McDuffie, Amy Roth; Aguirre, Julia Maria; Bartell, Tonya Gau; Drake, Corey
2016-01-01
In this article, the authors explore prospective elementary teachers' engagement with and reflection on activities they conducted to learn about a single child from their practicum classroom. Through these activities, prospective teachers learned about their child's mathematical thinking and the interests, competencies, and resources she or he…
Principals: Don't Settle for Rolling the Boulder
ERIC Educational Resources Information Center
Hess, Frederick M.
2013-01-01
By encouraging a single-minded focus on instructional leadership, the training, socializing, and mentoring of school leaders has unwittingly fostered a culture of caged leadership. Leaders are expected to succeed via culture, capacity building, coaching, and consensus--no matter the obstacles in their path. Those are all good things, the author…
Computer Graphics and Metaphorical Elaboration for Learning Science Concepts.
ERIC Educational Resources Information Center
ChanLin, Lih-Juan; Chan, Kung-Chi
This study explores the instructional impact of using computer multimedia to integrate metaphorical verbal information into graphical representations of biotechnology concepts. The combination of text and graphics into a single metaphor makes concepts dual-coded, and therefore more comprehensible and memorable for the student. Visual stimuli help…
An Artificial Intelligence Tutor: A Supplementary Tool for Teaching and Practicing Braille
ERIC Educational Resources Information Center
McCarthy, Tessa; Rosenblum, L. Penny; Johnson, Benny G.; Dittel, Jeffrey; Kearns, Devin M.
2016-01-01
Introduction: This study evaluated the usability and effectiveness of an artificial intelligence Braille Tutor designed to supplement the instruction of students with visual impairments as they learned to write braille contractions. Methods: A mixed-methods design was used, which incorporated a single-subject, adapted alternating treatments design…
Classroom-Based Measurement and Portfolio Assessment.
ERIC Educational Resources Information Center
Nolet, Victor
1992-01-01
Portfolio assessment involves collecting multiple forms of data to support inferences about student performance in skill or content areas that cannot be sampled directly by a single measure. Portfolio assessment can help to clarify and individualize instructional goals for regular and special education students as well as suggest research and…