Sample records for parallel design study

  1. A Parallel Trade Study Architecture for Design Optimization of Complex Systems

    NASA Technical Reports Server (NTRS)

    Kim, Hongman; Mullins, James; Ragon, Scott; Soremekun, Grant; Sobieszczanski-Sobieski, Jaroslaw

    2005-01-01

    Design of a successful product requires evaluating many design alternatives in a limited design cycle time. This can be achieved through leveraging design space exploration tools and available computing resources on the network. This paper presents a parallel trade study architecture to integrate trade study clients and computing resources on a network using Web services. The parallel trade study solution is demonstrated to accelerate design of experiments, genetic algorithm optimization, and a cost as an independent variable (CAIV) study for a space system application.

  2. The Importance of Considering Differences in Study Design in Network Meta-analysis: An Application Using Anti-Tumor Necrosis Factor Drugs for Ulcerative Colitis.

    PubMed

    Cameron, Chris; Ewara, Emmanuel; Wilson, Florence R; Varu, Abhishek; Dyrda, Peter; Hutton, Brian; Ingham, Michael

    2017-11-01

    Adaptive trial designs present a methodological challenge when performing network meta-analysis (NMA), as data from such adaptive trial designs differ from conventional parallel design randomized controlled trials (RCTs). We aim to illustrate the importance of considering study design when conducting an NMA. Three NMAs comparing anti-tumor necrosis factor drugs for ulcerative colitis were compared and the analyses replicated using Bayesian NMA. The NMA comprised 3 RCTs comparing 4 treatments (adalimumab 40 mg, golimumab 50 mg, golimumab 100 mg, infliximab 5 mg/kg) and placebo. We investigated the impact of incorporating differences in the study design among the 3 RCTs and presented 3 alternative methods on how to convert outcome data derived from one form of adaptive design to more conventional parallel RCTs. Combining RCT results without considering variations in study design resulted in effect estimates that were biased against golimumab. In contrast, using the 3 alternative methods to convert outcome data from one form of adaptive design to a format more consistent with conventional parallel RCTs facilitated more transparent consideration of differences in study design. This approach is more likely to yield appropriate estimates of comparative efficacy when conducting an NMA, which includes treatments that use an alternative study design. RCTs based on adaptive study designs should not be combined with traditional parallel RCT designs in NMA. We have presented potential approaches to convert data from one form of adaptive design to more conventional parallel RCTs to facilitate transparent and less-biased comparisons.

  3. A Comparison of Parallelism in Interface Designs for Computer-Based Learning Environments

    ERIC Educational Resources Information Center

    Min, Rik; Yu, Tao; Spenkelink, Gerd; Vos, Hans

    2004-01-01

    In this paper we discuss an experiment that was carried out with a prototype, designed in conformity with the concept of parallelism and the Parallel Instruction theory (the PI theory). We designed this prototype with five different interfaces, and ran an empirical study in which 18 participants completed an abstract task. The five basic designs…

  4. ARTS III/Parallel Processor Design Study

    DOT National Transportation Integrated Search

    1975-04-01

    It was the purpose of this design study to investigate the feasibility, suitability, and cost-effectiveness of augmenting the ARTS III failsafe/failsoft multiprocessor system with a form of parallel processor to accomodate a large growth in air traff...

  5. Parallel machine architecture and compiler design facilities

    NASA Technical Reports Server (NTRS)

    Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex

    1990-01-01

    The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.

  6. A CS1 pedagogical approach to parallel thinking

    NASA Astrophysics Data System (ADS)

    Rague, Brian William

    Almost all collegiate programs in Computer Science offer an introductory course in programming primarily devoted to communicating the foundational principles of software design and development. The ACM designates this introduction to computer programming course for first-year students as CS1, during which methodologies for solving problems within a discrete computational context are presented. Logical thinking is highlighted, guided primarily by a sequential approach to algorithm development and made manifest by typically using the latest, commercially successful programming language. In response to the most recent developments in accessible multicore computers, instructors of these introductory classes may wish to include training on how to design workable parallel code. Novel issues arise when programming concurrent applications which can make teaching these concepts to beginning programmers a seemingly formidable task. Student comprehension of design strategies related to parallel systems should be monitored to ensure an effective classroom experience. This research investigated the feasibility of integrating parallel computing concepts into the first-year CS classroom. To quantitatively assess student comprehension of parallel computing, an experimental educational study using a two-factor mixed group design was conducted to evaluate two instructional interventions in addition to a control group: (1) topic lecture only, and (2) topic lecture with laboratory work using a software visualization Parallel Analysis Tool (PAT) specifically designed for this project. A new evaluation instrument developed for this study, the Perceptions of Parallelism Survey (PoPS), was used to measure student learning regarding parallel systems. The results from this educational study show a statistically significant main effect among the repeated measures, implying that student comprehension levels of parallel concepts as measured by the PoPS improve immediately after the delivery of any initial three-week CS1 level module when compared with student comprehension levels just prior to starting the course. Survey results measured during the ninth week of the course reveal that performance levels remained high compared to pre-course performance scores. A second result produced by this study reveals no statistically significant interaction effect between the intervention method and student performance as measured by the evaluation instrument over three separate testing periods. However, visual inspection of survey score trends and the low p-value generated by the interaction analysis (0.062) indicate that further studies may verify improved concept retention levels for the lecture w/PAT group.

  7. Slice-selective RF pulses for in vivo B1+ inhomogeneity mitigation at 7 tesla using parallel RF excitation with a 16-element coil.

    PubMed

    Setsompop, Kawin; Alagappan, Vijayanand; Gagoski, Borjan; Witzel, Thomas; Polimeni, Jonathan; Potthast, Andreas; Hebrank, Franz; Fontius, Ulrich; Schmitt, Franz; Wald, Lawrence L; Adalsteinsson, Elfar

    2008-12-01

    Slice-selective RF waveforms that mitigate severe B1+ inhomogeneity at 7 Tesla using parallel excitation were designed and validated in a water phantom and human studies on six subjects using a 16-element degenerate stripline array coil driven with a butler matrix to utilize the eight most favorable birdcage modes. The parallel RF waveform design applied magnitude least-squares (MLS) criteria with an optimized k-space excitation trajectory to significantly improve profile uniformity compared to conventional least-squares (LS) designs. Parallel excitation RF pulses designed to excite a uniform in-plane flip angle (FA) with slice selection in the z-direction were demonstrated and compared with conventional sinc-pulse excitation and RF shimming. In all cases, the parallel RF excitation significantly mitigated the effects of inhomogeneous B1+ on the excitation FA. The optimized parallel RF pulses for human B1+ mitigation were only 67% longer than a conventional sinc-based excitation, but significantly outperformed RF shimming. For example the standard deviations (SDs) of the in-plane FA (averaged over six human studies) were 16.7% for conventional sinc excitation, 13.3% for RF shimming, and 7.6% for parallel excitation. This work demonstrates that excitations with parallel RF systems can provide slice selection with spatially uniform FAs at high field strengths with only a small pulse-duration penalty. (c) 2008 Wiley-Liss, Inc.

  8. Parallel Processing at the High School Level.

    ERIC Educational Resources Information Center

    Sheary, Kathryn Anne

    This study investigated the ability of high school students to cognitively understand and implement parallel processing. Data indicates that most parallel processing is being taught at the university level. Instructional modules on C, Linux, and the parallel processing language, P4, were designed to show that high school students are highly…

  9. Testing for carryover effects after cessation of treatments: a design approach.

    PubMed

    Sturdevant, S Gwynn; Lumley, Thomas

    2016-08-02

    Recently, trials addressing noisy measurements with diagnosis occurring by exceeding thresholds (such as diabetes and hypertension) have been published which attempt to measure carryover - the impact that treatment has on an outcome after cessation. The design of these trials has been criticised and simulations have been conducted which suggest that the parallel-designs used are not adequate to test this hypothesis; two solutions are that either a differing parallel-design or a cross-over design could allow for diagnosis of carryover. We undertook a systematic simulation study to determine the ability of a cross-over or a parallel-group trial design to detect carryover effects on incident hypertension in a population with prehypertension. We simulated blood pressure and focused on varying criteria to diagnose systolic hypertension. Using the difference in cumulative incidence hypertension to analyse parallel-group or cross-over trials resulted in none of the designs having acceptable Type I error rate. Under the null hypothesis of no carryover the difference is well above the nominal 5 % error rate. When a treatment is effective during the intervention period, reliable testing for a carryover effect is difficult. Neither parallel-group nor cross-over designs using the difference in cumulative incidence appear to be a feasible approach. Future trials should ensure their design and analysis is validated by simulation.

  10. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, A.; Ellis, Carla; Kotz, David; Nieuwejaar, Nils; Best, Michael L.

    1995-01-01

    High-performance parallel file systems are needed to satisfy tremendous I/O requirements of parallel scientific applications. The design of such high-performance parallel file systems depends on a comprehensive understanding of the expected workload, but so far there have been very few usage studies of multiprocessor file systems. This paper is part of the CHARISMA project, which intends to fill this void by measuring real file-system workloads on various production parallel machines. In particular, we present results from the CM-5 at the National Center for Supercomputing Applications. Our results are unique because we collect information about nearly every individual I/O request from the mix of jobs running on the machine. Analysis of the traces leads to various recommendations for parallel file-system design.

  11. A design methodology for portable software on parallel computers

    NASA Technical Reports Server (NTRS)

    Nicol, David M.; Miller, Keith W.; Chrisman, Dan A.

    1993-01-01

    This final report for research that was supported by grant number NAG-1-995 documents our progress in addressing two difficulties in parallel programming. The first difficulty is developing software that will execute quickly on a parallel computer. The second difficulty is transporting software between dissimilar parallel computers. In general, we expect that more hardware-specific information will be included in software designs for parallel computers than in designs for sequential computers. This inclusion is an instance of portability being sacrificed for high performance. New parallel computers are being introduced frequently. Trying to keep one's software on the current high performance hardware, a software developer almost continually faces yet another expensive software transportation. The problem of the proposed research is to create a design methodology that helps designers to more precisely control both portability and hardware-specific programming details. The proposed research emphasizes programming for scientific applications. We completed our study of the parallelizability of a subsystem of the NASA Earth Radiation Budget Experiment (ERBE) data processing system. This work is summarized in section two. A more detailed description is provided in Appendix A ('Programming Practices to Support Eventual Parallelism'). Mr. Chrisman, a graduate student, wrote and successfully defended a Ph.D. dissertation proposal which describes our research associated with the issues of software portability and high performance. The list of research tasks are specified in the proposal. The proposal 'A Design Methodology for Portable Software on Parallel Computers' is summarized in section three and is provided in its entirety in Appendix B. We are currently studying a proposed subsystem of the NASA Clouds and the Earth's Radiant Energy System (CERES) data processing system. This software is the proof-of-concept for the Ph.D. dissertation. We have implemented and measured the performance of a portion of this subsystem on the Intel iPSC/2 parallel computer. These results are provided in section four. Our future work is summarized in section five, our acknowledgements are stated in section six, and references for published papers associated with NAG-1-995 are provided in section seven.

  12. Performance Evaluation of Parallel Branch and Bound Search with the Intel iPSC (Intel Personal SuperComputer) Hypercube Computer.

    DTIC Science & Technology

    1986-12-01

    17 III. Analysis of Parallel Design ................................................ 18 Parallel Abstract Data ...Types ........................................... 18 Abstract Data Type .................................................. 19 Parallel ADT...22 Data -Structure Design ........................................... 23 Object-Oriented Design

  13. Type synthesis for 4-DOF parallel press mechanism using GF set theory

    NASA Astrophysics Data System (ADS)

    He, Jun; Gao, Feng; Meng, Xiangdun; Guo, Weizhong

    2015-07-01

    Parallel mechanisms is used in the large capacity servo press to avoid the over-constraint of the traditional redundant actuation. Currently, the researches mainly focus on the performance analysis for some specific parallel press mechanisms. However, the type synthesis and evaluation of parallel press mechanisms is seldom studied, especially for the four degrees of freedom(DOF) press mechanisms. The type synthesis of 4-DOF parallel press mechanisms is carried out based on the generalized function(GF) set theory. Five design criteria of 4-DOF parallel press mechanisms are firstly proposed. The general procedure of type synthesis of parallel press mechanisms is obtained, which includes number synthesis, symmetrical synthesis of constraint GF sets, decomposition of motion GF sets and design of limbs. Nine combinations of constraint GF sets of 4-DOF parallel press mechanisms, ten combinations of GF sets of active limbs, and eleven combinations of GF sets of passive limbs are synthesized. Thirty-eight kinds of press mechanisms are presented and then different structures of kinematic limbs are designed. Finally, the geometrical constraint complexity( GCC), kinematic pair complexity( KPC), and type complexity( TC) are proposed to evaluate the press types and the optimal press type is achieved. The general methodologies of type synthesis and evaluation for parallel press mechanism are suggested.

  14. Fast I/O for Massively Parallel Applications

    NASA Technical Reports Server (NTRS)

    OKeefe, Matthew T.

    1996-01-01

    The two primary goals for this report were the design, contruction and modeling of parallel disk arrays for scientific visualization and animation, and a study of the IO requirements of highly parallel applications. In addition, further work in parallel display systems required to project and animate the very high-resolution frames resulting from our supercomputing simulations in ocean circulation and compressible gas dynamics.

  15. HPCC Methodologies for Structural Design and Analysis on Parallel and Distributed Computing Platforms

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel

    1998-01-01

    In this grant, we have proposed a three-year research effort focused on developing High Performance Computation and Communication (HPCC) methodologies for structural analysis on parallel processors and clusters of workstations, with emphasis on reducing the structural design cycle time. Besides consolidating and further improving the FETI solver technology to address plate and shell structures, we have proposed to tackle the following design related issues: (a) parallel coupling and assembly of independently designed and analyzed three-dimensional substructures with non-matching interfaces, (b) fast and smart parallel re-analysis of a given structure after it has undergone design modifications, (c) parallel evaluation of sensitivity operators (derivatives) for design optimization, and (d) fast parallel analysis of mildly nonlinear structures. While our proposal was accepted, support was provided only for one year.

  16. Minimum envelope roughness pulse design for reduced amplifier distortion in parallel excitation.

    PubMed

    Grissom, William A; Kerr, Adam B; Stang, Pascal; Scott, Greig C; Pauly, John M

    2010-11-01

    Parallel excitation uses multiple transmit channels and coils, each driven by independent waveforms, to afford the pulse designer an additional spatial encoding mechanism that complements gradient encoding. In contrast to parallel reception, parallel excitation requires individual power amplifiers for each transmit channel, which can be cost prohibitive. Several groups have explored the use of low-cost power amplifiers for parallel excitation; however, such amplifiers commonly exhibit nonlinear memory effects that distort radio frequency pulses. This is especially true for pulses with rapidly varying envelopes, which are common in parallel excitation. To overcome this problem, we introduce a technique for parallel excitation pulse design that yields pulses with smoother envelopes. We demonstrate experimentally that pulses designed with the new technique suffer less amplifier distortion than unregularized pulses and pulses designed with conventional regularization.

  17. Work stealing for GPU-accelerated parallel programs in a global address space framework: WORK STEALING ON GPU-ACCELERATED SYSTEMS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram

    Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain.« less

  18. Work stealing for GPU-accelerated parallel programs in a global address space framework

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arafat, Humayun; Dinan, James; Krishnamoorthy, Sriram

    Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed-memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task-parallel programs executed on hybrid distributed-memory CPU-graphics processing unit (GPU) systems in a global-address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a functionmore » of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU-GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state-of-the-art CCSD(T) application module from the computational chemistry domain« less

  19. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  20. [Design and study of parallel computing environment of Monte Carlo simulation for particle therapy planning using a public cloud-computing infrastructure].

    PubMed

    Yokohama, Noriya

    2013-07-01

    This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.

  1. Survival distributions impact the power of randomized placebo-phase design and parallel groups randomized clinical trials.

    PubMed

    Abrahamyan, Lusine; Li, Chuan Silvia; Beyene, Joseph; Willan, Andrew R; Feldman, Brian M

    2011-03-01

    The study evaluated the power of the randomized placebo-phase design (RPPD)-a new design of randomized clinical trials (RCTs), compared with the traditional parallel groups design, assuming various response time distributions. In the RPPD, at some point, all subjects receive the experimental therapy, and the exposure to placebo is for only a short fixed period of time. For the study, an object-oriented simulation program was written in R. The power of the simulated trials was evaluated using six scenarios, where the treatment response times followed the exponential, Weibull, or lognormal distributions. The median response time was assumed to be 355 days for the placebo and 42 days for the experimental drug. Based on the simulation results, the sample size requirements to achieve the same level of power were different under different response time to treatment distributions. The scenario where the response times followed the exponential distribution had the highest sample size requirement. In most scenarios, the parallel groups RCT had higher power compared with the RPPD. The sample size requirement varies depending on the underlying hazard distribution. The RPPD requires more subjects to achieve a similar power to the parallel groups design. Copyright © 2011 Elsevier Inc. All rights reserved.

  2. Symplectic molecular dynamics simulations on specially designed parallel computers.

    PubMed

    Borstnik, Urban; Janezic, Dusanka

    2005-01-01

    We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables the use of longer simulation time steps. The low-frequency motion is treated numerically on specially designed parallel computers, which decreases the computational time of each simulation time step. The combination of these approaches means that less time is required and fewer steps are needed and so enables fast MD simulations. We study the computational performance of MD simulation of molecular systems on specialized computers and provide a comparison to standard personal computers. The combination of the SISM with two specialized parallel computers is an effective way to increase the speed of MD simulations up to 16-fold over a single PC processor.

  3. Research on Parallel Three Phase PWM Converters base on RTDS

    NASA Astrophysics Data System (ADS)

    Xia, Yan; Zou, Jianxiao; Li, Kai; Liu, Jingbo; Tian, Jun

    2018-01-01

    Converters parallel operation can increase capacity of the system, but it may lead to potential zero-sequence circulating current, so the control of circulating current was an important goal in the design of parallel inverters. In this paper, the Real Time Digital Simulator (RTDS) is used to model the converters parallel system in real time and study the circulating current restraining. The equivalent model of two parallel converters and zero-sequence circulating current(ZSCC) were established and analyzed, then a strategy using variable zero vector control was proposed to suppress the circulating current. For two parallel modular converters, hardware-in-the-loop(HIL) study based on RTDS and practical experiment were implemented, results prove that the proposed control strategy is feasible and effective.

  4. Parallelizing serial code for a distributed processing environment with an application to high frequency electromagnetic scattering

    NASA Astrophysics Data System (ADS)

    Work, Paul R.

    1991-12-01

    This thesis investigates the parallelization of existing serial programs in computational electromagnetics for use in a parallel environment. Existing algorithms for calculating the radar cross section of an object are covered, and a ray-tracing code is chosen for implementation on a parallel machine. Current parallel architectures are introduced and a suitable parallel machine is selected for the implementation of the chosen ray-tracing algorithm. The standard techniques for the parallelization of serial codes are discussed, including load balancing and decomposition considerations, and appropriate methods for the parallelization effort are selected. A load balancing algorithm is modified to increase the efficiency of the application, and a high level design of the structure of the serial program is presented. A detailed design of the modifications for the parallel implementation is also included, with both the high level and the detailed design specified in a high level design language called UNITY. The correctness of the design is proven using UNITY and standard logic operations. The theoretical and empirical results show that it is possible to achieve an efficient parallel application for a serial computational electromagnetic program where the characteristics of the algorithm and the target architecture critically influence the development of such an implementation.

  5. A Parallel Particle Swarm Optimization Algorithm Accelerated by Asynchronous Evaluations

    NASA Technical Reports Server (NTRS)

    Venter, Gerhard; Sobieszczanski-Sobieski, Jaroslaw

    2005-01-01

    A parallel Particle Swarm Optimization (PSO) algorithm is presented. Particle swarm optimization is a fairly recent addition to the family of non-gradient based, probabilistic search algorithms that is based on a simplified social model and is closely tied to swarming theory. Although PSO algorithms present several attractive properties to the designer, they are plagued by high computational cost as measured by elapsed time. One approach to reduce the elapsed time is to make use of coarse-grained parallelization to evaluate the design points. Previous parallel PSO algorithms were mostly implemented in a synchronous manner, where all design points within a design iteration are evaluated before the next iteration is started. This approach leads to poor parallel speedup in cases where a heterogeneous parallel environment is used and/or where the analysis time depends on the design point being analyzed. This paper introduces an asynchronous parallel PSO algorithm that greatly improves the parallel e ciency. The asynchronous algorithm is benchmarked on a cluster assembled of Apple Macintosh G5 desktop computers, using the multi-disciplinary optimization of a typical transport aircraft wing as an example.

  6. An embedded multi-core parallel model for real-time stereo imaging

    NASA Astrophysics Data System (ADS)

    He, Wenjing; Hu, Jian; Niu, Jingyu; Li, Chuanrong; Liu, Guangyu

    2018-04-01

    The real-time processing based on embedded system will enhance the application capability of stereo imaging for LiDAR and hyperspectral sensor. The task partitioning and scheduling strategies for embedded multiprocessor system starts relatively late, compared with that for PC computer. In this paper, aimed at embedded multi-core processing platform, a parallel model for stereo imaging is studied and verified. After analyzing the computing amount, throughout capacity and buffering requirements, a two-stage pipeline parallel model based on message transmission is established. This model can be applied to fast stereo imaging for airborne sensors with various characteristics. To demonstrate the feasibility and effectiveness of the parallel model, a parallel software was designed using test flight data, based on the 8-core DSP processor TMS320C6678. The results indicate that the design performed well in workload distribution and had a speed-up ratio up to 6.4.

  7. Displacement and deformation measurement for large structures by camera network

    NASA Astrophysics Data System (ADS)

    Shang, Yang; Yu, Qifeng; Yang, Zhen; Xu, Zhiqiang; Zhang, Xiaohu

    2014-03-01

    A displacement and deformation measurement method for large structures by a series-parallel connection camera network is presented. By taking the dynamic monitoring of a large-scale crane in lifting operation as an example, a series-parallel connection camera network is designed, and the displacement and deformation measurement method by using this series-parallel connection camera network is studied. The movement range of the crane body is small, and that of the crane arm is large. The displacement of the crane body, the displacement of the crane arm relative to the body and the deformation of the arm are measured. Compared with a pure series or parallel connection camera network, the designed series-parallel connection camera network can be used to measure not only the movement and displacement of a large structure but also the relative movement and deformation of some interesting parts of the large structure by a relatively simple optical measurement system.

  8. Aerodynamic Shape Optimization of Supersonic Aircraft Configurations via an Adjoint Formulation on Parallel Computers

    NASA Technical Reports Server (NTRS)

    Reuther, James; Alonso, Juan Jose; Rimlinger, Mark J.; Jameson, Antony

    1996-01-01

    This work describes the application of a control theory-based aerodynamic shape optimization method to the problem of supersonic aircraft design. The design process is greatly accelerated through the use of both control theory and a parallel implementation on distributed memory computers. Control theory is employed to derive the adjoint differential equations whose solution allows for the evaluation of design gradient information at a fraction of the computational cost required by previous design methods. The resulting problem is then implemented on parallel distributed memory architectures using a domain decomposition approach, an optimized communication schedule, and the MPI (Message Passing Interface) Standard for portability and efficiency. The final result achieves very rapid aerodynamic design based on higher order computational fluid dynamics methods (CFD). In our earlier studies, the serial implementation of this design method was shown to be effective for the optimization of airfoils, wings, wing-bodies, and complex aircraft configurations using both the potential equation and the Euler equations. In our most recent paper, the Euler method was extended to treat complete aircraft configurations via a new multiblock implementation. Furthermore, during the same conference, we also presented preliminary results demonstrating that this basic methodology could be ported to distributed memory parallel computing architectures. In this paper, our concern will be to demonstrate that the combined power of these new technologies can be used routinely in an industrial design environment by applying it to the case study of the design of typical supersonic transport configurations. A particular difficulty of this test case is posed by the propulsion/airframe integration.

  9. Parallel optimization algorithms and their implementation in VLSI design

    NASA Technical Reports Server (NTRS)

    Lee, G.; Feeley, J. J.

    1991-01-01

    Two new parallel optimization algorithms based on the simplex method are described. They may be executed by a SIMD parallel processor architecture and be implemented in VLSI design. Several VLSI design implementations are introduced. An application example is reported to demonstrate that the algorithms are effective.

  10. TECA: A Parallel Toolkit for Extreme Climate Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prabhat, Mr; Ruebel, Oliver; Byna, Surendra

    2012-03-12

    We present TECA, a parallel toolkit for detecting extreme events in large climate datasets. Modern climate datasets expose parallelism across a number of dimensions: spatial locations, timesteps and ensemble members. We design TECA to exploit these modes of parallelism and demonstrate a prototype implementation for detecting and tracking three classes of extreme events: tropical cyclones, extra-tropical cyclones and atmospheric rivers. We process a modern TB-sized CAM5 simulation dataset with TECA, and demonstrate good runtime performance for the three case studies.

  11. Supersonic civil airplane study and design: Performance and sonic boom

    NASA Technical Reports Server (NTRS)

    Cheung, Samson

    1995-01-01

    Since aircraft configuration plays an important role in aerodynamic performance and sonic boom shape, the configuration of the next generation supersonic civil transport has to be tailored to meet high aerodynamic performance and low sonic boom requirements. Computational fluid dynamics (CFD) can be used to design airplanes to meet these dual objectives. The work and results in this report are used to support NASA's High Speed Research Program (HSRP). CFD tools and techniques have been developed for general usages of sonic boom propagation study and aerodynamic design. Parallel to the research effort on sonic boom extrapolation, CFD flow solvers have been coupled with a numeric optimization tool to form a design package for aircraft configuration. This CFD optimization package has been applied to configuration design on a low-boom concept and an oblique all-wing concept. A nonlinear unconstrained optimizer for Parallel Virtual Machine has been developed for aerodynamic design and study.

  12. Experimental Studies Of Pilot Performance At Collision Avoidance During Closely Spaced Parallel Approaches

    NASA Technical Reports Server (NTRS)

    Pritchett, Amy R.; Hansman, R. John

    1997-01-01

    Efforts to increase airport capacity include studies of aircraft systems that would enable simultaneous approaches to closely spaced parallel runway in Instrument Meteorological Conditions (IMC). The time-critical nature of a parallel approach results in key design issues for current and future collision avoidance systems. Two part-task flight simulator studies have examined the procedural and display issues inherent in such a time-critical task, the interaction of the pilot with a collision avoidance system, and the alerting criteria and avoidance maneuvers preferred by subjects.

  13. Centrifugal multiplexing fixed-volume dispenser on a plastic lab-on-a-disk for parallel biochemical single-end-point assays

    PubMed Central

    La, Moonwoo; Park, Sang Min; Kim, Dong Sung

    2015-01-01

    In this study, a multiple sample dispenser for precisely metered fixed volumes was successfully designed, fabricated, and fully characterized on a plastic centrifugal lab-on-a-disk (LOD) for parallel biochemical single-end-point assays. The dispenser, namely, a centrifugal multiplexing fixed-volume dispenser (C-MUFID) was designed with microfluidic structures based on the theoretical modeling about a centrifugal circumferential filling flow. The designed LODs were fabricated with a polystyrene substrate through micromachining and they were thermally bonded with a flat substrate. Furthermore, six parallel metering and dispensing assays were conducted at the same fixed-volume (1.27 μl) with a relative variation of ±0.02 μl. Moreover, the samples were metered and dispensed at different sub-volumes. To visualize the metering and dispensing performances, the C-MUFID was integrated with a serpentine micromixer during parallel centrifugal mixing tests. Parallel biochemical single-end-point assays were successfully conducted on the developed LOD using a standard serum with albumin, glucose, and total protein reagents. The developed LOD could be widely applied to various biochemical single-end-point assays which require different volume ratios of the sample and reagent by controlling the design of the C-MUFID. The proposed LOD is feasible for point-of-care diagnostics because of its mass-producible structures, reliable metering/dispensing performance, and parallel biochemical single-end-point assays, which can identify numerous biochemical. PMID:25610516

  14. Second International Workshop on Software Engineering and Code Design in Parallel Meteorological and Oceanographic Applications

    NASA Technical Reports Server (NTRS)

    OKeefe, Matthew (Editor); Kerr, Christopher L. (Editor)

    1998-01-01

    This report contains the abstracts and technical papers from the Second International Workshop on Software Engineering and Code Design in Parallel Meteorological and Oceanographic Applications, held June 15-18, 1998, in Scottsdale, Arizona. The purpose of the workshop is to bring together software developers in meteorology and oceanography to discuss software engineering and code design issues for parallel architectures, including Massively Parallel Processors (MPP's), Parallel Vector Processors (PVP's), Symmetric Multi-Processors (SMP's), Distributed Shared Memory (DSM) multi-processors, and clusters. Issues to be discussed include: (1) code architectures for current parallel models, including basic data structures, storage allocation, variable naming conventions, coding rules and styles, i/o and pre/post-processing of data; (2) designing modular code; (3) load balancing and domain decomposition; (4) techniques that exploit parallelism efficiently yet hide the machine-related details from the programmer; (5) tools for making the programmer more productive; and (6) the proliferation of programming models (F--, OpenMP, MPI, and HPF).

  15. Conceptual design of a hybrid parallel mechanism for mask exchanging of TMT

    NASA Astrophysics Data System (ADS)

    Wang, Jianping; Zhou, Hongfei; Li, Kexuan; Zhou, Zengxiang; Zhai, Chao

    2015-10-01

    Mask exchange system is an important part of the Multi-Object Broadband Imaging Echellette (MOBIE) on the Thirty Meter Telescope (TMT). To solve the problem of stiffness changing with the gravity vector of the mask exchange system in the MOBIE, the hybrid parallel mechanism design method was introduced into the whole research. By using the characteristics of high stiffness and precision of parallel structure, combined with large moving range of serial structure, a conceptual design of a hybrid parallel mask exchange system based on 3-RPS parallel mechanism was presented. According to the position requirements of the MOBIE, the SolidWorks structure model of the hybrid parallel mask exchange robot was established and the appropriate installation position without interfering with the related components and light path in the MOBIE of TMT was analyzed. Simulation results in SolidWorks suggested that 3-RPS parallel platform had good stiffness property in different gravity vector directions. Furthermore, through the research of the mechanism theory, the inverse kinematics solution of the 3-RPS parallel platform was calculated and the mathematical relationship between the attitude angle of moving platform and the angle of ball-hinges on the moving platform was established, in order to analyze the attitude adjustment ability of the hybrid parallel mask exchange robot. The proposed conceptual design has some guiding significance for the design of mask exchange system of the MOBIE on TMT.

  16. Optimizing trial design in pharmacogenetics research: comparing a fixed parallel group, group sequential, and adaptive selection design on sample size requirements.

    PubMed

    Boessen, Ruud; van der Baan, Frederieke; Groenwold, Rolf; Egberts, Antoine; Klungel, Olaf; Grobbee, Diederick; Knol, Mirjam; Roes, Kit

    2013-01-01

    Two-stage clinical trial designs may be efficient in pharmacogenetics research when there is some but inconclusive evidence of effect modification by a genomic marker. Two-stage designs allow to stop early for efficacy or futility and can offer the additional opportunity to enrich the study population to a specific patient subgroup after an interim analysis. This study compared sample size requirements for fixed parallel group, group sequential, and adaptive selection designs with equal overall power and control of the family-wise type I error rate. The designs were evaluated across scenarios that defined the effect sizes in the marker positive and marker negative subgroups and the prevalence of marker positive patients in the overall study population. Effect sizes were chosen to reflect realistic planning scenarios, where at least some effect is present in the marker negative subgroup. In addition, scenarios were considered in which the assumed 'true' subgroup effects (i.e., the postulated effects) differed from those hypothesized at the planning stage. As expected, both two-stage designs generally required fewer patients than a fixed parallel group design, and the advantage increased as the difference between subgroups increased. The adaptive selection design added little further reduction in sample size, as compared with the group sequential design, when the postulated effect sizes were equal to those hypothesized at the planning stage. However, when the postulated effects deviated strongly in favor of enrichment, the comparative advantage of the adaptive selection design increased, which precisely reflects the adaptive nature of the design. Copyright © 2013 John Wiley & Sons, Ltd.

  17. Parallel Hybrid Gas-Electric Geared Turbofan Engine Conceptual Design and Benefits Analysis

    NASA Technical Reports Server (NTRS)

    Lents, Charles; Hardin, Larry; Rheaume, Jonathan; Kohlman, Lee

    2016-01-01

    The conceptual design of a parallel gas-electric hybrid propulsion system for a conventional single aisle twin engine tube and wing vehicle has been developed. The study baseline vehicle and engine technology are discussed, followed by results of the hybrid propulsion system sizing and performance analysis. The weights analysis for the electric energy storage & conversion system and thermal management system is described. Finally, the potential system benefits are assessed.

  18. Aerodynamic Shape Optimization of Supersonic Aircraft Configurations via an Adjoint Formulation on Parallel Computers

    NASA Technical Reports Server (NTRS)

    Reuther, James; Alonso, Juan Jose; Rimlinger, Mark J.; Jameson, Antony

    1996-01-01

    This work describes the application of a control theory-based aerodynamic shape optimization method to the problem of supersonic aircraft design. The design process is greatly accelerated through the use of both control theory and a parallel implementation on distributed memory computers. Control theory is employed to derive the adjoint differential equations whose solution allows for the evaluation of design gradient information at a fraction of the computational cost required by previous design methods (13, 12, 44, 38). The resulting problem is then implemented on parallel distributed memory architectures using a domain decomposition approach, an optimized communication schedule, and the MPI (Message Passing Interface) Standard for portability and efficiency. The final result achieves very rapid aerodynamic design based on higher order computational fluid dynamics methods (CFD). In our earlier studies, the serial implementation of this design method (19, 20, 21, 23, 39, 25, 40, 41, 42, 43, 9) was shown to be effective for the optimization of airfoils, wings, wing-bodies, and complex aircraft configurations using both the potential equation and the Euler equations (39, 25). In our most recent paper, the Euler method was extended to treat complete aircraft configurations via a new multiblock implementation. Furthermore, during the same conference, we also presented preliminary results demonstrating that the basic methodology could be ported to distributed memory parallel computing architectures [241. In this paper, our concem will be to demonstrate that the combined power of these new technologies can be used routinely in an industrial design environment by applying it to the case study of the design of typical supersonic transport configurations. A particular difficulty of this test case is posed by the propulsion/airframe integration.

  19. Transmission Index Research of Parallel Manipulators Based on Matrix Orthogonal Degree

    NASA Astrophysics Data System (ADS)

    Shao, Zhu-Feng; Mo, Jiao; Tang, Xiao-Qiang; Wang, Li-Ping

    2017-11-01

    Performance index is the standard of performance evaluation, and is the foundation of both performance analysis and optimal design for the parallel manipulator. Seeking the suitable kinematic indices is always an important and challenging issue for the parallel manipulator. So far, there are extensive studies in this field, but few existing indices can meet all the requirements, such as simple, intuitive, and universal. To solve this problem, the matrix orthogonal degree is adopted, and generalized transmission indices that can evaluate motion/force transmissibility of fully parallel manipulators are proposed. Transmission performance analysis of typical branches, end effectors, and parallel manipulators is given to illustrate proposed indices and analysis methodology. Simulation and analysis results reveal that proposed transmission indices possess significant advantages, such as normalized finite (ranging from 0 to 1), dimensionally homogeneous, frame-free, intuitive and easy to calculate. Besides, proposed indices well indicate the good transmission region and relativity to the singularity with better resolution than the traditional local conditioning index, and provide a novel tool for kinematic analysis and optimal design of fully parallel manipulators.

  20. Design and realization of photoelectric instrument binocular optical axis parallelism calibration system

    NASA Astrophysics Data System (ADS)

    Ying, Jia-ju; Chen, Yu-dan; Liu, Jie; Wu, Dong-sheng; Lu, Jun

    2016-10-01

    The maladjustment of photoelectric instrument binocular optical axis parallelism will affect the observe effect directly. A binocular optical axis parallelism digital calibration system is designed. On the basis of the principle of optical axis binocular photoelectric instrument calibration, the scheme of system is designed, and the binocular optical axis parallelism digital calibration system is realized, which include four modules: multiband parallel light tube, optical axis translation, image acquisition system and software system. According to the different characteristics of thermal infrared imager and low-light-level night viewer, different algorithms is used to localize the center of the cross reticle. And the binocular optical axis parallelism calibration is realized for calibrating low-light-level night viewer and thermal infrared imager.

  1. The Design and Evaluation of "CAPTools"--A Computer Aided Parallelization Toolkit

    NASA Technical Reports Server (NTRS)

    Yan, Jerry; Frumkin, Michael; Hribar, Michelle; Jin, Haoqiang; Waheed, Abdul; Johnson, Steve; Cross, Jark; Evans, Emyr; Ierotheou, Constantinos; Leggett, Pete; hide

    1998-01-01

    Writing applications for high performance computers is a challenging task. Although writing code by hand still offers the best performance, it is extremely costly and often not very portable. The Computer Aided Parallelization Tools (CAPTools) are a toolkit designed to help automate the mapping of sequential FORTRAN scientific applications onto multiprocessors. CAPTools consists of the following major components: an inter-procedural dependence analysis module that incorporates user knowledge; a 'self-propagating' data partitioning module driven via user guidance; an execution control mask generation and optimization module for the user to fine tune parallel processing of individual partitions; a program transformation/restructuring facility for source code clean up and optimization; a set of browsers through which the user interacts with CAPTools at each stage of the parallelization process; and a code generator supporting multiple programming paradigms on various multiprocessors. Besides describing the rationale behind the architecture of CAPTools, the parallelization process is illustrated via case studies involving structured and unstructured meshes. The programming process and the performance of the generated parallel programs are compared against other programming alternatives based on the NAS Parallel Benchmarks, ARC3D and other scientific applications. Based on these results, a discussion on the feasibility of constructing architectural independent parallel applications is presented.

  2. Parallel Ray Tracing Using the Message Passing Interface

    DTIC Science & Technology

    2007-09-01

    software is available for lens design and for general optical systems modeling. It tends to be designed to run on a single processor and can be very...Cameron, Senior Member, IEEE Abstract—Ray-tracing software is available for lens design and for general optical systems modeling. It tends to be designed to...National Aeronautics and Space Administration (NASA), optical ray tracing, parallel computing, parallel pro- cessing, prime numbers, ray tracing

  3. A Framework for Understanding Community Colleges' Organizational Capacity for Data Use: A Convergent Parallel Mixed Methods Study

    ERIC Educational Resources Information Center

    Kerrigan, Monica Reid

    2014-01-01

    This convergent parallel design mixed methods case study of four community colleges explores the relationship between organizational capacity and implementation of data-driven decision making (DDDM). The article also illustrates purposive sampling using replication logic for cross-case analysis and the strengths and weaknesses of quantitizing…

  4. Parallel auto-correlative statistics with VTK.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  5. Parallel algorithms for placement and routing in VLSI design. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Brouwer, Randall Jay

    1991-01-01

    The computational requirements for high quality synthesis, analysis, and verification of very large scale integration (VLSI) designs have rapidly increased with the fast growing complexity of these designs. Research in the past has focused on the development of heuristic algorithms, special purpose hardware accelerators, or parallel algorithms for the numerous design tasks to decrease the time required for solution. Two new parallel algorithms are proposed for two VLSI synthesis tasks, standard cell placement and global routing. The first algorithm, a parallel algorithm for global routing, uses hierarchical techniques to decompose the routing problem into independent routing subproblems that are solved in parallel. Results are then presented which compare the routing quality to the results of other published global routers and which evaluate the speedups attained. The second algorithm, a parallel algorithm for cell placement and global routing, hierarchically integrates a quadrisection placement algorithm, a bisection placement algorithm, and the previous global routing algorithm. Unique partitioning techniques are used to decompose the various stages of the algorithm into independent tasks which can be evaluated in parallel. Finally, results are presented which evaluate the various algorithm alternatives and compare the algorithm performance to other placement programs. Measurements are presented on the parallel speedups available.

  6. Parallel digital forensics infrastructure.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liebrock, Lorie M.; Duggan, David Patrick

    2009-10-01

    This report documents the architecture and implementation of a Parallel Digital Forensics infrastructure. This infrastructure is necessary for supporting the design, implementation, and testing of new classes of parallel digital forensics tools. Digital Forensics has become extremely difficult with data sets of one terabyte and larger. The only way to overcome the processing time of these large sets is to identify and develop new parallel algorithms for performing the analysis. To support algorithm research, a flexible base infrastructure is required. A candidate architecture for this base infrastructure was designed, instantiated, and tested by this project, in collaboration with New Mexicomore » Tech. Previous infrastructures were not designed and built specifically for the development and testing of parallel algorithms. With the size of forensics data sets only expected to increase significantly, this type of infrastructure support is necessary for continued research in parallel digital forensics. This report documents the implementation of the parallel digital forensics (PDF) infrastructure architecture and implementation.« less

  7. A Parallel Saturation Algorithm on Shared Memory Architectures

    NASA Technical Reports Server (NTRS)

    Ezekiel, Jonathan; Siminiceanu

    2007-01-01

    Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.

  8. High order parallel numerical schemes for solving incompressible flows

    NASA Technical Reports Server (NTRS)

    Lin, Avi; Milner, Edward J.; Liou, May-Fun; Belch, Richard A.

    1992-01-01

    The use of parallel computers for numerically solving flow fields has gained much importance in recent years. This paper introduces a new high order numerical scheme for computational fluid dynamics (CFD) specifically designed for parallel computational environments. A distributed MIMD system gives the flexibility of treating different elements of the governing equations with totally different numerical schemes in different regions of the flow field. The parallel decomposition of the governing operator to be solved is the primary parallel split. The primary parallel split was studied using a hypercube like architecture having clusters of shared memory processors at each node. The approach is demonstrated using examples of simple steady state incompressible flows. Future studies should investigate the secondary split because, depending on the numerical scheme that each of the processors applies and the nature of the flow in the specific subdomain, it may be possible for a processor to seek better, or higher order, schemes for its particular subcase.

  9. A design concept of parallel elasticity extracted from biological muscles for engineered actuators.

    PubMed

    Chen, Jie; Jin, Hongzhe; Iida, Fumiya; Zhao, Jie

    2016-08-23

    Series elastic actuation that takes inspiration from biological muscle-tendon units has been extensively studied and used to address the challenges (e.g. energy efficiency, robustness) existing in purely stiff robots. However, there also exists another form of passive property in biological actuation, parallel elasticity within muscles themselves, and our knowledge of it is limited: for example, there is still no general design strategy for the elasticity profile. When we look at nature, on the other hand, there seems a universal agreement in biological systems: experimental evidence has suggested that a concave-upward elasticity behaviour is exhibited within the muscles of animals. Seeking to draw possible design clues for elasticity in parallel with actuators, we use a simplified joint model to investigate the mechanisms behind this biologically universal preference of muscles. Actuation of the model is identified from general biological joints and further reduced with a specific focus on muscle elasticity aspects, for the sake of easy implementation. By examining various elasticity scenarios, one without elasticity and three with elasticity of different profiles, we find that parallel elasticity generally exerts contradictory influences on energy efficiency and disturbance rejection, due to the mechanical impedance shift thus caused. The trade-off analysis between them also reveals that concave parallel elasticity is able to achieve a more advantageous balance than linear and convex ones. It is expected that the results could contribute to our further understanding of muscle elasticity and provide a theoretical guideline on how to properly design parallel elasticity behaviours for engineering systems such as artificial actuators and robotic joints.

  10. Integrated Task And Data Parallel Programming: Language Design

    NASA Technical Reports Server (NTRS)

    Grimshaw, Andrew S.; West, Emily A.

    1998-01-01

    his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated with Andrew Grimshaw and Adam Ferrari to write a book chapter which will be included in Parallel Processing in C++ edited by Gregory Wilson. I also finished two courses, Compilers and Advanced Compilers, in 1995. These courses complete my class requirements at the University of Virginia. I have only my dissertation research and defense to complete.

  11. Parallel Treatments Design: A Nested Single Subject Design for Comparing Instructional Procedures.

    ERIC Educational Resources Information Center

    Gast, David L.; Wolery, Mark

    1988-01-01

    This paper describes the parallel treatments design, a nested single subject experimental design that combines two concurrently implemented multiple probe designs, allows control for effects of extraneous variables through counterbalancing, and replicates its effects across behaviors. Procedural guidelines for the design's use and issues related…

  12. Software Design for Real-Time Systems on Parallel Computers: Formal Specifications.

    DTIC Science & Technology

    1996-04-01

    This research investigated the important issues related to the analysis and design of real - time systems targeted to parallel architectures. In...particular, the software specification models for real - time systems on parallel architectures were evaluated. A survey of current formal methods for...uniprocessor real - time systems specifications was conducted to determine their extensibility in specifying real - time systems on parallel architectures. In

  13. Affordance Access Matters: Preschool Children's Learning Progressions While Interacting with Touch-Screen Mathematics Apps

    ERIC Educational Resources Information Center

    Bullock, Emma P.; Shumway, Jessica F.; Watts, Christina M.; Moyer-Packenham, Patricia S.

    2017-01-01

    The purpose of this study was to contribute to the research on mathematics app use by very young children, and specifically mathematics apps for touch-screen mobile devices that contain virtual manipulatives. The study used a convergent parallel mixed methods design, in which quantitative and qualitative data were collected in parallel, analyzed…

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Janjusic, Tommy; Kartsaklis, Christos

    Memory scalability is an enduring problem and bottleneck that plagues many parallel codes. Parallel codes designed for High Performance Systems are typically designed over the span of several, and in some instances 10+, years. As a result, optimization practices which were appropriate for earlier systems may no longer be valid and thus require careful optimization consideration. Specifically, parallel codes whose memory footprint is a function of their scalability must be carefully considered for future exa-scale systems. In this paper we present a methodology and tool to study the memory scalability of parallel codes. Using our methodology we evaluate an applicationmore » s memory footprint as a function of scalability, which we coined memory efficiency, and describe our results. In particular, using our in-house tools we can pinpoint the specific application components which contribute to the application s overall memory foot-print (application data- structures, libraries, etc.).« less

  15. Multi-GPU parallel algorithm design and analysis for improved inversion of probability tomography with gravity gradiometry data

    NASA Astrophysics Data System (ADS)

    Hou, Zhenlong; Huang, Danian

    2017-09-01

    In this paper, we make a study on the inversion of probability tomography (IPT) with gravity gradiometry data at first. The space resolution of the results is improved by multi-tensor joint inversion, depth weighting matrix and the other methods. Aiming at solving the problems brought by the big data in the exploration, we present the parallel algorithm and the performance analysis combining Compute Unified Device Architecture (CUDA) with Open Multi-Processing (OpenMP) based on Graphics Processing Unit (GPU) accelerating. In the test of the synthetic model and real data from Vinton Dome, we get the improved results. It is also proved that the improved inversion algorithm is effective and feasible. The performance of parallel algorithm we designed is better than the other ones with CUDA. The maximum speedup could be more than 200. In the performance analysis, multi-GPU speedup and multi-GPU efficiency are applied to analyze the scalability of the multi-GPU programs. The designed parallel algorithm is demonstrated to be able to process larger scale of data and the new analysis method is practical.

  16. A multioutput LLC-type parallel resonant converter

    NASA Astrophysics Data System (ADS)

    Liu, Rui; Lee, C. Q.; Upadhyay, Anand K.

    1992-07-01

    When an LLC-type parallel resonant converter (LLC-PRC) operates above resonant frequency, the switching transistors can be turned off at zero voltage. Further study reveals that the LLC-PRC possesses the advantage of lower converter voltage gain as compared with the conventional PRC. Based on analytic results, a complete set of design curves is obtained, from which a systematic design procedure is developed. Experimental results from a 150 W 150 kHz multioutput LLC-type PRC power supply are presented.

  17. Parent-Child Parallel-Group Intervention for Childhood Aggression in Hong Kong

    ERIC Educational Resources Information Center

    Fung, Annis L. C.; Tsang, Sandra H. K. M.

    2006-01-01

    This article reports the original evidence-based outcome study on parent-child parallel group-designed Anger Coping Training (ACT) program for children aged 8-10 with reactive aggression and their parents in Hong Kong. This research program involved experimental and control groups with pre- and post-comparison. Quantitative data collection…

  18. Design Sketches For Optical Crossbar Switches Intended For Large-Scale Parallel Processing Applications

    NASA Astrophysics Data System (ADS)

    Hartmann, Alfred; Redfield, Steve

    1989-04-01

    This paper discusses design of large-scale (1000x 1000) optical crossbar switching networks for use in parallel processing supercom-puters. Alternative design sketches for an optical crossbar switching network are presented using free-space optical transmission with either a beam spreading/masking model or a beam steering model for internodal communications. The performances of alternative multiple access channel communications protocol-unslotted and slotted ALOHA and carrier sense multiple access (CSMA)-are compared with the performance of the classic arbitrated bus crossbar of conventional electronic parallel computing. These comparisons indicate an almost inverse relationship between ease of implementation and speed of operation. Practical issues of optical system design are addressed, and an optically addressed, composite spatial light modulator design is presented for fabrication to arbitrarily large scale. The wide range of switch architecture, communications protocol, optical systems design, device fabrication, and system performance problems presented by these design sketches poses a serious challenge to practical exploitation of highly parallel optical interconnects in advanced computer designs.

  19. A Next-Generation Parallel File System Environment for the OLCF

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dillow, David A; Fuller, Douglas; Gunasekaran, Raghul

    2012-01-01

    When deployed in 2008/2009 the Spider system at the Oak Ridge National Laboratory s Leadership Computing Facility (OLCF) was the world s largest scale Lustre parallel file system. Envisioned as a shared parallel file system capable of delivering both the bandwidth and capacity requirements of the OLCF s diverse computational environment, Spider has since become a blueprint for shared Lustre environments deployed worldwide. Designed to support the parallel I/O requirements of the Jaguar XT5 system and other smallerscale platforms at the OLCF, the upgrade to the Titan XK6 heterogeneous system will begin to push the limits of Spider s originalmore » design by mid 2013. With a doubling in total system memory and a 10x increase in FLOPS, Titan will require both higher bandwidth and larger total capacity. Our goal is to provide a 4x increase in total I/O bandwidth from over 240GB=sec today to 1TB=sec and a doubling in total capacity. While aggregate bandwidth and total capacity remain important capabilities, an equally important goal in our efforts is dramatically increasing metadata performance, currently the Achilles heel of parallel file systems at leadership. We present in this paper an analysis of our current I/O workloads, our operational experiences with the Spider parallel file systems, the high-level design of our Spider upgrade, and our efforts in developing benchmarks that synthesize our performance requirements based on our workload characterization studies.« less

  20. National Combustion Code: Parallel Implementation and Performance

    NASA Technical Reports Server (NTRS)

    Quealy, A.; Ryder, R.; Norris, A.; Liu, N.-S.

    2000-01-01

    The National Combustion Code (NCC) is being developed by an industry-government team for the design and analysis of combustion systems. CORSAIR-CCD is the current baseline reacting flow solver for NCC. This is a parallel, unstructured grid code which uses a distributed memory, message passing model for its parallel implementation. The focus of the present effort has been to improve the performance of the NCC flow solver to meet combustor designer requirements for model accuracy and analysis turnaround time. Improving the performance of this code contributes significantly to the overall reduction in time and cost of the combustor design cycle. This paper describes the parallel implementation of the NCC flow solver and summarizes its current parallel performance on an SGI Origin 2000. Earlier parallel performance results on an IBM SP-2 are also included. The performance improvements which have enabled a turnaround of less than 15 hours for a 1.3 million element fully reacting combustion simulation are described.

  1. Performance Comparison of a Set of Periodic and Non-Periodic Tridiagonal Solvers on SP2 and Paragon Parallel Computers

    NASA Technical Reports Server (NTRS)

    Sun, Xian-He; Moitra, Stuti

    1996-01-01

    Various tridiagonal solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel diagonal dominant algorithm, and the reduced diagonal dominant algorithm, is studied. These algorithms are designed for distributed-memory machines and are tested on an Intel Paragon and an IBM SP2 machines. Measured results are reported in terms of execution time and speedup. Analytical study are conducted for different communication topologies and for different tridiagonal systems. The measured results match the analytical results closely. In addition to address implementation issues, performance considerations such as problem sizes and models of speedup are also discussed.

  2. File-access characteristics of parallel scientific workloads

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David; Purakayastha, Apratim; Best, Michael; Ellis, Carla Schlatter

    1995-01-01

    Phenomenal improvements in the computational performance of multiprocessors have not been matched by comparable gains in I/O system performance. This imbalance has resulted in I/O becoming a significant bottleneck for many scientific applications. One key to overcoming this bottleneck is improving the performance of parallel file systems. The design of a high-performance parallel file system requires a comprehensive understanding of the expected workload. Unfortunately, until recently, no general workload studies of parallel file systems have been conducted. The goal of the CHARISMA project was to remedy this problem by characterizing the behavior of several production workloads, on different machines, at the level of individual reads and writes. The first set of results from the CHARISMA project describe the workloads observed on an Intel iPSC/860 and a Thinking Machines CM-5. This paper is intended to compare and contrast these two workloads for an understanding of their essential similarities and differences, isolating common trends and platform-dependent variances. Using this comparison, we are able to gain more insight into the general principles that should guide parallel file-system design.

  3. A Robust and Scalable Software Library for Parallel Adaptive Refinement on Unstructured Meshes

    NASA Technical Reports Server (NTRS)

    Lou, John Z.; Norton, Charles D.; Cwik, Thomas A.

    1999-01-01

    The design and implementation of Pyramid, a software library for performing parallel adaptive mesh refinement (PAMR) on unstructured meshes, is described. This software library can be easily used in a variety of unstructured parallel computational applications, including parallel finite element, parallel finite volume, and parallel visualization applications using triangular or tetrahedral meshes. The library contains a suite of well-designed and efficiently implemented modules that perform operations in a typical PAMR process. Among these are mesh quality control during successive parallel adaptive refinement (typically guided by a local-error estimator), parallel load-balancing, and parallel mesh partitioning using the ParMeTiS partitioner. The Pyramid library is implemented in Fortran 90 with an interface to the Message-Passing Interface (MPI) library, supporting code efficiency, modularity, and portability. An EM waveguide filter application, adaptively refined using the Pyramid library, is illustrated.

  4. Cross-over studies underestimate energy compensation: The example of sucrose-versus sucralose-containing drinks.

    PubMed

    Gadah, Nouf S; Brunstrom, Jeffrey M; Rogers, Peter J

    2016-12-01

    The vast majority of preload-test-meal studies that have investigated the effects on energy intake of disguised nutrient or other food/drink ingredient manipulations have used a cross-over design. We argue that this design may underestimate the effect of the manipulation due to carry-over effects. To test this we conducted comparable cross-over (n = 69) and parallel-groups (n = 48) studies testing the effects of sucrose versus low-calorie sweetener (sucralose) in a drink preload on test-meal energy intake. The parallel-groups study included a baseline day in which only the test meal was consumed. Energy intake in that meal was used to control for individual differences in energy intake in the analysis of the effects of sucrose versus sucralose on energy intake on the test day. Consistent with our prediction, the effect of consuming sucrose on subsequent energy intake was greater when measured in the parallel-groups study than in the cross-over study (respectively 64% versus 36% compensation for the 162 kcal difference in energy content of the sucrose and sucralose drinks). We also included a water comparison group in the parallel-groups study (n = 24) and found that test-meal energy intake did not differ significantly between the water and sucralose conditions. Together, these results confirm that consumption of sucrose in a drink reduces subsequent energy intake, but by less than the energy content of the drink, whilst drink sweetness does not increase food energy intake. Crucially, though, the studies demonstrate that study design affects estimated energy compensation. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Design of a massively parallel computer using bit serial processing elements

    NASA Technical Reports Server (NTRS)

    Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing

    1995-01-01

    A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.

  6. Concurrent Collections (CnC): A new approach to parallel programming

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Knobe, Kathleen

    2010-05-07

    A common approach in designing parallel languages is to provide some high level handles to manipulate the use of the parallel platform. This exposes some aspects of the target platform, for example, shared vs. distributed memory. It may expose some but not all types of parallelism, for example, data parallelism but not task parallelism. This approach must find a balance between the desire to provide a simple view for the domain expert and provide sufficient power for tuning. This is hard for any given architecture and harder if the language is to apply to a range of architectures. Either simplicitymore » or power is lost. Instead of viewing the language design problem as one of providing the programmer with high level handles, we view the problem as one of designing an interface. On one side of this interface is the programmer (domain expert) who knows the application but needs no knowledge of any aspects of the platform. On the other side of the interface is the performance expert (programmer or program) who demands maximal flexibility for optimizing the mapping to a wide range of target platforms (parallel / serial, shared / distributed, homogeneous / heterogeneous, etc.) but needs no knowledge of the domain. Concurrent Collections (CnC) is based on this separation of concerns. The talk will present CnC and its benefits. About the speaker. Kathleen Knobe has focused throughout her career on parallelism especially compiler technology, runtime system design and language design. She worked at Compass (aka Massachusetts Computer Associates) from 1980 to 1991 designing compilers for a wide range of parallel platforms for Thinking Machines, MasPar, Alliant, Numerix, and several government projects. In 1991 she decided to finish her education. After graduating from MIT in 1997, she joined Digital Equipment’s Cambridge Research Lab (CRL). She stayed through the DEC/Compaq/HP mergers and when CRL was acquired and absorbed by Intel. She currently works in the Software and Services Group / Technology Pathfinding and Innovation.« less

  7. Concurrent Collections (CnC): A new approach to parallel programming

    ScienceCinema

    Knobe, Kathleen

    2018-04-16

    A common approach in designing parallel languages is to provide some high level handles to manipulate the use of the parallel platform. This exposes some aspects of the target platform, for example, shared vs. distributed memory. It may expose some but not all types of parallelism, for example, data parallelism but not task parallelism. This approach must find a balance between the desire to provide a simple view for the domain expert and provide sufficient power for tuning. This is hard for any given architecture and harder if the language is to apply to a range of architectures. Either simplicity or power is lost. Instead of viewing the language design problem as one of providing the programmer with high level handles, we view the problem as one of designing an interface. On one side of this interface is the programmer (domain expert) who knows the application but needs no knowledge of any aspects of the platform. On the other side of the interface is the performance expert (programmer or program) who demands maximal flexibility for optimizing the mapping to a wide range of target platforms (parallel / serial, shared / distributed, homogeneous / heterogeneous, etc.) but needs no knowledge of the domain. Concurrent Collections (CnC) is based on this separation of concerns. The talk will present CnC and its benefits. About the speaker. Kathleen Knobe has focused throughout her career on parallelism especially compiler technology, runtime system design and language design. She worked at Compass (aka Massachusetts Computer Associates) from 1980 to 1991 designing compilers for a wide range of parallel platforms for Thinking Machines, MasPar, Alliant, Numerix, and several government projects. In 1991 she decided to finish her education. After graduating from MIT in 1997, she joined Digital Equipment’s Cambridge Research Lab (CRL). She stayed through the DEC/Compaq/HP mergers and when CRL was acquired and absorbed by Intel. She currently works in the Software and Services Group / Technology Pathfinding and Innovation.

  8. Parallel workflow tools to facilitate human brain MRI post-processing

    PubMed Central

    Cui, Zaixu; Zhao, Chenxi; Gong, Gaolang

    2015-01-01

    Multi-modal magnetic resonance imaging (MRI) techniques are widely applied in human brain studies. To obtain specific brain measures of interest from MRI datasets, a number of complex image post-processing steps are typically required. Parallel workflow tools have recently been developed, concatenating individual processing steps and enabling fully automated processing of raw MRI data to obtain the final results. These workflow tools are also designed to make optimal use of available computational resources and to support the parallel processing of different subjects or of independent processing steps for a single subject. Automated, parallel MRI post-processing tools can greatly facilitate relevant brain investigations and are being increasingly applied. In this review, we briefly summarize these parallel workflow tools and discuss relevant issues. PMID:26029043

  9. Integrated configurable equipment selection and line balancing for mass production with serial-parallel machining systems

    NASA Astrophysics Data System (ADS)

    Battaïa, Olga; Dolgui, Alexandre; Guschinsky, Nikolai; Levin, Genrikh

    2014-10-01

    Solving equipment selection and line balancing problems together allows better line configurations to be reached and avoids local optimal solutions. This article considers jointly these two decision problems for mass production lines with serial-parallel workplaces. This study was motivated by the design of production lines based on machines with rotary or mobile tables. Nevertheless, the results are more general and can be applied to assembly and production lines with similar structures. The designers' objectives and the constraints are studied in order to suggest a relevant mathematical model and an efficient optimization approach to solve it. A real case study is used to validate the model and the developed approach.

  10. Parallel integer sorting with medium and fine-scale parallelism

    NASA Technical Reports Server (NTRS)

    Dagum, Leonardo

    1993-01-01

    Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.

  11. Foreign Language/Area Studies Enhancement Project. Final Report.

    ERIC Educational Resources Information Center

    Felker, William; Fuller, Clark

    The Foreign Language/Area Studies Enhancement Program at Central State University (Ohio) is an experience-centered work and study program in Africa designed to give students training in language, culture, and technology. It parallels and supports the university's northern Senegal water management project designed to promote self-sufficiency among…

  12. Adaptive multi-GPU Exchange Monte Carlo for the 3D Random Field Ising Model

    NASA Astrophysics Data System (ADS)

    Navarro, Cristóbal A.; Huang, Wei; Deng, Youjin

    2016-08-01

    This work presents an adaptive multi-GPU Exchange Monte Carlo approach for the simulation of the 3D Random Field Ising Model (RFIM). The design is based on a two-level parallelization. The first level, spin-level parallelism, maps the parallel computation as optimal 3D thread-blocks that simulate blocks of spins in shared memory with minimal halo surface, assuming a constant block volume. The second level, replica-level parallelism, uses multi-GPU computation to handle the simulation of an ensemble of replicas. CUDA's concurrent kernel execution feature is used in order to fill the occupancy of each GPU with many replicas, providing a performance boost that is more notorious at the smallest values of L. In addition to the two-level parallel design, the work proposes an adaptive multi-GPU approach that dynamically builds a proper temperature set free of exchange bottlenecks. The strategy is based on mid-point insertions at the temperature gaps where the exchange rate is most compromised. The extra work generated by the insertions is balanced across the GPUs independently of where the mid-point insertions were performed. Performance results show that spin-level performance is approximately two orders of magnitude faster than a single-core CPU version and one order of magnitude faster than a parallel multi-core CPU version running on 16-cores. Multi-GPU performance is highly convenient under a weak scaling setting, reaching up to 99 % efficiency as long as the number of GPUs and L increase together. The combination of the adaptive approach with the parallel multi-GPU design has extended our possibilities of simulation to sizes of L = 32 , 64 for a workstation with two GPUs. Sizes beyond L = 64 can eventually be studied using larger multi-GPU systems.

  13. Parallel particle impactor - novel size-selective particle sampler for accurate fractioning of inhalable particles

    NASA Astrophysics Data System (ADS)

    Trakumas, S.; Salter, E.

    2009-02-01

    Adverse health effects due to exposure to airborne particles are associated with particle deposition within the human respiratory tract. Particle size, shape, chemical composition, and the individual physiological characteristics of each person determine to what depth inhaled particles may penetrate and deposit within the respiratory tract. Various particle inertial classification devices are available to fractionate airborne particles according to their aerodynamic size to approximate particle penetration through the human respiratory tract. Cyclones are most often used to sample thoracic or respirable fractions of inhaled particles. Extensive studies of different cyclonic samplers have shown, however, that the sampling characteristics of cyclones do not follow the entire selected convention accurately. In the search for a more accurate way to assess worker exposure to different fractions of inhaled dust, a novel sampler comprising several inertial impactors arranged in parallel was designed and tested. The new design includes a number of separated impactors arranged in parallel. Prototypes of respirable and thoracic samplers each comprising four impactors arranged in parallel were manufactured and tested. Results indicated that the prototype samplers followed closely the penetration characteristics for which they were designed. The new samplers were found to perform similarly for liquid and solid test particles; penetration characteristics remained unchanged even after prolonged exposure to coal mine dust at high concentration. The new parallel impactor design can be applied to approximate any monotonically decreasing penetration curve at a selected flow rate. Personal-size samplers that operate at a few L/min as well as area samplers that operate at higher flow rates can be made based on the suggested design. Performance of such samplers can be predicted with high accuracy employing well-established impaction theory.

  14. Genetic Parallel Programming: design and implementation.

    PubMed

    Cheang, Sin Man; Leung, Kwong Sak; Lee, Kin Hong

    2006-01-01

    This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential program if required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.

  15. Design considerations for parallel graphics libraries

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1994-01-01

    Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.

  16. Designing Feature and Data Parallel Stochastic Coordinate Descent Method forMatrix and Tensor Factorization

    DTIC Science & Technology

    2016-05-11

    AFRL-AFOSR-JP-TR-2016-0046 Designing Feature and Data Parallel Stochastic Coordinate Descent Method for Matrix and Tensor Factorization U Kang Korea...maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or   any other aspect...Designing Feature and Data Parallel Stochastic Coordinate Descent Method for Matrix and Tensor Factorization 5a.  CONTRACT NUMBER 5b.  GRANT NUMBER FA2386

  17. Exploiting Symmetry on Parallel Architectures.

    NASA Astrophysics Data System (ADS)

    Stiller, Lewis Benjamin

    1995-01-01

    This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.

  18. Big Data: A Parallel Particle Swarm Optimization-Back-Propagation Neural Network Algorithm Based on MapReduce.

    PubMed

    Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan

    2016-01-01

    A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.

  19. The Tera Multithreaded Architecture and Unstructured Meshes

    NASA Technical Reports Server (NTRS)

    Bokhari, Shahid H.; Mavriplis, Dimitri J.

    1998-01-01

    The Tera Multithreaded Architecture (MTA) is a new parallel supercomputer currently being installed at San Diego Supercomputing Center (SDSC). This machine has an architecture quite different from contemporary parallel machines. The computational processor is a custom design and the machine uses hardware to support very fine grained multithreading. The main memory is shared, hardware randomized and flat. These features make the machine highly suited to the execution of unstructured mesh problems, which are difficult to parallelize on other architectures. We report the results of a study carried out during July-August 1998 to evaluate the execution of EUL3D, a code that solves the Euler equations on an unstructured mesh, on the 2 processor Tera MTA at SDSC. Our investigation shows that parallelization of an unstructured code is extremely easy on the Tera. We were able to get an existing parallel code (designed for a shared memory machine), running on the Tera by changing only the compiler directives. Furthermore, a serial version of this code was compiled to run in parallel on the Tera by judicious use of directives to invoke the "full/empty" tag bits of the machine to obtain synchronization. This version achieves 212 and 406 Mflop/s on one and two processors respectively, and requires no attention to partitioning or placement of data issues that would be of paramount importance in other parallel architectures.

  20. Coupling between structure and liquids in a parallel stage space shuttle design

    NASA Technical Reports Server (NTRS)

    Kana, D. D.; Ko, W. L.; Francis, P. H.; Nagy, A.

    1972-01-01

    A study was conducted to determine the influence of liquid propellants on the dynamic loads for space shuttle vehicles. A parallel-stage configuration model was designed and tested to determine the influence of liquid propellants on coupled natural modes. A forty degree-of-freedom analytical model was also developed for predicting these modes. Currently available analytical models were used to represent the liquid contributions, even though coupled longitudinal and lateral motions are present in such a complex structure. Agreement between the results was found in the lower few modes.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, C.

    Almost every computer architect dreams of achieving high system performance with low implementation costs. A multigauge machine can reconfigure its data-path width, provide parallelism, achieve better resource utilization, and sometimes can trade computational precision for increased speed. A simple experimental method is used here to capture the main characteristics of multigauging. The measurements indicate evidence of near-optimal speedups. Adapting these ideas in designing parallel processors incurs low costs and provides flexibility. Several operational aspects of designing a multigauge machine are discussed as well. Thus, this research reports the technical, economical, and operational feasibility studies of multigauging.

  2. Design of on-board parallel computer on nano-satellite

    NASA Astrophysics Data System (ADS)

    You, Zheng; Tian, Hexiang; Yu, Shijie; Meng, Li

    2007-11-01

    This paper provides one scheme of the on-board parallel computer system designed for the Nano-satellite. Based on the development request that the Nano-satellite should have a small volume, low weight, low power cost, and intelligence, this scheme gets rid of the traditional one-computer system and dual-computer system with endeavor to improve the dependability, capability and intelligence simultaneously. According to the method of integration design, it employs the parallel computer system with shared memory as the main structure, connects the telemetric system, attitude control system, and the payload system by the intelligent bus, designs the management which can deal with the static tasks and dynamic task-scheduling, protect and recover the on-site status and so forth in light of the parallel algorithms, and establishes the fault diagnosis, restoration and system restructure mechanism. It accomplishes an on-board parallel computer system with high dependability, capability and intelligence, a flexible management on hardware resources, an excellent software system, and a high ability in extension, which satisfies with the conception and the tendency of the integration electronic design sufficiently.

  3. Branson: A Mini-App for Studying Parallel IMC, Version 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Long, Alex

    This code solves the gray thermal radiative transfer (TRT) equations in parallel using simple opacities and Cartesian meshes. Although Branson solves the TRT equations it is not designed to model radiation transport: Branson contains simple physics and does not have a multigroup treatment, nor can it use physical material data. The opacities have are simple polynomials in temperature there is a limited ability to specify complex geometries and sources. Branson was designed only to capture the computational demands of production IMC codes, especially in large parallel runs. It was also intended to foster collaboration with vendors, universities and other DOEmore » partners. Branson is similar in character to the neutron transport proxy-app Quicksilver from LLNL, which was recently open-sourced.« less

  4. CFD Analysis and Design Optimization Using Parallel Computers

    NASA Technical Reports Server (NTRS)

    Martinelli, Luigi; Alonso, Juan Jose; Jameson, Antony; Reuther, James

    1997-01-01

    A versatile and efficient multi-block method is presented for the simulation of both steady and unsteady flow, as well as aerodynamic design optimization of complete aircraft configurations. The compressible Euler and Reynolds Averaged Navier-Stokes (RANS) equations are discretized using a high resolution scheme on body-fitted structured meshes. An efficient multigrid implicit scheme is implemented for time-accurate flow calculations. Optimum aerodynamic shape design is achieved at very low cost using an adjoint formulation. The method is implemented on parallel computing systems using the MPI message passing interface standard to ensure portability. The results demonstrate that, by combining highly efficient algorithms with parallel computing, it is possible to perform detailed steady and unsteady analysis as well as automatic design for complex configurations using the present generation of parallel computers.

  5. ProperCAD: A portable object-oriented parallel environment for VLSI CAD

    NASA Technical Reports Server (NTRS)

    Ramkumar, Balkrishna; Banerjee, Prithviraj

    1993-01-01

    Most parallel algorithms for VLSI CAD proposed to date have one important drawback: they work efficiently only on machines that they were designed for. As a result, algorithms designed to date are dependent on the architecture for which they are developed and do not port easily to other parallel architectures. A new project under way to address this problem is described. A Portable object-oriented parallel environment for CAD algorithms (ProperCAD) is being developed. The objectives of this research are (1) to develop new parallel algorithms that run in a portable object-oriented environment (CAD algorithms using a general purpose platform for portable parallel programming called CARM is being developed and a C++ environment that is truly object-oriented and specialized for CAD applications is also being developed); and (2) to design the parallel algorithms around a good sequential algorithm with a well-defined parallel-sequential interface (permitting the parallel algorithm to benefit from future developments in sequential algorithms). One CAD application that has been implemented as part of the ProperCAD project, flat VLSI circuit extraction, is described. The algorithm, its implementation, and its performance on a range of parallel machines are discussed in detail. It currently runs on an Encore Multimax, a Sequent Symmetry, Intel iPSC/2 and i860 hypercubes, a NCUBE 2 hypercube, and a network of Sun Sparc workstations. Performance data for other applications that were developed are provided: namely test pattern generation for sequential circuits, parallel logic synthesis, and standard cell placement.

  6. Charon Toolkit for Parallel, Implicit Structured-Grid Computations: Functional Design

    NASA Technical Reports Server (NTRS)

    VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)

    1997-01-01

    In a previous report the design concepts of Charon were presented. Charon is a toolkit that aids engineers in developing scientific programs for structured-grid applications to be run on MIMD parallel computers. It constitutes an augmentation of the general-purpose MPI-based message-passing layer, and provides the user with a hierarchy of tools for rapid prototyping and validation of parallel programs, and subsequent piecemeal performance tuning. Here we describe the implementation of the domain decomposition tools used for creating data distributions across sets of processors. We also present the hierarchy of parallelization tools that allows smooth translation of legacy code (or a serial design) into a parallel program. Along with the actual tool descriptions, we will present the considerations that led to the particular design choices. Many of these are motivated by the requirement that Charon must be useful within the traditional computational environments of Fortran 77 and C. Only the Fortran 77 syntax will be presented in this report.

  7. Integrated Joule switches for the control of current dynamics in parallel superconducting strips

    NASA Astrophysics Data System (ADS)

    Casaburi, A.; Heath, R. M.; Cristiano, R.; Ejrnaes, M.; Zen, N.; Ohkubo, M.; Hadfield, R. H.

    2018-06-01

    Understanding and harnessing the physics of the dynamic current distribution in parallel superconducting strips holds the key to creating next generation sensors for single molecule and single photon detection. Non-uniformity in the current distribution in parallel superconducting strips leads to low detection efficiency and unstable operation, preventing the scale up to large area sensors. Recent studies indicate that non-uniform current distributions occurring in parallel strips can be understood and modeled in the framework of the generalized London model. Here we build on this important physical insight, investigating an innovative design with integrated superconducting-to-resistive Joule switches to break the superconducting loops between the strips and thus control the current dynamics. Employing precision low temperature nano-optical techniques, we map the uniformity of the current distribution before- and after the resistive strip switching event, confirming the effectiveness of our design. These results provide important insights for the development of next generation large area superconducting strip-based sensors.

  8. Technical report analysis and design: Study of solid rocket motors for a space shuttle booster, volume 2, book 1, supplement 1

    NASA Technical Reports Server (NTRS)

    1972-01-01

    An analysis and design effort was conducted as part of the study of solid rocket motor for a space shuttle booster. The 156-inch-diameter, parallel burn solid rocket motor was selected as its baseline because it is transportable and is the most cost-effective, reliable system that has been developed and demonstrated. The basic approach was to concentrate on the selected baseline design, and to draw from the baseline sufficient data to describe the alternate approaches also studied. The following conclusions were reached with respect to technical feasibility of the use of solid rocket booster motors for the space shuttle vehicle: (1) The 156-inch, parallel-burn baseline SRM design meets NASA's study requirements while incorporating conservative safety factors. (2) The solid rocket motor booster represents a cost-effective approach. (3) Baseline costs are conservative and are based on a demonstrated design. (4) Recovery and reuse are feasible and offer substantial cost savings. (5) Abort can be accomplished successfully. (6) Ecological effects are acceptable.

  9. Supercomputing on massively parallel bit-serial architectures

    NASA Technical Reports Server (NTRS)

    Iobst, Ken

    1985-01-01

    Research on the Goodyear Massively Parallel Processor (MPP) suggests that high-level parallel languages are practical and can be designed with powerful new semantics that allow algorithms to be efficiently mapped to the real machines. For the MPP these semantics include parallel/associative array selection for both dense and sparse matrices, variable precision arithmetic to trade accuracy for speed, micro-pipelined train broadcast, and conditional branching at the processing element (PE) control unit level. The preliminary design of a FORTRAN-like parallel language for the MPP has been completed and is being used to write programs to perform sparse matrix array selection, min/max search, matrix multiplication, Gaussian elimination on single bit arrays and other generic algorithms. A description is given of the MPP design. Features of the system and its operation are illustrated in the form of charts and diagrams.

  10. Process optimization using combinatorial design principles: parallel synthesis and design of experiment methods.

    PubMed

    Gooding, Owen W

    2004-06-01

    The use of parallel synthesis techniques with statistical design of experiment (DoE) methods is a powerful combination for the optimization of chemical processes. Advances in parallel synthesis equipment and easy to use software for statistical DoE have fueled a growing acceptance of these techniques in the pharmaceutical industry. As drug candidate structures become more complex at the same time that development timelines are compressed, these enabling technologies promise to become more important in the future.

  11. Distributed Parallel Processing and Dynamic Load Balancing Techniques for Multidisciplinary High Speed Aircraft Design

    NASA Technical Reports Server (NTRS)

    Krasteva, Denitza T.

    1998-01-01

    Multidisciplinary design optimization (MDO) for large-scale engineering problems poses many challenges (e.g., the design of an efficient concurrent paradigm for global optimization based on disciplinary analyses, expensive computations over vast data sets, etc.) This work focuses on the application of distributed schemes for massively parallel architectures to MDO problems, as a tool for reducing computation time and solving larger problems. The specific problem considered here is configuration optimization of a high speed civil transport (HSCT), and the efficient parallelization of the embedded paradigm for reasonable design space identification. Two distributed dynamic load balancing techniques (random polling and global round robin with message combining) and two necessary termination detection schemes (global task count and token passing) were implemented and evaluated in terms of effectiveness and scalability to large problem sizes and a thousand processors. The effect of certain parameters on execution time was also inspected. Empirical results demonstrated stable performance and effectiveness for all schemes, and the parametric study showed that the selected algorithmic parameters have a negligible effect on performance.

  12. An Analysis of Social, Literary and Technological Sources Used by Classroom Teachers in Social Studies Courses

    ERIC Educational Resources Information Center

    Fidan, Nuray Kurtdede; Ergün, Mustafa

    2016-01-01

    In this study, social, literary and technological sources used by classroom teachers in social studies courses are analyzed in terms of frequency. The study employs mixed methods research and is designed following the convergent parallel design. In the qualitative part of the study, phenomenological method was used and in the quantitative…

  13. Layout as Political Expression: Visual Literacy and the Peruvian Press.

    ERIC Educational Resources Information Center

    Barnhurst, Kevin G.

    Newspaper layout and design studies ignore politics, and most studies of newspaper politics ignore visual design. News layout is generally thought to be a set of neutral, efficient practices. This study suggests that the political position of Peruvian newspapers parallels their visual presentation of terrorism. The liberal "La Republica"…

  14. Integrated Task and Data Parallel Programming

    NASA Technical Reports Server (NTRS)

    Grimshaw, A. S.

    1998-01-01

    This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated with Andrew Grimshaw and Adam Ferrari to write a book chapter which will be included in Parallel Processing in C++ edited by Gregory Wilson. I also finished two courses, Compilers and Advanced Compilers, in 1995. These courses complete my class requirements at the University of Virginia. I have only my dissertation research and defense to complete.

  15. Pilot Non-Conformance to Alerting System Commands During Closely Spaced Parallel Approaches

    NASA Technical Reports Server (NTRS)

    Pritchett, Amy R.; Hansman, R. John

    1997-01-01

    Pilot non-conformance to alerting system commands has been noted in general and to a TCAS-like collision avoidance system in a previous experiment. This paper details two experiments studying collision avoidance during closely-spaced parallel approaches in instrument meteorological conditions (IMC), and specifically examining possible causal factors of, and design solutions to, pilot non-conformance.

  16. Variable-Complexity Multidisciplinary Optimization on Parallel Computers

    NASA Technical Reports Server (NTRS)

    Grossman, Bernard; Mason, William H.; Watson, Layne T.; Haftka, Raphael T.

    1998-01-01

    This report covers work conducted under grant NAG1-1562 for the NASA High Performance Computing and Communications Program (HPCCP) from December 7, 1993, to December 31, 1997. The objective of the research was to develop new multidisciplinary design optimization (MDO) techniques which exploit parallel computing to reduce the computational burden of aircraft MDO. The design of the High-Speed Civil Transport (HSCT) air-craft was selected as a test case to demonstrate the utility of our MDO methods. The three major tasks of this research grant included: development of parallel multipoint approximation methods for the aerodynamic design of the HSCT, use of parallel multipoint approximation methods for structural optimization of the HSCT, mathematical and algorithmic development including support in the integration of parallel computation for items (1) and (2). These tasks have been accomplished with the development of a response surface methodology that incorporates multi-fidelity models. For the aerodynamic design we were able to optimize with up to 20 design variables using hundreds of expensive Euler analyses together with thousands of inexpensive linear theory simulations. We have thereby demonstrated the application of CFD to a large aerodynamic design problem. For the predicting structural weight we were able to combine hundreds of structural optimizations of refined finite element models with thousands of optimizations based on coarse models. Computations have been carried out on the Intel Paragon with up to 128 nodes. The parallel computation allowed us to perform combined aerodynamic-structural optimization using state of the art models of a complex aircraft configurations.

  17. Sample size calculations for stepped wedge and cluster randomised trials: a unified approach

    PubMed Central

    Hemming, Karla; Taljaard, Monica

    2016-01-01

    Objectives To clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework. Study Design and Setting We summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies. Results For a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design. Conclusion Our unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario. PMID:26344808

  18. Performance of GeantV EM Physics Models

    NASA Astrophysics Data System (ADS)

    Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Cosmo, G.; Duhem, L.; Elvira, D.; Folger, G.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.

    2017-10-01

    The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.

  19. GPU accelerated dynamic functional connectivity analysis for functional MRI data.

    PubMed

    Akgün, Devrim; Sakoğlu, Ünal; Esquivel, Johnny; Adinoff, Bryon; Mete, Mutlu

    2015-07-01

    Recent advances in multi-core processors and graphics card based computational technologies have paved the way for an improved and dynamic utilization of parallel computing techniques. Numerous applications have been implemented for the acceleration of computationally-intensive problems in various computational science fields including bioinformatics, in which big data problems are prevalent. In neuroimaging, dynamic functional connectivity (DFC) analysis is a computationally demanding method used to investigate dynamic functional interactions among different brain regions or networks identified with functional magnetic resonance imaging (fMRI) data. In this study, we implemented and analyzed a parallel DFC algorithm based on thread-based and block-based approaches. The thread-based approach was designed to parallelize DFC computations and was implemented in both Open Multi-Processing (OpenMP) and Compute Unified Device Architecture (CUDA) programming platforms. Another approach developed in this study to better utilize CUDA architecture is the block-based approach, where parallelization involves smaller parts of fMRI time-courses obtained by sliding-windows. Experimental results showed that the proposed parallel design solutions enabled by the GPUs significantly reduce the computation time for DFC analysis. Multicore implementation using OpenMP on 8-core processor provides up to 7.7× speed-up. GPU implementation using CUDA yielded substantial accelerations ranging from 18.5× to 157× speed-up once thread-based and block-based approaches were combined in the analysis. Proposed parallel programming solutions showed that multi-core processor and CUDA-supported GPU implementations accelerated the DFC analyses significantly. Developed algorithms make the DFC analyses more practical for multi-subject studies with more dynamic analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Design of a dataway processor for a parallel image signal processing system

    NASA Astrophysics Data System (ADS)

    Nomura, Mitsuru; Fujii, Tetsuro; Ono, Sadayasu

    1995-04-01

    Recently, demands for high-speed signal processing have been increasing especially in the field of image data compression, computer graphics, and medical imaging. To achieve sufficient power for real-time image processing, we have been developing parallel signal-processing systems. This paper describes a communication processor called 'dataway processor' designed for a new scalable parallel signal-processing system. The processor has six high-speed communication links (Dataways), a data-packet routing controller, a RISC CORE, and a DMA controller. Each communication link operates at 8-bit parallel in a full duplex mode at 50 MHz. Moreover, data routing, DMA, and CORE operations are processed in parallel. Therefore, sufficient throughput is available for high-speed digital video signals. The processor is designed in a top- down fashion using a CAD system called 'PARTHENON.' The hardware is fabricated using 0.5-micrometers CMOS technology, and its hardware is about 200 K gates.

  1. Effects of imbalanced currents on large-format LiFePO4/graphite batteries systems connected in parallel

    NASA Astrophysics Data System (ADS)

    Shi, Wei; Hu, Xiaosong; Jin, Chao; Jiang, Jiuchun; Zhang, Yanru; Yip, Tony

    2016-05-01

    With the development and popularization of electric vehicles, it is urgent and necessary to develop effective management and diagnosis technology for battery systems. In this work, we design a parallel battery model, according to equivalent circuits of parallel voltage and branch current, to study effects of imbalanced currents on parallel large-format LiFePO4/graphite battery systems. Taking a 60 Ah LiFePO4/graphite battery system manufactured by ATL (Amperex Technology Limited, China) as an example, causes of imbalanced currents in the parallel connection are analyzed using our model, and the associated effect mechanisms on long-term stability of each single battery are examined. Theoretical and experimental results show that continuously increasing imbalanced currents during cycling are mainly responsible for the capacity fade of LiFePO4/graphite parallel batteries. It is thus a good way to avoid fast performance fade of parallel battery systems by suppressing variations of branch currents.

  2. Xyce parallel electronic simulator users guide, version 6.1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less

  3. Xyce parallel electronic simulator users' guide, Version 6.0.1.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less

  4. Xyce parallel electronic simulator users guide, version 6.0.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less

  5. A Design Verification of the Parallel Pipelined Image Processings

    NASA Astrophysics Data System (ADS)

    Wasaki, Katsumi; Harai, Toshiaki

    2008-11-01

    This paper presents a case study of the design and verification of a parallel and pipe-lined image processing unit based on an extended Petri net, which is called a Logical Colored Petri net (LCPN). This is suitable for Flexible-Manufacturing System (FMS) modeling and discussion of structural properties. LCPN is another family of colored place/transition-net(CPN) with the addition of the following features: integer value assignment of marks, representation of firing conditions as marks' value based formulae, and coupling of output procedures with transition firing. Therefore, to study the behavior of a system modeled with this net, we provide a means of searching the reachability tree for markings.

  6. Comparison between four dissimilar solar panel configurations

    NASA Astrophysics Data System (ADS)

    Suleiman, K.; Ali, U. A.; Yusuf, Ibrahim; Koko, A. D.; Bala, S. I.

    2017-12-01

    Several studies on photovoltaic systems focused on how it operates and energy required in operating it. Little attention is paid on its configurations, modeling of mean time to system failure, availability, cost benefit and comparisons of parallel and series-parallel designs. In this research work, four system configurations were studied. Configuration I consists of two sub-components arranged in parallel with 24 V each, configuration II consists of four sub-components arranged logically in parallel with 12 V each, configuration III consists of four sub-components arranged in series-parallel with 8 V each, and configuration IV has six sub-components with 6 V each arranged in series-parallel. Comparative analysis was made using Chapman Kolmogorov's method. The derivation for explicit expression of mean time to system failure, steady state availability and cost benefit analysis were performed, based on the comparison. Ranking method was used to determine the optimal configuration of the systems. The results of analytical and numerical solutions of system availability and mean time to system failure were determined and it was found that configuration I is the optimal configuration.

  7. Big Data: A Parallel Particle Swarm Optimization-Back-Propagation Neural Network Algorithm Based on MapReduce

    PubMed Central

    Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan

    2016-01-01

    A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network’s initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data. PMID:27304987

  8. National Combustion Code Parallel Performance Enhancements

    NASA Technical Reports Server (NTRS)

    Quealy, Angela; Benyo, Theresa (Technical Monitor)

    2002-01-01

    The National Combustion Code (NCC) is being developed by an industry-government team for the design and analysis of combustion systems. The unstructured grid, reacting flow code uses a distributed memory, message passing model for its parallel implementation. The focus of the present effort has been to improve the performance of the NCC code to meet combustor designer requirements for model accuracy and analysis turnaround time. Improving the performance of this code contributes significantly to the overall reduction in time and cost of the combustor design cycle. This report describes recent parallel processing modifications to NCC that have improved the parallel scalability of the code, enabling a two hour turnaround for a 1.3 million element fully reacting combustion simulation on an SGI Origin 2000.

  9. A Parallel Genetic Algorithm for Automated Electronic Circuit Design

    NASA Technical Reports Server (NTRS)

    Lohn, Jason D.; Colombano, Silvano P.; Haith, Gary L.; Stassinopoulos, Dimitris; Norvig, Peter (Technical Monitor)

    2000-01-01

    We describe a parallel genetic algorithm (GA) that automatically generates circuit designs using evolutionary search. A circuit-construction programming language is introduced and we show how evolution can generate practical analog circuit designs. Our system allows circuit size (number of devices), circuit topology, and device values to be evolved. We present experimental results as applied to analog filter and amplifier design tasks.

  10. Parallel steady state studies on a milliliter scale accelerate fed-batch bioprocess design for recombinant protein production with Escherichia coli.

    PubMed

    Schmideder, Andreas; Cremer, Johannes H; Weuster-Botz, Dirk

    2016-11-01

    In general, fed-batch processes are applied for recombinant protein production with Escherichia coli (E. coli). However, state of the art methods for identifying suitable reaction conditions suffer from severe drawbacks, i.e. direct transfer of process information from parallel batch studies is often defective and sequential fed-batch studies are time-consuming and cost-intensive. In this study, continuously operated stirred-tank reactors on a milliliter scale were applied to identify suitable reaction conditions for fed-batch processes. Isopropyl β-d-1-thiogalactopyranoside (IPTG) induction strategies were varied in parallel-operated stirred-tank bioreactors to study the effects on the continuous production of the recombinant protein photoactivatable mCherry (PAmCherry) with E. coli. Best-performing induction strategies were transferred from the continuous processes on a milliliter scale to liter scale fed-batch processes. Inducing recombinant protein expression by dynamically increasing the IPTG concentration to 100 µM led to an increase in the product concentration of 21% (8.4 g L -1 ) compared to an implemented high-performance production process with the most frequently applied induction strategy by a single addition of 1000 µM IPGT. Thus, identifying feasible reaction conditions for fed-batch processes in parallel continuous studies on a milliliter scale was shown to be a powerful, novel method to accelerate bioprocess design in a cost-reducing manner. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1426-1435, 2016. © 2016 American Institute of Chemical Engineers.

  11. A comparative study of serial and parallel aeroelastic computations of wings

    NASA Technical Reports Server (NTRS)

    Byun, Chansup; Guruswamy, Guru P.

    1994-01-01

    A procedure for computing the aeroelasticity of wings on parallel multiple-instruction, multiple-data (MIMD) computers is presented. In this procedure, fluids are modeled using Euler equations, and structures are modeled using modal or finite element equations. The procedure is designed in such a way that each discipline can be developed and maintained independently by using a domain decomposition approach. In the present parallel procedure, each computational domain is scalable. A parallel integration scheme is used to compute aeroelastic responses by solving fluid and structural equations concurrently. The computational efficiency issues of parallel integration of both fluid and structural equations are investigated in detail. This approach, which reduces the total computational time by a factor of almost 2, is demonstrated for a typical aeroelastic wing by using various numbers of processors on the Intel iPSC/860.

  12. Multidisciplinary Optimization Methods for Aircraft Preliminary Design

    NASA Technical Reports Server (NTRS)

    Kroo, Ilan; Altus, Steve; Braun, Robert; Gage, Peter; Sobieski, Ian

    1994-01-01

    This paper describes a research program aimed at improved methods for multidisciplinary design and optimization of large-scale aeronautical systems. The research involves new approaches to system decomposition, interdisciplinary communication, and methods of exploiting coarse-grained parallelism for analysis and optimization. A new architecture, that involves a tight coupling between optimization and analysis, is intended to improve efficiency while simplifying the structure of multidisciplinary, computation-intensive design problems involving many analysis disciplines and perhaps hundreds of design variables. Work in two areas is described here: system decomposition using compatibility constraints to simplify the analysis structure and take advantage of coarse-grained parallelism; and collaborative optimization, a decomposition of the optimization process to permit parallel design and to simplify interdisciplinary communication requirements.

  13. Xyce parallel electronic simulator : users' guide.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.

    2011-05-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-artmore » algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.« less

  14. Parallel transmission RF pulse design for eddy current correction at ultra high field.

    PubMed

    Zheng, Hai; Zhao, Tiejun; Qian, Yongxian; Ibrahim, Tamer; Boada, Fernando

    2012-08-01

    Multidimensional spatially selective RF pulses have been used in MRI applications such as B₁ and B₀ inhomogeneities mitigation. However, the long pulse duration has limited their practical applications. Recently, theoretical and experimental studies have shown that parallel transmission can effectively shorten pulse duration without sacrificing the quality of the excitation pattern. Nonetheless, parallel transmission with accelerated pulses can be severely impeded by hardware and/or system imperfections. One of such imperfections is the effect of the eddy current field. In this paper, we first show the effects of the eddy current field on the excitation pattern and then report an RF pulse the design method to correct eddy current fields caused by the RF coil and the gradient system. Experimental results on a 7 T human eight-channel parallel transmit system show substantial improvements on excitation patterns with the use of eddy current correction. Moreover, the proposed model-based correction method not only demonstrates comparable excitation patterns as the trajectory measurement method, but also significantly improves time efficiency. Copyright © 2012. Published by Elsevier Inc.

  15. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, Apratim; Ellis, Carla Schlatter; Kotz, David; Nieuwejaar, Nils; Best, Michael

    1994-01-01

    Rapid increases in the computational speeds of multiprocessors have not been matched by corresponding performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwidth and high-volume data transfer between the I/O subsystem and thousands of processors. Design of such high-performance parallel file systems depends on a thorough grasp of the expected workload. So far there have been no comprehensive usage studies of multiprocessor file systems. Our CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines. The results of our trace analysis lead to recommendations for parallel file system design. First the file system should support efficient concurrent access to many files, and I/O requests from many jobs under varying load conditions. Second, it must efficiently manage large files kept open for long periods. Third, it should expect to see small requests predominantly sequential access patterns, application-wide synchronous access, no concurrent file-sharing between jobs appreciable byte and block sharing between processes within jobs, and strong interprocess locality. Finally, the trace data suggest that node-level write caches and collective I/O request interfaces may be useful in certain environments.

  16. General design approach and practical realization of decoupling matrices for parallel transmission coils.

    PubMed

    Mahmood, Zohaib; McDaniel, Patrick; Guérin, Bastien; Keil, Boris; Vester, Markus; Adalsteinsson, Elfar; Wald, Lawrence L; Daniel, Luca

    2016-07-01

    In a coupled parallel transmit (pTx) array, the power delivered to a channel is partially distributed to other channels because of coupling. This power is dissipated in circulators resulting in a significant reduction in power efficiency. In this study, a technique for designing robust decoupling matrices interfaced between the RF amplifiers and the coils is proposed. The decoupling matrices ensure that most forward power is delivered to the load without loss of encoding capabilities of the pTx array. The decoupling condition requires that the impedance matrix seen by the power amplifiers is a diagonal matrix whose entries match the characteristic impedance of the power amplifiers. In this work, the impedance matrix of the coupled coils is diagonalized by a successive multiplication by its eigenvectors. A general design procedure and software are developed to generate automatically the hardware that implements diagonalization using passive components. The general design method is demonstrated by decoupling two example parallel transmit arrays. Our decoupling matrices achieve better than -20 db decoupling in both cases. A robust framework for designing decoupling matrices for pTx arrays is presented and validated. The proposed decoupling strategy theoretically scales to any arbitrary number of channels. Magn Reson Med 76:329-339, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  17. Comparison of intervention effects in split-mouth and parallel-arm randomized controlled trials: a meta-epidemiological study

    PubMed Central

    2014-01-01

    Background Split-mouth randomized controlled trials (RCTs) are popular in oral health research. Meta-analyses frequently include trials of both split-mouth and parallel-arm designs to derive combined intervention effects. However, carry-over effects may induce bias in split- mouth RCTs. We aimed to assess whether intervention effect estimates differ between split- mouth and parallel-arm RCTs investigating the same questions. Methods We performed a meta-epidemiological study. We systematically reviewed meta- analyses including both split-mouth and parallel-arm RCTs with binary or continuous outcomes published up to February 2013. Two independent authors selected studies and extracted data. We used a two-step approach to quantify the differences between split-mouth and parallel-arm RCTs: for each meta-analysis. First, we derived ratios of odds ratios (ROR) for dichotomous data and differences in standardized mean differences (∆SMD) for continuous data; second, we pooled RORs or ∆SMDs across meta-analyses by random-effects meta-analysis models. Results We selected 18 systematic reviews, for 15 meta-analyses with binary outcomes (28 split-mouth and 28 parallel-arm RCTs) and 19 meta-analyses with continuous outcomes (28 split-mouth and 28 parallel-arm RCTs). Effect estimates did not differ between split-mouth and parallel-arm RCTs (mean ROR, 0.96, 95% confidence interval 0.52–1.80; mean ∆SMD, 0.08, -0.14–0.30). Conclusions Our study did not provide sufficient evidence for a difference in intervention effect estimates derived from split-mouth and parallel-arm RCTs. Authors should consider including split-mouth RCTs in their meta-analyses with suitable and appropriate analysis. PMID:24886043

  18. Fringe Capacitance of a Parallel-Plate Capacitor.

    ERIC Educational Resources Information Center

    Hale, D. P.

    1978-01-01

    Describes an experiment designed to measure the forces between charged parallel plates, and determines the relationship among the effective electrode area, the measured capacitance values, and the electrode spacing of a parallel plate capacitor. (GA)

  19. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    PubMed

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  20. A tool for simulating parallel branch-and-bound methods

    NASA Astrophysics Data System (ADS)

    Golubeva, Yana; Orlov, Yury; Posypkin, Mikhail

    2016-01-01

    The Branch-and-Bound method is known as one of the most powerful but very resource consuming global optimization methods. Parallel and distributed computing can efficiently cope with this issue. The major difficulty in parallel B&B method is the need for dynamic load redistribution. Therefore design and study of load balancing algorithms is a separate and very important research topic. This paper presents a tool for simulating parallel Branchand-Bound method. The simulator allows one to run load balancing algorithms with various numbers of processors, sizes of the search tree, the characteristics of the supercomputer's interconnect thereby fostering deep study of load distribution strategies. The process of resolution of the optimization problem by B&B method is replaced by a stochastic branching process. Data exchanges are modeled using the concept of logical time. The user friendly graphical interface to the simulator provides efficient visualization and convenient performance analysis.

  1. Novel hybrid GPU-CPU implementation of parallelized Monte Carlo parametric expectation maximization estimation method for population pharmacokinetic data analysis.

    PubMed

    Ng, C M

    2013-10-01

    The development of a population PK/PD model, an essential component for model-based drug development, is both time- and labor-intensive. A graphical-processing unit (GPU) computing technology has been proposed and used to accelerate many scientific computations. The objective of this study was to develop a hybrid GPU-CPU implementation of parallelized Monte Carlo parametric expectation maximization (MCPEM) estimation algorithm for population PK data analysis. A hybrid GPU-CPU implementation of the MCPEM algorithm (MCPEMGPU) and identical algorithm that is designed for the single CPU (MCPEMCPU) were developed using MATLAB in a single computer equipped with dual Xeon 6-Core E5690 CPU and a NVIDIA Tesla C2070 GPU parallel computing card that contained 448 stream processors. Two different PK models with rich/sparse sampling design schemes were used to simulate population data in assessing the performance of MCPEMCPU and MCPEMGPU. Results were analyzed by comparing the parameter estimation and model computation times. Speedup factor was used to assess the relative benefit of parallelized MCPEMGPU over MCPEMCPU in shortening model computation time. The MCPEMGPU consistently achieved shorter computation time than the MCPEMCPU and can offer more than 48-fold speedup using a single GPU card. The novel hybrid GPU-CPU implementation of parallelized MCPEM algorithm developed in this study holds a great promise in serving as the core for the next-generation of modeling software for population PK/PD analysis.

  2. Placebo Response in Repetitive Transcranial Magnetic Stimulation Trials of Treatment of Auditory Hallucinations in Schizophrenia: A Meta-Analysis

    PubMed Central

    Dollfus, Sonia; Lecardeur, Laurent; Morello, Rémy; Etard, Olivier

    2016-01-01

    Several meta-analyses have assessed the response of patients with schizophrenia with auditory verbal hallucinations (AVH) to treatment with repetitive transcranial magnetic stimulation (rTMS); however, the placebo response has never been explored. Typically observed in a therapeutic trial, the placebo effect may have a major influence on the effectiveness of rTMS. The purpose of this meta-analysis is to evaluate the magnitude of the placebo effect observed in controlled studies of rTMS treatment of AVH, and to determine factors that can impact the magnitude of this placebo effect, such as study design considerations and the type of sham used. The study included twenty-one articles concerning 303 patients treated by sham rTMS. A meta-analytic method was applied to obtain a combined, weighted effect size, Hedges’s g. The mean weighted effect size of the placebo effect across these 21 studies was 0.29 (P < .001). Comparison of the parallel and crossover studies revealed distinct results for each study design; placebo has a significant effect size in the 13 parallel studies (g = 0.44, P < 10−4), but not in the 8 crossover studies (g = 0.06, P = .52). In meta-analysis of the 13 parallel studies, the 45° position coil showed the highest effect size. Our results demonstrate that placebo effect should be considered a major source of bias in the assessment of rTMS efficacy. These results fundamentally inform the design of further controlled studies, particularly with respect to studies of rTMS treatment in psychiatry. PMID:26089351

  3. Electromagnetic pulse coupling through an aperture into a two-parallel-plate region

    NASA Technical Reports Server (NTRS)

    Rahmat-Samii, Y.

    1978-01-01

    Analysis of electromagnetic-pulse (EMP) penetration via apertures into cavities is an important study in designing hardened systems. In this paper, an integral equation procedure is developed for determining the frequency and consequently the time behavior of the field inside a two-parallel-plate region excited through an aperture by an EMP. Some discussion of the numerical results is also included in the paper for completeness.

  4. Learning in Parallel: Using Parallel Corpora to Enhance Written Language Acquisition at the Beginning Level

    ERIC Educational Resources Information Center

    Bluemel, Brody

    2014-01-01

    This article illustrates the pedagogical value of incorporating parallel corpora in foreign language education. It explores the development of a Chinese/English parallel corpus designed specifically for pedagogical application. The corpus tool was created to aid language learners in reading comprehension and writing development by making foreign…

  5. Sampling Designs in Qualitative Research: Making the Sampling Process More Public

    ERIC Educational Resources Information Center

    Onwuegbuzie, Anthony J.; Leech, Nancy L.

    2007-01-01

    The purpose of this paper is to provide a typology of sampling designs for qualitative researchers. We introduce the following sampling strategies: (a) parallel sampling designs, which represent a body of sampling strategies that facilitate credible comparisons of two or more different subgroups that are extracted from the same levels of study;…

  6. Portable parallel stochastic optimization for the design of aeropropulsion components

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Rhodes, G. S.

    1994-01-01

    This report presents the results of Phase 1 research to develop a methodology for performing large-scale Multi-disciplinary Stochastic Optimization (MSO) for the design of aerospace systems ranging from aeropropulsion components to complete aircraft configurations. The current research recognizes that such design optimization problems are computationally expensive, and require the use of either massively parallel or multiple-processor computers. The methodology also recognizes that many operational and performance parameters are uncertain, and that uncertainty must be considered explicitly to achieve optimum performance and cost. The objective of this Phase 1 research was to initialize the development of an MSO methodology that is portable to a wide variety of hardware platforms, while achieving efficient, large-scale parallelism when multiple processors are available. The first effort in the project was a literature review of available computer hardware, as well as review of portable, parallel programming environments. The first effort was to implement the MSO methodology for a problem using the portable parallel programming language, Parallel Virtual Machine (PVM). The third and final effort was to demonstrate the example on a variety of computers, including a distributed-memory multiprocessor, a distributed-memory network of workstations, and a single-processor workstation. Results indicate the MSO methodology can be well-applied towards large-scale aerospace design problems. Nearly perfect linear speedup was demonstrated for computation of optimization sensitivity coefficients on both a 128-node distributed-memory multiprocessor (the Intel iPSC/860) and a network of workstations (speedups of almost 19 times achieved for 20 workstations). Very high parallel efficiencies (75 percent for 31 processors and 60 percent for 50 processors) were also achieved for computation of aerodynamic influence coefficients on the Intel. Finally, the multi-level parallelization strategy that will be needed for large-scale MSO problems was demonstrated to be highly efficient. The same parallel code instructions were used on both platforms, demonstrating portability. There are many applications for which MSO can be applied, including NASA's High-Speed-Civil Transport, and advanced propulsion systems. The use of MSO will reduce design and development time and testing costs dramatically.

  7. The effect of earthquake on architecture geometry with non-parallel system irregularity configuration

    NASA Astrophysics Data System (ADS)

    Teddy, Livian; Hardiman, Gagoek; Nuroji; Tudjono, Sri

    2017-12-01

    Indonesia is an area prone to earthquake that may cause casualties and damage to buildings. The fatalities or the injured are not largely caused by the earthquake, but by building collapse. The collapse of the building is resulted from the building behaviour against the earthquake, and it depends on many factors, such as architectural design, geometry configuration of structural elements in horizontal and vertical plans, earthquake zone, geographical location (distance to earthquake center), soil type, material quality, and construction quality. One of the geometry configurations that may lead to the collapse of the building is irregular configuration of non-parallel system. In accordance with FEMA-451B, irregular configuration in non-parallel system is defined to have existed if the vertical lateral force-retaining elements are neither parallel nor symmetric with main orthogonal axes of the earthquake-retaining axis system. Such configuration may lead to torque, diagonal translation and local damage to buildings. It does not mean that non-parallel irregular configuration should not be formed on architectural design; however the designer must know the consequence of earthquake behaviour against buildings with irregular configuration of non-parallel system. The present research has the objective to identify earthquake behaviour in architectural geometry with irregular configuration of non-parallel system. The present research was quantitative with simulation experimental method. It consisted of 5 models, where architectural data and model structure data were inputted and analyzed using the software SAP2000 in order to find out its performance, and ETAB2015 to determine the eccentricity occurred. The output of the software analysis was tabulated, graphed, compared and analyzed with relevant theories. For areas of strong earthquake zones, avoid designing buildings which wholly form irregular configuration of non-parallel system. If it is inevitable to design a building with building parts containing irregular configuration of non-parallel system, make it more rigid by forming a triangle module, and use the formula.A good collaboration is needed between architects and structural experts in creating earthquake architecture.

  8. Parallel computing works

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less

  9. Generation and investigation of terahertz Airy beam realized using parallel-plate waveguides

    NASA Astrophysics Data System (ADS)

    Wu, Mengru; Lang, Tingting; Shi, Guohua; Han, Zhanghua

    2018-03-01

    In this paper, the launching of Airy beam in the terahertz region using waveguiding structures was proposed, designed and numerically characterized. By properly designing the waveguide slit width and the packing number in different sections of parallel-plate waveguides (PPWGs) array, arbitrary phase delay and lateral position-dependent amplitude transmission through the structure, required to realize the target Airy beam profile, can be easily fulfilled. Airy beams working at the frequency of 0.3 THz with good non-diffracting, self-bending, and self-healing features are demonstrated. This study represents a new alternative to scattering-based metasurface structures, and can be utilized in many modern applications.

  10. File concepts for parallel I/O

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1989-01-01

    The subject of input/output (I/O) was often neglected in the design of parallel computer systems, although for many problems I/O rates will limit the speedup attainable. The I/O problem is addressed by considering the role of files in parallel systems. The notion of parallel files is introduced. Parallel files provide for concurrent access by multiple processes, and utilize parallelism in the I/O system to improve performance. Parallel files can also be used conventionally by sequential programs. A set of standard parallel file organizations is proposed, organizations are suggested, using multiple storage devices. Problem areas are also identified and discussed.

  11. Xyce Parallel Electronic Simulator : users' guide, version 2.0.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont

    2004-06-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator capable of simulating electrical circuits at a variety of abstraction levels. Primarily, Xyce has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability the current state-of-the-art in the following areas: {sm_bullet} Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. {sm_bullet} Improved performance for allmore » numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. {sm_bullet} Device models which are specifically tailored to meet Sandia's needs, including many radiation-aware devices. {sm_bullet} A client-server or multi-tiered operating model wherein the numerical kernel can operate independently of the graphical user interface (GUI). {sm_bullet} Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing of computing platforms. These include serial, shared-memory and distributed-memory parallel implementation - which allows it to run efficiently on the widest possible number parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. One feature required by designers is the ability to add device models, many specific to the needs of Sandia, to the code. To this end, the device package in the Xyce These input formats include standard analytical models, behavioral models look-up Parallel Electronic Simulator is designed to support a variety of device model inputs. tables, and mesh-level PDE device models. Combined with this flexible interface is an architectural design that greatly simplifies the addition of circuit models. One of the most important feature of Xyce is in providing a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia now has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods) research and development can be performed. Ultimately, these capabilities are migrated to end users.« less

  12. Optimization under uncertainty of parallel nonlinear energy sinks

    NASA Astrophysics Data System (ADS)

    Boroson, Ethan; Missoum, Samy; Mattei, Pierre-Olivier; Vergez, Christophe

    2017-04-01

    Nonlinear Energy Sinks (NESs) are a promising technique for passively reducing the amplitude of vibrations. Through nonlinear stiffness properties, a NES is able to passively and irreversibly absorb energy. Unlike the traditional Tuned Mass Damper (TMD), NESs do not require a specific tuning and absorb energy over a wider range of frequencies. Nevertheless, they are still only efficient over a limited range of excitations. In order to mitigate this limitation and maximize the efficiency range, this work investigates the optimization of multiple NESs configured in parallel. It is well known that the efficiency of a NES is extremely sensitive to small perturbations in loading conditions or design parameters. In fact, the efficiency of a NES has been shown to be nearly discontinuous in the neighborhood of its activation threshold. For this reason, uncertainties must be taken into account in the design optimization of NESs. In addition, the discontinuities require a specific treatment during the optimization process. In this work, the objective of the optimization is to maximize the expected value of the efficiency of NESs in parallel. The optimization algorithm is able to tackle design variables with uncertainty (e.g., nonlinear stiffness coefficients) as well as aleatory variables such as the initial velocity of the main system. The optimal design of several parallel NES configurations for maximum mean efficiency is investigated. Specifically, NES nonlinear stiffness properties, considered random design variables, are optimized for cases with 1, 2, 3, 4, 5, and 10 NESs in parallel. The distributions of efficiency for the optimal parallel configurations are compared to distributions of efficiencies of non-optimized NESs. It is observed that the optimization enables a sharp increase in the mean value of efficiency while reducing the corresponding variance, thus leading to more robust NES designs.

  13. Differential Draining of Parallel-Fed Propellant Tanks in Morpheus and Apollo Flight

    NASA Technical Reports Server (NTRS)

    Hurlbert, Eric; Guardado, Hector; Hernandez, Humberto; Desai, Pooja

    2015-01-01

    Parallel-fed propellant tanks are an advantageous configuration for many spacecraft. Parallel-fed tanks allow the center of gravity (cg) to be maintained over the engine(s), as opposed to serial-fed propellant tanks which result in a cg shift as propellants are drained from tank one tank first opposite another. Parallel-fed tanks also allow for tank isolation if that is needed. Parallel tanks and feed systems have been used in several past vehicles including the Apollo Lunar Module. The design of the feedsystem connecting the parallel tank is critical to maintain balance in the propellant tanks. The design must account for and minimize the effect of manufacturing variations that could cause delta-p or mass flowrate differences, which would lead to propellant imbalance. Other sources of differential draining will be discussed. Fortunately, physics provides some self-correcting behaviors that tend to equalize any initial imbalance. The question concerning whether or not active control of propellant in each tank is required or can be avoided or not is also important to answer. In order to provide data on parallel-fed tanks and differential draining in flight for cryogenic propellants (as well as any other fluid), a vertical test bed (flying lander) for terrestrial use was employed. The Morpheus vertical test bed is a parallel-fed propellant tank system that uses passive design to keep the propellant tanks balanced. The system is operated in blow down. The Morpheus vehicle was instrumented with a capacitance level sensor in each propellant tank in order to measure the draining of propellants in over 34 tethered and 12 free flights. Morpheus did experience an approximately 20 lb/m imbalance in one pair of tanks. The cause of this imbalance will be discussed. This paper discusses the analysis, design, flight simulation vehicle dynamic modeling, and flight test of the Morpheus parallel-fed propellant. The Apollo LEM data is also examined in this summary report of the flight data.

  14. National Combustion Code: Parallel Performance

    NASA Technical Reports Server (NTRS)

    Babrauckas, Theresa

    2001-01-01

    This report discusses the National Combustion Code (NCC). The NCC is an integrated system of codes for the design and analysis of combustion systems. The advanced features of the NCC meet designers' requirements for model accuracy and turn-around time. The fundamental features at the inception of the NCC were parallel processing and unstructured mesh. The design and performance of the NCC are discussed.

  15. Ropes: Support for collective opertions among distributed threads

    NASA Technical Reports Server (NTRS)

    Haines, Matthew; Mehrotra, Piyush; Cronk, David

    1995-01-01

    Lightweight threads are becoming increasingly useful in supporting parallelism and asynchronous control structures in applications and language implementations. Recently, systems have been designed and implemented to support interprocessor communication between lightweight threads so that threads can be exploited in a distributed memory system. Their use, in this setting, has been largely restricted to supporting latency hiding techniques and functional parallelism within a single application. However, to execute data parallel codes independent of other threads in the system, collective operations and relative indexing among threads are required. This paper describes the design of ropes: a scoping mechanism for collective operations and relative indexing among threads. We present the design of ropes in the context of the Chant system, and provide performance results evaluating our initial design decisions.

  16. Xyce™ Parallel Electronic Simulator Users' Guide, Version 6.5.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R.; Aadithya, Karthik V.; Mei, Ting

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright © 2002-2016 Sandia Corporation. All rights reserved.« less

  17. A master-slave parallel hybrid multi-objective evolutionary algorithm for groundwater remediation design under general hydrogeological conditions

    NASA Astrophysics Data System (ADS)

    Wu, J.; Yang, Y.; Luo, Q.; Wu, J.

    2012-12-01

    This study presents a new hybrid multi-objective evolutionary algorithm, the niched Pareto tabu search combined with a genetic algorithm (NPTSGA), whereby the global search ability of niched Pareto tabu search (NPTS) is improved by the diversification of candidate solutions arose from the evolving nondominated sorting genetic algorithm II (NSGA-II) population. Also, the NPTSGA coupled with the commonly used groundwater flow and transport codes, MODFLOW and MT3DMS, is developed for multi-objective optimal design of groundwater remediation systems. The proposed methodology is then applied to a large-scale field groundwater remediation system for cleanup of large trichloroethylene (TCE) plume at the Massachusetts Military Reservation (MMR) in Cape Cod, Massachusetts. Furthermore, a master-slave (MS) parallelization scheme based on the Message Passing Interface (MPI) is incorporated into the NPTSGA to implement objective function evaluations in distributed processor environment, which can greatly improve the efficiency of the NPTSGA in finding Pareto-optimal solutions to the real-world application. This study shows that the MS parallel NPTSGA in comparison with the original NPTS and NSGA-II can balance the tradeoff between diversity and optimality of solutions during the search process and is an efficient and effective tool for optimizing the multi-objective design of groundwater remediation systems under complicated hydrogeologic conditions.

  18. Parallel elastic elements improve energy efficiency on the STEPPR bipedal walking robot

    DOE PAGES

    Mazumdar, Anirban; Spencer, Steven J.; Hobart, Clinton; ...

    2016-11-23

    This study describes how parallel elastic elements can be used to reduce energy consumption in the electric motor driven, fully-actuated, STEPPR bipedal walking robot without compromising or significantly limiting locomotive behaviors. A physically motivated approach is used to illustrate how selectively-engaging springs for hip adduction and ankle flexion predict benefits for three different flat ground walking gaits: human walking, human-like robot walking and crouched robot walking. Based on locomotion data, springs are designed and substantial reductions in power consumption are demonstrated using a bench dynamometer. These lessons are then applied to STEPPR (Sandia Transmission-Efficient Prototype Promoting Research), a fully actuatedmore » bipedal robot designed to explore the impact of tailored joint mechanisms on walking efficiency. Featuring high-torque brushless DC motors, efficient low-ratio transmissions, and high fidelity torque control, STEPPR provides the ability to incorporate novel joint-level mechanisms without dramatically altering high level control. Unique parallel elastic designs are incorporated into STEPPR, and walking data shows that hip adduction and ankle flexion springs significantly reduce the required actuator energy at those joints for several gaits. These results suggest that parallel joint springs offer a promising means of supporting quasi-static joint torques due to body mass during walking, relieving motors of the need to support these torques and substantially improving locomotive energy efficiency.« less

  19. Parallel elastic elements improve energy efficiency on the STEPPR bipedal walking robot

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mazumdar, Anirban; Spencer, Steven J.; Hobart, Clinton

    This study describes how parallel elastic elements can be used to reduce energy consumption in the electric motor driven, fully-actuated, STEPPR bipedal walking robot without compromising or significantly limiting locomotive behaviors. A physically motivated approach is used to illustrate how selectively-engaging springs for hip adduction and ankle flexion predict benefits for three different flat ground walking gaits: human walking, human-like robot walking and crouched robot walking. Based on locomotion data, springs are designed and substantial reductions in power consumption are demonstrated using a bench dynamometer. These lessons are then applied to STEPPR (Sandia Transmission-Efficient Prototype Promoting Research), a fully actuatedmore » bipedal robot designed to explore the impact of tailored joint mechanisms on walking efficiency. Featuring high-torque brushless DC motors, efficient low-ratio transmissions, and high fidelity torque control, STEPPR provides the ability to incorporate novel joint-level mechanisms without dramatically altering high level control. Unique parallel elastic designs are incorporated into STEPPR, and walking data shows that hip adduction and ankle flexion springs significantly reduce the required actuator energy at those joints for several gaits. These results suggest that parallel joint springs offer a promising means of supporting quasi-static joint torques due to body mass during walking, relieving motors of the need to support these torques and substantially improving locomotive energy efficiency.« less

  20. Growth: How Much is Too Much? Student Book. Social Studies Module (9th-10th Grade Social Studies). Revised Edition.

    ERIC Educational Resources Information Center

    Georgia Univ., Athens. Coll. of Education.

    This learning module is designed to integrate environmental education into ninth- and tenth-grade social studies courses. The module and a parallel module designed for chemistry classes were pilot tested in Gwinnett County, Georgia in 1975-76. The module is divided into four parts. The first part alerts students to the serious problems that growth…

  1. Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)

    2000-01-01

    HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).

  2. Parallelization of Program to Optimize Simulated Trajectories (POST3D)

    NASA Technical Reports Server (NTRS)

    Hammond, Dana P.; Korte, John J. (Technical Monitor)

    2001-01-01

    This paper describes the parallelization of the Program to Optimize Simulated Trajectories (POST3D). POST3D uses a gradient-based optimization algorithm that reaches an optimum design point by moving from one design point to the next. The gradient calculations required to complete the optimization process, dominate the computational time and have been parallelized using a Single Program Multiple Data (SPMD) on a distributed memory NUMA (non-uniform memory access) architecture. The Origin2000 was used for the tests presented.

  3. Options for Parallelizing a Planning and Scheduling Algorithm

    NASA Technical Reports Server (NTRS)

    Clement, Bradley J.; Estlin, Tara A.; Bornstein, Benjamin D.

    2011-01-01

    Space missions have a growing interest in putting multi-core processors onboard spacecraft. For many missions processing power significantly slows operations. We investigate how continual planning and scheduling algorithms can exploit multi-core processing and outline different potential design decisions for a parallelized planning architecture. This organization of choices and challenges helps us with an initial design for parallelizing the CASPER planning system for a mesh multi-core processor. This work extends that presented at another workshop with some preliminary results.

  4. Parallelization of Catalytic Packed-Bed Microchannels with Pressure-Drop Microstructures for Gas-Liquid Multiphase Reactions

    NASA Astrophysics Data System (ADS)

    Murakami, Sunao; Ohtaki, Kenichiro; Matsumoto, Sohei; Inoue, Tomoya

    2012-06-01

    High-throughput and stable treatments are required to achieve the practical production of chemicals with microreactors. However, the flow maldistribution to the paralleled microchannels has been a critical problem in achieving the productive use of multichannel microreactors for multiphase flow conditions. In this study, we newly designed and fabricated a glass four-channel catalytic packed-bed microreactor for the scale-up of gas-liquid multiphase chemical reactions. We embedded microstructures generating high pressure losses at the upstream side of each packed bed, and experimentally confirmed the efficacy of the microstructures in decreasing the maldistribution of the gas-liquid flow to the parallel microchannels.

  5. Progress on the Multiphysics Capabilities of the Parallel Electromagnetic ACE3P Simulation Suite

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kononenko, Oleksiy

    2015-03-26

    ACE3P is a 3D parallel simulation suite that is being developed at SLAC National Accelerator Laboratory. Effectively utilizing supercomputer resources, ACE3P has become a key tool for the coupled electromagnetic, thermal and mechanical research and design of particle accelerators. Based on the existing finite-element infrastructure, a massively parallel eigensolver is developed for modal analysis of mechanical structures. It complements a set of the multiphysics tools in ACE3P and, in particular, can be used for the comprehensive study of microphonics in accelerating cavities ensuring the operational reliability of a particle accelerator.

  6. Scalable Unix commands for parallel processors : a high-performance implementation.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ong, E.; Lusk, E.; Gropp, W.

    2001-06-22

    We describe a family of MPI applications we call the Parallel Unix Commands. These commands are natural parallel versions of common Unix user commands such as ls, ps, and find, together with a few similar commands particular to the parallel environment. We describe the design and implementation of these programs and present some performance results on a 256-node Linux cluster. The Parallel Unix Commands are open source and freely available.

  7. Parallel language constructs for tensor product computations on loosely coupled architectures

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush; Van Rosendale, John

    1989-01-01

    A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. The authors focus on tensor product array computations, a simple but important class of numerical algorithms. They consider first the problem of programming one-dimensional kernel routines, such as parallel tridiagonal solvers, and then look at how such parallel kernels can be combined to form parallel tensor product algorithms.

  8. 77 FR 38840 - Submission for OMB Review; Comment Request: Child Health Disparities Substudy for the National...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-29

    ... needed for a study of this size and complexity, the NCS was designed to include a preliminary pilot study... parallel with the Main Study. At every phase of the NCS, the multiple methodological studies conducted...

  9. Critical appraisal of arguments for the delayed-start design proposed as alternative to the parallel-group randomized clinical trial design in the field of rare disease.

    PubMed

    Spineli, Loukia M; Jenz, Eva; Großhennig, Anika; Koch, Armin

    2017-08-17

    A number of papers have proposed or evaluated the delayed-start design as an alternative to the standard two-arm parallel group randomized clinical trial (RCT) design in the field of rare disease. However the discussion is felt to lack a sufficient degree of consideration devoted to the true virtues of the delayed start design and the implications either in terms of required sample-size, overall information, or interpretation of the estimate in the context of small populations. To evaluate whether there are real advantages of the delayed-start design particularly in terms of overall efficacy and sample size requirements as a proposed alternative to the standard parallel group RCT in the field of rare disease. We used a real-life example to compare the delayed-start design with the standard RCT in terms of sample size requirements. Then, based on three scenarios regarding the development of the treatment effect over time, the advantages, limitations and potential costs of the delayed-start design are discussed. We clarify that delayed-start design is not suitable for drugs that establish an immediate treatment effect, but for drugs with effects developing over time, instead. In addition, the sample size will always increase as an implication for a reduced time on placebo resulting in a decreased treatment effect. A number of papers have repeated well-known arguments to justify the delayed-start design as appropriate alternative to the standard parallel group RCT in the field of rare disease and do not discuss the specific needs of research methodology in this field. The main point is that a limited time on placebo will result in an underestimated treatment effect and, in consequence, in larger sample size requirements compared to those expected under a standard parallel-group design. This also impacts on benefit-risk assessment.

  10. 77 FR 9666 - National Institute of Child Health and Human Development; New Proposed Collection; Comment...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-17

    ... preparation needed for a study of this size and complexity, the NCS was designed to include a preliminary... parallel with the Main Study. At every phase of the NCS, the multiple methodological studies conducted...

  11. Analytical Study on Thermal and Mechanical Design of Printed Circuit Heat Exchanger

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yoon, Su-Jong; Sabharwall, Piyush; Kim, Eung-Soo

    2013-09-01

    The analytical methodologies for the thermal design, mechanical design and cost estimation of printed circuit heat exchanger are presented in this study. In this study, three flow arrangements of parallel flow, countercurrent flow and crossflow are taken into account. For each flow arrangement, the analytical solution of temperature profile of heat exchanger is introduced. The size and cost of printed circuit heat exchangers for advanced small modular reactors, which employ various coolants such as sodium, molten salts, helium, and water, are also presented.

  12. Design Method of Digital Optimal Control Scheme and Multiple Paralleled Bridge Type Current Amplifier for Generating Gradient Magnetic Fields in MRI Systems

    NASA Astrophysics Data System (ADS)

    Watanabe, Shuji; Takano, Hiroshi; Fukuda, Hiroya; Hiraki, Eiji; Nakaoka, Mutsuo

    This paper deals with a digital control scheme of multiple paralleled high frequency switching current amplifier with four-quadrant chopper for generating gradient magnetic fields in MRI (Magnetic Resonance Imaging) systems. In order to track high precise current pattern in Gradient Coils (GC), the proposal current amplifier cancels the switching current ripples in GC with each other and designed optimum switching gate pulse patterns without influences of the large filter current ripple amplitude. The optimal control implementation and the linear control theory in GC current amplifiers have affinity to each other with excellent characteristics. The digital control system can be realized easily through the digital control implementation, DSPs or microprocessors. Multiple-parallel operational microprocessors realize two or higher paralleled GC current pattern tracking amplifier with optimal control design and excellent results are given for improving the image quality of MRI systems.

  13. Skewed steel bridges, part ii : cross-frame and connection design to ensure brace effectiveness : technical summary.

    DOT National Transportation Integrated Search

    2017-08-01

    Skewed bridges in Kansas are often designed such that the cross-frames are carried parallel to the skew angle up to 40, while many other states place cross-frames perpendicular to the girder for skew angles greater than 20. Skewed-parallel cross-...

  14. Skewed steel bridges, part ii : cross-frame and connection design to ensure brace effectiveness : final report.

    DOT National Transportation Integrated Search

    2017-08-01

    Skewed bridges in Kansas are often designed such that the cross-frames are carried parallel to the skew angle up to 40, while many other states place cross-frames perpendicular to the girder for skew angles greater than 20. Skewed-parallel cross-...

  15. Design of a switch matrix gate/bulk driver controller for thin film lithium microbatteries using microwave SOI technology

    NASA Technical Reports Server (NTRS)

    Whitacre, J.; West, W. C.; Mojarradi, M.; Sukumar, V.; Hess, H.; Li, H.; Buck, K.; Cox, D.; Alahmad, M.; Zghoul, F. N.; hide

    2003-01-01

    This paper presents a design approach to help attain any random grouping pattern between the microbatteries. In this case, the result is an ability to charge microbatteries in parallel and to discharge microbatteries in parallel or pairs of microbatteries in series.

  16. Simplified Parallel Domain Traversal

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Erickson III, David J

    2011-01-01

    Many data-intensive scientific analysis techniques require global domain traversal, which over the years has been a bottleneck for efficient parallelization across distributed-memory architectures. Inspired by MapReduce and other simplified parallel programming approaches, we have designed DStep, a flexible system that greatly simplifies efficient parallelization of domain traversal techniques at scale. In order to deliver both simplicity to users as well as scalability on HPC platforms, we introduce a novel two-tiered communication architecture for managing and exploiting asynchronous communication loads. We also integrate our design with advanced parallel I/O techniques that operate directly on native simulation output. We demonstrate DStep bymore » performing teleconnection analysis across ensemble runs of terascale atmospheric CO{sub 2} and climate data, and we show scalability results on up to 65,536 IBM BlueGene/P cores.« less

  17. PARAMESH: A Parallel Adaptive Mesh Refinement Community Toolkit

    NASA Technical Reports Server (NTRS)

    MacNeice, Peter; Olson, Kevin M.; Mobarry, Clark; deFainchtein, Rosalinda; Packer, Charles

    1999-01-01

    In this paper, we describe a community toolkit which is designed to provide parallel support with adaptive mesh capability for a large and important class of computational models, those using structured, logically cartesian meshes. The package of Fortran 90 subroutines, called PARAMESH, is designed to provide an application developer with an easy route to extend an existing serial code which uses a logically cartesian structured mesh into a parallel code with adaptive mesh refinement. Alternatively, in its simplest use, and with minimal effort, it can operate as a domain decomposition tool for users who want to parallelize their serial codes, but who do not wish to use adaptivity. The package can provide them with an incremental evolutionary path for their code, converting it first to uniformly refined parallel code, and then later if they so desire, adding adaptivity.

  18. Optimal Design of Passive Power Filters Based on Pseudo-parallel Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Li, Pei; Li, Hongbo; Gao, Nannan; Niu, Lin; Guo, Liangfeng; Pei, Ying; Zhang, Yanyan; Xu, Minmin; Chen, Kerui

    2017-05-01

    The economic costs together with filter efficiency are taken as targets to optimize the parameter of passive filter. Furthermore, the method of combining pseudo-parallel genetic algorithm with adaptive genetic algorithm is adopted in this paper. In the early stages pseudo-parallel genetic algorithm is introduced to increase the population diversity, and adaptive genetic algorithm is used in the late stages to reduce the workload. At the same time, the migration rate of pseudo-parallel genetic algorithm is improved to change with population diversity adaptively. Simulation results show that the filter designed by the proposed method has better filtering effect with lower economic cost, and can be used in engineering.

  19. Language Classification using N-grams Accelerated by FPGA-based Bloom Filters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jacob, A; Gokhale, M

    N-Gram (n-character sequences in text documents) counting is a well-established technique used in classifying the language of text in a document. In this paper, n-gram processing is accelerated through the use of reconfigurable hardware on the XtremeData XD1000 system. Our design employs parallelism at multiple levels, with parallel Bloom Filters accessing on-chip RAM, parallel language classifiers, and parallel document processing. In contrast to another hardware implementation (HAIL algorithm) that uses off-chip SRAM for lookup, our highly scalable implementation uses only on-chip memory blocks. Our implementation of end-to-end language classification runs at 85x comparable software and 1.45x the competing hardware design.

  20. INVITED TOPICAL REVIEW: Parallel magnetic resonance imaging

    NASA Astrophysics Data System (ADS)

    Larkman, David J.; Nunes, Rita G.

    2007-04-01

    Parallel imaging has been the single biggest innovation in magnetic resonance imaging in the last decade. The use of multiple receiver coils to augment the time consuming Fourier encoding has reduced acquisition times significantly. This increase in speed comes at a time when other approaches to acquisition time reduction were reaching engineering and human limits. A brief summary of spatial encoding in MRI is followed by an introduction to the problem parallel imaging is designed to solve. There are a large number of parallel reconstruction algorithms; this article reviews a cross-section, SENSE, SMASH, g-SMASH and GRAPPA, selected to demonstrate the different approaches. Theoretical (the g-factor) and practical (coil design) limits to acquisition speed are reviewed. The practical implementation of parallel imaging is also discussed, in particular coil calibration. How to recognize potential failure modes and their associated artefacts are shown. Well-established applications including angiography, cardiac imaging and applications using echo planar imaging are reviewed and we discuss what makes a good application for parallel imaging. Finally, active research areas where parallel imaging is being used to improve data quality by repairing artefacted images are also reviewed.

  1. Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform.

    PubMed

    Tao, Liang; Kwan, Hon Keung

    2012-07-01

    Novel algorithms for the multirate and fast parallel implementation of the 2-D discrete Hartley transform (DHT)-based real-valued discrete Gabor transform (RDGT) and its inverse transform are presented in this paper. A 2-D multirate-based analysis convolver bank is designed for the 2-D RDGT, and a 2-D multirate-based synthesis convolver bank is designed for the 2-D inverse RDGT. The parallel channels in each of the two convolver banks have a unified structure and can apply the 2-D fast DHT algorithm to speed up their computations. The computational complexity of each parallel channel is low and is independent of the Gabor oversampling rate. All the 2-D RDGT coefficients of an image are computed in parallel during the analysis process and can be reconstructed in parallel during the synthesis process. The computational complexity and time of the proposed parallel algorithms are analyzed and compared with those of the existing fastest algorithms for 2-D discrete Gabor transforms. The results indicate that the proposed algorithms are the fastest, which make them attractive for real-time image processing.

  2. An Efficient Multiblock Method for Aerodynamic Analysis and Design on Distributed Memory Systems

    NASA Technical Reports Server (NTRS)

    Reuther, James; Alonso, Juan Jose; Vassberg, John C.; Jameson, Antony; Martinelli, Luigi

    1997-01-01

    The work presented in this paper describes the application of a multiblock gridding strategy to the solution of aerodynamic design optimization problems involving complex configurations. The design process is parallelized using the MPI (Message Passing Interface) Standard such that it can be efficiently run on a variety of distributed memory systems ranging from traditional parallel computers to networks of workstations. Substantial improvements to the parallel performance of the baseline method are presented, with particular attention to their impact on the scalability of the program as a function of the mesh size. Drag minimization calculations at a fixed coefficient of lift are presented for a business jet configuration that includes the wing, body, pylon, aft-mounted nacelle, and vertical and horizontal tails. An aerodynamic design optimization is performed with both the Euler and Reynolds Averaged Navier-Stokes (RANS) equations governing the flow solution and the results are compared. These sample calculations establish the feasibility of efficient aerodynamic optimization of complete aircraft configurations using the RANS equations as the flow model. There still exists, however, the need for detailed studies of the importance of a true viscous adjoint method which holds the promise of tackling the minimization of not only the wave and induced components of drag, but also the viscous drag.

  3. A strategy for reducing turnaround time in design optimization using a distributed computer system

    NASA Technical Reports Server (NTRS)

    Young, Katherine C.; Padula, Sharon L.; Rogers, James L.

    1988-01-01

    There is a need to explore methods for reducing lengthly computer turnaround or clock time associated with engineering design problems. Different strategies can be employed to reduce this turnaround time. One strategy is to run validated analysis software on a network of existing smaller computers so that portions of the computation can be done in parallel. This paper focuses on the implementation of this method using two types of problems. The first type is a traditional structural design optimization problem, which is characterized by a simple data flow and a complicated analysis. The second type of problem uses an existing computer program designed to study multilevel optimization techniques. This problem is characterized by complicated data flow and a simple analysis. The paper shows that distributed computing can be a viable means for reducing computational turnaround time for engineering design problems that lend themselves to decomposition. Parallel computing can be accomplished with a minimal cost in terms of hardware and software.

  4. Inter-laboratory evaluation of the EUROFORGEN Global ancestry-informative SNP panel by massively parallel sequencing using the Ion PGM™.

    PubMed

    Eduardoff, M; Gross, T E; Santos, C; de la Puente, M; Ballard, D; Strobl, C; Børsting, C; Morling, N; Fusco, L; Hussing, C; Egyed, B; Souto, L; Uacyisrael, J; Syndercombe Court, D; Carracedo, Á; Lareu, M V; Schneider, P M; Parson, W; Phillips, C; Parson, W; Phillips, C

    2016-07-01

    The EUROFORGEN Global ancestry-informative SNP (AIM-SNPs) panel is a forensic multiplex of 128 markers designed to differentiate an individual's ancestry from amongst the five continental population groups of Africa, Europe, East Asia, Native America, and Oceania. A custom multiplex of AmpliSeq™ PCR primers was designed for the Global AIM-SNPs to perform massively parallel sequencing using the Ion PGM™ system. This study assessed individual SNP genotyping precision using the Ion PGM™, the forensic sensitivity of the multiplex using dilution series, degraded DNA plus simple mixtures, and the ancestry differentiation power of the final panel design, which required substitution of three original ancestry-informative SNPs with alternatives. Fourteen populations that had not been previously analyzed were genotyped using the custom multiplex and these studies allowed assessment of genotyping performance by comparison of data across five laboratories. Results indicate a low level of genotyping error can still occur from sequence misalignment caused by homopolymeric tracts close to the target SNP, despite careful scrutiny of candidate SNPs at the design stage. Such sequence misalignment required the exclusion of component SNP rs2080161 from the Global AIM-SNPs panel. However, the overall genotyping precision and sensitivity of this custom multiplex indicates the Ion PGM™ assay for the Global AIM-SNPs is highly suitable for forensic ancestry analysis with massively parallel sequencing. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  5. PISCES: An environment for parallel scientific computation

    NASA Technical Reports Server (NTRS)

    Pratt, T. W.

    1985-01-01

    The parallel implementation of scientific computing environment (PISCES) is a project to provide high-level programming environments for parallel MIMD computers. Pisces 1, the first of these environments, is a FORTRAN 77 based environment which runs under the UNIX operating system. The Pisces 1 user programs in Pisces FORTRAN, an extension of FORTRAN 77 for parallel processing. The major emphasis in the Pisces 1 design is in providing a carefully specified virtual machine that defines the run-time environment within which Pisces FORTRAN programs are executed. Each implementation then provides the same virtual machine, regardless of differences in the underlying architecture. The design is intended to be portable to a variety of architectures. Currently Pisces 1 is implemented on a network of Apollo workstations and on a DEC VAX uniprocessor via simulation of the task level parallelism. An implementation for the Flexible Computing Corp. FLEX/32 is under construction. An introduction to the Pisces 1 virtual computer and the FORTRAN 77 extensions is presented. An example of an algorithm for the iterative solution of a system of equations is given. The most notable features of the design are the provision for several granularities of parallelism in programs and the provision of a window mechanism for distributed access to large arrays of data.

  6. Parallelizing Timed Petri Net simulations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.

    1993-01-01

    The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included.

  7. SNSPD with parallel nanowires (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Ejrnaes, Mikkel; Parlato, Loredana; Gaggero, Alessandro; Mattioli, Francesco; Leoni, Roberto; Pepe, Giampiero; Cristiano, Roberto

    2017-05-01

    Superconducting nanowire single-photon detectors (SNSPDs) have shown to be promising in applications such as quantum communication and computation, quantum optics, imaging, metrology and sensing. They offer the advantages of a low dark count rate, high efficiency, a broadband response, a short time jitter, a high repetition rate, and no need for gated-mode operation. Several SNSPD designs have been proposed in literature. Here, we discuss the so-called parallel nanowires configurations. They were introduced with the aim of improving some SNSPD property like detection efficiency, speed, signal-to-noise ratio, or photon number resolution. Although apparently similar, the various parallel designs are not the same. There is no one design that can improve the mentioned properties all together. In fact, each design presents its own characteristics with specific advantages and drawbacks. In this work, we will discuss the various designs outlining peculiarities and possible improvements.

  8. Algorithms for the Construction of Parallel Tests by Zero-One Programming. Project Psychometric Aspects of Item Banking No. 7. Research Report 86-7.

    ERIC Educational Resources Information Center

    Boekkooi-Timminga, Ellen

    Nine methods for automated test construction are described. All are based on the concepts of information from item response theory. Two general kinds of methods for the construction of parallel tests are presented: (1) sequential test design; and (2) simultaneous test design. Sequential design implies that the tests are constructed one after the…

  9. Design and implementation of a novel modal space active force control concept for spatial multi-DOF parallel robotic manipulators actuated by electrical actuators.

    PubMed

    Yang, Chifu; Zhao, Jinsong; Li, Liyi; Agrawal, Sunil K

    2018-01-01

    Robotic spine brace based on parallel-actuated robotic system is a new device for treatment and sensing of scoliosis, however, the strong dynamic coupling and anisotropy problem of parallel manipulators result in accuracy loss of rehabilitation force control, including big error in direction and value of force. A novel active force control strategy named modal space force control is proposed to solve these problems. Considering the electrical driven system and contact environment, the mathematical model of spatial parallel manipulator is built. The strong dynamic coupling problem in force field is described via experiments as well as the anisotropy problem of work space of parallel manipulators. The effects of dynamic coupling on control design and performances are discussed, and the influences of anisotropy on accuracy are also addressed. With mass/inertia matrix and stiffness matrix of parallel manipulators, a modal matrix can be calculated by using eigenvalue decomposition. Making use of the orthogonality of modal matrix with mass matrix of parallel manipulators, the strong coupled dynamic equations expressed in work space or joint space of parallel manipulator may be transformed into decoupled equations formulated in modal space. According to this property, each force control channel is independent of others in the modal space, thus we proposed modal space force control concept which means the force controller is designed in modal space. A modal space active force control is designed and implemented with only a simple PID controller employed as exampled control method to show the differences, uniqueness, and benefits of modal space force control. Simulation and experimental results show that the proposed modal space force control concept can effectively overcome the effects of the strong dynamic coupling and anisotropy problem in the physical space, and modal space force control is thus a very useful control framework, which is better than the current joint space control and work space control. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  10. Design of object-oriented distributed simulation classes

    NASA Technical Reports Server (NTRS)

    Schoeffler, James D. (Principal Investigator)

    1995-01-01

    Distributed simulation of aircraft engines as part of a computer aided design package is being developed by NASA Lewis Research Center for the aircraft industry. The project is called NPSS, an acronym for 'Numerical Propulsion Simulation System'. NPSS is a flexible object-oriented simulation of aircraft engines requiring high computing speed. It is desirable to run the simulation on a distributed computer system with multiple processors executing portions of the simulation in parallel. The purpose of this research was to investigate object-oriented structures such that individual objects could be distributed. The set of classes used in the simulation must be designed to facilitate parallel computation. Since the portions of the simulation carried out in parallel are not independent of one another, there is the need for communication among the parallel executing processors which in turn implies need for their synchronization. Communication and synchronization can lead to decreased throughput as parallel processors wait for data or synchronization signals from other processors. As a result of this research, the following have been accomplished. The design and implementation of a set of simulation classes which result in a distributed simulation control program have been completed. The design is based upon MIT 'Actor' model of a concurrent object and uses 'connectors' to structure dynamic connections between simulation components. Connectors may be dynamically created according to the distribution of objects among machines at execution time without any programming changes. Measurements of the basic performance have been carried out with the result that communication overhead of the distributed design is swamped by the computation time of modules unless modules have very short execution times per iteration or time step. An analytical performance model based upon queuing network theory has been designed and implemented. Its application to realistic configurations has not been carried out.

  11. Design of Object-Oriented Distributed Simulation Classes

    NASA Technical Reports Server (NTRS)

    Schoeffler, James D.

    1995-01-01

    Distributed simulation of aircraft engines as part of a computer aided design package being developed by NASA Lewis Research Center for the aircraft industry. The project is called NPSS, an acronym for "Numerical Propulsion Simulation System". NPSS is a flexible object-oriented simulation of aircraft engines requiring high computing speed. It is desirable to run the simulation on a distributed computer system with multiple processors executing portions of the simulation in parallel. The purpose of this research was to investigate object-oriented structures such that individual objects could be distributed. The set of classes used in the simulation must be designed to facilitate parallel computation. Since the portions of the simulation carried out in parallel are not independent of one another, there is the need for communication among the parallel executing processors which in turn implies need for their synchronization. Communication and synchronization can lead to decreased throughput as parallel processors wait for data or synchronization signals from other processors. As a result of this research, the following have been accomplished. The design and implementation of a set of simulation classes which result in a distributed simulation control program have been completed. The design is based upon MIT "Actor" model of a concurrent object and uses "connectors" to structure dynamic connections between simulation components. Connectors may be dynamically created according to the distribution of objects among machines at execution time without any programming changes. Measurements of the basic performance have been carried out with the result that communication overhead of the distributed design is swamped by the computation time of modules unless modules have very short execution times per iteration or time step. An analytical performance model based upon queuing network theory has been designed and implemented. Its application to realistic configurations has not been carried out.

  12. Efficient partitioning and assignment on programs for multiprocessor execution

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1993-01-01

    The general problem studied is that of segmenting or partitioning programs for distribution across a multiprocessor system. Efficient partitioning and the assignment of program elements are of great importance since the time consumed in this overhead activity may easily dominate the computation, effectively eliminating any gains made by the use of the parallelism. In this study, the partitioning of sequentially structured programs (written in FORTRAN) is evaluated. Heuristics, developed for similar applications are examined. Finally, a model for queueing networks with finite queues is developed which may be used to analyze multiprocessor system architectures with a shared memory approach to the problem of partitioning. The properties of sequentially written programs form obstacles to large scale (at the procedure or subroutine level) parallelization. Data dependencies of even the minutest nature, reflecting the sequential development of the program, severely limit parallelism. The design of heuristic algorithms is tied to the experience gained in the parallel splitting. Parallelism obtained through the physical separation of data has seen some success, especially at the data element level. Data parallelism on a grander scale requires models that accurately reflect the effects of blocking caused by finite queues. A model for the approximation of the performance of finite queueing networks is developed. This model makes use of the decomposition approach combined with the efficiency of product form solutions.

  13. VISUAL AND AUDIO PRESENTATION IN MACHINE PROGRAMED INSTRUCTION. FINAL REPORT.

    ERIC Educational Resources Information Center

    ALLEN, WILLIAM H.

    THIS STUDY WAS PART OF A LARGER RESEARCH PROGRAM AIMED TOWARD DEVELOPMENT OF PARADIGMS OF MESSAGE DESIGN. OBJECTIVES OF THREE PARALLEL EXPERIMENTS WERE TO EVALUATE INTERACTIONS OF PRESENTATION MODE, PROGRAM TYPE, AND CONTENT AS THEY AFFECT LEARNER CHARACTERISTICS. EACH EXPERIMENT USED 18 TREATMENTS IN A FACTORIAL DESIGN WITH RANDOMLY SELECTED…

  14. Parallel Algorithms for Switching Edges in Heterogeneous Graphs.

    PubMed

    Bhuiyan, Hasanuzzaman; Khan, Maleq; Chen, Jiangzhuo; Marathe, Madhav

    2017-06-01

    An edge switch is an operation on a graph (or network) where two edges are selected randomly and one of their end vertices are swapped with each other. Edge switch operations have important applications in graph theory and network analysis, such as in generating random networks with a given degree sequence, modeling and analyzing dynamic networks, and in studying various dynamic phenomena over a network. The recent growth of real-world networks motivates the need for efficient parallel algorithms. The dependencies among successive edge switch operations and the requirement to keep the graph simple (i.e., no self-loops or parallel edges) as the edges are switched lead to significant challenges in designing a parallel algorithm. Addressing these challenges requires complex synchronization and communication among the processors leading to difficulties in achieving a good speedup by parallelization. In this paper, we present distributed memory parallel algorithms for switching edges in massive networks. These algorithms provide good speedup and scale well to a large number of processors. A harmonic mean speedup of 73.25 is achieved on eight different networks with 1024 processors. One of the steps in our edge switch algorithms requires the computation of multinomial random variables in parallel. This paper presents the first non-trivial parallel algorithm for the problem, achieving a speedup of 925 using 1024 processors.

  15. Parallel Algorithms for Switching Edges in Heterogeneous Graphs☆

    PubMed Central

    Khan, Maleq; Chen, Jiangzhuo; Marathe, Madhav

    2017-01-01

    An edge switch is an operation on a graph (or network) where two edges are selected randomly and one of their end vertices are swapped with each other. Edge switch operations have important applications in graph theory and network analysis, such as in generating random networks with a given degree sequence, modeling and analyzing dynamic networks, and in studying various dynamic phenomena over a network. The recent growth of real-world networks motivates the need for efficient parallel algorithms. The dependencies among successive edge switch operations and the requirement to keep the graph simple (i.e., no self-loops or parallel edges) as the edges are switched lead to significant challenges in designing a parallel algorithm. Addressing these challenges requires complex synchronization and communication among the processors leading to difficulties in achieving a good speedup by parallelization. In this paper, we present distributed memory parallel algorithms for switching edges in massive networks. These algorithms provide good speedup and scale well to a large number of processors. A harmonic mean speedup of 73.25 is achieved on eight different networks with 1024 processors. One of the steps in our edge switch algorithms requires the computation of multinomial random variables in parallel. This paper presents the first non-trivial parallel algorithm for the problem, achieving a speedup of 925 using 1024 processors. PMID:28757680

  16. Radiofrequency pulse design in parallel transmission under strict temperature constraints.

    PubMed

    Boulant, Nicolas; Massire, Aurélien; Amadon, Alexis; Vignaud, Alexandre

    2014-09-01

    To gain radiofrequency (RF) pulse performance by directly addressing the temperature constraints, as opposed to the specific absorption rate (SAR) constraints, in parallel transmission at ultra-high field. The magnitude least-squares RF pulse design problem under hard SAR constraints was solved repeatedly by using the virtual observation points and an active-set algorithm. The SAR constraints were updated at each iteration based on the result of a thermal simulation. The numerical study was performed for an SAR-demanding and simplified time of flight sequence using B1 and ΔB0 maps obtained in vivo on a human brain at 7T. The proposed adjustment of the SAR constraints combined with an active-set algorithm provided higher flexibility in RF pulse design within a reasonable time. The modifications of those constraints acted directly upon the thermal response as desired. Although further confidence in the thermal models is needed, this study shows that RF pulse design under strict temperature constraints is within reach, allowing better RF pulse performance and faster acquisitions at ultra-high fields at the cost of higher sequence complexity. Copyright © 2013 Wiley Periodicals, Inc.

  17. Students as Game Designers vs. "Just" Players: Comparison of Two Different Approaches to Location-Based Games Implementation into School Curricula

    ERIC Educational Resources Information Center

    Slussareff, Michaela; Bohácková, Petra

    2016-01-01

    This paper compares two kinds of educational treatment within location-based game approach; learning by playing a location-based game and learning by designing a location-based game. Two parallel elementary school classes were included in our study (N = 27; age 14-15). The "designers" class took part in the whole process of game design…

  18. Use of Parallel Micro-Platform for the Simulation the Space Exploration

    NASA Astrophysics Data System (ADS)

    Velasco Herrera, Victor Manuel; Velasco Herrera, Graciela; Rosano, Felipe Lara; Rodriguez Lozano, Salvador; Lucero Roldan Serrato, Karen

    The purpose of this work is to create a parallel micro-platform, that simulates the virtual movements of a space exploration in 3D. One of the innovations presented in this design consists of the application of a lever mechanism for the transmission of the movement. The development of such a robot is a challenging task very different of the industrial manipulators due to a totally different target system of requirements. This work presents the study and simulation, aided by computer, of the movement of this parallel manipulator. The development of this model has been developed using the platform of computer aided design Unigraphics, in which it was done the geometric modeled of each one of the components and end assembly (CAD), the generation of files for the computer aided manufacture (CAM) of each one of the pieces and the kinematics simulation of the system evaluating different driving schemes. We used the toolbox (MATLAB) of aerospace and create an adaptive control module to simulate the system.

  19. A Generic Mesh Data Structure with Parallel Applications

    ERIC Educational Resources Information Center

    Cochran, William Kenneth, Jr.

    2009-01-01

    High performance, massively-parallel multi-physics simulations are built on efficient mesh data structures. Most data structures are designed from the bottom up, focusing on the implementation of linear algebra routines. In this thesis, we explore a top-down approach to design, evaluating the various needs of many aspects of simulation, not just…

  20. Sequential color video to parallel color video converter

    NASA Technical Reports Server (NTRS)

    1975-01-01

    The engineering design, development, breadboard fabrication, test, and delivery of a breadboard field sequential color video to parallel color video converter is described. The converter was designed for use onboard a manned space vehicle to eliminate a flickering TV display picture and to reduce the weight and bulk of previous ground conversion systems.

  1. Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Choudhary, Alok Nidhi

    1989-01-01

    Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.

  2. Generalized Philosophy of Alerting with Applications for Parallel Approach Collision Prevention

    NASA Technical Reports Server (NTRS)

    Winder, Lee F.; Kuchar, James K.

    2000-01-01

    The goal of the research was to develop formal guidelines for the design of hazard avoidance systems. An alerting system is automation designed to reduce the likelihood of undesirable outcomes that are due to rare failures in a human-controlled system. It accomplishes this by monitoring the system, and issuing warning messages to the human operators when thought necessary to head off a problem. On examination of existing and recently proposed logics for alerting it appears that few commonly accepted principles guide the design process. Different logics intended to address the same hazards may take disparate forms and emphasize different aspects of performance, because each reflects the intuitive priorities of a different designer. Because performance must be satisfactory to all users of an alerting system (implying a universal meaning of acceptable performance) and not just one designer, a proposed logic often undergoes significant piecemeal modification before gamma general acceptance. This report is an initial attempt to clarify the common performance goals by which an alerting system is ultimately judged. A better understanding of these goals will hopefully allow designers to reach the final logic in a quicker, more direct and repeatable manner. As a case study, this report compares three alerting logics for collision prevention during independent approaches to parallel runways, and outlines a fourth alternative incorporating elements of the first three, but satisfying stated requirements. Three existing logics for parallel approach alerting are described. Each follows from different intuitive principles. The logics are presented as examples of three "philosophies" of alerting system design.

  3. Fast parallel algorithm for slicing STL based on pipeline

    NASA Astrophysics Data System (ADS)

    Ma, Xulong; Lin, Feng; Yao, Bo

    2016-05-01

    In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a parallel algorithm has great advantages. However, traditional algorithms can't make full use of multi-core CPU hardware resources. In the paper, a fast parallel algorithm is presented to speed up data processing. A pipeline mode is adopted to design the parallel algorithm. And the complexity of the pipeline algorithm is analyzed theoretically. To evaluate the performance of the new algorithm, effects of threads number and layers number are investigated by a serial of experiments. The experimental results show that the threads number and layers number are two remarkable factors to the speedup ratio. The tendency of speedup versus threads number reveals a positive relationship which greatly agrees with the Amdahl's law, and the tendency of speedup versus layers number also keeps a positive relationship agreeing with Gustafson's law. The new algorithm uses topological information to compute contours with a parallel method of speedup. Another parallel algorithm based on data parallel is used in experiments to show that pipeline parallel mode is more efficient. A case study at last shows a suspending performance of the new parallel algorithm. Compared with the serial slicing algorithm, the new pipeline parallel algorithm can make full use of the multi-core CPU hardware, accelerate the slicing process, and compared with the data parallel slicing algorithm, the new slicing algorithm in this paper adopts a pipeline parallel model, and a much higher speedup ratio and efficiency is achieved.

  4. The effect of cell design and test criteria on the series/parallel performance of nickel cadmium cells and batteries

    NASA Technical Reports Server (NTRS)

    Halpert, G.; Webb, D. A.

    1983-01-01

    Three batteries were operated in parallel from a common bus during charge and discharge. SMM utilized NASA Standard 20AH cells and batteries, and LANDSAT-D NASA 50AH cells and batteries of a similar design. Each battery consisted of 22 series connected cells providing the nominal 28V bus. The three batteries were charged in parallel using the voltage limit/current taper mode wherein the voltage limit was temperature compensated. Discharge occurred on the demand of the spacecraft instruments and electronics. Both flights were planned for three to five year missions. The series/parallel configuration of cells and batteries for the 3-5 yr mission required a well controlled product with built-in reliability and uniformity. Examples of how component, cell and battery selection methods affect the uniformity of the series/parallel operation of the batteries both in testing and in flight are given.

  5. Research of influence of open-winding faults on properties of brushless permanent magnets motor

    NASA Astrophysics Data System (ADS)

    Bogusz, Piotr; Korkosz, Mariusz; Powrózek, Adam; Prokop, Jan; Wygonik, Piotr

    2017-12-01

    The paper presents an analysis of influence of selected fault states on properties of brushless DC motor with permanent magnets. The subject of study was a BLDC motor designed by the authors for unmanned aerial vehicle hybrid drive. Four parallel branches per each phase were provided in the discussed 3-phase motor. After open-winding fault in single or few parallel branches, a further operation of the motor can be continued. Waveforms of currents, voltages and electromagnetic torque were determined in discussed fault states based on the developed mathematical and simulation models. Laboratory test results concerning an influence of open-windings faults in parallel branches on properties of BLDC motor were presented.

  6. Performance analysis of parallel branch and bound search with the hypercube architecture

    NASA Technical Reports Server (NTRS)

    Mraz, Richard T.

    1987-01-01

    With the availability of commercial parallel computers, researchers are examining new classes of problems which might benefit from parallel computing. This paper presents results of an investigation of the class of search intensive problems. The specific problem discussed is the Least-Cost Branch and Bound search method of deadline job scheduling. The object-oriented design methodology was used to map the problem into a parallel solution. While the initial design was good for a prototype, the best performance resulted from fine-tuning the algorithm for a specific computer. The experiments analyze the computation time, the speed up over a VAX 11/785, and the load balance of the problem when using loosely coupled multiprocessor system based on the hypercube architecture.

  7. Scalable descriptive and correlative statistics with Titan.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thompson, David C.; Pebay, Philippe Pierre

    This report summarizes the existing statistical engines in VTK/Titan and presents the parallel versions thereof which have already been implemented. The ease of use of these parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; then, this theoretical property is verified with test runs that demonstrate optimal parallel speed-up with up to 200 processors.

  8. Study of solid rocket motor for a space shuttle booster

    NASA Technical Reports Server (NTRS)

    1972-01-01

    The study of solid rocket motors for a space shuttle booster was directed toward definition of a parallel-burn shuttle booster using two 156-in.-dia solid rocket motors. The study effort was organized into the following major task areas: system studies, preliminary design, program planning, and program costing.

  9. Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

    NASA Technical Reports Server (NTRS)

    Sun, Xian-He

    1997-01-01

    Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.

  10. Modularized Parallel Neutron Instrument Simulation on the TeraGrid

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Meili; Cobb, John W; Hagen, Mark E

    2007-01-01

    In order to build a bridge between the TeraGrid (TG), a national scale cyberinfrastructure resource, and neutron science, the Neutron Science TeraGrid Gateway (NSTG) is focused on introducing productive HPC usage to the neutron science community, primarily the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL). Monte Carlo simulations are used as a powerful tool for instrument design and optimization at SNS. One of the successful efforts of a collaboration team composed of NSTG HPC experts and SNS instrument scientists is the development of a software facility named PSoNI, Parallelizing Simulations of Neutron Instruments. Parallelizing the traditional serialmore » instrument simulation on TeraGrid resources, PSoNI quickly computes full instrument simulation at sufficient statistical levels in instrument de-sign. Upon SNS successful commissioning, to the end of 2007, three out of five commissioned instruments in SNS target station will be available for initial users. Advanced instrument study, proposal feasibility evalua-tion, and experiment planning are on the immediate schedule of SNS, which pose further requirements such as flexibility and high runtime efficiency on fast instrument simulation. PSoNI has been redesigned to meet the new challenges and a preliminary version is developed on TeraGrid. This paper explores the motivation and goals of the new design, and the improved software structure. Further, it describes the realized new fea-tures seen from MPI parallelized McStas running high resolution design simulations of the SEQUOIA and BSS instruments at SNS. A discussion regarding future work, which is targeted to do fast simulation for automated experiment adjustment and comparing models to data in analysis, is also presented.« less

  11. MARBLE: A system for executing expert systems in parallel

    NASA Technical Reports Server (NTRS)

    Myers, Leonard; Johnson, Coe; Johnson, Dean

    1990-01-01

    This paper details the MARBLE 2.0 system which provides a parallel environment for cooperating expert systems. The work has been done in conjunction with the development of an intelligent computer-aided design system, ICADS, by the CAD Research Unit of the Design Institute at California Polytechnic State University. MARBLE (Multiple Accessed Rete Blackboard Linked Experts) is a system of C Language Production Systems (CLIPS) expert system tool. A copied blackboard is used for communication between the shells to establish an architecture which supports cooperating expert systems that execute in parallel. The design of MARBLE is simple, but it provides support for a rich variety of configurations, while making it relatively easy to demonstrate the correctness of its parallel execution features. In its most elementary configuration, individual CLIPS expert systems execute on their own processors and communicate with each other through a modified blackboard. Control of the system as a whole, and specifically of writing to the blackboard is provided by one of the CLIPS expert systems, an expert control system.

  12. A Systems Approach to Scalable Transportation Network Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perumalla, Kalyan S

    2006-01-01

    Emerging needs in transportation network modeling and simulation are raising new challenges with respect to scal-ability of network size and vehicular traffic intensity, speed of simulation for simulation-based optimization, and fidel-ity of vehicular behavior for accurate capture of event phe-nomena. Parallel execution is warranted to sustain the re-quired detail, size and speed. However, few parallel simulators exist for such applications, partly due to the challenges underlying their development. Moreover, many simulators are based on time-stepped models, which can be computationally inefficient for the purposes of modeling evacuation traffic. Here an approach is presented to de-signing a simulator with memory andmore » speed efficiency as the goals from the outset, and, specifically, scalability via parallel execution. The design makes use of discrete event modeling techniques as well as parallel simulation meth-ods. Our simulator, called SCATTER, is being developed, incorporating such design considerations. Preliminary per-formance results are presented on benchmark road net-works, showing scalability to one million vehicles simu-lated on one processor.« less

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, C.; Yu, G.; Wang, K.

    The physical designs of the new concept reactors which have complex structure, various materials and neutronic energy spectrum, have greatly improved the requirements to the calculation methods and the corresponding computing hardware. Along with the widely used parallel algorithm, heterogeneous platforms architecture has been introduced into numerical computations in reactor physics. Because of the natural parallel characteristics, the CPU-FPGA architecture is often used to accelerate numerical computation. This paper studies the application and features of this kind of heterogeneous platforms used in numerical calculation of reactor physics through practical examples. After the designed neutron diffusion module based on CPU-FPGA architecturemore » achieves a 11.2 speed up factor, it is proved to be feasible to apply this kind of heterogeneous platform into reactor physics. (authors)« less

  14. Dynamic file-access characteristics of a production parallel scientific workload

    NASA Technical Reports Server (NTRS)

    Kotz, David; Nieuwejaar, Nils

    1994-01-01

    Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to tremendous amounts of data in parallel to hundreds or thousands of processors. Most successful systems are based on a solid understanding of the expected workload, but thus far there have been no comprehensive workload characterizations of multiprocessor file systems. This paper presents the results of a three week tracing study in which all file-related activity on a massively parallel computer was recorded. Our instrumentation differs from previous efforts in that it collects information about every I/O request and about the mix of jobs running in a production environment. We also present the results of a trace-driven caching simulation and recommendations for designers of multiprocessor file systems.

  15. Developing Information Power Grid Based Algorithms and Software

    NASA Technical Reports Server (NTRS)

    Dongarra, Jack

    1998-01-01

    This exploratory study initiated our effort to understand performance modeling on parallel systems. The basic goal of performance modeling is to understand and predict the performance of a computer program or set of programs on a computer system. Performance modeling has numerous applications, including evaluation of algorithms, optimization of code implementations, parallel library development, comparison of system architectures, parallel system design, and procurement of new systems. Our work lays the basis for the construction of parallel libraries that allow for the reconstruction of application codes on several distinct architectures so as to assure performance portability. Following our strategy, once the requirements of applications are well understood, one can then construct a library in a layered fashion. The top level of this library will consist of architecture-independent geometric, numerical, and symbolic algorithms that are needed by the sample of applications. These routines should be written in a language that is portable across the targeted architectures.

  16. Rotor systems research aircraft predesign study. Volume 3: Predesign report

    NASA Technical Reports Server (NTRS)

    Schmidt, S. A.; Linden, A. W.

    1972-01-01

    The features of two aircraft designs were selected to be included in the single RSRA configuration. A study was conducted for further preliminary design and a more detailed analysis of development plans and costs. An analysis was also made of foreseeable technical problems and risks, identification of parallel research which would reduce risks and/or add to the basic capability of the aircraft, and a draft aircraft specification.

  17. Multiprocessor graphics computation and display using transputers

    NASA Technical Reports Server (NTRS)

    Ellis, Graham K.

    1988-01-01

    A package of two-dimensional graphics routines was developed to run on a transputer-based parallel processing system. These routines were designed to enable applications programmers to easily generate and display results from the transputer network in a graphic format. The graphics procedures were designed for the lowest possible network communication overhead for increased performance. The routines were designed for ease of use and to present an intuitive approach to generating graphics on the transputer parallel processing system.

  18. Architecture and design of a 500-MHz gallium-arsenide processing element for a parallel supercomputer

    NASA Technical Reports Server (NTRS)

    Fouts, Douglas J.; Butner, Steven E.

    1991-01-01

    The design of the processing element of GASP, a GaAs supercomputer with a 500-MHz instruction issue rate and 1-GHz subsystem clocks, is presented. The novel, functionally modular, block data flow architecture of GASP is described. The architecture and design of a GASP processing element is then presented. The processing element (PE) is implemented in a hybrid semiconductor module with 152 custom GaAs ICs of eight different types. The effects of the implementation technology on both the system-level architecture and the PE design are discussed. SPICE simulations indicate that parts of the PE are capable of being clocked at 1 GHz, while the rest of the PE uses a 500-MHz clock. The architecture utilizes data flow techniques at a program block level, which allows efficient execution of parallel programs while maintaining reasonably good performance on sequential programs. A simulation study of the architecture indicates that an instruction execution rate of over 30,000 MIPS can be attained with 65 PEs.

  19. Evaluation of peristaltic micromixers for highly integrated microfluidic systems

    PubMed Central

    Kim, Duckjong; Rho, Hoon Suk; Jambovane, Sachin; Shin, Soojeong; Hong, Jong Wook

    2016-01-01

    Microfluidic devices based on the multilayer soft lithography allow accurate manipulation of liquids, handling reagents at the sub-nanoliter level, and performing multiple reactions in parallel processors by adapting micromixers. Here, we have experimentally evaluated and compared several designs of micromixers and operating conditions to find design guidelines for the micromixers. We tested circular, triangular, and rectangular mixing loops and measured mixing performance according to the position and the width of the valves that drive nanoliters of fluids in the micrometer scale mixing loop. We found that the rectangular mixer is best for the applications of highly integrated microfluidic platforms in terms of the mixing performance and the space utilization. This study provides an improved understanding of the flow behaviors inside micromixers and design guidelines for micromixers that are critical to build higher order fluidic systems for the complicated parallel bio/chemical processes on a chip. PMID:27036809

  20. Evaluation of peristaltic micromixers for highly integrated microfluidic systems

    NASA Astrophysics Data System (ADS)

    Kim, Duckjong; Rho, Hoon Suk; Jambovane, Sachin; Shin, Soojeong; Hong, Jong Wook

    2016-03-01

    Microfluidic devices based on the multilayer soft lithography allow accurate manipulation of liquids, handling reagents at the sub-nanoliter level, and performing multiple reactions in parallel processors by adapting micromixers. Here, we have experimentally evaluated and compared several designs of micromixers and operating conditions to find design guidelines for the micromixers. We tested circular, triangular, and rectangular mixing loops and measured mixing performance according to the position and the width of the valves that drive nanoliters of fluids in the micrometer scale mixing loop. We found that the rectangular mixer is best for the applications of highly integrated microfluidic platforms in terms of the mixing performance and the space utilization. This study provides an improved understanding of the flow behaviors inside micromixers and design guidelines for micromixers that are critical to build higher order fluidic systems for the complicated parallel bio/chemical processes on a chip.

  1. On the impact of communication complexity in the design of parallel numerical algorithms

    NASA Technical Reports Server (NTRS)

    Gannon, D.; Vanrosendale, J.

    1984-01-01

    This paper describes two models of the cost of data movement in parallel numerical algorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In the second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm independent upper bounds on system performance are derived for several problems that are important to scientific computation.

  2. On the impact of communication complexity on the design of parallel numerical algorithms

    NASA Technical Reports Server (NTRS)

    Gannon, D. B.; Van Rosendale, J.

    1984-01-01

    This paper describes two models of the cost of data movement in parallel numerical alorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In this second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm-independent upper bounds on system performance are derived for several problems that are important to scientific computation.

  3. Design of a MIMD neural network processor

    NASA Astrophysics Data System (ADS)

    Saeks, Richard E.; Priddy, Kevin L.; Pap, Robert M.; Stowell, S.

    1994-03-01

    The Accurate Automation Corporation (AAC) neural network processor (NNP) module is a fully programmable multiple instruction multiple data (MIMD) parallel processor optimized for the implementation of neural networks. The AAC NNP design fully exploits the intrinsic sparseness of neural network topologies. Moreover, by using a MIMD parallel processing architecture one can update multiple neurons in parallel with efficiency approaching 100 percent as the size of the network increases. Each AAC NNP module has 8 K neurons and 32 K interconnections and is capable of 140,000,000 connections per second with an eight processor array capable of over one billion connections per second.

  4. Methods for design and evaluation of parallel computating systems (The PISCES project)

    NASA Technical Reports Server (NTRS)

    Pratt, Terrence W.; Wise, Robert; Haught, Mary JO

    1989-01-01

    The PISCES project started in 1984 under the sponsorship of the NASA Computational Structural Mechanics (CSM) program. A PISCES 1 programming environment and parallel FORTRAN were implemented in 1984 for the DEC VAX (using UNIX processes to simulate parallel processes). This system was used for experimentation with parallel programs for scientific applications and AI (dynamic scene analysis) applications. PISCES 1 was ported to a network of Apollo workstations by N. Fitzgerald.

  5. Design of an Input-Parallel Output-Parallel LLC Resonant DC-DC Converter System for DC Microgrids

    NASA Astrophysics Data System (ADS)

    Juan, Y. L.; Chen, T. R.; Chang, H. M.; Wei, S. E.

    2017-11-01

    Compared with the centralized power system, the distributed modularized power system is composed of several power modules with lower power capacity to provide a totally enough power capacity for the load demand. Therefore, the current stress of the power components in each module can then be reduced, and the flexibility of system setup is also enhanced. However, the parallel-connected power modules in the conventional system are usually controlled to equally share the power flow which would result in lower efficiency in low loading condition. In this study, a modular power conversion system for DC micro grid is developed with 48 V dc low voltage input and 380 V dc high voltage output. However, in the developed system control strategy, the numbers of power modules enabled to share the power flow is decided according to the output power at lower load demand. Finally, three 350 W power modules are constructed and parallel-connected to setup a modular power conversion system. From the experimental results, compared with the conventional system, the efficiency of the developed power system in the light loading condition is greatly improved. The modularized design of the power system can also decrease the power loss ratio to the system capacity.

  6. The Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. The interface conceals the parallelism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. We discuss Galley's file structure and application interface, as well as an application that has been implemented using that interface.

  7. Parallel transmission RF pulse design with strict temperature constraints.

    PubMed

    Deniz, Cem M; Carluccio, Giuseppe; Collins, Christopher

    2017-05-01

    RF safety in parallel transmission (pTx) is generally ensured by imposing specific absorption rate (SAR) limits during pTx RF pulse design. There is increasing interest in using temperature to ensure safety in MRI. In this work, we present a local temperature correlation matrix formalism and apply it to impose strict constraints on maximum absolute temperature in pTx RF pulse design for head and hip regions. Electromagnetic field simulations were performed on the head and hip of virtual body models. Temperature correlation matrices were calculated for four different exposure durations ranging between 6 and 24 min using simulated fields and body-specific constants. Parallel transmission RF pulses were designed using either SAR or temperature constraints, and compared with each other and unconstrained RF pulse design in terms of excitation fidelity and safety. The use of temperature correlation matrices resulted in better excitation fidelity compared with the use of SAR in parallel transmission RF pulse design (for the 6 min exposure period, 8.8% versus 21.0% for the head and 28.0% versus 32.2% for the hip region). As RF exposure duration increases (from 6 min to 24 min), the benefit of using temperature correlation matrices on RF pulse design diminishes. However, the safety of the subject is always guaranteed (the maximum temperature was equal to 39°C). This trend was observed in both head and hip regions, where the perfusion rates are very different. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Aerostructural analysis and design optimization of composite aircraft

    NASA Astrophysics Data System (ADS)

    Kennedy, Graeme James

    High-performance composite materials exhibit both anisotropic strength and stiffness properties. These anisotropic properties can be used to produce highly-tailored aircraft structures that meet stringent performance requirements, but these properties also present unique challenges for analysis and design. New tools and techniques are developed to address some of these important challenges. A homogenization-based theory for beams is developed to accurately predict the through-thickness stress and strain distribution in thick composite beams. Numerical comparisons demonstrate that the proposed beam theory can be used to obtain highly accurate results in up to three orders of magnitude less computational time than three-dimensional calculations. Due to the large finite-element model requirements for thin composite structures used in aerospace applications, parallel solution methods are explored. A parallel direct Schur factorization method is developed. The parallel scalability of the direct Schur approach is demonstrated for a large finite-element problem with over 5 million unknowns. In order to address manufacturing design requirements, a novel laminate parametrization technique is presented that takes into account the discrete nature of the ply-angle variables, and ply-contiguity constraints. This parametrization technique is demonstrated on a series of structural optimization problems including compliance minimization of a plate, buckling design of a stiffened panel and layup design of a full aircraft wing. The design and analysis of composite structures for aircraft is not a stand-alone problem and cannot be performed without multidisciplinary considerations. A gradient-based aerostructural design optimization framework is presented that partitions the disciplines into distinct process groups. An approximate Newton-Krylov method is shown to be an efficient aerostructural solution algorithm and excellent parallel scalability of the algorithm is demonstrated. An induced drag optimization study is performed to compare the trade-off between wing weight and induced drag for wing tip extensions, raked wing tips and winglets. The results demonstrate that it is possible to achieve a 43% induced drag reduction with no weight penalty, a 28% induced drag reduction with a 10% wing weight reduction, or a 20% wing weight reduction with a 5% induced drag penalty from a baseline wing obtained from a structural mass-minimization problem with fixed aerodynamic loads.

  9. State-Based Curriculum-Making: The Illinois Learning Standards

    ERIC Educational Resources Information Center

    Westbury, Ian

    2016-01-01

    This case study of the development of the "Illinois Learning Standards" of 1997 parallels a study of the development of the Norwegian compulsory school curriculum of 1997, "Laereplanverket 1997." The pair of case studies is designed to explore the administration of state-based curriculum-making and, in particular, the use of…

  10. Optimal resonance configuration for ultrasonic wireless power transmission to millimeter-sized biomedical implants.

    PubMed

    Miao Meng; Kiani, Mehdi

    2016-08-01

    In order to achieve efficient wireless power transmission (WPT) to biomedical implants with millimeter (mm) dimensions, ultrasonic WPT links have recently been proposed. Operating both transmitter (Tx) and receiver (Rx) ultrasonic transducers at their resonance frequency (fr) is key in improving power transmission efficiency (PTE). In this paper, different resonance configurations for Tx and Rx transducers, including series and parallel resonance, have been studied to help the designers of ultrasonic WPT links to choose the optimal resonance configuration for Tx and Rx that maximizes PTE. The geometries for disk-shaped transducers of four different sets of links, operating at series-series, series-parallel, parallel-series, and parallel-parallel resonance configurations in Tx and Rx, have been found through finite-element method (FEM) simulation tools for operation at fr of 1.4 MHz. Our simulation results suggest that operating the Tx transducer with parallel resonance increases PTE, while the resonance configuration of the mm-sized Rx transducer highly depends on the load resistance, Rl. For applications that involve large Rl in the order of tens of kΩ, a parallel resonance for a mm-sized Rx leads to higher PTE, while series resonance is preferred for Rl in the order of several kΩ and below.

  11. Exploration of a Permanent Magnet Synchronous Generator with Compensated Reactance Windings in Parallel Rod Configuration

    NASA Astrophysics Data System (ADS)

    Lyan, Oleg; Jankunas, Valdas; Guseinoviene, Eleonora; Pašilis, Aleksas; Senulis, Audrius; Knolis, Audrius; Kurt, Erol

    2018-02-01

    In this study, a permanent magnet synchronous generator (PMSG) topology with compensated reactance windings in parallel rod configuration is proposed to reduce the armature reactance X L and to achieve higher efficiency of PMSG. The PMSG was designed using iron-cored bifilar coil topology to overcome problems of market-dominant rotary type generators. Often the problem is a comparatively high armature reactance X L, which is usually bigger than armature resistance R a. Therefore, the topology is proposed to partially compensate or negligibly reduce the PMSG reactance. The study was performed by using finite element method (FEM) analysis and experimental investigation. FEM analysis was used to investigate magnetic field flux distribution and density in PMSG. The PMSG experimental analyses of no-load losses and electromotive force versus frequency (i.e., speed) was performed. Also terminal voltage, power output and efficiency relation with load current at different frequencies have been evaluated. The reactance of PMSG has low value and a linear relation with operating frequency. The low reactance gives a small variation of efficiency (from 90% to 95%) in a wide range of load (from 3 A to 10 A) and operation frequency (from 44 Hz to 114 Hz). The comparison of PMSG characteristics with parallel and series winding connection showed insignificant power variation. The research results showed that compensated reactance winding in parallel rod configuration in PMSG design provides lower reactance and therefore, higher efficiency under wider load and frequency variation.

  12. P-Hint-Hunt: a deep parallelized whole genome DNA methylation detection tool.

    PubMed

    Peng, Shaoliang; Yang, Shunyun; Gao, Ming; Liao, Xiangke; Liu, Jie; Yang, Canqun; Wu, Chengkun; Yu, Wenqiang

    2017-03-14

    The increasing studies have been conducted using whole genome DNA methylation detection as one of the most important part of epigenetics research to find the significant relationships among DNA methylation and several typical diseases, such as cancers and diabetes. In many of those studies, mapping the bisulfite treated sequence to the whole genome has been the main method to study DNA cytosine methylation. However, today's relative tools almost suffer from inaccuracies and time-consuming problems. In our study, we designed a new DNA methylation prediction tool ("Hint-Hunt") to solve the problem. By having an optimal complex alignment computation and Smith-Waterman matrix dynamic programming, Hint-Hunt could analyze and predict the DNA methylation status. But when Hint-Hunt tried to predict DNA methylation status with large-scale dataset, there are still slow speed and low temporal-spatial efficiency problems. In order to solve the problems of Smith-Waterman dynamic programming and low temporal-spatial efficiency, we further design a deep parallelized whole genome DNA methylation detection tool ("P-Hint-Hunt") on Tianhe-2 (TH-2) supercomputer. To the best of our knowledge, P-Hint-Hunt is the first parallel DNA methylation detection tool with a high speed-up to process large-scale dataset, and could run both on CPU and Intel Xeon Phi coprocessors. Moreover, we deploy and evaluate Hint-Hunt and P-Hint-Hunt on TH-2 supercomputer in different scales. The experimental results illuminate our tools eliminate the deviation caused by bisulfite treatment in mapping procedure and the multi-level parallel program yields a 48 times speed-up with 64 threads. P-Hint-Hunt gain a deep acceleration on CPU and Intel Xeon Phi heterogeneous platform, which gives full play of the advantages of multi-cores (CPU) and many-cores (Phi).

  13. A multi-component parallel-plate flow chamber system for studying the effect of exercise-induced wall shear stress on endothelial cells.

    PubMed

    Wang, Yan-Xia; Xiang, Cheng; Liu, Bo; Zhu, Yong; Luan, Yong; Liu, Shu-Tian; Qin, Kai-Rong

    2016-12-28

    In vivo studies have demonstrated that reasonable exercise training can improve endothelial function. To confirm the key role of wall shear stress induced by exercise on endothelial cells, and to understand how wall shear stress affects the structure and the function of endothelial cells, it is crucial to design and fabricate an in vitro multi-component parallel-plate flow chamber system which can closely replicate exercise-induced wall shear stress waveforms in artery. The in vivo wall shear stress waveforms from the common carotid artery of a healthy volunteer in resting and immediately after 30 min acute aerobic cycling exercise were first calculated by measuring the inner diameter and the center-line blood flow velocity with a color Doppler ultrasound. According to the above in vivo wall shear stress waveforms, we designed and fabricated a parallel-plate flow chamber system with appropriate components based on a lumped parameter hemodynamics model. To validate the feasibility of this system, human umbilical vein endothelial cells (HUVECs) line were cultured within the parallel-plate flow chamber under abovementioned two types of wall shear stress waveforms and the intracellular actin microfilaments and nitric oxide (NO) production level were evaluated using fluorescence microscope. Our results show that the trends of resting and exercise-induced wall shear stress waveforms, especially the maximal, minimal and mean wall shear stress as well as oscillatory shear index, generated by the parallel-plate flow chamber system are similar to those acquired from the common carotid artery. In addition, the cellular experiments demonstrate that the actin microfilaments and the production of NO within cells exposed to the two different wall shear stress waveforms exhibit different dynamic behaviors; there are larger numbers of actin microfilaments and higher level NO in cells exposed in exercise-induced wall shear stress condition than resting wall shear stress condition. The parallel-plate flow chamber system can well reproduce wall shear stress waveforms acquired from the common carotid artery in resting and immediately after exercise states. Furthermore, it can be used for studying the endothelial cells responses under resting and exercise-induced wall shear stress environments in vitro.

  14. Shared Memory Parallelization of an Implicit ADI-type CFD Code

    NASA Technical Reports Server (NTRS)

    Hauser, Th.; Huang, P. G.

    1999-01-01

    A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.

  15. Stage-by-Stage and Parallel Flow Path Compressor Modeling for a Variable Cycle Engine

    NASA Technical Reports Server (NTRS)

    Kopasakis, George; Connolly, Joseph W.; Cheng, Larry

    2015-01-01

    This paper covers the development of stage-by-stage and parallel flow path compressor modeling approaches for a Variable Cycle Engine. The stage-by-stage compressor modeling approach is an extension of a technique for lumped volume dynamics and performance characteristic modeling. It was developed to improve the accuracy of axial compressor dynamics over lumped volume dynamics modeling. The stage-by-stage compressor model presented here is formulated into a parallel flow path model that includes both axial and rotational dynamics. This is done to enable the study of compressor and propulsion system dynamic performance under flow distortion conditions. The approaches utilized here are generic and should be applicable for the modeling of any axial flow compressor design.

  16. Maximal clique enumeration with data-parallel primitives

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lessley, Brenton; Perciano, Talita; Mathai, Manish

    The enumeration of all maximal cliques in an undirected graph is a fundamental problem arising in several research areas. We consider maximal clique enumeration on shared-memory, multi-core architectures and introduce an approach consisting entirely of data-parallel operations, in an effort to achieve efficient and portable performance across different architectures. We study the performance of the algorithm via experiments varying over benchmark graphs and architectures. Overall, we observe that our algorithm achieves up to a 33-time speedup and 9-time speedup over state-of-the-art distributed and serial algorithms, respectively, for graphs with higher ratios of maximal cliques to total cliques. Further, we attainmore » additional speedups on a GPU architecture, demonstrating the portable performance of our data-parallel design.« less

  17. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    NASA Astrophysics Data System (ADS)

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  18. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.

    PubMed

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-12-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.

  19. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    PubMed Central

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-01-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520

  20. Flat-plate photovoltaic array design optimization

    NASA Technical Reports Server (NTRS)

    Ross, R. G., Jr.

    1980-01-01

    An analysis is presented which integrates the results of specific studies in the areas of photovoltaic structural design optimization, optimization of array series/parallel circuit design, thermal design optimization, and optimization of environmental protection features. The analysis is based on minimizing the total photovoltaic system life-cycle energy cost including repair and replacement of failed cells and modules. This approach is shown to be a useful technique for array optimization, particularly when time-dependent parameters such as array degradation and maintenance are involved.

  1. Parallel multipoint recording of aligned and cultured neurons on corresponding Micro Channel Array toward on-chip cell analysis.

    PubMed

    Tonomura, W; Moriguchi, H; Jimbo, Y; Konishi, S

    2008-01-01

    This paper describes an advanced Micro Channel Array (MCA) so as to record neuronal network at multiple points simultaneously. Developed MCA is designed for neuronal network analysis which has been studied by co-authors using MEA (Micro Electrode Arrays) system. The MCA employs the principle of the extracellular recording. Presented MCA has the following advantages. First of all, the electrodes integrated around individual micro channels are electrically isolated for parallel multipoint recording. Sucking and clamping of cells through micro channels is expected to improve the cellular selectivity and S/N ratio. In this study, hippocampal neurons were cultured on the developed MCA. As a result, the spontaneous and evoked spike potential could be recorded by sucking and clamping the cells at multiple points. Herein, we describe the successful experimental results together with the design and fabrication of the advanced MCA toward on-chip analysis of neuronal network.

  2. The engine design engine. A clustered computer platform for the aerodynamic inverse design and analysis of a full engine

    NASA Technical Reports Server (NTRS)

    Sanz, J.; Pischel, K.; Hubler, D.

    1992-01-01

    An application for parallel computation on a combined cluster of powerful workstations and supercomputers was developed. A Parallel Virtual Machine (PVM) is used as message passage language on a macro-tasking parallelization of the Aerodynamic Inverse Design and Analysis for a Full Engine computer code. The heterogeneous nature of the cluster is perfectly handled by the controlling host machine. Communication is established via Ethernet with the TCP/IP protocol over an open network. A reasonable overhead is imposed for internode communication, rendering an efficient utilization of the engaged processors. Perhaps one of the most interesting features of the system is its versatile nature, that permits the usage of the computational resources available that are experiencing less use at a given point in time.

  3. Parallel-Vector Algorithm For Rapid Structural Anlysis

    NASA Technical Reports Server (NTRS)

    Agarwal, Tarun R.; Nguyen, Duc T.; Storaasli, Olaf O.

    1993-01-01

    New algorithm developed to overcome deficiency of skyline storage scheme by use of variable-band storage scheme. Exploits both parallel and vector capabilities of modern high-performance computers. Gives engineers and designers opportunity to include more design variables and constraints during optimization of structures. Enables use of more refined finite-element meshes to obtain improved understanding of complex behaviors of aerospace structures leading to better, safer designs. Not only attractive for current supercomputers but also for next generation of shared-memory supercomputers.

  4. DCL System Using Deep Learning Approaches for Land-based or Ship-based Real-Time Recognition and Localization of Marine Mammals

    DTIC Science & Technology

    2012-09-30

    platform (HPC) was developed, called the HPC-Acoustic Data Accelerator, or HPC-ADA for short. The HPC-ADA was designed based on fielded systems [1-4...software (Detection cLassificaiton for MAchine learning - High Peformance Computing). The software package was designed to utilize parallel and...Sedna [7] and is designed using a parallel architecture2, allowing existing algorithms to distribute to the various processing nodes with minimal changes

  5. Design of miniature type parallel coupled microstrip hairpin filter in UHF range

    NASA Astrophysics Data System (ADS)

    Hasan, Adib Belhaj; Rahman, Maj Tarikur; Kahhar, Azizul; Trina, Tasnim; Saha, Pran Kanai

    2017-12-01

    A microstrip parallel coupled line bandpass filter is designed in UHF range and the filter size is reduced by microstrip hairpin structure. The FR4 substrate is used as base material of the filter. The filter is analyzed by both ADS and CST design studio in the frequency range of 500 MHz to 650 MHz. The Bandwidth is found 13.27% with a center frequency 570 MHz. Simulation from both ADS and CST shows a very good agreement of performance of the filter.

  6. Xyce Parallel Electronic Simulator Users' Guide Version 6.7.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one tomore » develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright c 2002-2017 Sandia Corporation. All rights reserved. Trademarks Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. All other trademarks are property of their respective owners. Contacts World Wide Web http://xyce.sandia.gov https://info.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only) Bug Reports (Sandia only) http://joseki-vm.sandia.gov/bugzilla http://morannon.sandia.gov/bugzilla« less

  7. Veteran Teacher Engagement in Site-Based Professional Development: A Mixed Methods Study

    ERIC Educational Resources Information Center

    Houston, Biaze L.

    2016-01-01

    This research study examined how teachers self-report their levels of engagement, which factors they believe contribute most to their engagement, and which assumptions of andragogy most heavily influence teacher engagement in site-based professional development. This study employed a convergent parallel mixed methods design to study veteran…

  8. Xyce Parallel Electronic Simulator Users' Guide Version 6.8

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase$-$ a message passing parallel implementation $-$ which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less

  9. Multirate parallel distributed compensation of a cluster in wireless sensor and actor networks

    NASA Astrophysics Data System (ADS)

    Yang, Chun-xi; Huang, Ling-yun; Zhang, Hao; Hua, Wang

    2016-01-01

    The stabilisation problem for one of the clusters with bounded multiple random time delays and packet dropouts in wireless sensor and actor networks is investigated in this paper. A new multirate switching model is constructed to describe the feature of this single input multiple output linear system. According to the difficulty of controller design under multi-constraints in multirate switching model, this model can be converted to a Takagi-Sugeno fuzzy model. By designing a multirate parallel distributed compensation, a sufficient condition is established to ensure this closed-loop fuzzy control system to be globally exponentially stable. The solution of the multirate parallel distributed compensation gains can be obtained by solving an auxiliary convex optimisation problem. Finally, two numerical examples are given to show, compared with solving switching controller, multirate parallel distributed compensation can be obtained easily. Furthermore, it has stronger robust stability than arbitrary switching controller and single-rate parallel distributed compensation under the same conditions.

  10. Development and evaluation of a study design typology for human research.

    PubMed

    Carini, Simona; Pollock, Brad H; Lehmann, Harold P; Bakken, Suzanne; Barbour, Edward M; Gabriel, Davera; Hagler, Herbert K; Harper, Caryn R; Mollah, Shamim A; Nahm, Meredith; Nguyen, Hien H; Scheuermann, Richard H; Sim, Ida

    2009-11-14

    A systematic classification of study designs would be useful for researchers, systematic reviewers, readers, and research administrators, among others. As part of the Human Studies Database Project, we developed the Study Design Typology to standardize the classification of study designs in human research. We then performed a multiple observer masked evaluation of active research protocols in four institutions according to a standardized protocol. Thirty-five protocols were classified by three reviewers each into one of nine high-level study designs for interventional and observational research (e.g., N-of-1, Parallel Group, Case Crossover). Rater classification agreement was moderately high for the 35 protocols (Fleiss' kappa = 0.442) and higher still for the 23 quantitative studies (Fleiss' kappa = 0.463). We conclude that our typology shows initial promise for reliably distinguishing study design types for quantitative human research.

  11. Statistical power analysis of cardiovascular safety pharmacology studies in conscious rats.

    PubMed

    Bhatt, Siddhartha; Li, Dingzhou; Flynn, Declan; Wisialowski, Todd; Hemkens, Michelle; Steidl-Nichols, Jill

    2016-01-01

    Cardiovascular (CV) toxicity and related attrition are a major challenge for novel therapeutic entities and identifying CV liability early is critical for effective derisking. CV safety pharmacology studies in rats are a valuable tool for early investigation of CV risk. Thorough understanding of data analysis techniques and statistical power of these studies is currently lacking and is imperative for enabling sound decision-making. Data from 24 crossover and 12 parallel design CV telemetry rat studies were used for statistical power calculations. Average values of telemetry parameters (heart rate, blood pressure, body temperature, and activity) were logged every 60s (from 1h predose to 24h post-dose) and reduced to 15min mean values. These data were subsequently binned into super intervals for statistical analysis. A repeated measure analysis of variance was used for statistical analysis of crossover studies and a repeated measure analysis of covariance was used for parallel studies. Statistical power analysis was performed to generate power curves and establish relationships between detectable CV (blood pressure and heart rate) changes and statistical power. Additionally, data from a crossover CV study with phentolamine at 4, 20 and 100mg/kg are reported as a representative example of data analysis methods. Phentolamine produced a CV profile characteristic of alpha adrenergic receptor antagonism, evidenced by a dose-dependent decrease in blood pressure and reflex tachycardia. Detectable blood pressure changes at 80% statistical power for crossover studies (n=8) were 4-5mmHg. For parallel studies (n=8), detectable changes at 80% power were 6-7mmHg. Detectable heart rate changes for both study designs were 20-22bpm. Based on our results, the conscious rat CV model is a sensitive tool to detect and mitigate CV risk in early safety studies. Furthermore, these results will enable informed selection of appropriate models and study design for early stage CV studies. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. A novel visual hardware behavioral language

    NASA Technical Reports Server (NTRS)

    Li, Xueqin; Cheng, H. D.

    1992-01-01

    Most hardware behavioral languages just use texts to describe the behavior of the desired hardware design. This is inconvenient for VLSI designers who enjoy using the schematic approach. The proposed visual hardware behavioral language has the ability to graphically express design information using visual parallel models (blocks), visual sequential models (processes) and visual data flow graphs (which consist of primitive operational icons, control icons, and Data and Synchro links). Thus, the proposed visual hardware behavioral language can not only specify hardware concurrent and sequential functionality, but can also visually expose parallelism, sequentiality, and disjointness (mutually exclusive operations) for the hardware designers. That would make the hardware designers capture the design ideas easily and explicitly using this visual hardware behavioral language.

  13. Accelerating large-scale protein structure alignments with graphics processing units

    PubMed Central

    2012-01-01

    Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132

  14. Programming Probabilistic Structural Analysis for Parallel Processing Computer

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

    1991-01-01

    The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.

  15. Design and analysis of all-dielectric broadband nonpolarizing parallel-plate beam splitters.

    PubMed

    Wang, Wenliang; Xiong, Shengming; Zhang, Yundong

    2007-06-01

    Past research on the all-dielectric nonpolarizing beam splitter is reviewed. With the aid of the needle thin-film synthesis method and the conjugate graduate refine method, three different split ratio nonpolarizing parallel-plate beam splitters over a 200 nm spectral range centered at 550 nm with incidence angles of 45 degrees are designed. The chosen materials component and the initial stack are based on the Costich and Thelen theories. The results of design and analysis show that the designs maintain a very low polarization ratio in the working range of the spectrum and has a reasonable angular field.

  16. Evaluation of fault-tolerant parallel-processor architectures over long space missions

    NASA Technical Reports Server (NTRS)

    Johnson, Sally C.

    1989-01-01

    The impact of a five year space mission environment on fault-tolerant parallel processor architectures is examined. The target application is a Strategic Defense Initiative (SDI) satellite requiring 256 parallel processors to provide the computation throughput. The reliability requirements are that the system still be operational after five years with .99 probability and that the probability of system failure during one-half hour of full operation be less than 10(-7). The fault tolerance features an architecture must possess to meet these reliability requirements are presented, many potential architectures are briefly evaluated, and one candidate architecture, the Charles Stark Draper Laboratory's Fault-Tolerant Parallel Processor (FTPP) is evaluated in detail. A methodology for designing a preliminary system configuration to meet the reliability and performance requirements of the mission is then presented and demonstrated by designing an FTPP configuration.

  17. Compact hohlraum configuration with parallel planar-wire-array x-ray sources at the 1.7-MA Zebra generator.

    PubMed

    Kantsyrev, V L; Chuvatin, A S; Rudakov, L I; Velikovich, A L; Shrestha, I K; Esaulov, A A; Safronova, A S; Shlyaptseva, V V; Osborne, G C; Astanovitsky, A L; Weller, M E; Stafford, A; Schultz, K A; Cooper, M C; Cuneo, M E; Jones, B; Vesey, R A

    2014-12-01

    A compact Z-pinch x-ray hohlraum design with parallel-driven x-ray sources is experimentally demonstrated in a configuration with a central target and tailored shine shields at a 1.7-MA Zebra generator. Driving in parallel two magnetically decoupled compact double-planar-wire Z pinches has demonstrated the generation of synchronized x-ray bursts that correlated well in time with x-ray emission from a central reemission target. Good agreement between simulated and measured hohlraum radiation temperature of the central target is shown. The advantages of compact hohlraum design applications for multi-MA facilities are discussed.

  18. Parallel/distributed direct method for solving linear systems

    NASA Technical Reports Server (NTRS)

    Lin, Avi

    1990-01-01

    A new family of parallel schemes for directly solving linear systems is presented and analyzed. It is shown that these schemes exhibit a near optimal performance and enjoy several important features: (1) For large enough linear systems, the design of the appropriate paralleled algorithm is insensitive to the number of processors as its performance grows monotonically with them; (2) It is especially good for large matrices, with dimensions large relative to the number of processors in the system; (3) It can be used in both distributed parallel computing environments and tightly coupled parallel computing systems; and (4) This set of algorithms can be mapped onto any parallel architecture without any major programming difficulties or algorithmical changes.

  19. [Not Available].

    PubMed

    Paturzo, Marco; Colaceci, Sofia; Clari, Marco; Mottola, Antonella; Alvaro, Rosaria; Vellone, Ercole

    2016-01-01

    . Mixed methods designs: an innovative methodological approach for nursing research. The mixed method research designs (MM) combine qualitative and quantitative approaches in the research process, in a single study or series of studies. Their use can provide a wider understanding of multifaceted phenomena. This article presents a general overview of the structure and design of MM to spread this approach in the Italian nursing research community. The MM designs most commonly used in the nursing field are the convergent parallel design, the sequential explanatory design, the exploratory sequential design and the embedded design. For each method a research example is presented. The use of MM can be an added value to improve clinical practices as, through the integration of qualitative and quantitative methods, researchers can better assess complex phenomena typical of nursing.

  20. Design and implementation of highly parallel pipelined VLSI systems

    NASA Astrophysics Data System (ADS)

    Delange, Alphonsus Anthonius Jozef

    A methodology and its realization as a prototype CAD (Computer Aided Design) system for the design and analysis of complex multiprocessor systems is presented. The design is an iterative process in which the behavioral specifications of the system components are refined into structural descriptions consisting of interconnections and lower level components etc. A model for the representation and analysis of multiprocessor systems at several levels of abstraction and an implementation of a CAD system based on this model are described. A high level design language, an object oriented development kit for tool design, a design data management system, and design and analysis tools such as a high level simulator and graphics design interface which are integrated into the prototype system and graphics interface are described. Procedures for the synthesis of semiregular processor arrays, and to compute the switching of input/output signals, memory management and control of processor array, and sequencing and segmentation of input/output data streams due to partitioning and clustering of the processor array during the subsequent synthesis steps, are described. The architecture and control of a parallel system is designed and each component mapped to a module or module generator in a symbolic layout library, compacted for design rules of VLSI (Very Large Scale Integration) technology. An example of the design of a processor that is a useful building block for highly parallel pipelined systems in the signal/image processing domains is given.

  1. Space Station Human Factors Research Review. Volume 1: EVA Research and Development

    NASA Technical Reports Server (NTRS)

    Cohen, Marc M. (Editor); Vykukal, H. C. (Editor)

    1988-01-01

    An overview is presented of extravehicular activity (EVA) research and development activities at Ames. The majority of the program was devoted to presentations by the three contractors working in parallel on the EVA System Phase A Study, focusing on Implications for Man-Systems Design. Overhead visuals are included for a mission results summary, space station EVA requirements and interface accommodations summary, human productivity study cross-task coordination, and advanced EVAS Phase A study implications for man-systems design. Articles are also included on subsea approach to work systems development and advanced EVA system design requirements.

  2. EOS: A project to investigate the design and construction of real-time distributed Embedded Operating Systems

    NASA Technical Reports Server (NTRS)

    Campbell, R. H.; Essick, Ray B.; Johnston, Gary; Kenny, Kevin; Russo, Vince

    1987-01-01

    Project EOS is studying the problems of building adaptable real-time embedded operating systems for the scientific missions of NASA. Choices (A Class Hierarchical Open Interface for Custom Embedded Systems) is an operating system designed and built by Project EOS to address the following specific issues: the software architecture for adaptable embedded parallel operating systems, the achievement of high-performance and real-time operation, the simplification of interprocess communications, the isolation of operating system mechanisms from one another, and the separation of mechanisms from policy decisions. Choices is written in C++ and runs on a ten processor Encore Multimax. The system is intended for use in constructing specialized computer applications and research on advanced operating system features including fault tolerance and parallelism.

  3. Automated Solar Module Assembly Line

    NASA Technical Reports Server (NTRS)

    Bycer, M.

    1979-01-01

    The gathering of information that led to the design approach of the machine, and a summary of the findings in the areas of study along with a description of each station of the machine are discussed. The machine is a cell stringing and string applique machine which is flexible in design, capable of handling a variety of cells and assembling strings of cells which can then be placed in a matrix up to 4 ft x 2 ft. in series or parallel arrangement. The target machine cycle is to be 5 seconds per cell. This machine is primarily adapted to 100 MM round cells with one or two tabs between cells. It places finished strings of up to twelve cells in a matrix of up to six such strings arranged in series or in parallel.

  4. Unique Study Designs in Nephrology: N-of-1 Trials and Other Designs.

    PubMed

    Samuel, Joyce P; Bell, Cynthia S

    2016-11-01

    Alternatives to the traditional parallel-group trial design may be required to answer clinical questions in special populations, rare conditions, or with limited resources. N-of-1 trials are a unique trial design which can inform personalized evidence-based decisions for the patient when data from traditional clinical trials are lacking or not generalizable. A concise overview of factorial design, cluster randomization, adaptive designs, crossover studies, and n-of-1 trials will be provided along with pertinent examples in nephrology. The indication for analysis strategies such as equivalence and noninferiority trials will be discussed, as well as analytic pitfalls. Copyright © 2016 National Kidney Foundation, Inc. Published by Elsevier Inc. All rights reserved.

  5. NAS Requirements Checklist for Job Queuing/Scheduling Software

    NASA Technical Reports Server (NTRS)

    Jones, James Patton

    1996-01-01

    The increasing reliability of parallel systems and clusters of computers has resulted in these systems becoming more attractive for true production workloads. Today, the primary obstacle to production use of clusters of computers is the lack of a functional and robust Job Management System for parallel applications. This document provides a checklist of NAS requirements for job queuing and scheduling in order to make most efficient use of parallel systems and clusters for parallel applications. Future requirements are also identified to assist software vendors with design planning.

  6. Parallel Calculation of Sensitivity Derivatives for Aircraft Design using Automatic Differentiation

    NASA Technical Reports Server (NTRS)

    Bischof, c. H.; Green, L. L.; Haigler, K. J.; Knauff, T. L., Jr.

    1994-01-01

    Sensitivity derivative (SD) calculation via automatic differentiation (AD) typical of that required for the aerodynamic design of a transport-type aircraft is considered. Two ways of computing SD via code generated by the ADIFOR automatic differentiation tool are compared for efficiency and applicability to problems involving large numbers of design variables. A vector implementation on a Cray Y-MP computer is compared with a coarse-grained parallel implementation on an IBM SP1 computer, employing a Fortran M wrapper. The SD are computed for a swept transport wing in turbulent, transonic flow; the number of geometric design variables varies from 1 to 60 with coupling between a wing grid generation program and a state-of-the-art, 3-D computational fluid dynamics program, both augmented for derivative computation via AD. For a small number of design variables, the Cray Y-MP implementation is much faster. As the number of design variables grows, however, the IBM SP1 becomes an attractive alternative in terms of compute speed, job turnaround time, and total memory available for solutions with large numbers of design variables. The coarse-grained parallel implementation also can be moved easily to a network of workstations.

  7. WFIRST: Science from the Guest Investigator and Parallel Observation Programs

    NASA Astrophysics Data System (ADS)

    Postman, Marc; Nataf, David; Furlanetto, Steve; Milam, Stephanie; Robertson, Brant; Williams, Ben; Teplitz, Harry; Moustakas, Leonidas; Geha, Marla; Gilbert, Karoline; Dickinson, Mark; Scolnic, Daniel; Ravindranath, Swara; Strolger, Louis; Peek, Joshua; Marc Postman

    2018-01-01

    The Wide Field InfraRed Survey Telescope (WFIRST) mission will provide an extremely rich archival dataset that will enable a broad range of scientific investigations beyond the initial objectives of the proposed key survey programs. The scientific impact of WFIRST will thus be significantly expanded by a robust Guest Investigator (GI) archival research program. We will present examples of GI research opportunities ranging from studies of the properties of a variety of Solar System objects, surveys of the outer Milky Way halo, comprehensive studies of cluster galaxies, to unique and new constraints on the epoch of cosmic re-ionization and the assembly of galaxies in the early universe.WFIRST will also support the acquisition of deep wide-field imaging and slitless spectroscopic data obtained in parallel during campaigns with the coronagraphic instrument (CGI). These parallel wide-field imager (WFI) datasets can provide deep imaging data covering several square degrees at no impact to the scheduling of the CGI program. A competitively selected program of well-designed parallel WFI observation programs will, like the GI science above, maximize the overall scientific impact of WFIRST. We will give two examples of parallel observations that could be conducted during a proposed CGI program centered on a dozen nearby stars.

  8. Propulsion technology for an advanced subsonic transport

    NASA Technical Reports Server (NTRS)

    Beheim, M. A.; Antl, R. J.; Povolny, J. H.

    1972-01-01

    Engine design studies for future subsonic commercial transport aircraft were conducted in parallel with airframe studies. These studies surveyed a broad distribution of design variables, including aircraft configuration, payload, range, and speed, with particular emphasis on reducing noise and exhaust emissions without severe economic and performance penalties. The results indicated that an engine for an advanced transport would be similar to the currently emerging turbofan engines. Application of current technology in the areas of noise suppression and combustors imposed severe performance and economic penalties.

  9. State-Based Curriculum Work and Curriculum-Making: Norway's "Laereplanverket 1997"

    ERIC Educational Resources Information Center

    Sivesind, Kirsten; Westbury, Ian

    2016-01-01

    This case study of the development of the Norwegian compulsory school curriculum of 1997, "Laereplanverket 1997," parallels a study of the development of the "Illinois Learning Standards" of 1997. The pair of case studies is designed to explore the administration of state-based curriculum-making and, in particular, the use in…

  10. Electromagnetic Design of a Magnetically-Coupled Spatial Power Combiner

    NASA Technical Reports Server (NTRS)

    Bulcha, B.; Cataldo, G.; Stevenson, T. R.; U-Yen, K.; Moseley, S. H.; Wollack, E. J.

    2017-01-01

    The design of a two-dimensional beam-combining network employing a parallel-plate superconducting waveguide with a mono-crystalline silicon dielectric is presented. This novel beam-combining network structure employs an array of magnetically coupled antenna elements to achieve high coupling efficiency and full sampling of the intensity distribution while avoiding diffractive losses in the multi-mode region defined by the parallel-plate waveguide. These attributes enable the structures use in realizing compact far-infrared spectrometers for astrophysical and instrumentation applications. When configured with a suitable corporate-feed power-combiner, this fully sampled array can be used to realize a low-sidelobe apodized response without incurring a reduction in coupling efficiency. To control undesired reflections over a wide range of angles in the finite-sized parallel-plate waveguide region, a wideband meta-material electromagnetic absorber structure is implemented. This adiabatic structure absorbs greater than 99 of the power over the 1.7:1 operational band at angles ranging from normal (0 degree) to near parallel (180 degree) incidence. Design, simulations, and application of the device will be presented.

  11. Predicting the stability of a compressible periodic parallel jet flow

    NASA Technical Reports Server (NTRS)

    Miles, Jeffrey H.

    1996-01-01

    It is known that mixing enhancement in compressible free shear layer flows with high convective Mach numbers is difficult. One design strategy to get around this is to use multiple nozzles. Extrapolating this design concept in a one dimensional manner, one arrives at an array of parallel rectangular nozzles where the smaller dimension is omega and the longer dimension, b, is taken to be infinite. In this paper, the feasibility of predicting the stability of this type of compressible periodic parallel jet flow is discussed. The problem is treated using Floquet-Bloch theory. Numerical solutions to this eigenvalue problem are presented. For the case presented, the interjet spacing, s, was selected so that s/omega =2.23. Typical plots of the eigenvalue and stability curves are presented. Results obtained for a range of convective Mach numbers from 3 to 5 show growth rates omega(sub i)=kc(sub i)/2 range from 0.25 to 0.29. These results indicate that coherent two-dimensional structures can occur without difficulty in multiple parallel periodic jet nozzles and that shear layer mixing should occur with this type of nozzle design.

  12. Combining qualitative and quantitative research within mixed method research designs: a methodological review.

    PubMed

    Östlund, Ulrika; Kidd, Lisa; Wengström, Yvonne; Rowa-Dewar, Neneh

    2011-03-01

    It has been argued that mixed methods research can be useful in nursing and health science because of the complexity of the phenomena studied. However, the integration of qualitative and quantitative approaches continues to be one of much debate and there is a need for a rigorous framework for designing and interpreting mixed methods research. This paper explores the analytical approaches (i.e. parallel, concurrent or sequential) used in mixed methods studies within healthcare and exemplifies the use of triangulation as a methodological metaphor for drawing inferences from qualitative and quantitative findings originating from such analyses. This review of the literature used systematic principles in searching CINAHL, Medline and PsycINFO for healthcare research studies which employed a mixed methods approach and were published in the English language between January 1999 and September 2009. In total, 168 studies were included in the results. Most studies originated in the United States of America (USA), the United Kingdom (UK) and Canada. The analytic approach most widely used was parallel data analysis. A number of studies used sequential data analysis; far fewer studies employed concurrent data analysis. Very few of these studies clearly articulated the purpose for using a mixed methods design. The use of the methodological metaphor of triangulation on convergent, complementary, and divergent results from mixed methods studies is exemplified and an example of developing theory from such data is provided. A trend for conducting parallel data analysis on quantitative and qualitative data in mixed methods healthcare research has been identified in the studies included in this review. Using triangulation as a methodological metaphor can facilitate the integration of qualitative and quantitative findings, help researchers to clarify their theoretical propositions and the basis of their results. This can offer a better understanding of the links between theory and empirical findings, challenge theoretical assumptions and develop new theory. Copyright © 2010 Elsevier Ltd. All rights reserved.

  13. A Technological Acceptance of Remote Laboratory in Chemistry Education

    ERIC Educational Resources Information Center

    Ling, Wendy Sing Yii; Lee, Tien Tien; Tho, Siew Wei

    2017-01-01

    The purpose of this study is to evaluate the technological acceptance of Chemistry students, and the opinions of Chemistry lecturers and laboratory assistants towards the use of remote laboratory in Chemistry education. The convergent parallel design mixed method was carried out in this study. The instruments involved were questionnaire and…

  14. Being Outside Learning about Science Is Amazing: A Mixed Methods Study

    ERIC Educational Resources Information Center

    Weibel, Michelle L.

    2011-01-01

    This study used a convergent parallel mixed methods design to examine teachers' environmental attitudes and concerns about an outdoor educational field trip. Converging both quantitative data (Environmental Attitudes Scale and teacher demographics) and qualitative data (Open-Ended Statements of Concern and interviews) facilitated interpretation.…

  15. Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

    NASA Technical Reports Server (NTRS)

    Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

    1994-01-01

    Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.

  16. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems.

    PubMed

    Stone, John E; Gohara, David; Shi, Guochun

    2010-05-01

    We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures.

  17. The design of multi-core DSP parallel model based on message passing and multi-level pipeline

    NASA Astrophysics Data System (ADS)

    Niu, Jingyu; Hu, Jian; He, Wenjing; Meng, Fanrong; Li, Chuanrong

    2017-10-01

    Currently, the design of embedded signal processing system is often based on a specific application, but this idea is not conducive to the rapid development of signal processing technology. In this paper, a parallel processing model architecture based on multi-core DSP platform is designed, and it is mainly suitable for the complex algorithms which are composed of different modules. This model combines the ideas of multi-level pipeline parallelism and message passing, and summarizes the advantages of the mainstream model of multi-core DSP (the Master-Slave model and the Data Flow model), so that it has better performance. This paper uses three-dimensional image generation algorithm to validate the efficiency of the proposed model by comparing with the effectiveness of the Master-Slave and the Data Flow model.

  18. Trade-off results and preliminary designs of Near-Term Hybrid Vehicles

    NASA Technical Reports Server (NTRS)

    Sandberg, J. J.

    1980-01-01

    Phase I of the Near-Term Hybrid Vehicle Program involved the development of preliminary designs of electric/heat engine hybrid passenger vehicles. The preliminary designs were developed on the basis of mission analysis, performance specification, and design trade-off studies conducted independently by four contractors. THe resulting designs involve parallel hybrid (heat engine/electric) propulsion systems with significant variation in component selection, power train layout, and control strategy. Each of the four designs is projected by its developer as having the potential to substitute electrical energy for 40% to 70% of the petroleum fuel consumed annually by its conventional counterpart.

  19. Ti/Al Design/Cost Trade-Off Analysis

    DTIC Science & Technology

    1978-10-01

    evaluate the applV!ati’an of selected titanium aluuinide alloys to both dynamic and static components of aircraft gas turbine engines . Mr. D. 0. Nash...the development of advanced aircraft gas turbine engines , a continuing objective has been to develop lightweight, high-performance designs. A parallel... engines for the design/cost trade-off study are as follows: Dynamic Components "* F1O1 Fourth-Stage Compressor Blade "* JlO1 Low Pressure Turbine Blade

  20. Heavy Duty Diesel Truck and Bus Hybrid Powertrain Study

    DTIC Science & Technology

    2012-03-01

    electric 22 ft. bus that offers greater range than battery-electric buses can provide. Designed to seat 22 passengers plus standees, this Ebus model...system that has both parallel and series operating modes. The relatively low volume of many truck and bus designs has inhibited the development of...that battery packs need to be designed for 50,000 lifetime energy storage cycles in a hybrid transit bus vs. just 3,600 cycles in the typical

  1. Integration experiences and performance studies of A COTS parallel archive systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Hsing-bung; Scott, Cody; Grider, Bary

    2010-01-01

    Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf(COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and lessmore » robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, ls, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petaflop/s computing system, LANL's Roadrunner, and demonstrated its capability to address requirements of future archival storage systems.« less

  2. Integration experiments and performance studies of a COTS parallel archive system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Hsing-bung; Scott, Cody; Grider, Gary

    2010-06-16

    Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf (COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching andmore » less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, Is, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petafiop/s computing system, LANL's Roadrunner machine, and demonstrated its capability to address requirements of future archival storage systems.« less

  3. Efficacy and safety of sacubitril/valsartan (LCZ696) in Japanese patients with chronic heart failure and reduced ejection fraction: Rationale for and design of the randomized, double-blind PARALLEL-HF study.

    PubMed

    Tsutsui, Hiroyuki; Momomura, Shinichi; Saito, Yoshihiko; Ito, Hiroshi; Yamamoto, Kazuhiro; Ohishi, Tomomi; Okino, Naoko; Guo, Weinong

    2017-09-01

    The prognosis of heart failure patients with reduced ejection fraction (HFrEF) in Japan remains poor, although there is growing evidence for increasing use of evidence-based pharmacotherapies in Japanese real-world HF registries. Sacubitril/valsartan (LCZ696) is a first-in-class angiotensin receptor neprilysin inhibitor shown to reduce mortality and morbidity in the recently completed largest outcome trial in patients with HFrEF (PARADIGM-HF trial). The prospectively designed phase III PARALLEL-HF (Prospective comparison of ARNI with ACE inhibitor to determine the noveL beneficiaL trEatment vaLue in Japanese Heart Failure patients) study aims to assess the clinical efficacy and safety of LCZ696 in Japanese HFrEF patients, and show similar improvements in clinical outcomes as the PARADIGM-HF study enabling the registration of LCZ696 in Japan. This is a multicenter, randomized, double-blind, parallel-group, active controlled study of 220 Japanese HFrEF patients. Eligibility criteria include a diagnosis of chronic HF (New York Heart Association Class II-IV) and reduced ejection fraction (left ventricular ejection fraction ≤35%) and increased plasma concentrations of natriuretic peptides [N-terminal pro B-type natriuretic peptide (NT-proBNP) ≥600pg/mL, or NT-proBNP ≥400pg/mL for those who had a hospitalization for HF within the last 12 months] at the screening visit. The study consists of three phases: (i) screening, (ii) single-blind active LCZ696 run-in, and (iii) double-blind randomized treatment. Patients tolerating LCZ696 50mg bid during the treatment run-in are randomized (1:1) to receive LCZ696 100mg bid or enalapril 5mg bid for 4 weeks followed by up-titration to target doses of LCZ696 200mg bid or enalapril 10mg bid in a double-blind manner. The primary outcome is the composite of cardiovascular death or HF hospitalization and the study is an event-driven trial. The design of the PARALLEL-HF study is aligned with the PARADIGM-HF study and aims to assess the efficacy and safety of LCZ696 in Japanese HFrEF patients. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  4. Post retention and post/core shear bond strength of four post systems.

    PubMed

    Stockton, L W; Williams, P T; Clarke, C T

    2000-01-01

    As clinicians we continue to search for a post system which will give us maximum retention while maximizing resistance to root fracture. The introduction of several new post systems, with claims of high retentive and resistance to root fracture values, require that independent studies be performed to evaluate these claims. This study tested the tensile and shear dislodgment forces of four post designs that were luted into roots 10 mm apical of the CEJ. The Para Post Plus (P1) is a parallel-sided, passive design; the Para Post XT (P2) is a combination active/passive design; the Flexi-Post (F1) and the Flexi-Flange (F2) are active post designs. All systems tested were stainless steel. This study compared the test results of the four post designs for tensile and shear dislodgment. All mounted samples were loaded in tension until failure occurred. The tensile load was applied parallel to the long axis of the root, while the shear load was applied at 450 to the long axis of the root. The Flexi-Post (F1) was significantly different from the other three in the tensile test, however, the Para Post XT (P2) was significantly different to the other three in the shear test and had a better probability for survival in the Kaplan-Meier survival function test. Based on the results of this study, our recommendation is for the Para Post XT (P2).

  5. SCORPIO: A Scalable Two-Phase Parallel I/O Library With Application To A Large Scale Subsurface Simulator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sreepathi, Sarat; Sripathi, Vamsi; Mills, Richard T

    2013-01-01

    Inefficient parallel I/O is known to be a major bottleneck among scientific applications employed on supercomputers as the number of processor cores grows into the thousands. Our prior experience indicated that parallel I/O libraries such as HDF5 that rely on MPI-IO do not scale well beyond 10K processor cores, especially on parallel file systems (like Lustre) with single point of resource contention. Our previous optimization efforts for a massively parallel multi-phase and multi-component subsurface simulator (PFLOTRAN) led to a two-phase I/O approach at the application level where a set of designated processes participate in the I/O process by splitting themore » I/O operation into a communication phase and a disk I/O phase. The designated I/O processes are created by splitting the MPI global communicator into multiple sub-communicators. The root process in each sub-communicator is responsible for performing the I/O operations for the entire group and then distributing the data to rest of the group. This approach resulted in over 25X speedup in HDF I/O read performance and 3X speedup in write performance for PFLOTRAN at over 100K processor cores on the ORNL Jaguar supercomputer. This research describes the design and development of a general purpose parallel I/O library, SCORPIO (SCalable block-ORiented Parallel I/O) that incorporates our optimized two-phase I/O approach. The library provides a simplified higher level abstraction to the user, sitting atop existing parallel I/O libraries (such as HDF5) and implements optimized I/O access patterns that can scale on larger number of processors. Performance results with standard benchmark problems and PFLOTRAN indicate that our library is able to maintain the same speedups as before with the added flexibility of being applicable to a wider range of I/O intensive applications.« less

  6. EDIN design study alternate space shuttle booster replacement concepts. Volume 1: Engineering analysis

    NASA Technical Reports Server (NTRS)

    Demakes, P. T.; Hirsch, G. N.; Stewart, W. A.; Glatt, C. R.

    1976-01-01

    The use of a recoverable liquid rocket booster (LRB) system to replace the existing solid rocket booster (SRB) system for the shuttle was studied. Historical weight estimating relationships were developed for the LRB using Saturn technology and modified as required. Mission performance was computed using February 1975 shuttle configuration groundrules to allow reasonable comparison of the existing shuttle with the study designs. The launch trajectory was constrained to pass through both the RTLS/AOA and main engine cut off points of the shuttle reference mission 1. Performance analysis is based on a point design trajectory model which optimizes initial tilt rate and exoatmospheric pitch profile. A gravity turn was employed during the boost phase in place of the shuttle angle of attack profile. Engine throttling add/or shutdown was used to constrain dynamic pressure and/or longitudinal acceleration where necessary. Four basic configurations were investigated: a parallel burn vehicle with an F-1 engine powered LRB; a parallel burn vehicle with a high pressure engine powered LRB; a series burn vehicle with a high pressure engine powered LRB. The relative sizes of the LRB and the ET are optimized to minimize GLOW in most cases.

  7. Simulated parallel annealing within a neighborhood for optimization of biomechanical systems.

    PubMed

    Higginson, J S; Neptune, R R; Anderson, F C

    2005-09-01

    Optimization problems for biomechanical systems have become extremely complex. Simulated annealing (SA) algorithms have performed well in a variety of test problems and biomechanical applications; however, despite advances in computer speed, convergence to optimal solutions for systems of even moderate complexity has remained prohibitive. The objective of this study was to develop a portable parallel version of a SA algorithm for solving optimization problems in biomechanics. The algorithm for simulated parallel annealing within a neighborhood (SPAN) was designed to minimize interprocessor communication time and closely retain the heuristics of the serial SA algorithm. The computational speed of the SPAN algorithm scaled linearly with the number of processors on different computer platforms for a simple quadratic test problem and for a more complex forward dynamic simulation of human pedaling.

  8. Exploiting parallel computing with limited program changes using a network of microcomputers

    NASA Technical Reports Server (NTRS)

    Rogers, J. L., Jr.; Sobieszczanski-Sobieski, J.

    1985-01-01

    Network computing and multiprocessor computers are two discernible trends in parallel processing. The computational behavior of an iterative distributed process in which some subtasks are completed later than others because of an imbalance in computational requirements is of significant interest. The effects of asynchronus processing was studied. A small existing program was converted to perform finite element analysis by distributing substructure analysis over a network of four Apple IIe microcomputers connected to a shared disk, simulating a parallel computer. The substructure analysis uses an iterative, fully stressed, structural resizing procedure. A framework of beams divided into three substructures is used as the finite element model. The effects of asynchronous processing on the convergence of the design variables are determined by not resizing particular substructures on various iterations.

  9. Symposium on Parallel Computational Methods for Large-scale Structural Analysis and Design, 2nd, Norfolk, VA, US

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O. (Editor); Housner, Jerrold M. (Editor)

    1993-01-01

    Computing speed is leaping forward by several orders of magnitude each decade. Engineers and scientists gathered at a NASA Langley symposium to discuss these exciting trends as they apply to parallel computational methods for large-scale structural analysis and design. Among the topics discussed were: large-scale static analysis; dynamic, transient, and thermal analysis; domain decomposition (substructuring); and nonlinear and numerical methods.

  10. Program For Parallel Discrete-Event Simulation

    NASA Technical Reports Server (NTRS)

    Beckman, Brian C.; Blume, Leo R.; Geiselman, John S.; Presley, Matthew T.; Wedel, John J., Jr.; Bellenot, Steven F.; Diloreto, Michael; Hontalas, Philip J.; Reiher, Peter L.; Weiland, Frederick P.

    1991-01-01

    User does not have to add any special logic to aid in synchronization. Time Warp Operating System (TWOS) computer program is special-purpose operating system designed to support parallel discrete-event simulation. Complete implementation of Time Warp mechanism. Supports only simulations and other computations designed for virtual time. Time Warp Simulator (TWSIM) subdirectory contains sequential simulation engine interface-compatible with TWOS. TWOS and TWSIM written in, and support simulations in, C programming language.

  11. Joint Experiment on Scalable Parallel Processors (JESPP) Parallel Data Management

    DTIC Science & Technology

    2006-05-01

    management and analysis tool, called Simulation Data Grid ( SDG ). The design principles driving the design of SDG are: 1) minimize network communication...or SDG . In this report, an initial prototype implementation of this system is described. This project follows on earlier research, primarily...distributed logging system had some 2 limitations. These limitations will be described in this report, and how the SDG addresses these limitations. 3.0

  12. Design study of wind turbines 50 kW to 3000 kW for electric utility applications. Volume 3: Supplementary design and analysis tasks

    NASA Technical Reports Server (NTRS)

    1976-01-01

    Additional design and analysis data are provided to supplement the results of the two parallel design study efforts. The key results of the three supplemental tasks investigated are: (1) The velocity duration profile has a significant effect in determining the optimum wind turbine design parameters and the energy generation cost. (2) Modest increases in capacity factor can be achieved with small increases in energy generation costs and capital costs. (3) Reinforced concrete towers that are esthetically attractive can be designed and built at a cost comparable to those for steel truss towers. The approach used, method of analysis, assumptions made, design requirements, and the results for each task are discussed in detail.

  13. Grid-Enabled Quantitative Analysis of Breast Cancer

    DTIC Science & Technology

    2010-10-01

    large-scale, multi-modality computerized image analysis . The central hypothesis of this research is that large-scale image analysis for breast cancer...research, we designed a pilot study utilizing large scale parallel Grid computing harnessing nationwide infrastructure for medical image analysis . Also

  14. Does Reimportation Reduce Price Differences for Prescription Drugs? Lessons from the European Union

    PubMed Central

    Kyle, Margaret K; Allsbrook, Jennifer S; Schulman, Kevin A

    2008-01-01

    Objective To examine the effect of parallel trade on patterns of price dispersion for prescription drugs in the European Union. Data Sources Longitudinal data from an IMS Midas database of prices and units sold for drugs in 36 categories in 30 countries from 1993 through 2004. Study Design The main outcome measures were mean price differentials and other measures of price dispersion within European Union countries compared with within non-European Union countries. Data Collection/Extraction Methods We identified drugs subject to parallel trade using information provided by IMS and by checking membership lists of parallel import trade associations and lists of approved parallel imports. Principal Findings Parallel trade was not associated with substantial reductions in price dispersion in European Union countries. In descriptive and regression analyses, about half of the price differentials exceeded 50 percent in both European Union and non-European Union countries over time, and price distributions among European Union countries did not show a dramatic change concurrent with the adoption of parallel trade. In regression analysis, we found that although price differentials decreased after 1995 in most countries, they decreased less in the European Union than elsewhere. Conclusions Parallel trade for prescription drugs does not automatically reduce international price differences. Future research should explore how other regulatory schemes might lead to different results elsewhere. PMID:18355258

  15. Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform.

    PubMed

    Cao, Jianfang; Chen, Lichao; Wang, Min; Tian, Yun

    2018-01-01

    The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator's dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance.

  16. Modeling and Dynamic Analysis of Paralleled dc/dc Converters With Master-Slave Current Sharing Control

    NASA Technical Reports Server (NTRS)

    Rajagopalan, J.; Xing, K.; Guo, Y.; Lee, F. C.; Manners, Bruce

    1996-01-01

    A simple, application-oriented, transfer function model of paralleled converters employing Master-Slave Current-sharing (MSC) control is developed. Dynamically, the Master converter retains its original design characteristics; all the Slave converters are forced to depart significantly from their original design characteristics into current-controlled current sources. Five distinct loop gains to assess system stability and performance are identified and their physical significance is described. A design methodology for the current share compensator is presented. The effect of this current sharing scheme on 'system output impedance' is analyzed.

  17. Dose finding with the sequential parallel comparison design.

    PubMed

    Wang, Jessie J; Ivanova, Anastasia

    2014-01-01

    The sequential parallel comparison design (SPCD) is a two-stage design recommended for trials with possibly high placebo response. A drug-placebo comparison in the first stage is followed in the second stage by placebo nonresponders being re-randomized between drug and placebo. We describe how SPCD can be used in trials where multiple doses of a drug or multiple treatments are compared with placebo and present two adaptive approaches. We detail how to analyze data in such trials and give recommendations about the allocation proportion to placebo in the two stages of SPCD.

  18. An Evaluation of the Effectiveness of the Remediation Plus Program on Improving Reading Achievement of Students in the Marinette (WI) School District

    ERIC Educational Resources Information Center

    Corcoran, Roisin P.; Ross, Steven M.

    2015-01-01

    The study was implemented in the Title I Marinette School District using a randomized experimental design and parallel quasi experimental design spanning three grades 1-3 in 3 district elementary schools. The Remediation Plus Intervention is a multi-sensory, systematic synthetic phonics curriculum for all ages of students who struggle with…

  19. Design of a 6-DOF upper limb rehabilitation exoskeleton with parallel actuated joints.

    PubMed

    Chen, Yanyan; Li, Ge; Zhu, Yanhe; Zhao, Jie; Cai, Hegao

    2014-01-01

    In this paper, a 6-DOF wearable upper limb exoskeleton with parallel actuated joints which perfectly mimics human motions is proposed. The upper limb exoskeleton assists the movement of physically weak people. Compared with the existing upper limb exoskeletons which are mostly designed using a serial structure with large movement space but small stiffness and poor wearable ability, a prototype for motion assistance based on human anatomy structure has been developed in our design. Moreover, the design adopts balls instead of bearings to save space, which simplifies the structure and reduces the cost of the mechanism. The proposed design also employs deceleration processes to ensure that the transmission ratio of each joint is coincident.

  20. Analysis and Design of ITER 1 MV Core Snubber

    NASA Astrophysics Data System (ADS)

    Wang, Haitian; Li, Ge

    2012-11-01

    The core snubber, as a passive protection device, can suppress arc current and absorb stored energy in stray capacitance during the electrical breakdown in accelerating electrodes of ITER NBI. In order to design the core snubber of ITER, the control parameters of the arc peak current have been firstly analyzed by the Fink-Baker-Owren (FBO) method, which are used for designing the DIIID 100 kV snubber. The B-H curve can be derived from the measured voltage and current waveforms, and the hysteresis loss of the core snubber can be derived using the revised parallelogram method. The core snubber can be a simplified representation as an equivalent parallel resistance and inductance, which has been neglected by the FBO method. A simulation code including the parallel equivalent resistance and inductance has been set up. The simulation and experiments result in dramatically large arc shorting currents due to the parallel inductance effect. The case shows that the core snubber utilizing the FBO method gives more compact design.

  1. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems

    PubMed Central

    Stone, John E.; Gohara, David; Shi, Guochun

    2010-01-01

    We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures. PMID:21037981

  2. Matching pursuit parallel decomposition of seismic data

    NASA Astrophysics Data System (ADS)

    Li, Chuanhui; Zhang, Fanchang

    2017-07-01

    In order to improve the computation speed of matching pursuit decomposition of seismic data, a matching pursuit parallel algorithm is designed in this paper. We pick a fixed number of envelope peaks from the current signal in every iteration according to the number of compute nodes and assign them to the compute nodes on average to search the optimal Morlet wavelets in parallel. With the help of parallel computer systems and Message Passing Interface, the parallel algorithm gives full play to the advantages of parallel computing to significantly improve the computation speed of the matching pursuit decomposition and also has good expandability. Besides, searching only one optimal Morlet wavelet by every compute node in every iteration is the most efficient implementation.

  3. The Parallel System for Integrating Impact Models and Sectors (pSIMS)

    NASA Technical Reports Server (NTRS)

    Elliott, Joshua; Kelly, David; Chryssanthacopoulos, James; Glotter, Michael; Jhunjhnuwala, Kanika; Best, Neil; Wilde, Michael; Foster, Ian

    2014-01-01

    We present a framework for massively parallel climate impact simulations: the parallel System for Integrating Impact Models and Sectors (pSIMS). This framework comprises a) tools for ingesting and converting large amounts of data to a versatile datatype based on a common geospatial grid; b) tools for translating this datatype into custom formats for site-based models; c) a scalable parallel framework for performing large ensemble simulations, using any one of a number of different impacts models, on clusters, supercomputers, distributed grids, or clouds; d) tools and data standards for reformatting outputs to common datatypes for analysis and visualization; and e) methodologies for aggregating these datatypes to arbitrary spatial scales such as administrative and environmental demarcations. By automating many time-consuming and error-prone aspects of large-scale climate impacts studies, pSIMS accelerates computational research, encourages model intercomparison, and enhances reproducibility of simulation results. We present the pSIMS design and use example assessments to demonstrate its multi-model, multi-scale, and multi-sector versatility.

  4. First experiments probing the collision of parallel magnetic fields using laser-produced plasmas

    DOE PAGES

    Rosenberg, M. J.; Li, C. K.; Fox, W.; ...

    2015-04-08

    Novel experiments to study the strongly-driven collision of parallel magnetic fields in β~10, laser-produced plasmas have been conducted using monoenergetic proton radiography. These experiments were designed to probe the process of magnetic flux pileup, which has been identified in prior laser-plasma experiments as a key physical mechanism in the reconnection of anti-parallel magnetic fields when the reconnection inflow is dominated by strong plasma flows. In the present experiments using colliding plasmas carrying parallel magnetic fields, the magnetic flux is found to be conserved and slightly compressed in the collision region. Two-dimensional (2D) particle-in-cell (PIC) simulations predict a stronger flux compressionmore » and amplification of the magnetic field strength, and this discrepancy is attributed to the three-dimensional (3D) collision geometry. Future experiments may drive a stronger collision and further explore flux pileup in the context of the strongly-driven interaction of magnetic fields.« less

  5. Sequential Feedback Scheme Outperforms the Parallel Scheme for Hamiltonian Parameter Estimation.

    PubMed

    Yuan, Haidong

    2016-10-14

    Measurement and estimation of parameters are essential for science and engineering, where the main quest is to find the highest achievable precision with the given resources and design schemes to attain it. Two schemes, the sequential feedback scheme and the parallel scheme, are usually studied in the quantum parameter estimation. While the sequential feedback scheme represents the most general scheme, it remains unknown whether it can outperform the parallel scheme for any quantum estimation tasks. In this Letter, we show that the sequential feedback scheme has a threefold improvement over the parallel scheme for Hamiltonian parameter estimations on two-dimensional systems, and an order of O(d+1) improvement for Hamiltonian parameter estimation on d-dimensional systems. We also show that, contrary to the conventional belief, it is possible to simultaneously achieve the highest precision for estimating all three components of a magnetic field, which sets a benchmark on the local precision limit for the estimation of a magnetic field.

  6. User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Earth Sciences Division; Zhang, Keni; Zhang, Keni

    TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used. To familiarize users with the parallel code, illustrative sample problems are presented.« less

  7. A Metascalable Computing Framework for Large Spatiotemporal-Scale Atomistic Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nomura, K; Seymour, R; Wang, W

    2009-02-17

    A metascalable (or 'design once, scale on new architectures') parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials based on spatiotemporal data locality principles, which is expected to scale on emerging multipetaflops architectures. The framework consists of: (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms for high complexity problems; (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, while introducing multiple parallelization axes; and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these O(N) algorithms onto a multicore cluster based onmore » hybrid implementation combining message passing and critical section-free multithreading. The EDC-STEP-HCD framework exposes maximal concurrency and data locality, thereby achieving: (1) inter-node parallel efficiency well over 0.95 for 218 billion-atom molecular-dynamics and 1.68 trillion electronic-degrees-of-freedom quantum-mechanical simulations on 212,992 IBM BlueGene/L processors (superscalability); (2) high intra-node, multithreading parallel efficiency (nanoscalability); and (3) nearly perfect time/ensemble parallel efficiency (eon-scalability). The spatiotemporal scale covered by MD simulation on a sustained petaflops computer per day (i.e. petaflops {center_dot} day of computing) is estimated as NT = 2.14 (e.g. N = 2.14 million atoms for T = 1 microseconds).« less

  8. ALS turbomachinery technology

    NASA Technical Reports Server (NTRS)

    Csomor, A.; Faulkner, C.; Ferlita, F.

    1990-01-01

    Advanced Development Programs are being pursued by Rocketdyne, Aerojet, and Pratt and Whitney to define and validate design approaches toward producing low-cost, reliable liquid-hydrogen and liquid-oxygen turbopumps for a 2580 kN (580 klb) thrust Advanced Launch System. The generic approach, which is evolving after 18 months of trade studies and conceptual and preliminary design efforts, is explained. In addition, the preliminary liquid-hydrogen turbopump designs produced in parallel tasks by Rocketdyne and Aerojet and the liquid-oxygen turbopump design produced by Pratt and Whitney are described, and technology features and issues are discussed.

  9. Experimental characterization of a binary actuated parallel manipulator

    NASA Astrophysics Data System (ADS)

    Giuseppe, Carbone

    2016-05-01

    This paper describes the BAPAMAN (Binary Actuated Parallel MANipulator) series of parallel manipulators that has been conceived at Laboratory of Robotics and Mechatronics (LARM). Basic common characteristics of BAPAMAN series are described. In particular, it is outlined the use of a reduced number of active degrees of freedom, the use of design solutions with flexural joints and Shape Memory Alloy (SMA) actuators for achieving miniaturization, cost reduction and easy operation features. Given the peculiarities of BAPAMAN architecture, specific experimental tests have been proposed and carried out with the aim to validate the proposed design and to evaluate the practical operation performance and the characteristics of a built prototype, in particular, in terms of operation and workspace characteristics.

  10. A Novel Design of 4-Class BCI Using Two Binary Classifiers and Parallel Mental Tasks

    PubMed Central

    Geng, Tao; Gan, John Q.; Dyson, Matthew; Tsui, Chun SL; Sepulveda, Francisco

    2008-01-01

    A novel 4-class single-trial brain computer interface (BCI) based on two (rather than four or more) binary linear discriminant analysis (LDA) classifiers is proposed, which is called a “parallel BCI.” Unlike other BCIs where mental tasks are executed and classified in a serial way one after another, the parallel BCI uses properly designed parallel mental tasks that are executed on both sides of the subject body simultaneously, which is the main novelty of the BCI paradigm used in our experiments. Each of the two binary classifiers only classifies the mental tasks executed on one side of the subject body, and the results of the two binary classifiers are combined to give the result of the 4-class BCI. Data was recorded in experiments with both real movement and motor imagery in 3 able-bodied subjects. Artifacts were not detected or removed. Offline analysis has shown that, in some subjects, the parallel BCI can generate a higher accuracy than a conventional 4-class BCI, although both of them have used the same feature selection and classification algorithms. PMID:18584040

  11. The BLAZE language: A parallel language for scientific programming

    NASA Technical Reports Server (NTRS)

    Mehrotra, P.; Vanrosendale, J.

    1985-01-01

    A Pascal-like scientific programming language, Blaze, is described. Blaze contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus Blaze should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with onceptually sequential control flow. A central goal in the design of Blaze is portability across a broad range of parallel architectures. The multiple levels of parallelism present in Blaze code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of Blaze are described and shows how this language would be used in typical scientific programming.

  12. A scalable parallel black oil simulator on distributed memory parallel computers

    NASA Astrophysics Data System (ADS)

    Wang, Kun; Liu, Hui; Chen, Zhangxin

    2015-11-01

    This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.

  13. Innovative design of composite structures: Further studies in the use of a curvilinear fiber format to improve structural efficiency

    NASA Technical Reports Server (NTRS)

    Hyer, Michael W.; Charette, Robert F.

    1988-01-01

    Further studies to determine the potential for using a curvilinear fiber format in the design of composite laminates are reported. The curvilinear format is in contrast to the current practice of having the fibers aligned parallel to each other and in a straight line. The problem of a plate with a central circular hole is used as a candidate problem for this study. The study concludes that for inplane tensile loading the curvilinear format is superior. The limited results to date on compression buckling loads indicate that the curvilinear designs are poorer in resistant buckling. However, for the curvilinear design of interest, the reduction in buckling load is minimal and so overall there is a gain in considering the curvilinear design.

  14. Optimization of sparse matrix-vector multiplication on emerging multicore platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Samuel; Oliker, Leonid; Vuduc, Richard

    2007-01-01

    We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientificmore » study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less

  15. On the Use of CAD and Cartesian Methods for Aerodynamic Optimization

    NASA Technical Reports Server (NTRS)

    Nemec, M.; Aftosmis, M. J.; Pulliam, T. H.

    2004-01-01

    The objective for this paper is to present the development of an optimization capability for Curt3D, a Cartesian inviscid-flow analysis package. We present the construction of a new optimization framework and we focus on the following issues: 1) Component-based geometry parameterization approach using parametric-CAD models and CAPRI. A novel geometry server is introduced that addresses the issue of parallel efficiency while only sparingly consuming CAD resources; 2) The use of genetic and gradient-based algorithms for three-dimensional aerodynamic design problems. The influence of noise on the optimization methods is studied. Our goal is to create a responsive and automated framework that efficiently identifies design modifications that result in substantial performance improvements. In addition, we examine the architectural issues associated with the deployment of a CAD-based approach in a heterogeneous parallel computing environment that contains both CAD workstations and dedicated compute engines. We demonstrate the effectiveness of the framework for a design problem that features topology changes and complex geometry.

  16. Photonic content-addressable memory system that uses a parallel-readout optical disk

    NASA Astrophysics Data System (ADS)

    Krishnamoorthy, Ashok V.; Marchand, Philippe J.; Yayla, Gökçe; Esener, Sadik C.

    1995-11-01

    We describe a high-performance associative-memory system that can be implemented by means of an optical disk modified for parallel readout and a custom-designed silicon integrated circuit with parallel optical input. The system can achieve associative recall on 128 \\times 128 bit images and also on variable-size subimages. The system's behavior and performance are evaluated on the basis of experimental results on a motionless-head parallel-readout optical-disk system, logic simulations of the very-large-scale integrated chip, and a software emulation of the overall system.

  17. RAMA: A file system for massively parallel computers

    NASA Technical Reports Server (NTRS)

    Miller, Ethan L.; Katz, Randy H.

    1993-01-01

    This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another common bottleneck in parallel processor file systems. Support for a large tertiary storage system can easily be integrated in lo the file system; in fact, RAMA runs most efficiently when tertiary storage is used.

  18. The Relationship between Attention Levels and Class Participation of First-Year Students in Classroom Teaching Departments

    ERIC Educational Resources Information Center

    Sezer, Adem; Inel, Yusuf; Seçkin, Ahmet Çagdas; Uluçinar, Ufuk

    2017-01-01

    This study aimed to detect any relationship that may exist between classroom teacher candidates' class participation and their attention levels. The research method was a convergent parallel design, mixing quantitative and qualitative research techniques, and the study group was composed of 21 freshmen studying in the Classroom Teaching Department…

  19. Supervised Home Training of Dialogue Skills in Chronic Aphasia: A Randomized Parallel Group Study

    ERIC Educational Resources Information Center

    Nobis-Bosch, Ruth; Springer, Luise; Radermacher, Irmgard; Huber, Walter

    2011-01-01

    Purpose: The aim of this study was to prove the efficacy of supervised self-training for individuals with aphasia. Linguistic and communicative performance in structured dialogues represented the main study parameters. Method: In a cross-over design for randomized matched pairs, 18 individuals with chronic aphasia were examined during 12 weeks of…

  20. A Parallel Implementation of Multilevel Recursive Spectral Bisection for Application to Adaptive Unstructured Meshes. Chapter 1

    NASA Technical Reports Server (NTRS)

    Barnard, Stephen T.; Simon, Horst; Lasinski, T. A. (Technical Monitor)

    1994-01-01

    The design of a parallel implementation of multilevel recursive spectral bisection is described. The goal is to implement a code that is fast enough to enable dynamic repartitioning of adaptive meshes.

  1. 100 Gbps Wireless System and Circuit Design Using Parallel Spread-Spectrum Sequencing

    NASA Astrophysics Data System (ADS)

    Scheytt, J. Christoph; Javed, Abdul Rehman; Bammidi, Eswara Rao; KrishneGowda, Karthik; Kallfass, Ingmar; Kraemer, Rolf

    2017-09-01

    In this article mixed analog/digital signal processing techniques based on parallel spread-spectrum sequencing (PSSS) and radio frequency (RF) carrier synchronization for ultra-broadband wireless communication are investigated on system and circuit level.

  2. RTNN: The New Parallel Machine in Zaragoza

    NASA Astrophysics Data System (ADS)

    Sijs, A. J. V. D.

    I report on the development of RTNN, a parallel computer designed as a 4^4 hypercube of 256 T9000 transputer nodes, each with 8 MB memory. The peak performance of the machine is expected to be 2.5 Gflops.

  3. Rapid-Scanning Fourier Transform Spectrometer for Studies of Propagation of Near-Millimeter-Wave Radiation through Clear Air and Fog.

    DTIC Science & Technology

    1988-03-01

    parallel in the output beam . ’ , However, as will be seen, this function can be performed by auxiliary, non -moving mirrors. Our . design for a rapid... splitter used in our design is shown in Fig. 2. The mirror drive is somewhat novel for this type of interferometer in that one mirror in each beam . M3...features: * High interferometric efficiency, due to the Martin-Puplett type design 0 Ruggedness in photolithographically produced beam splitters

  4. Multivariable speed synchronisation for a parallel hybrid electric vehicle drivetrain

    NASA Astrophysics Data System (ADS)

    Alt, B.; Antritter, F.; Svaricek, F.; Schultalbers, M.

    2013-03-01

    In this article, a new drivetrain configuration of a parallel hybrid electric vehicle is considered and a novel model-based control design strategy is given. In particular, the control design covers the speed synchronisation task during a restart of the internal combustion engine. The proposed multivariable synchronisation strategy is based on feedforward and decoupled feedback controllers. The performance and the robustness properties of the closed-loop system are illustrated by nonlinear simulation results.

  5. Automatic differentiation for design sensitivity analysis of structural systems using multiple processors

    NASA Technical Reports Server (NTRS)

    Nguyen, Duc T.; Storaasli, Olaf O.; Qin, Jiangning; Qamar, Ramzi

    1994-01-01

    An automatic differentiation tool (ADIFOR) is incorporated into a finite element based structural analysis program for shape and non-shape design sensitivity analysis of structural systems. The entire analysis and sensitivity procedures are parallelized and vectorized for high performance computation. Small scale examples to verify the accuracy of the proposed program and a medium scale example to demonstrate the parallel vector performance on multiple CRAY C90 processors are included.

  6. A method for real-time implementation of HOG feature extraction

    NASA Astrophysics Data System (ADS)

    Luo, Hai-bo; Yu, Xin-rong; Liu, Hong-mei; Ding, Qing-hai

    2011-08-01

    Histogram of oriented gradient (HOG) is an efficient feature extraction scheme, and HOG descriptors are feature descriptors which is widely used in computer vision and image processing for the purpose of biometrics, target tracking, automatic target detection(ATD) and automatic target recognition(ATR) etc. However, computation of HOG feature extraction is unsuitable for hardware implementation since it includes complicated operations. In this paper, the optimal design method and theory frame for real-time HOG feature extraction based on FPGA were proposed. The main principle is as follows: firstly, the parallel gradient computing unit circuit based on parallel pipeline structure was designed. Secondly, the calculation of arctangent and square root operation was simplified. Finally, a histogram generator based on parallel pipeline structure was designed to calculate the histogram of each sub-region. Experimental results showed that the HOG extraction can be implemented in a pixel period by these computing units.

  7. The Effects of Argumentation Implementation on Environmental Education Self Efficacy Beliefs and Perspectives According to Environmental Problems

    ERIC Educational Resources Information Center

    Fettahlioglu, Pinar

    2018-01-01

    The purpose of this study is to investigate the effect of argumentation implementation applied in the environmental science course on science teacher candidates' environmental education self-efficacy beliefs and perspectives according to environmental problems. In this mixed method research study, convergent parallel design was utilized.…

  8. Evaluation of Turkish and Mathematics Curricula According to Value-Based Evaluation Model

    ERIC Educational Resources Information Center

    Duman, Serap Nur; Akbas, Oktay

    2017-01-01

    This study evaluated secondary school seventh-grade Turkish and mathematics programs using the Context-Input-Process-Product Evaluation Model based on student, teacher, and inspector views. The convergent parallel mixed method design was used in the study. Student values were identified using the scales for socio-level identification, traditional…

  9. The Effects of Using Learning Objects in Two Different Settings

    ERIC Educational Resources Information Center

    Cakiroglu, Unal; Baki, Adnan; Akkan, Yasar

    2012-01-01

    The study compared the effects of Learning Objects (LOs) within different applications; in classroom and in extracurricular activities. So in this study, firstly a Learning Object Repository (LOR) has been designed in parallel with 9th grade school mathematics curriculum. One of the two treatment groups was named as "classroom group" (n…

  10. Prospective Elementary School Teachers' Views about Socioscientific Issues: A Concurrent Parallel Design Study

    ERIC Educational Resources Information Center

    Özden, Muhammet

    2015-01-01

    The purpose of this research is to examine the prospective elementary school teachers' perceptions on socioscientific issues. The research was conducted on prospective elementary school teachers studying at a university located in western Turkey. The researcher first taught the subjects of global warming and nuclear power plants from a perspective…

  11. Humor Climate of the Primary Schools

    ERIC Educational Resources Information Center

    Sahin, Ahmet

    2018-01-01

    The aim of this study is to determine the opinions primary school administrators and teachers on humor climates in primary schools. The study was modeled as a convergent parallel design, one of the mixed methods. The data gathered from 253 administrator questionnaires, and 651 teacher questionnaires was evaluated for the quantitative part of the…

  12. Instructional Coaching in a Small District: A Mixed Methods Study of Teachers' Concerns

    ERIC Educational Resources Information Center

    Mayfield, Melissa J.

    2016-01-01

    This study utilized a convergent parallel mixed methods design to study teachers' concerns during implementation of instructional coaching for math in a rural PK-12 district in north Texas over a three-year time period. Five campuses were included in the study: one high school (grades 9-12), one middle school (grades 6-8), and three elementary…

  13. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.

  14. 14 CFR 25.393 - Loads parallel to hinge line.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... designed for inertia loads acting parallel to the hinge line. (b) In the absence of more rational data, the inertia loads may be assumed to be equal to KW, where— (1) K=24 for vertical surfaces; (2) K=12 for...

  15. 14 CFR 25.393 - Loads parallel to hinge line.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... designed for inertia loads acting parallel to the hinge line. (b) In the absence of more rational data, the inertia loads may be assumed to be equal to KW, where— (1) K=24 for vertical surfaces; (2) K=12 for...

  16. 14 CFR 25.393 - Loads parallel to hinge line.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... designed for inertia loads acting parallel to the hinge line. (b) In the absence of more rational data, the inertia loads may be assumed to be equal to KW, where— (1) K=24 for vertical surfaces; (2) K=12 for...

  17. 14 CFR 25.393 - Loads parallel to hinge line.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... designed for inertia loads acting parallel to the hinge line. (b) In the absence of more rational data, the inertia loads may be assumed to be equal to KW, where— (1) K=24 for vertical surfaces; (2) K=12 for...

  18. 14 CFR 25.393 - Loads parallel to hinge line.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... designed for inertia loads acting parallel to the hinge line. (b) In the absence of more rational data, the inertia loads may be assumed to be equal to KW, where— (1) K=24 for vertical surfaces; (2) K=12 for...

  19. A METHOD FOR IN-SITU CHARACTERIZATION OF RF HEATING IN PARALLEL TRANSMIT MRI

    PubMed Central

    Alon, Leeor; Deniz, Cem Murat; Brown, Ryan; Sodickson, Daniel K.; Zhu, Yudong

    2012-01-01

    In ultra high field magnetic resonance imaging, parallel radio-frequency (RF) transmission presents both opportunities and challenges for specific absorption rate (SAR) management. On one hand, parallel transmission provides flexibility in tailoring electric fields in the body while facilitating magnetization profile control. On the other hand, it increases the complexity of energy deposition as well as possibly exacerbating local SAR by improper design or delivery of RF pulses. This study shows that the information needed to characterize RF heating in parallel transmission is contained within a local power correlation matrix. Building upon a calibration scheme involving a finite number of magnetic resonance thermometry measurements, the present work establishes a way of estimating the local power correlation matrix. Determination of this matrix allows prediction of temperature change for an arbitrary parallel transmit RF pulse. In the case of a three transmit coil MR experiment in a phantom, determination and validation of the power correlation matrix was conducted in less than 200 minutes with induced temperature changes of <4 degrees C. Further optimization and adaptation are possible, and simulations evaluating potential feasibility for in vivo use are presented. The method allows general characteristics indicative of RF coil/pulse safety determined in situ. PMID:22714806

  20. Heterodyne frequency-domain multispectral diffuse optical tomography of breast cancer in the parallel-plane transmission geometry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ban, H. Y.; Kavuri, V. C., E-mail: venk@physics.up

    Purpose: The authors introduce a state-of-the-art all-optical clinical diffuse optical tomography (DOT) imaging instrument which collects spatially dense, multispectral, frequency-domain breast data in the parallel-plate geometry. Methods: The instrument utilizes a CCD-based heterodyne detection scheme that permits massively parallel detection of diffuse photon density wave amplitude and phase for a large number of source–detector pairs (10{sup 6}). The stand-alone clinical DOT instrument thus offers high spatial resolution with reduced crosstalk between absorption and scattering. Other novel features include a fringe profilometry system for breast boundary segmentation, real-time data normalization, and a patient bed design which permits both axial and sagittalmore » breast measurements. Results: The authors validated the instrument using tissue simulating phantoms with two different chromophore-containing targets and one scattering target. The authors also demonstrated the instrument in a case study breast cancer patient; the reconstructed 3D image of endogenous chromophores and scattering gave tumor localization in agreement with MRI. Conclusions: Imaging with a novel parallel-plate DOT breast imager that employs highly parallel, high-resolution CCD detection in the frequency-domain was demonstrated.« less

  1. AdiosStMan: Parallelizing Casacore Table Data System using Adaptive IO System

    NASA Astrophysics Data System (ADS)

    Wang, R.; Harris, C.; Wicenec, A.

    2016-07-01

    In this paper, we investigate the Casacore Table Data System (CTDS) used in the casacore and CASA libraries, and methods to parallelize it. CTDS provides a storage manager plugin mechanism for third-party developers to design and implement their own CTDS storage managers. Having this in mind, we looked into various storage backend techniques that can possibly enable parallel I/O for CTDS by implementing new storage managers. After carrying on benchmarks showing the excellent parallel I/O throughput of the Adaptive IO System (ADIOS), we implemented an ADIOS based parallel CTDS storage manager. We then applied the CASA MSTransform frequency split task to verify the ADIOS Storage Manager. We also ran a series of performance tests to examine the I/O throughput in a massively parallel scenario.

  2. Broadcasting collective operation contributions throughout a parallel computer

    DOEpatents

    Faraj, Ahmad [Rochester, MN

    2012-02-21

    Methods, systems, and products are disclosed for broadcasting collective operation contributions throughout a parallel computer. The parallel computer includes a plurality of compute nodes connected together through a data communications network. Each compute node has a plurality of processors for use in collective parallel operations on the parallel computer. Broadcasting collective operation contributions throughout a parallel computer according to embodiments of the present invention includes: transmitting, by each processor on each compute node, that processor's collective operation contribution to the other processors on that compute node using intra-node communications; and transmitting on a designated network link, by each processor on each compute node according to a serial processor transmission sequence, that processor's collective operation contribution to the other processors on the other compute nodes using inter-node communications.

  3. Parallel design patterns for a low-power, software-defined compressed video encoder

    NASA Astrophysics Data System (ADS)

    Bruns, Michael W.; Hunt, Martin A.; Prasad, Durga; Gunupudi, Nageswara R.; Sonachalam, Sekar

    2011-06-01

    Video compression algorithms such as H.264 offer much potential for parallel processing that is not always exploited by the technology of a particular implementation. Consumer mobile encoding devices often achieve real-time performance and low power consumption through parallel processing in Application Specific Integrated Circuit (ASIC) technology, but many other applications require a software-defined encoder. High quality compression features needed for some applications such as 10-bit sample depth or 4:2:2 chroma format often go beyond the capability of a typical consumer electronics device. An application may also need to efficiently combine compression with other functions such as noise reduction, image stabilization, real time clocks, GPS data, mission/ESD/user data or software-defined radio in a low power, field upgradable implementation. Low power, software-defined encoders may be implemented using a massively parallel memory-network processor array with 100 or more cores and distributed memory. The large number of processor elements allow the silicon device to operate more efficiently than conventional DSP or CPU technology. A dataflow programming methodology may be used to express all of the encoding processes including motion compensation, transform and quantization, and entropy coding. This is a declarative programming model in which the parallelism of the compression algorithm is expressed as a hierarchical graph of tasks with message communication. Data parallel and task parallel design patterns are supported without the need for explicit global synchronization control. An example is described of an H.264 encoder developed for a commercially available, massively parallel memorynetwork processor device.

  4. An architecture for real-time vision processing

    NASA Technical Reports Server (NTRS)

    Chien, Chiun-Hong

    1994-01-01

    To study the feasibility of developing an architecture for real time vision processing, a task queue server and parallel algorithms for two vision operations were designed and implemented on an i860-based Mercury Computing System 860VS array processor. The proposed architecture treats each vision function as a task or set of tasks which may be recursively divided into subtasks and processed by multiple processors coordinated by a task queue server accessible by all processors. Each idle processor subsequently fetches a task and associated data from the task queue server for processing and posts the result to shared memory for later use. Load balancing can be carried out within the processing system without the requirement for a centralized controller. The author concludes that real time vision processing cannot be achieved without both sequential and parallel vision algorithms and a good parallel vision architecture.

  5. Breast ultrasound tomography with two parallel transducer arrays: preliminary clinical results

    NASA Astrophysics Data System (ADS)

    Huang, Lianjie; Shin, Junseob; Chen, Ting; Lin, Youzuo; Intrator, Miranda; Hanson, Kenneth; Epstein, Katherine; Sandoval, Daniel; Williamson, Michael

    2015-03-01

    Ultrasound tomography has great potential to provide quantitative estimations of physical properties of breast tumors for accurate characterization of breast cancer. We design and manufacture a new synthetic-aperture breast ultrasound tomography system with two parallel transducer arrays. The distance of these two transducer arrays is adjustable for scanning breasts with different sizes. The ultrasound transducer arrays are translated vertically to scan the entire breast slice by slice and acquires ultrasound transmission and reflection data for whole-breast ultrasound imaging and tomographic reconstructions. We use the system to acquire patient data at the University of New Mexico Hospital for clinical studies. We present some preliminary imaging results of in vivo patient ultrasound data. Our preliminary clinical imaging results show promising of our breast ultrasound tomography system with two parallel transducer arrays for breast cancer imaging and characterization.

  6. Hazards Due to Overdischarge in Lithium-ion Cylindrical Cells in Multi-cell Configurations

    NASA Technical Reports Server (NTRS)

    Jeevarajan, Judith; Strangways, Brad; Nelson, Tim

    2010-01-01

    Lithium-ion cells in the cylindrical Commercial-off-the-shelf 18650 design format were used to study the hazards associated with overdischarge. The cells in series or in parallel configurations were subjected to different conditions of overdischarge. The cells in parallel configurations were all overdischarged to 2.0 V for 75 cycles with one cell removed at 25 cycles to study the health of the cell. The cells in series were designed to be in an unbalanced configuration by discharging one cell in each series configuration before the start of test. The discharge consisted of removing a pre-determined capacity from the cell. This ranged from 50 to 150 mAh removal. The cells were discharged down to a predetermined end-of-discharge voltage cutoff which allowed the cell with lower capacity to go into an overdischarge mode. The cell modules that survived the 75 cycles were subjected to one overvoltage test to 4.4 V/cell.

  7. Parallel language constructs for tensor product computations on loosely coupled architectures

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush; Vanrosendale, John

    1989-01-01

    Distributed memory architectures offer high levels of performance and flexibility, but have proven awkard to program. Current languages for nonshared memory architectures provide a relatively low level programming environment, and are poorly suited to modular programming, and to the construction of libraries. A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. Tensor product array computations are focused on along with a simple but important class of numerical algorithms. The problem of programming 1-D kernal routines is focused on first, such as parallel tridiagonal solvers, and then how such parallel kernels can be combined to form parallel tensor product algorithms is examined.

  8. An object-oriented approach to nested data parallelism

    NASA Technical Reports Server (NTRS)

    Sheffler, Thomas J.; Chatterjee, Siddhartha

    1994-01-01

    This paper describes an implementation technique for integrating nested data parallelism into an object-oriented language. Data-parallel programming employs sets of data called 'collections' and expresses parallelism as operations performed over the elements of a collection. When the elements of a collection are also collections, then there is the possibility for 'nested data parallelism.' Few current programming languages support nested data parallelism however. In an object-oriented framework, a collection is a single object. Its type defines the parallel operations that may be applied to it. Our goal is to design and build an object-oriented data-parallel programming environment supporting nested data parallelism. Our initial approach is built upon three fundamental additions to C++. We add new parallel base types by implementing them as classes, and add a new parallel collection type called a 'vector' that is implemented as a template. Only one new language feature is introduced: the 'foreach' construct, which is the basis for exploiting elementwise parallelism over collections. The strength of the method lies in the compilation strategy, which translates nested data-parallel C++ into ordinary C++. Extracting the potential parallelism in nested 'foreach' constructs is called 'flattening' nested parallelism. We show how to flatten 'foreach' constructs using a simple program transformation. Our prototype system produces vector code which has been successfully run on workstations, a CM-2, and a CM-5.

  9. A fast pulse design for parallel excitation with gridding conjugate gradient.

    PubMed

    Feng, Shuo; Ji, Jim

    2013-01-01

    Parallel excitation (pTx) is recognized as a crucial technique in high field MRI to address the transmit field inhomogeneity problem. However, it can be time consuming to design pTx pulses which is not desirable. In this work, we propose a pulse design with gridding conjugate gradient (CG) based on the small-tip-angle approximation. The two major time consuming matrix-vector multiplications are substituted by two operators which involves with FFT and gridding only. Simulation results have shown that the proposed method is 3 times faster than conventional method and the memory cost is reduced by 1000 times.

  10. Modeling and Dynamic Analysis of Paralleled of dc/dc Converters with Master-Slave Current Sharing Control

    NASA Technical Reports Server (NTRS)

    Rajagopalan, J.; Xing, K.; Guo, Y.; Lee, F. C.; Manners, Bruce

    1996-01-01

    A simple, application-oriented, transfer function model of paralleled converters employing Master-Slave Current-sharing (MSC) control is developed. Dynamically, the Master converter retains its original design characteristics; all the Slave converters are forced to depart significantly from their original design characteristics into current-controlled current sources. Five distinct loop gains to assess system stability and performance are identified and their physical significance is described. A design methodology for the current share compensator is presented. The effect of this current sharing scheme on 'system output impedance' is analyzed.

  11. An overview of confounding. Part 1: the concept and how to address it.

    PubMed

    Howards, Penelope P

    2018-04-01

    Confounding is an important source of bias, but it is often misunderstood. We consider how confounding occurs and how to address confounding using examples. Study results are confounded when the effect of the exposure on the outcome, mixes with the effects of other risk and protective factors for the outcome. This problem arises when these factors are present to different degrees among the exposed and unexposed study participants, but not all differences between the groups result in confounding. Thinking about an ideal study where all of the population of interest is exposed in one universe and is unexposed in a parallel universe helps to distinguish confounders from other differences. In an actual study, an observed unexposed population is chosen to stand in for the unobserved parallel universe. Differences between this substitute population and the parallel universe result in confounding. Confounding by identified factors can be addressed analytically and through study design, but only randomization has the potential to address confounding by unmeasured factors. Nevertheless, a given randomized study may still be confounded. Confounded study results can lead to incorrect conclusions about the effect of the exposure of interest on the outcome. © 2018 Nordic Federation of Societies of Obstetrics and Gynecology.

  12. Parallel rendering

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1995-01-01

    This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.

  13. The BLAZE language - A parallel language for scientific programming

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush; Van Rosendale, John

    1987-01-01

    A Pascal-like scientific programming language, BLAZE, is described. BLAZE contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus BLAZE should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with conceptually sequential control flow. A central goal in the design of BLAZE is portability across a broad range of parallel architectures. The multiple levels of parallelism present in BLAZE code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of BLAZE are described and it is shown how this language would be used in typical scientific programming.

  14. A model for optimizing file access patterns using spatio-temporal parallelism

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boonthanome, Nouanesengsy; Patchett, John; Geveci, Berk

    2013-01-01

    For many years now, I/O read time has been recognized as the primary bottleneck for parallel visualization and analysis of large-scale data. In this paper, we introduce a model that can estimate the read time for a file stored in a parallel filesystem when given the file access pattern. Read times ultimately depend on how the file is stored and the access pattern used to read the file. The file access pattern will be dictated by the type of parallel decomposition used. We employ spatio-temporal parallelism, which combines both spatial and temporal parallelism, to provide greater flexibility to possible filemore » access patterns. Using our model, we were able to configure the spatio-temporal parallelism to design optimized read access patterns that resulted in a speedup factor of approximately 400 over traditional file access patterns.« less

  15. Joint design of large-tip-angle parallel RF pulses and blipped gradient trajectories.

    PubMed

    Cao, Zhipeng; Donahue, Manus J; Ma, Jun; Grissom, William A

    2016-03-01

    To design multichannel large-tip-angle kT-points and spokes radiofrequency (RF) pulses and gradient waveforms for transmit field inhomogeneity compensation in high field magnetic resonance imaging. An algorithm to design RF subpulse weights and gradient blip areas is proposed to minimize a magnitude least-squares cost function that measures the difference between realized and desired state parameters in the spin domain, and penalizes integrated RF power. The minimization problem is solved iteratively with interleaved target phase updates, RF subpulse weights updates using the conjugate gradient method with optimal control-based derivatives, and gradient blip area updates using the conjugate gradient method. Two-channel parallel transmit simulations and experiments were conducted in phantoms and human subjects at 7 T to demonstrate the method and compare it to small-tip-angle-designed pulses and circularly polarized excitations. The proposed algorithm designed more homogeneous and accurate 180° inversion and refocusing pulses than other methods. It also designed large-tip-angle pulses on multiple frequency bands with independent and joint phase relaxation. Pulses designed by the method improved specificity and contrast-to-noise ratio in a finger-tapping spin echo blood oxygen level dependent functional magnetic resonance imaging study, compared with circularly polarized mode refocusing. A joint RF and gradient waveform design algorithm was proposed and validated to improve large-tip-angle inversion and refocusing at ultrahigh field. © 2015 Wiley Periodicals, Inc.

  16. On the relationship between parallel computation and graph embedding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gupta, A.K.

    1989-01-01

    The problem of efficiently simulating an algorithm designed for an n-processor parallel machine G on an m-processor parallel machine H with n > m arises when parallel algorithms designed for an ideal size machine are simulated on existing machines which are of a fixed size. The author studies this problem when every processor of H takes over the function of a number of processors in G, and he phrases the simulation problem as a graph embedding problem. New embeddings presented address relevant issues arising from the parallel computation environment. The main focus centers around embedding complete binary trees into smaller-sizedmore » binary trees, butterflies, and hypercubes. He also considers simultaneous embeddings of r source machines into a single hypercube. Constant factors play a crucial role in his embeddings since they are not only important in practice but also lead to interesting theoretical problems. All of his embeddings minimize dilation and load, which are the conventional cost measures in graph embeddings and determine the maximum amount of time required to simulate one step of G on H. His embeddings also optimize a new cost measure called ({alpha},{beta})-utilization which characterizes how evenly the processors of H are used by the processors of G. Ideally, the utilization should be balanced (i.e., every processor of H simulates at most (n/m) processors of G) and the ({alpha},{beta})-utilization measures how far off from a balanced utilization the embedding is. He presents embeddings for the situation when some processors of G have different capabilities (e.g. memory or I/O) than others and the processors with different capabilities are to be distributed uniformly among the processors of H. Placing such conditions on an embedding results in an increase in some of the cost measures.« less

  17. The Implementation Of Solid State Switches In A Parallel Configuration To Gain Output Current Capacity In A High Current Capacitive Discharge Unit (CDU).

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chaves, Mario Paul

    2017-07-01

    For my project I have selected to research and design a high current pulse system, which will be externally triggered from a 5V pulse. The research will be conducted in the region of paralleling the solid state switches for a higher current output, as well as to see if there will be any other advantages in doing so. The end use of the paralleled solid state switches will be used on a Capacitive Discharge Unit (CDU). For the first part of my project, I have set my focus on the design of the circuit, selection of components, and simulation ofmore » the circuit.« less

  18. An Old Story in the Parallel Synthesis World: An Approach to Hydantoin Libraries.

    PubMed

    Bogolubsky, Andrey V; Moroz, Yurii S; Savych, Olena; Pipko, Sergey; Konovets, Angelika; Platonov, Maxim O; Vasylchenko, Oleksandr V; Hurmach, Vasyl V; Grygorenko, Oleksandr O

    2018-01-08

    An approach to the parallel synthesis of hydantoin libraries by reaction of in situ generated 2,2,2-trifluoroethylcarbamates and α-amino esters was developed. To demonstrate utility of the method, a library of 1158 hydantoins designed according to the lead-likeness criteria (MW 200-350, cLogP 1-3) was prepared. The success rate of the method was analyzed as a function of physicochemical parameters of the products, and it was found that the method can be considered as a tool for lead-oriented synthesis. A hydantoin-bearing submicromolar primary hit acting as an Aurora kinase A inhibitor was discovered with a combination of rational design, parallel synthesis using the procedures developed, in silico and in vitro screenings.

  19. Efficient method to design RF pulses for parallel excitation MRI using gridding and conjugate gradient

    PubMed Central

    Feng, Shuo

    2014-01-01

    Parallel excitation (pTx) techniques with multiple transmit channels have been widely used in high field MRI imaging to shorten the RF pulse duration and/or reduce the specific absorption rate (SAR). However, the efficiency of pulse design still needs substantial improvement for practical real-time applications. In this paper, we present a detailed description of a fast pulse design method with Fourier domain gridding and a conjugate gradient method. Simulation results of the proposed method show that the proposed method can design pTx pulses at an efficiency 10 times higher than that of the conventional conjugate-gradient based method, without reducing the accuracy of the desirable excitation patterns. PMID:24834420

  20. Efficient method to design RF pulses for parallel excitation MRI using gridding and conjugate gradient.

    PubMed

    Feng, Shuo; Ji, Jim

    2014-04-01

    Parallel excitation (pTx) techniques with multiple transmit channels have been widely used in high field MRI imaging to shorten the RF pulse duration and/or reduce the specific absorption rate (SAR). However, the efficiency of pulse design still needs substantial improvement for practical real-time applications. In this paper, we present a detailed description of a fast pulse design method with Fourier domain gridding and a conjugate gradient method. Simulation results of the proposed method show that the proposed method can design pTx pulses at an efficiency 10 times higher than that of the conventional conjugate-gradient based method, without reducing the accuracy of the desirable excitation patterns.

  1. A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm

    PubMed Central

    Guo, Xinyu; Wang, Hong; Devabhaktuni, Vijay

    2012-01-01

    A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures. PMID:25969747

  2. Can sequential parallel comparison design and two-way enriched design be useful in medical device clinical trials?

    PubMed

    Ivanova, Anastasia; Zhang, Zhiwei; Thompson, Laura; Yang, Ying; Kotz, Richard M; Fang, Xin

    2016-01-01

    Sequential parallel comparison design (SPCD) was proposed for trials with high placebo response. In the first stage of SPCD subjects are randomized between placebo and active treatment. In the second stage placebo nonresponders are re-randomized between placebo and active treatment. Data from the population of "all comers" and the subpopulations of placebo nonresponders then combined to yield a single p-value for treatment comparison. Two-way enriched design (TED) is an extension of SPCD where active treatment responders are also re-randomized between placebo and active treatment in Stage 2. This article investigates the potential uses of SPCD and TED in medical device trials.

  3. Performance of the Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    As the input/output (I/O) needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance 1/O to applications the applications that rely on them. In Section 3 we describe that access data in patterns that have been observed to be common.

  4. Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform

    PubMed Central

    Wang, Min; Tian, Yun

    2018-01-01

    The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator's dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance. PMID:29861711

  5. Accuracy analysis and design of A3 parallel spindle head

    NASA Astrophysics Data System (ADS)

    Ni, Yanbing; Zhang, Biao; Sun, Yupeng; Zhang, Yuan

    2016-03-01

    As functional components of machine tools, parallel mechanisms are widely used in high efficiency machining of aviation components, and accuracy is one of the critical technical indexes. Lots of researchers have focused on the accuracy problem of parallel mechanisms, but in terms of controlling the errors and improving the accuracy in the stage of design and manufacturing, further efforts are required. Aiming at the accuracy design of a 3-DOF parallel spindle head(A3 head), its error model, sensitivity analysis and tolerance allocation are investigated. Based on the inverse kinematic analysis, the error model of A3 head is established by using the first-order perturbation theory and vector chain method. According to the mapping property of motion and constraint Jacobian matrix, the compensatable and uncompensatable error sources which affect the accuracy in the end-effector are separated. Furthermore, sensitivity analysis is performed on the uncompensatable error sources. The sensitivity probabilistic model is established and the global sensitivity index is proposed to analyze the influence of the uncompensatable error sources on the accuracy in the end-effector of the mechanism. The results show that orientation error sources have bigger effect on the accuracy in the end-effector. Based upon the sensitivity analysis results, the tolerance design is converted into the issue of nonlinearly constrained optimization with the manufacturing cost minimum being the optimization objective. By utilizing the genetic algorithm, the allocation of the tolerances on each component is finally determined. According to the tolerance allocation results, the tolerance ranges of ten kinds of geometric error sources are obtained. These research achievements can provide fundamental guidelines for component manufacturing and assembly of this kind of parallel mechanisms.

  6. Quality Regulation in Expansion of Educational Systems: A Case of Privately Sponsored Students' Programme in Kenya's Public Universities

    ERIC Educational Resources Information Center

    Yego, Helen J. C.

    2016-01-01

    This paper examines the expansion and management of quality of parallel programmes in Kenya's public universities. The study is based on Privately Sponsored Students Programmes (PSSP) at Moi University and its satellite campuses in Kenya. The study was descriptive in nature and adopted an ex-post facto research design. The study sample consisted…

  7. Computational mechanics analysis tools for parallel-vector supercomputers

    NASA Technical Reports Server (NTRS)

    Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.

    1993-01-01

    Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.

  8. A New Approach to Parallel Dynamic Partitioning for Adaptive Unstructured Meshes

    NASA Technical Reports Server (NTRS)

    Heber, Gerd; Biswas, Rupak; Gao, Guang R.

    1999-01-01

    Classical mesh partitioning algorithms were designed for rather static situations, and their straightforward application in a dynamical framework may lead to unsatisfactory results, e.g., excessive data migration among processors. Furthermore, special attention should be paid to their amenability to parallelization. In this paper, a novel parallel method for the dynamic partitioning of adaptive unstructured meshes is described. It is based on a linear representation of the mesh using self-avoiding walks.

  9. Coding for parallel execution of hardware-in-the-loop millimeter-wave scene generation models on multicore SIMD processor architectures

    NASA Astrophysics Data System (ADS)

    Olson, Richard F.

    2013-05-01

    Rendering of point scatterer based radar scenes for millimeter wave (mmW) seeker tests in real-time hardware-in-the-loop (HWIL) scene generation requires efficient algorithms and vector-friendly computer architectures for complex signal synthesis. New processor technology from Intel implements an extended 256-bit vector SIMD instruction set (AVX, AVX2) in a multi-core CPU design providing peak execution rates of hundreds of GigaFLOPS (GFLOPS) on one chip. Real world mmW scene generation code can approach peak SIMD execution rates only after careful algorithm and source code design. An effective software design will maintain high computing intensity emphasizing register-to-register SIMD arithmetic operations over data movement between CPU caches or off-chip memories. Engineers at the U.S. Army Aviation and Missile Research, Development and Engineering Center (AMRDEC) applied two basic parallel coding methods to assess new 256-bit SIMD multi-core architectures for mmW scene generation in HWIL. These include use of POSIX threads built on vector library functions and more portable, highlevel parallel code based on compiler technology (e.g. OpenMP pragmas and SIMD autovectorization). Since CPU technology is rapidly advancing toward high processor core counts and TeraFLOPS peak SIMD execution rates, it is imperative that coding methods be identified which produce efficient and maintainable parallel code. This paper describes the algorithms used in point scatterer target model rendering, the parallelization of those algorithms, and the execution performance achieved on an AVX multi-core machine using the two basic parallel coding methods. The paper concludes with estimates for scale-up performance on upcoming multi-core technology.

  10. Optimization of a new flow design for solid oxide cells using computational fluid dynamics modelling

    NASA Astrophysics Data System (ADS)

    Duhn, Jakob Dragsbæk; Jensen, Anker Degn; Wedel, Stig; Wix, Christian

    2016-12-01

    Design of a gas distributor to distribute gas flow into parallel channels for Solid Oxide Cells (SOC) is optimized, with respect to flow distribution, using Computational Fluid Dynamics (CFD) modelling. The CFD model is based on a 3d geometric model and the optimized structural parameters include the width of the channels in the gas distributor and the area in front of the parallel channels. The flow of the optimized design is found to have a flow uniformity index value of 0.978. The effects of deviations from the assumptions used in the modelling (isothermal and non-reacting flow) are evaluated and it is found that a temperature gradient along the parallel channels does not affect the flow uniformity, whereas a temperature difference between the channels does. The impact of the flow distribution on the maximum obtainable conversion during operation is also investigated and the obtainable overall conversion is found to be directly proportional to the flow uniformity. Finally the effect of manufacturing errors is investigated. The design is shown to be robust towards deviations from design dimensions of at least ±0.1 mm which is well within obtainable tolerances.

  11. Massively parallel information processing systems for space applications

    NASA Technical Reports Server (NTRS)

    Schaefer, D. H.

    1979-01-01

    NASA is developing massively parallel systems for ultra high speed processing of digital image data collected by satellite borne instrumentation. Such systems contain thousands of processing elements. Work is underway on the design and fabrication of the 'Massively Parallel Processor', a ground computer containing 16,384 processing elements arranged in a 128 x 128 array. This computer uses existing technology. Advanced work includes the development of semiconductor chips containing thousands of feedthrough paths. Massively parallel image analog to digital conversion technology is also being developed. The goal is to provide compact computers suitable for real-time onboard processing of images.

  12. Formalization, equivalence and generalization of basic resonance electrical circuits

    NASA Astrophysics Data System (ADS)

    Penev, Dimitar; Arnaudov, Dimitar; Hinov, Nikolay

    2017-12-01

    In the work are presented basic resonance circuits, which are used in resonance energy converters. The following resonant circuits are considered: serial, serial with parallel load parallel capacitor, parallel and parallel with serial loaded inductance. For the circuits under consideration, expressions are generated for the frequencies of own oscillations and for the equivalence of the active power emitted in the load. Mathematical expressions are graphically constructed and verified using computer simulations. The results obtained are used in the model based design of resonant energy converters with DC or AC output. This guaranteed the output indicators of power electronic devices.

  13. Programming parallel architectures: The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1988-01-01

    Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.

  14. Rubus: A compiler for seamless and extensible parallelism.

    PubMed

    Adnan, Muhammad; Aslam, Faisal; Nawaz, Zubair; Sarwar, Syed Mansoor

    2017-01-01

    Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer's expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been achieved by Rubus on the same GPU. Moreover, Rubus achieves this performance without drastically increasing the memory footprint of a program.

  15. Rubus: A compiler for seamless and extensible parallelism

    PubMed Central

    Adnan, Muhammad; Aslam, Faisal; Sarwar, Syed Mansoor

    2017-01-01

    Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer’s expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been achieved by Rubus on the same GPU. Moreover, Rubus achieves this performance without drastically increasing the memory footprint of a program. PMID:29211758

  16. LDRD final report on massively-parallel linear programming : the parPCx system.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

    2005-02-01

    This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less

  17. What Multilevel Parallel Programs do when you are not Watching: A Performance Analysis Case Study Comparing MPI/OpenMP, MLP, and Nested OpenMP

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Labarta, Jesus; Gimenez, Judit

    2004-01-01

    With the current trend in parallel computer architectures towards clusters of shared memory symmetric multi-processors, parallel programming techniques have evolved that support parallelism beyond a single level. When comparing the performance of applications based on different programming paradigms, it is important to differentiate between the influence of the programming model itself and other factors, such as implementation specific behavior of the operating system (OS) or architectural issues. Rewriting-a large scientific application in order to employ a new programming paradigms is usually a time consuming and error prone task. Before embarking on such an endeavor it is important to determine that there is really a gain that would not be possible with the current implementation. A detailed performance analysis is crucial to clarify these issues. The multilevel programming paradigms considered in this study are hybrid MPI/OpenMP, MLP, and nested OpenMP. The hybrid MPI/OpenMP approach is based on using MPI [7] for the coarse grained parallelization and OpenMP [9] for fine grained loop level parallelism. The MPI programming paradigm assumes a private address space for each process. Data is transferred by explicitly exchanging messages via calls to the MPI library. This model was originally designed for distributed memory architectures but is also suitable for shared memory systems. The second paradigm under consideration is MLP which was developed by Taft. The approach is similar to MPi/OpenMP, using a mix of coarse grain process level parallelization and loop level OpenMP parallelization. As it is the case with MPI, a private address space is assumed for each process. The MLP approach was developed for ccNUMA architectures and explicitly takes advantage of the availability of shared memory. A shared memory arena which is accessible by all processes is required. Communication is done by reading from and writing to the shared memory.

  18. Stability of Large Parallel Tunnels Excavated in Weak Rocks: A Case Study

    NASA Astrophysics Data System (ADS)

    Ding, Xiuli; Weng, Yonghong; Zhang, Yuting; Xu, Tangjin; Wang, Tuanle; Rao, Zhiwen; Qi, Zufang

    2017-09-01

    Diversion tunnels are important structures for hydropower projects but are always placed in locations with less favorable geological conditions than those in which other structures are placed. Because diversion tunnels are usually large and closely spaced, the rock pillar between adjacent tunnels in weak rocks is affected on both sides, and conventional support measures may not be adequate to achieve the required stability. Thus, appropriate reinforcement support measures are needed, and the design philosophy regarding large parallel tunnels in weak rocks should be updated. This paper reports a recent case in which two large parallel diversion tunnels are excavated. The rock masses are thin- to ultra-thin-layered strata coated with phyllitic films, which significantly decrease the soundness and strength of the strata and weaken the rocks. The behaviors of the surrounding rock masses under original (and conventional) support measures are detailed in terms of rock mass deformation, anchor bolt stress, and the extent of the excavation disturbed zone (EDZ), as obtained from safety monitoring and field testing. In situ observed phenomena and their interpretation are also included. The sidewall deformations exhibit significant time-dependent characteristics, and large magnitudes are recorded. The stresses in the anchor bolts are small, but the extents of the EDZs are large. The stability condition under the original support measures is evaluated as poor. To enhance rock mass stability, attempts are made to reinforce support design and improve safety monitoring programs. The main feature of these attempts is the use of prestressed cables that run through the rock pillar between the parallel tunnels. The efficacy of reinforcement support measures is verified by further safety monitoring data and field test results. Numerical analysis is constantly performed during the construction process to provide a useful reference for decision making. The calculated deformations are in good agreement with the measured data, and the calculated forces of newly added cables show that the designed reinforcement is necessary and ensures sufficient stability. Finally, the role of safety monitoring in the evaluation of rock mass stability and the consideration of tunnel group effect are discussed. The work described in this paper aims to deepen the understanding of rock mass behaviors of large parallel tunnels in weak rocks and to improve the design philosophy.

  19. Performance of the SERI parallel-passage dehumidifer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schlepp, D.; Barlow, R.

    1984-09-01

    The key component in improving the performance of solar desiccant cooling systems is the dehumidifier. A parallel-passage geometry for the desiccant dehumidifier has been identified as meeting key criteria of low pressure drop, high mass transfer efficiency, and compact size. An experimental program to build and test a small-scale prototype of this design was undertaken in FY 1982, and the results are presented in this report. Computer models to predict the adsorption/desorption behavior of desiccant dehumidifiers were updated to take into account the geometry of the bed and predict potential system performance using the new component design. The parallel-passage designmore » proved to have high mass transfer effectiveness and low pressure drop over a wide range of test conditions typical of desiccant cooling system operation. The prototype dehumidifier averaged 93% effectiveness at pressure drops of less than 50 Pa at design point conditions. Predictions of system performance using models validated with the experimental data indicate that system thermal coefficients of performance (COPs) of 1.0 to 1.2 and electrical COPs above 8.5 are possible using this design.« less

  20. A high-speed linear algebra library with automatic parallelism

    NASA Technical Reports Server (NTRS)

    Boucher, Michael L.

    1994-01-01

    Parallel or distributed processing is key to getting highest performance workstations. However, designing and implementing efficient parallel algorithms is difficult and error-prone. It is even more difficult to write code that is both portable to and efficient on many different computers. Finally, it is harder still to satisfy the above requirements and include the reliability and ease of use required of commercial software intended for use in a production environment. As a result, the application of parallel processing technology to commercial software has been extremely small even though there are numerous computationally demanding programs that would significantly benefit from application of parallel processing. This paper describes DSSLIB, which is a library of subroutines that perform many of the time-consuming computations in engineering and scientific software. DSSLIB combines the high efficiency and speed of parallel computation with a serial programming model that eliminates many undesirable side-effects of typical parallel code. The result is a simple way to incorporate the power of parallel processing into commercial software without compromising maintainability, reliability, or ease of use. This gives significant advantages over less powerful non-parallel entries in the market.

  1. Classroom management of situated group learning: A research study of two teaching strategies

    NASA Astrophysics Data System (ADS)

    Smeh, Kathy; Fawns, Rod

    2000-06-01

    Although peer-based work is encouraged by theories in developmental psychology and although classroom interventions suggest it is effective, there are grounds for recognising that young pupils find collaborative learning hard to sustain. Discontinuities in collaborative skill during development have been suggested as one interpretation. Theory and research have neglected situational continuities that the teacher may provide in management of formal and informal collaborations. This experimental study, with the collaboration of the science faculty in one urban secondary college, investigated the effect of two role attribution strategies on communication in peer groups of different gender composition in three parallel Year 8 science classes. The group were set a problem that required them to design an experiment to compare the thermal insulating properties of two different materials. This presents the data collected and key findings, and reviews the findings from previous parallel studies that have employed the same research design in different school settings. The results confirm the effectiveness of social role attribution strategies in teacher management of communication in peer-based work.

  2. Long-Term Chinese Students' Transitional Experiences in UK Higher Education: A Particular Focus on Their Academic Adjustment

    ERIC Educational Resources Information Center

    Wang, Isobel Kai-Hui

    2018-01-01

    The global population of students pursuing studies abroad continues to grow, and consequently their intercultural experiences are receiving greater research attention. However, research into long-term student sojourners' academic development and personal growth is still in its infancy. A parallel mixed method study was designed to investigate the…

  3. Interventions to Reduce Distress in Adult Victims of Rape and Sexual Violence: A Systematic Review

    ERIC Educational Resources Information Center

    Regehr, Cheryl; Alaggia, Ramona; Dennis, Jane; Pitts, Annabel; Saini, Michael

    2013-01-01

    Objectives: This article presents a systematic evaluation of the effectiveness of interventions aimed at reducing distress in adult victims of rape and sexual violence. Method: Studies were eligible for the review if the assignment of study participants to experimental or control groups was by random allocation or parallel cohort design. Results:…

  4. Metascalable molecular dynamics simulation of nano-mechano-chemistry

    NASA Astrophysics Data System (ADS)

    Shimojo, F.; Kalia, R. K.; Nakano, A.; Nomura, K.; Vashishta, P.

    2008-07-01

    We have developed a metascalable (or 'design once, scale on new architectures') parallel application-development framework for first-principles based simulations of nano-mechano-chemical processes on emerging petaflops architectures based on spatiotemporal data locality principles. The framework consists of (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms, (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these scalable algorithms onto hardware. The EDC-STEP-HCD framework exposes and expresses maximal concurrency and data locality, thereby achieving parallel efficiency as high as 0.99 for 1.59-billion-atom reactive force field molecular dynamics (MD) and 17.7-million-atom (1.56 trillion electronic degrees of freedom) quantum mechanical (QM) MD in the framework of the density functional theory (DFT) on adaptive multigrids, in addition to 201-billion-atom nonreactive MD, on 196 608 IBM BlueGene/L processors. We have also used the framework for automated execution of adaptive hybrid DFT/MD simulation on a grid of six supercomputers in the US and Japan, in which the number of processors changed dynamically on demand and tasks were migrated according to unexpected faults. The paper presents the application of the framework to the study of nanoenergetic materials: (1) combustion of an Al/Fe2O3 thermite and (2) shock initiation and reactive nanojets at a void in an energetic crystal.

  5. Simulation framework for intelligent transportation systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ewing, T.; Doss, E.; Hanebutte, U.

    1996-10-01

    A simulation framework has been developed for a large-scale, comprehensive, scaleable simulation of an Intelligent Transportation System (ITS). The simulator is designed for running on parallel computers and distributed (networked) computer systems, but can run on standalone workstations for smaller simulations. The simulator currently models instrumented smart vehicles with in-vehicle navigation units capable of optimal route planning and Traffic Management Centers (TMC). The TMC has probe vehicle tracking capabilities (display position and attributes of instrumented vehicles), and can provide two-way interaction with traffic to provide advisories and link times. Both the in-vehicle navigation module and the TMC feature detailed graphicalmore » user interfaces to support human-factors studies. Realistic modeling of variations of the posted driving speed are based on human factors studies that take into consideration weather, road conditions, driver personality and behavior, and vehicle type. The prototype has been developed on a distributed system of networked UNIX computers but is designed to run on parallel computers, such as ANL`s IBM SP-2, for large-scale problems. A novel feature of the approach is that vehicles are represented by autonomous computer processes which exchange messages with other processes. The vehicles have a behavior model which governs route selection and driving behavior, and can react to external traffic events much like real vehicles. With this approach, the simulation is scaleable to take advantage of emerging massively parallel processor (MPP) systems.« less

  6. Design of high-performance parallelized gene predictors in MATLAB.

    PubMed

    Rivard, Sylvain Robert; Mailloux, Jean-Gabriel; Beguenane, Rachid; Bui, Hung Tien

    2012-04-10

    This paper proposes a method of implementing parallel gene prediction algorithms in MATLAB. The proposed designs are based on either Goertzel's algorithm or on FFTs and have been implemented using varying amounts of parallelism on a central processing unit (CPU) and on a graphics processing unit (GPU). Results show that an implementation using a straightforward approach can require over 4.5 h to process 15 million base pairs (bps) whereas a properly designed one could perform the same task in less than five minutes. In the best case, a GPU implementation can yield these results in 57 s. The present work shows how parallelism can be used in MATLAB for gene prediction in very large DNA sequences to produce results that are over 270 times faster than a conventional approach. This is significant as MATLAB is typically overlooked due to its apparent slow processing time even though it offers a convenient environment for bioinformatics. From a practical standpoint, this work proposes two strategies for accelerating genome data processing which rely on different parallelization mechanisms. Using a CPU, the work shows that direct access to the MEX function increases execution speed and that the PARFOR construct should be used in order to take full advantage of the parallelizable Goertzel implementation. When the target is a GPU, the work shows that data needs to be segmented into manageable sizes within the GFOR construct before processing in order to minimize execution time.

  7. Increasing airport capacity with modified IFR approach procedures for close-spaced parallel runways

    DOT National Transportation Integrated Search

    2001-01-01

    Because of wake turbulence considerations, current instrument approach : procedures treat close-spaced (i.e., less than 2,500 feet apart) parallel run : ways as a single runway. This restriction is designed to assure safety for all : aircraft types u...

  8. A Sample Application for Use of Biography in Social Studies; Science, Technology and Social Change Course

    ERIC Educational Resources Information Center

    Er, Harun

    2017-01-01

    The aim of this study is to evaluate the opinions of social studies teacher candidates on use of biography in science, technology and social change course given in the undergraduate program of social studies education. In this regard, convergent parallel design as a mixed research pattern was used to make use of both qualitative and quantitative…

  9. Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN

    PubMed Central

    Hammond, G E; Lichtner, P C; Mills, R T

    2014-01-01

    [1] To better inform the subsurface scientist on the expected performance of parallel simulators, this work investigates performance of the reactive multiphase flow and multicomponent biogeochemical transport code PFLOTRAN as it is applied to several realistic modeling scenarios run on the Jaguar supercomputer. After a brief introduction to the code's parallel layout and code design, PFLOTRAN's parallel performance (measured through strong and weak scalability analyses) is evaluated in the context of conceptual model layout, software and algorithmic design, and known hardware limitations. PFLOTRAN scales well (with regard to strong scaling) for three realistic problem scenarios: (1) in situ leaching of copper from a mineral ore deposit within a 5-spot flow regime, (2) transient flow and solute transport within a regional doublet, and (3) a real-world problem involving uranium surface complexation within a heterogeneous and extremely dynamic variably saturated flow field. Weak scalability is discussed in detail for the regional doublet problem, and several difficulties with its interpretation are noted. PMID:25506097

  10. Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN.

    PubMed

    Hammond, G E; Lichtner, P C; Mills, R T

    2014-01-01

    [1] To better inform the subsurface scientist on the expected performance of parallel simulators, this work investigates performance of the reactive multiphase flow and multicomponent biogeochemical transport code PFLOTRAN as it is applied to several realistic modeling scenarios run on the Jaguar supercomputer. After a brief introduction to the code's parallel layout and code design, PFLOTRAN's parallel performance (measured through strong and weak scalability analyses) is evaluated in the context of conceptual model layout, software and algorithmic design, and known hardware limitations. PFLOTRAN scales well (with regard to strong scaling) for three realistic problem scenarios: (1) in situ leaching of copper from a mineral ore deposit within a 5-spot flow regime, (2) transient flow and solute transport within a regional doublet, and (3) a real-world problem involving uranium surface complexation within a heterogeneous and extremely dynamic variably saturated flow field. Weak scalability is discussed in detail for the regional doublet problem, and several difficulties with its interpretation are noted.

  11. The AIS-5000 parallel processor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schmitt, L.A.; Wilson, S.S.

    1988-05-01

    The AIS-5000 is a commercially available massively parallel processor which has been designed to operate in an industrial environment. It has fine-grained parallelism with up to 1024 processing elements arranged in a single-instruction multiple-data (SIMD) architecture. The processing elements are arranged in a one-dimensional chain that, for computer vision applications, can be as wide as the image itself. This architecture has superior cost/performance characteristics than two-dimensional mesh-connected systems. The design of the processing elements and their interconnections as well as the software used to program the system allow a wide variety of algorithms and applications to be implemented. In thismore » paper, the overall architecture of the system is described. Various components of the system are discussed, including details of the processing elements, data I/O pathways and parallel memory organization. A virtual two-dimensional model for programming image-based algorithms for the system is presented. This model is supported by the AIS-5000 hardware and software and allows the system to be treated as a full-image-size, two-dimensional, mesh-connected parallel processor. Performance bench marks are given for certain simple and complex functions.« less

  12. Evaluation of concurrent priority queue algorithms. Technical report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, Q.

    1991-02-01

    The priority queue is a fundamental data structure that is used in a large variety of parallel algorithms, such as multiprocessor scheduling and parallel best-first search of state-space graphs. This thesis addresses the design and experimental evaluation of two novel concurrent priority queues: a parallel Fibonacci heap and a concurrent priority pool, and compares them with the concurrent binary heap. The parallel Fibonacci heap is based on the sequential Fibonacci heap, which is theoretically the most efficient data structure for sequential priority queues. This scheme not only preserves the efficient operation time bounds of its sequential counterpart, but also hasmore » very low contention by distributing locks over the entire data structure. The experimental results show its linearly scalable throughput and speedup up to as many processors as tested (currently 18). A concurrent access scheme for a doubly linked list is described as part of the implementation of the parallel Fibonacci heap. The concurrent priority pool is based on the concurrent B-tree and the concurrent pool. The concurrent priority pool has the highest throughput among the priority queues studied. Like the parallel Fibonacci heap, the concurrent priority pool scales linearly up to as many processors as tested. The priority queues are evaluated in terms of throughput and speedup. Some applications of concurrent priority queues such as the vertex cover problem and the single source shortest path problem are tested.« less

  13. Parallel Geographic Variation in Drosophila melanogaster

    PubMed Central

    Reinhardt, Josie A.; Kolaczkowski, Bryan; Jones, Corbin D.; Begun, David J.; Kern, Andrew D.

    2014-01-01

    Drosophila melanogaster, an ancestrally African species, has recently spread throughout the world, associated with human activity. The species has served as the focus of many studies investigating local adaptation relating to latitudinal variation in non-African populations, especially those from the United States and Australia. These studies have documented the existence of shared, genetically determined phenotypic clines for several life history and morphological traits. However, there are no studies designed to formally address the degree of shared latitudinal differentiation at the genomic level. Here we present our comparative analysis of such differentiation. Not surprisingly, we find evidence of substantial, shared selection responses on the two continents, probably resulting from selection on standing ancestral variation. The polymorphic inversion In(3R)P has an important effect on this pattern, but considerable parallelism is also observed across the genome in regions not associated with inversion polymorphism. Interestingly, parallel latitudinal differentiation is observed even for variants that are not particularly strongly differentiated, which suggests that very large numbers of polymorphisms are targets of spatially varying selection in this species. PMID:24610860

  14. A Parallel Framework with Block Matrices of a Discrete Fourier Transform for Vector-Valued Discrete-Time Signals.

    PubMed

    Soto-Quiros, Pablo

    2015-01-01

    This paper presents a parallel implementation of a kind of discrete Fourier transform (DFT): the vector-valued DFT. The vector-valued DFT is a novel tool to analyze the spectra of vector-valued discrete-time signals. This parallel implementation is developed in terms of a mathematical framework with a set of block matrix operations. These block matrix operations contribute to analysis, design, and implementation of parallel algorithms in multicore processors. In this work, an implementation and experimental investigation of the mathematical framework are performed using MATLAB with the Parallel Computing Toolbox. We found that there is advantage to use multicore processors and a parallel computing environment to minimize the high execution time. Additionally, speedup increases when the number of logical processors and length of the signal increase.

  15. Roller-gear drives for robotic manipulators design, fabrication and test

    NASA Technical Reports Server (NTRS)

    Anderson, William J.; Shipitalo, William

    1991-01-01

    Two single axis planetary roller-gear drives and a two axis roller-gear drive with dual inputs were designed for use as robotic transmissions. Each of the single axis drives is a two planet row, four planet arrangement with spur gears and compressively loaded cylindrical rollers acting in parallel. The two axis drive employs bevel gears and cone rollers acting in parallel. The rollers serve a dual function: they remove backlash from the system, and they transmit torque when the gears are not fully engaged.

  16. The Area-Time Complexity of Sorting.

    DTIC Science & Technology

    1984-12-01

    suggests a classification of keys into short (k < logn), long (k > 2 logn), and of medium length. Optimal or near-optimal designs of VLSI sorters are...suggests a classification of keys into short (k 4 logn ), long (k > 21ogn ), and of medium length. Optimal or near-optimal designs of VLSI sorters are...ARCHITECTURES 79 5.1 Introduction 79 5.2 Parallel Algorithms for Sorting 80 . 5.3 Parallel Architectures 88 6 OPTIMAL VLSI SORTERS FOR KEYS OF LENGTH k - logn

  17. Experimental entangled photon pair generation using crystals with parallel optical axes.

    PubMed

    Villar, Aitor; Lohrmann, Alexander; Ling, Alexander

    2018-05-14

    We present an optical design where polarization-entangled photon pairs are generated within two β-Barium Borate crystals whose optical axes are parallel. This design increases the spatial mode overlap of the emitted photon pairs enhancing single mode collection without the need for additional spatial walk-off compensators. The observed photon pair rate is at least 65 000 pairs/s/mW with a quantum state fidelity of 99.53 ± 0.22% when pumped with an elliptical spatial profile.

  18. Experimental entangled photon pair generation using crystals with parallel optical axes

    NASA Astrophysics Data System (ADS)

    Villar, Aitor; Lohrmann, Alexander; Ling, Alexander

    2018-05-01

    We present an optical design where polarization-entangled photon pairs are generated within two $\\beta$-Barium Borate crystals whose optical axes are parallel. This design increases the spatial mode overlap of the emitted photon pairs enhancing single mode collection without the need for additional spatial walk-off compensators. The observed photon pair rate is at least 65000 pairs/s/mW with a quantum state fidelity of 99.53$\\pm$0.22% when pumped with an elliptical spatial profile.

  19. NIRcam-NIRSpec GTO Observations of Galaxy Evolution

    NASA Astrophysics Data System (ADS)

    Rieke, Marcia J.; Ferruit, Pierre; Alberts, Stacey; Bunker, Andrew; Charlot, Stephane; Chevallard, Jacopo; Dressler, Alan; Egami, Eiichi; Eisenstein, Daniel; Endsley, Ryan; Franx, Marijn; Frye, Brenda L.; Hainline, Kevin; Jakobsen, Peter; Lake, Emma Curtis; Maiolino, Roberto; Rix, Hans-Walter; Robertson, Brant; Stark, Daniel; Williams, Christina; Willmer, Christopher; Willott, Chris J.

    2017-06-01

    The NIRSpec and and NIRCam GTO Teams are planning a joint imaging and spectroscopic study of the high redshift universe. By virtue of planning a joint program which includes medium and deep near- and mid-infrared imaging surveys and multi-object spectroscopy (MOS) of sources in the same fields, we have learned much about planning observing programs for each of the instruments and using them in parallel mode to maximize photon collection time. The design and rationale for our joint program will be explored in this talk with an emphasis on why we have chosen particular suites of filters and spectroscopic resolutions, why we have chosen particular exposure patterns, and how we have designed the parallel observations. The actual observations that we intend on executing will serve as examples of how to layout mosaics and MOS observations to maximize observing efficiency for surveys with JWST.

  20. Adapted RF pulse design for SAR reduction in parallel excitation with experimental verification at 9.4 T.

    PubMed

    Wu, Xiaoping; Akgün, Can; Vaughan, J Thomas; Andersen, Peter; Strupp, John; Uğurbil, Kâmil; Van de Moortele, Pierre-François

    2010-07-01

    Parallel excitation holds strong promises to mitigate the impact of large transmit B1 (B+1) distortion at very high magnetic field. Accelerated RF pulses, however, inherently tend to require larger values in RF peak power which may result in substantial increase in Specific Absorption Rate (SAR) in tissues, which is a constant concern for patient safety at very high field. In this study, we demonstrate adapted rate RF pulse design allowing for SAR reduction while preserving excitation target accuracy. Compared with other proposed implementations of adapted rate RF pulses, our approach is compatible with any k-space trajectories, does not require an analytical expression of the gradient waveform and can be used for large flip angle excitation. We demonstrate our method with numerical simulations based on electromagnetic modeling and we include an experimental verification of transmit pattern accuracy on an 8 transmit channel 9.4 T system.

  1. Parallel Aircraft Trajectory Optimization with Analytic Derivatives

    NASA Technical Reports Server (NTRS)

    Falck, Robert D.; Gray, Justin S.; Naylor, Bret

    2016-01-01

    Trajectory optimization is an integral component for the design of aerospace vehicles, but emerging aircraft technologies have introduced new demands on trajectory analysis that current tools are not well suited to address. Designing aircraft with technologies such as hybrid electric propulsion and morphing wings requires consideration of the operational behavior as well as the physical design characteristics of the aircraft. The addition of operational variables can dramatically increase the number of design variables which motivates the use of gradient based optimization with analytic derivatives to solve the larger optimization problems. In this work we develop an aircraft trajectory analysis tool using a Legendre-Gauss-Lobatto based collocation scheme, providing analytic derivatives via the OpenMDAO multidisciplinary optimization framework. This collocation method uses an implicit time integration scheme that provides a high degree of sparsity and thus several potential options for parallelization. The performance of the new implementation was investigated via a series of single and multi-trajectory optimizations using a combination of parallel computing and constraint aggregation. The computational performance results show that in order to take full advantage of the sparsity in the problem it is vital to parallelize both the non-linear analysis evaluations and the derivative computations themselves. The constraint aggregation results showed a significant numerical challenge due to difficulty in achieving tight convergence tolerances. Overall, the results demonstrate the value of applying analytic derivatives to trajectory optimization problems and lay the foundation for future application of this collocation based method to the design of aircraft with where operational scheduling of technologies is key to achieving good performance.

  2. Optimality, sample size, and power calculations for the sequential parallel comparison design.

    PubMed

    Ivanova, Anastasia; Qaqish, Bahjat; Schoenfeld, David A

    2011-10-15

    The sequential parallel comparison design (SPCD) has been proposed to increase the likelihood of success of clinical trials in therapeutic areas where high-placebo response is a concern. The trial is run in two stages, and subjects are randomized into three groups: (i) placebo in both stages; (ii) placebo in the first stage and drug in the second stage; and (iii) drug in both stages. We consider the case of binary response data (response/no response). In the SPCD, all first-stage and second-stage data from placebo subjects who failed to respond in the first stage of the trial are utilized in the efficacy analysis. We develop 1 and 2 degree of freedom score tests for treatment effect in the SPCD. We give formulae for asymptotic power and for sample size computations and evaluate their accuracy via simulation studies. We compute the optimal allocation ratio between drug and placebo in stage 1 for the SPCD to determine from a theoretical viewpoint whether a single-stage design, a two-stage design with placebo only in the first stage, or a two-stage design is the best design for a given set of response rates. As response rates are not known before the trial, a two-stage approach with allocation to active drug in both stages is a robust design choice. Copyright © 2011 John Wiley & Sons, Ltd.

  3. Development of a parallel FE simulator for modeling the whole trans-scale failure process of rock from meso- to engineering-scale

    NASA Astrophysics Data System (ADS)

    Li, Gen; Tang, Chun-An; Liang, Zheng-Zhao

    2017-01-01

    Multi-scale high-resolution modeling of rock failure process is a powerful means in modern rock mechanics studies to reveal the complex failure mechanism and to evaluate engineering risks. However, multi-scale continuous modeling of rock, from deformation, damage to failure, has raised high requirements on the design, implementation scheme and computation capacity of the numerical software system. This study is aimed at developing the parallel finite element procedure, a parallel rock failure process analysis (RFPA) simulator that is capable of modeling the whole trans-scale failure process of rock. Based on the statistical meso-damage mechanical method, the RFPA simulator is able to construct heterogeneous rock models with multiple mechanical properties, deal with and represent the trans-scale propagation of cracks, in which the stress and strain fields are solved for the damage evolution analysis of representative volume element by the parallel finite element method (FEM) solver. This paper describes the theoretical basis of the approach and provides the details of the parallel implementation on a Windows - Linux interactive platform. A numerical model is built to test the parallel performance of FEM solver. Numerical simulations are then carried out on a laboratory-scale uniaxial compression test, and field-scale net fracture spacing and engineering-scale rock slope examples, respectively. The simulation results indicate that relatively high speedup and computation efficiency can be achieved by the parallel FEM solver with a reasonable boot process. In laboratory-scale simulation, the well-known physical phenomena, such as the macroscopic fracture pattern and stress-strain responses, can be reproduced. In field-scale simulation, the formation process of net fracture spacing from initiation, propagation to saturation can be revealed completely. In engineering-scale simulation, the whole progressive failure process of the rock slope can be well modeled. It is shown that the parallel FE simulator developed in this study is an efficient tool for modeling the whole trans-scale failure process of rock from meso- to engineering-scale.

  4. Review of Recent Methodological Developments in Group-Randomized Trials: Part 1—Design

    PubMed Central

    Li, Fan; Gallis, John A.; Prague, Melanie; Murray, David M.

    2017-01-01

    In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have highlighted the developments of the past 13 years in design with a companion article to focus on developments in analysis. As a pair, these articles update the 2004 review. We have discussed developments in the topics of the earlier review (e.g., clustering, matching, and individually randomized group-treatment trials) and in new topics, including constrained randomization and a range of randomized designs that are alternatives to the standard parallel-arm GRT. These include the stepped-wedge GRT, the pseudocluster randomized trial, and the network-randomized GRT, which, like the parallel-arm GRT, require clustering to be accounted for in both their design and analysis. PMID:28426295

  5. Review of Recent Methodological Developments in Group-Randomized Trials: Part 1-Design.

    PubMed

    Turner, Elizabeth L; Li, Fan; Gallis, John A; Prague, Melanie; Murray, David M

    2017-06-01

    In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have highlighted the developments of the past 13 years in design with a companion article to focus on developments in analysis. As a pair, these articles update the 2004 review. We have discussed developments in the topics of the earlier review (e.g., clustering, matching, and individually randomized group-treatment trials) and in new topics, including constrained randomization and a range of randomized designs that are alternatives to the standard parallel-arm GRT. These include the stepped-wedge GRT, the pseudocluster randomized trial, and the network-randomized GRT, which, like the parallel-arm GRT, require clustering to be accounted for in both their design and analysis.

  6. A general purpose subroutine for fast fourier transform on a distributed memory parallel machine

    NASA Technical Reports Server (NTRS)

    Dubey, A.; Zubair, M.; Grosch, C. E.

    1992-01-01

    One issue which is central in developing a general purpose Fast Fourier Transform (FFT) subroutine on a distributed memory parallel machine is the data distribution. It is possible that different users would like to use the FFT routine with different data distributions. Thus, there is a need to design FFT schemes on distributed memory parallel machines which can support a variety of data distributions. An FFT implementation on a distributed memory parallel machine which works for a number of data distributions commonly encountered in scientific applications is presented. The problem of rearranging the data after computing the FFT is also addressed. The performance of the implementation on a distributed memory parallel machine Intel iPSC/860 is evaluated.

  7. High convergence efficiency design of flat Fresnel lens with large aperture

    NASA Astrophysics Data System (ADS)

    Ke, Jieyao; Zhao, Changming; Guan, Zhe

    2018-01-01

    This paper designed a circle-shaped Fresnel lens with large aperture as part of the solar pumped laser design project. The Fresnel lens designed in this paper simulate in size 1000mm×1000mm, focus length 1200mm and polymethyl methacrylate (PMMA) material in order to conduct high convergence efficiency. In the light of design requirement of concentric ring with same width of 0.3mm, this paper proposed an optimized Fresnel lens design based on previous sphere design and conduct light tracing simulation in Matlab. This paper also analyzed the effect of light spot size, light intensity distribution, optical efficiency under four conditions, monochromatic parallel light, parallel spectrum light, divergent monochromatic light and sunlight. Design by 550nm wavelength and under the condition of Fresnel reflection, the results indicated that the designed lens could convergent sunlight in diffraction limit of 11.8mm with a 78.7% optical efficiency, better than the sphere cutting design results of 30.4%.

  8. Mechanical design of deformation compensated flexural pivots structured for linear nanopositioning stages

    DOEpatents

    Shu, Deming; Kearney, Steven P.; Preissner, Curt A.

    2015-02-17

    A method and deformation compensated flexural pivots structured for precision linear nanopositioning stages are provided. A deformation-compensated flexural linear guiding mechanism includes a basic parallel mechanism including a U-shaped member and a pair of parallel bars linked to respective pairs of I-link bars and each of the I-bars coupled by a respective pair of flexural pivots. The basic parallel mechanism includes substantially evenly distributed flexural pivots minimizing center shift dynamic errors.

  9. Software Tools for Design and Performance Evaluation of Intelligent Systems

    DTIC Science & Technology

    2004-08-01

    Self-calibration of Three-Legged Modular Reconfigurable Parallel Robots Based on Leg-End Distance Errors,” Robotica , Vol. 19, pp. 187-198. [4...9] Lintott, A. B., and Dunlop, G. R., “Parallel Topology Robot Calibration,” Robotica . [10] Vischer, P., and Clavel, R., “Kinematic Calibration...of the Parallel Delta Robot,” Robotica , Vol. 16, pp.207- 218, 1998. [11] Joshi, S.A., and Surianarayan, A., “Calibration of a 6-DOF Cable Robot Using

  10. Massively Parallel Solution of Poisson Equation on Coarse Grain MIMD Architectures

    NASA Technical Reports Server (NTRS)

    Fijany, A.; Weinberger, D.; Roosta, R.; Gulati, S.

    1998-01-01

    In this paper a new algorithm, designated as Fast Invariant Imbedding algorithm, for solution of Poisson equation on vector and massively parallel MIMD architectures is presented. This algorithm achieves the same optimal computational efficiency as other Fast Poisson solvers while offering a much better structure for vector and parallel implementation. Our implementation on the Intel Delta and Paragon shows that a speedup of over two orders of magnitude can be achieved even for moderate size problems.

  11. Design of a solar concentrator considering arbitrary surfaces

    NASA Astrophysics Data System (ADS)

    Jiménez-Rodríguez, Martín.; Avendaño-Alejo, Maximino; Verduzco-Grajeda, Lidia Elizabeth; Martínez-Enríquez, Arturo I.; García-Díaz, Reyes; Díaz-Uribe, Rufino

    2017-10-01

    We study the propagation of light in order to efficiently redirect the reflected light on photocatalytic samples placed inside a commercial solar simulator, and we have designed a small-scale prototype of Cycloidal Collectors (CCs), resembling a compound parabolic collector. The prototype consists of either cycloidal trough or cycloidal collector having symmetry of rotation, which has been designed considering an exact ray tracing assuming a bundle of rays propagating parallel to the optical axis and impinging on a curate cycloidal surface, obtaining its caustic surface produced by reflection.

  12. Development of a 3D parallel mechanism robot arm with three vertical-axial pneumatic actuators combined with a stereo vision system.

    PubMed

    Chiang, Mao-Hsiung; Lin, Hao-Ting

    2011-01-01

    This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for realizing a 3D motion in the X-Y-Z coordinate system of the robot's end-effector. The inverse kinematics and the forward kinematics of the parallel mechanism robot are investigated by using the Denavit-Hartenberg notation (D-H notation) coordinate system. The pneumatic actuators in the three vertical motion axes are modeled. In the control system, the Fourier series-based adaptive sliding-mode controller with H(∞) tracking performance is used to design the path tracking controllers of the three vertical servo pneumatic actuators for realizing 3D path tracking control of the end-effector. Three optical linear scales are used to measure the position of the three pneumatic actuators. The 3D position of the end-effector is then calculated from the measuring position of the three pneumatic actuators by means of the kinematics. However, the calculated 3D position of the end-effector cannot consider the manufacturing and assembly tolerance of the joints and the parallel mechanism so that errors between the actual position and the calculated 3D position of the end-effector exist. In order to improve this situation, sensor collaboration is developed in this paper. A stereo vision system is used to collaborate with the three position sensors of the pneumatic actuators. The stereo vision system combining two CCD serves to measure the actual 3D position of the end-effector and calibrate the error between the actual and the calculated 3D position of the end-effector. Furthermore, to verify the feasibility of the proposed parallel mechanism robot driven by three vertical pneumatic servo actuators, a full-scale test rig of the proposed parallel mechanism pneumatic robot is set up. Thus, simulations and experiments for different complex 3D motion profiles of the robot end-effector can be successfully achieved. The desired, the actual and the calculated 3D position of the end-effector can be compared in the complex 3D motion control.

  13. Development of a 3D Parallel Mechanism Robot Arm with Three Vertical-Axial Pneumatic Actuators Combined with a Stereo Vision System

    PubMed Central

    Chiang, Mao-Hsiung; Lin, Hao-Ting

    2011-01-01

    This study aimed to develop a novel 3D parallel mechanism robot driven by three vertical-axial pneumatic actuators with a stereo vision system for path tracking control. The mechanical system and the control system are the primary novel parts for developing a 3D parallel mechanism robot. In the mechanical system, a 3D parallel mechanism robot contains three serial chains, a fixed base, a movable platform and a pneumatic servo system. The parallel mechanism are designed and analyzed first for realizing a 3D motion in the X-Y-Z coordinate system of the robot’s end-effector. The inverse kinematics and the forward kinematics of the parallel mechanism robot are investigated by using the Denavit-Hartenberg notation (D-H notation) coordinate system. The pneumatic actuators in the three vertical motion axes are modeled. In the control system, the Fourier series-based adaptive sliding-mode controller with H∞ tracking performance is used to design the path tracking controllers of the three vertical servo pneumatic actuators for realizing 3D path tracking control of the end-effector. Three optical linear scales are used to measure the position of the three pneumatic actuators. The 3D position of the end-effector is then calculated from the measuring position of the three pneumatic actuators by means of the kinematics. However, the calculated 3D position of the end-effector cannot consider the manufacturing and assembly tolerance of the joints and the parallel mechanism so that errors between the actual position and the calculated 3D position of the end-effector exist. In order to improve this situation, sensor collaboration is developed in this paper. A stereo vision system is used to collaborate with the three position sensors of the pneumatic actuators. The stereo vision system combining two CCD serves to measure the actual 3D position of the end-effector and calibrate the error between the actual and the calculated 3D position of the end-effector. Furthermore, to verify the feasibility of the proposed parallel mechanism robot driven by three vertical pneumatic servo actuators, a full-scale test rig of the proposed parallel mechanism pneumatic robot is set up. Thus, simulations and experiments for different complex 3D motion profiles of the robot end-effector can be successfully achieved. The desired, the actual and the calculated 3D position of the end-effector can be compared in the complex 3D motion control. PMID:22247676

  14. Two schemes for rapid generation of digital video holograms using PC cluster

    NASA Astrophysics Data System (ADS)

    Park, Hanhoon; Song, Joongseok; Kim, Changseob; Park, Jong-Il

    2017-12-01

    Computer-generated holography (CGH), which is a process of generating digital holograms, is computationally expensive. Recently, several methods/systems of parallelizing the process using graphic processing units (GPUs) have been proposed. Indeed, use of multiple GPUs or a personal computer (PC) cluster (each PC with GPUs) enabled great improvements in the process speed. However, extant literature has less often explored systems involving rapid generation of multiple digital holograms and specialized systems for rapid generation of a digital video hologram. This study proposes a system that uses a PC cluster and is able to more efficiently generate a video hologram. The proposed system is designed to simultaneously generate multiple frames and accelerate the generation by parallelizing the CGH computations across a number of frames, as opposed to separately generating each individual frame while parallelizing the CGH computations within each frame. The proposed system also enables the subprocesses for generating each frame to execute in parallel through multithreading. With these two schemes, the proposed system significantly reduced the data communication time for generating a digital hologram when compared with that of the state-of-the-art system.

  15. Analysis of series resonant converter with series-parallel connection

    NASA Astrophysics Data System (ADS)

    Lin, Bor-Ren; Huang, Chien-Lan

    2011-02-01

    In this study, a parallel inductor-inductor-capacitor (LLC) resonant converter series-connected on the primary side and parallel-connected on the secondary side is presented for server power supply systems. Based on series resonant behaviour, the power metal-oxide-semiconductor field-effect transistors are turned on at zero voltage switching and the rectifier diodes are turned off at zero current switching. Thus, the switching losses on the power semiconductors are reduced. In the proposed converter, the primary windings of the two LLC converters are connected in series. Thus, the two converters have the same primary currents to ensure that they can supply the balance load current. On the output side, two LLC converters are connected in parallel to share the load current and to reduce the current stress on the secondary windings and the rectifier diodes. In this article, the principle of operation, steady-state analysis and design considerations of the proposed converter are provided and discussed. Experiments with a laboratory prototype with a 24 V/21 A output for server power supply were performed to verify the effectiveness of the proposed converter.

  16. Study of solid rocket motors for a space shuttle booster. Volume 4: Mass properties report

    NASA Technical Reports Server (NTRS)

    Vonderesch, A. H.

    1972-01-01

    Mass properties data for the 156 inch diameter, parallel burn, solid propellant rocket engine for the space shuttle booster are presented. Design ground rules and assumptions applicable to generation of the mass properties data are described, together with pertinent data sources.

  17. Working Memory Training: Improving Intelligence--Changing Brain Activity

    ERIC Educational Resources Information Center

    Jausovec, Norbert; Jausovec, Ksenija

    2012-01-01

    The main objectives of the study were: to investigate whether training on working memory (WM) could improve fluid intelligence, and to investigate the effects WM training had on neuroelectric (electroencephalography--EEG) and hemodynamic (near-infrared spectroscopy--NIRS) patterns of brain activity. In a parallel group experimental design,…

  18. Preliminary results of the large experimental wind turbine phase of the national wind energy program

    NASA Technical Reports Server (NTRS)

    Thomas, R. L.; Sholes, T.; Sholes, J. E.

    1975-01-01

    The preliminary results of two projects in the development phase of reliable wind turbines designed to supply cost-competitive electrical energy were discussed. An experimental 100 kW wind turbine design and its status are first reviewed. The results of two parallel design studies for determining the configurations and power levels for wind turbines with minimum energy costs are also discussed. These studies predict wind energy costs of 1.5 to 7 cents per kW-h for wind turbines produced in quantities of 100 to 1000 per year and located at sites having average winds of 12 to 18 mph.

  19. A parallel input composite transimpedance amplifier.

    PubMed

    Kim, D J; Kim, C

    2018-01-01

    A new approach to high performance current to voltage preamplifier design is presented. The design using multiple operational amplifiers (op-amps) has a parasitic capacitance compensation network and a composite amplifier topology for fast, precision, and low noise performance. The input stage consisting of a parallel linked JFET op-amps and a high-speed bipolar junction transistor (BJT) gain stage driving the output in the composite amplifier topology, cooperating with the capacitance compensation feedback network, ensures wide bandwidth stability in the presence of input capacitance above 40 nF. The design is ideal for any two-probe measurement, including high impedance transport and scanning tunneling microscopy measurements.

  20. A parallel input composite transimpedance amplifier

    NASA Astrophysics Data System (ADS)

    Kim, D. J.; Kim, C.

    2018-01-01

    A new approach to high performance current to voltage preamplifier design is presented. The design using multiple operational amplifiers (op-amps) has a parasitic capacitance compensation network and a composite amplifier topology for fast, precision, and low noise performance. The input stage consisting of a parallel linked JFET op-amps and a high-speed bipolar junction transistor (BJT) gain stage driving the output in the composite amplifier topology, cooperating with the capacitance compensation feedback network, ensures wide bandwidth stability in the presence of input capacitance above 40 nF. The design is ideal for any two-probe measurement, including high impedance transport and scanning tunneling microscopy measurements.

  1. Sample size re-estimation and other midcourse adjustments with sequential parallel comparison design.

    PubMed

    Silverman, Rachel K; Ivanova, Anastasia

    2017-01-01

    Sequential parallel comparison design (SPCD) was proposed to reduce placebo response in a randomized trial with placebo comparator. Subjects are randomized between placebo and drug in stage 1 of the trial, and then, placebo non-responders are re-randomized in stage 2. Efficacy analysis includes all data from stage 1 and all placebo non-responding subjects from stage 2. This article investigates the possibility to re-estimate the sample size and adjust the design parameters, allocation proportion to placebo in stage 1 of SPCD, and weight of stage 1 data in the overall efficacy test statistic during an interim analysis.

  2. An efficient dynamic load balancing algorithm

    NASA Astrophysics Data System (ADS)

    Lagaros, Nikos D.

    2014-01-01

    In engineering problems, randomness and uncertainties are inherent. Robust design procedures, formulated in the framework of multi-objective optimization, have been proposed in order to take into account sources of randomness and uncertainty. These design procedures require orders of magnitude more computational effort than conventional analysis or optimum design processes since a very large number of finite element analyses is required to be dealt. It is therefore an imperative need to exploit the capabilities of computing resources in order to deal with this kind of problems. In particular, parallel computing can be implemented at the level of metaheuristic optimization, by exploiting the physical parallelization feature of the nondominated sorting evolution strategies method, as well as at the level of repeated structural analyses required for assessing the behavioural constraints and for calculating the objective functions. In this study an efficient dynamic load balancing algorithm for optimum exploitation of available computing resources is proposed and, without loss of generality, is applied for computing the desired Pareto front. In such problems the computation of the complete Pareto front with feasible designs only, constitutes a very challenging task. The proposed algorithm achieves linear speedup factors and almost 100% speedup factor values with reference to the sequential procedure.

  3. Comprehensive Design Reliability Activities for Aerospace Propulsion Systems

    NASA Technical Reports Server (NTRS)

    Christenson, R. L.; Whitley, M. R.; Knight, K. C.

    2000-01-01

    This technical publication describes the methodology, model, software tool, input data, and analysis result that support aerospace design reliability studies. The focus of these activities is on propulsion systems mechanical design reliability. The goal of these activities is to support design from a reliability perspective. Paralleling performance analyses in schedule and method, this requires the proper use of metrics in a validated reliability model useful for design, sensitivity, and trade studies. Design reliability analysis in this view is one of several critical design functions. A design reliability method is detailed and two example analyses are provided-one qualitative and the other quantitative. The use of aerospace and commercial data sources for quantification is discussed and sources listed. A tool that was developed to support both types of analyses is presented. Finally, special topics discussed include the development of design criteria, issues of reliability quantification, quality control, and reliability verification.

  4. Fault detection for hydraulic pump based on chaotic parallel RBF network

    NASA Astrophysics Data System (ADS)

    Lu, Chen; Ma, Ning; Wang, Zhipeng

    2011-12-01

    In this article, a parallel radial basis function network in conjunction with chaos theory (CPRBF network) is presented, and applied to practical fault detection for hydraulic pump, which is a critical component in aircraft. The CPRBF network consists of a number of radial basis function (RBF) subnets connected in parallel. The number of input nodes for each RBF subnet is determined by different embedding dimension based on chaotic phase-space reconstruction. The output of CPRBF is a weighted sum of all RBF subnets. It was first trained using the dataset from normal state without fault, and then a residual error generator was designed to detect failures based on the trained CPRBF network. Then, failure detection can be achieved by the analysis of the residual error. Finally, two case studies are introduced to compare the proposed CPRBF network with traditional RBF networks, in terms of prediction and detection accuracy.

  5. Parallel Evolutionary Optimization for Neuromorphic Network Training

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schuman, Catherine D; Disney, Adam; Singh, Susheela

    One of the key impediments to the success of current neuromorphic computing architectures is the issue of how best to program them. Evolutionary optimization (EO) is one promising programming technique; in particular, its wide applicability makes it especially attractive for neuromorphic architectures, which can have many different characteristics. In this paper, we explore different facets of EO on a spiking neuromorphic computing model called DANNA. We focus on the performance of EO in the design of our DANNA simulator, and on how to structure EO on both multicore and massively parallel computing systems. We evaluate how our parallel methods impactmore » the performance of EO on Titan, the U.S.'s largest open science supercomputer, and BOB, a Beowulf-style cluster of Raspberry Pi's. We also focus on how to improve the EO by evaluating commonality in higher performing neural networks, and present the result of a study that evaluates the EO performed by Titan.« less

  6. Lower Limb Rehabilitation Using Patient Data

    PubMed Central

    Saadat, Mozafar

    2016-01-01

    The aim of this study is to investigate the performance of a 6-DoF parallel robot in tracking the movement of the foot trajectory of a paretic leg during a single stride. The foot trajectories of nine patients with a paretic leg including both males and females have been measured and analysed by a Vicon system in a gait laboratory. Based on kinematic and dynamic analysis of a 6-DoF UPS parallel robot, an algorithm was developed in MATLAB to calculate the length of the actuators and their required forces during all trajectories. The workspace and singularity points of the robot were then investigated in nine different cases. A 6-DoF UPS parallel robot prototype with high repeatability was designed and built in order to simulate a single stride. Results showed that the robot was capable of tracking all of the trajectories with the maximum position error of 1.2 mm. PMID:27721648

  7. Oxytocin: parallel processing in the social brain?

    PubMed

    Dölen, Gül

    2015-06-01

    Early studies attempting to disentangle the network complexity of the brain exploited the accessibility of sensory receptive fields to reveal circuits made up of synapses connected both in series and in parallel. More recently, extension of this organisational principle beyond the sensory systems has been made possible by the advent of modern molecular, viral and optogenetic approaches. Here, evidence supporting parallel processing of social behaviours mediated by oxytocin is reviewed. Understanding oxytocinergic signalling from this perspective has significant implications for the design of oxytocin-based therapeutic interventions aimed at disorders such as autism, where disrupted social function is a core clinical feature. Moreover, identification of opportunities for novel technology development will require a better appreciation of the complexity of the circuit-level organisation of the social brain. © 2015 The Authors. Journal of Neuroendocrinology published by John Wiley & Sons Ltd on behalf of British Society for Neuroendocrinology.

  8. Comparing an FPGA to a Cell for an Image Processing Application

    NASA Astrophysics Data System (ADS)

    Rakvic, Ryan N.; Ngo, Hau; Broussard, Randy P.; Ives, Robert W.

    2010-12-01

    Modern advancements in configurable hardware, most notably Field-Programmable Gate Arrays (FPGAs), have provided an exciting opportunity to discover the parallel nature of modern image processing algorithms. On the other hand, PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high performance. In this research project, our aim is to study the differences in performance of a modern image processing algorithm on these two hardware platforms. In particular, Iris Recognition Systems have recently become an attractive identification method because of their extremely high accuracy. Iris matching, a repeatedly executed portion of a modern iris recognition algorithm, is parallelized on an FPGA system and a Cell processor. We demonstrate a 2.5 times speedup of the parallelized algorithm on the FPGA system when compared to a Cell processor-based version.

  9. Quadrature transmit array design using single-feed circularly polarized patch antenna for parallel transmission in MR imaging.

    PubMed

    Pang, Yong; Yu, Baiying; Vigneron, Daniel B; Zhang, Xiaoliang

    2014-02-01

    Quadrature coils are often desired in MR applications because they can improve MR sensitivity and also reduce excitation power. In this work, we propose, for the first time, a quadrature array design strategy for parallel transmission at 298 MHz using single-feed circularly polarized (CP) patch antenna technique. Each array element is a nearly square ring microstrip antenna and is fed at a point on the diagonal of the antenna to generate quadrature magnetic fields. Compared with conventional quadrature coils, the single-feed structure is much simple and compact, making the quadrature coil array design practical. Numerical simulations demonstrate that the decoupling between elements is better than -35 dB for all the elements and the RF fields are homogeneous with deep penetration and quadrature behavior in the area of interest. Bloch equation simulation is also performed to simulate the excitation procedure by using an 8-element quadrature planar patch array to demonstrate its feasibility in parallel transmission at the ultrahigh field of 7 Tesla.

  10. Method and structure for skewed block-cyclic distribution of lower-dimensional data arrays in higher-dimensional processor grids

    DOEpatents

    Chatterjee, Siddhartha [Yorktown Heights, NY; Gunnels, John A [Brewster, NY

    2011-11-08

    A method and structure of distributing elements of an array of data in a computer memory to a specific processor of a multi-dimensional mesh of parallel processors includes designating a distribution of elements of at least a portion of the array to be executed by specific processors in the multi-dimensional mesh of parallel processors. The pattern of the designating includes a cyclical repetitive pattern of the parallel processor mesh, as modified to have a skew in at least one dimension so that both a row of data in the array and a column of data in the array map to respective contiguous groupings of the processors such that a dimension of the contiguous groupings is greater than one.

  11. Parallel Grand Canonical Monte Carlo (ParaGrandMC) Simulation Code

    NASA Technical Reports Server (NTRS)

    Yamakov, Vesselin I.

    2016-01-01

    This report provides an overview of the Parallel Grand Canonical Monte Carlo (ParaGrandMC) simulation code. This is a highly scalable parallel FORTRAN code for simulating the thermodynamic evolution of metal alloy systems at the atomic level, and predicting the thermodynamic state, phase diagram, chemical composition and mechanical properties. The code is designed to simulate multi-component alloy systems, predict solid-state phase transformations such as austenite-martensite transformations, precipitate formation, recrystallization, capillary effects at interfaces, surface absorption, etc., which can aid the design of novel metallic alloys. While the software is mainly tailored for modeling metal alloys, it can also be used for other types of solid-state systems, and to some degree for liquid or gaseous systems, including multiphase systems forming solid-liquid-gas interfaces.

  12. Parallel Optical Random Access Memory (PORAM)

    NASA Technical Reports Server (NTRS)

    Alphonse, G. A.

    1989-01-01

    It is shown that the need to minimize component count, power and size, and to maximize packing density require a parallel optical random access memory to be designed in a two-level hierarchy: a modular level and an interconnect level. Three module designs are proposed, in the order of research and development requirements. The first uses state-of-the-art components, including individually addressed laser diode arrays, acousto-optic (AO) deflectors and magneto-optic (MO) storage medium, aimed at moderate size, moderate power, and high packing density. The next design level uses an electron-trapping (ET) medium to reduce optical power requirements. The third design uses a beam-steering grating surface emitter (GSE) array to reduce size further and minimize the number of components.

  13. A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers

    NASA Technical Reports Server (NTRS)

    Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)

    1997-01-01

    The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on-going study progresses.

  14. Focusing, refraction, and asymmetric transmission of elastic waves in solid metamaterials with aligned parallel gaps.

    PubMed

    Su, Xiaoshi; Norris, Andrew N

    2016-06-01

    Gradient index (GRIN), refractive, and asymmetric transmission devices for elastic waves are designed using a solid with aligned parallel gaps. The gaps are assumed to be thin so that they can be considered as parallel cracks separating elastic plate waveguides. The plates do not interact with one another directly, only at their ends where they connect to the exterior solid. To formulate the transmission and reflection coefficients for SV- and P-waves, an analytical model is established using thin plate theory that couples the waveguide modes with the waves in the exterior body. The GRIN lens is designed by varying the thickness of the plates to achieve different flexural wave speeds. The refractive effect of SV-waves is achieved by designing the slope of the edge of the plate array, and keeping the ratio between plate length and flexural wavelength fixed. The asymmetric transmission of P-waves is achieved by sending an incident P-wave at a critical angle, at which total conversion to SV-wave occurs. An array of parallel gaps perpendicular to the propagation direction of the reflected waves stop the SV-wave but let P-waves travel through. Examples of focusing, steering, and asymmetric transmission devices are discussed.

  15. Electrode pattern design for GaAs betavoltaic batteries

    NASA Astrophysics Data System (ADS)

    Haiyang, Chen; Jianhua, Yin; Darang, Li

    2011-08-01

    The sensitivities of betavoltaic batteries and photovoltaic batteries to series and parallel resistance are studied. Based on the study, an electrode pattern design principle of GaAs betavoltaic batteries is proposed. GaAs PIN junctions with and without the proposed electrode pattern are fabricated and measured under the illumination of 63Ni. Results show that the proposed electrode can reduce the backscattering and shadowing for the beta particles from 63Ni to increase the GaAs betavoltaic battery short circuit currents effectively but has little impact on the fill factors and ideal factors.

  16. The Pasm Parallel Processing System: Design, Simulation, and Image Processing Applications. Volume 1

    DTIC Science & Technology

    1989-12-31

    was not funded and the author and other members of the PASM group turned their attention to related studies such as the further definition of the... Four such quadrants comprise the set of MCs and PEs. The logical PE i within each MC group is serviced by MSU i. Figure 1.9.3 shows that the MSUs are...instead of four Data Input Queues and with a 4-by-4 instead of a 2-by-2 crossbar switch could be designed. Studies indicate that the perfor- mance of a

  17. Scalable and reusable emulator for evaluating the performance of SS7 networks

    NASA Astrophysics Data System (ADS)

    Lazar, Aurel A.; Tseng, Kent H.; Lim, Koon Seng; Choe, Winston

    1994-04-01

    A scalable and reusable emulator was designed and implemented for studying the behavior of SS7 networks. The emulator design was largely based on public domain software. It was developed on top of an environment supported by PVM, the Parallel Virtual Machine, and managed by OSIMIS-the OSI Management Information Service platform. The emulator runs on top of a commercially available ATM LAN interconnecting engineering workstations. As a case study for evaluating the emulator, the behavior of the Singapore National SS7 Network under fault and unbalanced loading conditions was investigated.

  18. Design of a biologically inspired lower limb exoskeleton for human gait rehabilitation.

    PubMed

    Lyu, Mingxing; Chen, Weihai; Ding, Xilun; Wang, Jianhua; Bai, Shaoping; Ren, Huichao

    2016-10-01

    This paper proposes a novel bionic model of the human leg according to the theory of physiology. Based on this model, we present a biologically inspired 3-degree of freedom (DOF) lower limb exoskeleton for human gait rehabilitation, showing that the lower limb exoskeleton is fully compatible with the human knee joint. The exoskeleton has a hybrid serial-parallel kinematic structure consisting of a 1-DOF hip joint module and a 2-DOF knee joint module in the sagittal plane. A planar 2-DOF parallel mechanism is introduced in the design to fully accommodate the motion of the human knee joint, which features not only rotation but also relative sliding. Therefore, the design is consistent with the requirements of bionics. The forward and inverse kinematic analysis is studied and the workspace of the exoskeleton is analyzed. The structural parameters are optimized to obtain a larger workspace. The results using MATLAB-ADAMS co-simulation are shown in this paper to demonstrate the feasibility of our design. A prototype of the exoskeleton is also developed and an experiment performed to verify the kinematic analysis. Compared with existing lower limb exoskeletons, the designed mechanism has a large workspace, while allowing knee joint rotation and small amount of sliding.

  19. Design of a biologically inspired lower limb exoskeleton for human gait rehabilitation

    NASA Astrophysics Data System (ADS)

    Lyu, Mingxing; Chen, Weihai; Ding, Xilun; Wang, Jianhua; Bai, Shaoping; Ren, Huichao

    2016-10-01

    This paper proposes a novel bionic model of the human leg according to the theory of physiology. Based on this model, we present a biologically inspired 3-degree of freedom (DOF) lower limb exoskeleton for human gait rehabilitation, showing that the lower limb exoskeleton is fully compatible with the human knee joint. The exoskeleton has a hybrid serial-parallel kinematic structure consisting of a 1-DOF hip joint module and a 2-DOF knee joint module in the sagittal plane. A planar 2-DOF parallel mechanism is introduced in the design to fully accommodate the motion of the human knee joint, which features not only rotation but also relative sliding. Therefore, the design is consistent with the requirements of bionics. The forward and inverse kinematic analysis is studied and the workspace of the exoskeleton is analyzed. The structural parameters are optimized to obtain a larger workspace. The results using MATLAB-ADAMS co-simulation are shown in this paper to demonstrate the feasibility of our design. A prototype of the exoskeleton is also developed and an experiment performed to verify the kinematic analysis. Compared with existing lower limb exoskeletons, the designed mechanism has a large workspace, while allowing knee joint rotation and small amount of sliding.

  20. How Depressive Levels Are Related to the Adults' Experiences of Lower-Limb Amputation: A Mixed Methods Pilot Study

    ERIC Educational Resources Information Center

    Senra, Hugo

    2013-01-01

    The current pilot study aims to explore whether different adults' experiences of lower-limb amputation could be associated with different levels of depression. To achieve these study objectives, a convergent parallel mixed methods design was used in a convenience sample of 42 adult amputees (mean age of 61 years; SD = 13.5). All of them had…

  1. An Australian and New Zealand Scoping Study on the Use of 3D Immersive Virtual Worlds in Higher Education

    ERIC Educational Resources Information Center

    Dalgarno, Barney; Lee, Mark J. W.; Carlson, Lauren; Gregory, Sue; Tynan, Belinda

    2011-01-01

    This article describes the research design of, and reports selected findings from, a scoping study aimed at examining current and planned applications of 3D immersive virtual worlds at higher education institutions across Australia and New Zealand. The scoping study is the first of its kind in the region, intended to parallel and complement a…

  2. Beyond the treatment effect: Evaluating the effects of patient preferences in randomised trials.

    PubMed

    Walter, S D; Turner, R; Macaskill, P; McCaffery, K J; Irwig, L

    2017-02-01

    The treatments under comparison in a randomised trial should ideally have equal value and acceptability - a position of equipoise - to study participants. However, it is unlikely that true equipoise exists in practice, because at least some participants may have preferences for one treatment or the other, for a variety of reasons. These preferences may be related to study outcomes, and hence affect the estimation of the treatment effect. Furthermore, the effects of preferences can sometimes be substantial, and may even be larger than the direct effect of treatment. Preference effects are of interest in their own right, but they cannot be assessed in the standard parallel group design for a randomised trial. In this paper, we describe a model to represent the impact of preferences on trial outcomes, in addition to the usual treatment effect. In particular, we describe how outcomes might differ between participants who would choose one treatment or the other, if they were free to do so. Additionally, we investigate the difference in outcomes depending on whether or not a participant receives his or her preferred treatment, which we characterise through a so-called preference effect. We then discuss several study designs that have been proposed to measure and exploit data on preferences, and which constitute alternatives to the conventional parallel group design. Based on the model framework, we determine which of the various preference effects can or cannot be estimated with each design. We also illustrate these ideas with some examples of preference designs from the literature.

  3. Reading through Films

    ERIC Educational Resources Information Center

    Raman, Madhavi Gayathri; Vijaya

    2016-01-01

    This paper captures the design of a comprehensive curriculum incorporating the four skills based exclusively on the use of parallel audio-visual and written texts. We discuss the use of authentic materials to teach English to Indian undergraduates aged 18 to 20 years. Specifically, we talk about the use of parallel reading (screen-play) and…

  4. Introduction to Computers: Parallel Alternative Strategies for Students. Course No. 0200000.

    ERIC Educational Resources Information Center

    Chauvenne, Sherry; And Others

    Parallel Alternative Strategies for Students (PASS) is a content-centered package of alternative methods and materials designed to assist secondary teachers to meet the needs of mainstreamed learning-disabled and emotionally-handicapped students of various achievement levels in the basic education content courses. This supplementary text and…

  5. Issues of planning trajectory of parallel robots taking into account zones of singularity

    NASA Astrophysics Data System (ADS)

    Rybak, L. A.; Khalapyan, S. Y.; Gaponenko, E. V.

    2018-03-01

    A method for determining the design characteristics of a parallel robot necessary to provide specified parameters of its working space that satisfy the controllability requirement is developed. The experimental verification of the proposed method was carried out using an approximate planar 3-RPR mechanism.

  6. Diderot: a Domain-Specific Language for Portable Parallel Scientific Visualization and Image Analysis.

    PubMed

    Kindlmann, Gordon; Chiw, Charisee; Seltzer, Nicholas; Samuels, Lamont; Reppy, John

    2016-01-01

    Many algorithms for scientific visualization and image analysis are rooted in the world of continuous scalar, vector, and tensor fields, but are programmed in low-level languages and libraries that obscure their mathematical foundations. Diderot is a parallel domain-specific language that is designed to bridge this semantic gap by providing the programmer with a high-level, mathematical programming notation that allows direct expression of mathematical concepts in code. Furthermore, Diderot provides parallel performance that takes advantage of modern multicore processors and GPUs. The high-level notation allows a concise and natural expression of the algorithms and the parallelism allows efficient execution on real-world datasets.

  7. A mirror for lab-based quasi-monochromatic parallel x-rays

    NASA Astrophysics Data System (ADS)

    Nguyen, Thanhhai; Lu, Xun; Lee, Chang Jun; Jung, Jin-Ho; Jin, Gye-Hwan; Kim, Sung Youb; Jeon, Insu

    2014-09-01

    A multilayered parabolic mirror with six W/Al bilayers was designed and fabricated to generate monochromatic parallel x-rays using a lab-based x-ray source. Using this mirror, curved bright bands were obtained in x-ray images as reflected x-rays. The parallelism of the reflected x-rays was investigated using the shape of the bands. The intensity and monochromatic characteristics of the reflected x-rays were evaluated through measurements of the x-ray spectra in the band. High intensity, nearly monochromatic, and parallel x-rays, which can be used for high resolution x-ray microscopes and local radiation therapy systems, were obtained.

  8. From experiment to design -- Fault characterization and detection in parallel computer systems using computational accelerators

    NASA Astrophysics Data System (ADS)

    Yim, Keun Soo

    This dissertation summarizes experimental validation and co-design studies conducted to optimize the fault detection capabilities and overheads in hybrid computer systems (e.g., using CPUs and Graphics Processing Units, or GPUs), and consequently to improve the scalability of parallel computer systems using computational accelerators. The experimental validation studies were conducted to help us understand the failure characteristics of CPU-GPU hybrid computer systems under various types of hardware faults. The main characterization targets were faults that are difficult to detect and/or recover from, e.g., faults that cause long latency failures (Ch. 3), faults in dynamically allocated resources (Ch. 4), faults in GPUs (Ch. 5), faults in MPI programs (Ch. 6), and microarchitecture-level faults with specific timing features (Ch. 7). The co-design studies were based on the characterization results. One of the co-designed systems has a set of source-to-source translators that customize and strategically place error detectors in the source code of target GPU programs (Ch. 5). Another co-designed system uses an extension card to learn the normal behavioral and semantic execution patterns of message-passing processes executing on CPUs, and to detect abnormal behaviors of those parallel processes (Ch. 6). The third co-designed system is a co-processor that has a set of new instructions in order to support software-implemented fault detection techniques (Ch. 7). The work described in this dissertation gains more importance because heterogeneous processors have become an essential component of state-of-the-art supercomputers. GPUs were used in three of the five fastest supercomputers that were operating in 2011. Our work included comprehensive fault characterization studies in CPU-GPU hybrid computers. In CPUs, we monitored the target systems for a long period of time after injecting faults (a temporally comprehensive experiment), and injected faults into various types of program states that included dynamically allocated memory (to be spatially comprehensive). In GPUs, we used fault injection studies to demonstrate the importance of detecting silent data corruption (SDC) errors that are mainly due to the lack of fine-grained protections and the massive use of fault-insensitive data. This dissertation also presents transparent fault tolerance frameworks and techniques that are directly applicable to hybrid computers built using only commercial off-the-shelf hardware components. This dissertation shows that by developing understanding of the failure characteristics and error propagation paths of target programs, we were able to create fault tolerance frameworks and techniques that can quickly detect and recover from hardware faults with low performance and hardware overheads.

  9. Parallel-vector computation for linear structural analysis and non-linear unconstrained optimization problems

    NASA Technical Reports Server (NTRS)

    Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.

    1991-01-01

    Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.

  10. A parallel algorithm for the two-dimensional time fractional diffusion equation with implicit difference method.

    PubMed

    Gong, Chunye; Bao, Weimin; Tang, Guojian; Jiang, Yuewen; Liu, Jie

    2014-01-01

    It is very time consuming to solve fractional differential equations. The computational complexity of two-dimensional fractional differential equation (2D-TFDE) with iterative implicit finite difference method is O(M(x)M(y)N(2)). In this paper, we present a parallel algorithm for 2D-TFDE and give an in-depth discussion about this algorithm. A task distribution model and data layout with virtual boundary are designed for this parallel algorithm. The experimental results show that the parallel algorithm compares well with the exact solution. The parallel algorithm on single Intel Xeon X5540 CPU runs 3.16-4.17 times faster than the serial algorithm on single CPU core. The parallel efficiency of 81 processes is up to 88.24% compared with 9 processes on a distributed memory cluster system. We do think that the parallel computing technology will become a very basic method for the computational intensive fractional applications in the near future.

  11. Spatiotemporal Domain Decomposition for Massive Parallel Computation of Space-Time Kernel Density

    NASA Astrophysics Data System (ADS)

    Hohl, A.; Delmelle, E. M.; Tang, W.

    2015-07-01

    Accelerated processing capabilities are deemed critical when conducting analysis on spatiotemporal datasets of increasing size, diversity and availability. High-performance parallel computing offers the capacity to solve computationally demanding problems in a limited timeframe, but likewise poses the challenge of preventing processing inefficiency due to workload imbalance between computing resources. Therefore, when designing new algorithms capable of implementing parallel strategies, careful spatiotemporal domain decomposition is necessary to account for heterogeneity in the data. In this study, we perform octtree-based adaptive decomposition of the spatiotemporal domain for parallel computation of space-time kernel density. In order to avoid edge effects near subdomain boundaries, we establish spatiotemporal buffers to include adjacent data-points that are within the spatial and temporal kernel bandwidths. Then, we quantify computational intensity of each subdomain to balance workloads among processors. We illustrate the benefits of our methodology using a space-time epidemiological dataset of Dengue fever, an infectious vector-borne disease that poses a severe threat to communities in tropical climates. Our parallel implementation of kernel density reaches substantial speedup compared to sequential processing, and achieves high levels of workload balance among processors due to great accuracy in quantifying computational intensity. Our approach is portable of other space-time analytical tests.

  12. Computational time analysis of the numerical solution of 3D electrostatic Poisson's equation

    NASA Astrophysics Data System (ADS)

    Kamboh, Shakeel Ahmed; Labadin, Jane; Rigit, Andrew Ragai Henri; Ling, Tech Chaw; Amur, Khuda Bux; Chaudhary, Muhammad Tayyab

    2015-05-01

    3D Poisson's equation is solved numerically to simulate the electric potential in a prototype design of electrohydrodynamic (EHD) ion-drag micropump. Finite difference method (FDM) is employed to discretize the governing equation. The system of linear equations resulting from FDM is solved iteratively by using the sequential Jacobi (SJ) and sequential Gauss-Seidel (SGS) methods, simulation results are also compared to examine the difference between the results. The main objective was to analyze the computational time required by both the methods with respect to different grid sizes and parallelize the Jacobi method to reduce the computational time. In common, the SGS method is faster than the SJ method but the data parallelism of Jacobi method may produce good speedup over SGS method. In this study, the feasibility of using parallel Jacobi (PJ) method is attempted in relation to SGS method. MATLAB Parallel/Distributed computing environment is used and a parallel code for SJ method is implemented. It was found that for small grid size the SGS method remains dominant over SJ method and PJ method while for large grid size both the sequential methods may take nearly too much processing time to converge. Yet, the PJ method reduces computational time to some extent for large grid sizes.

  13. A parallel variable metric optimization algorithm

    NASA Technical Reports Server (NTRS)

    Straeter, T. A.

    1973-01-01

    An algorithm, designed to exploit the parallel computing or vector streaming (pipeline) capabilities of computers is presented. When p is the degree of parallelism, then one cycle of the parallel variable metric algorithm is defined as follows: first, the function and its gradient are computed in parallel at p different values of the independent variable; then the metric is modified by p rank-one corrections; and finally, a single univariant minimization is carried out in the Newton-like direction. Several properties of this algorithm are established. The convergence of the iterates to the solution is proved for a quadratic functional on a real separable Hilbert space. For a finite-dimensional space the convergence is in one cycle when p equals the dimension of the space. Results of numerical experiments indicate that the new algorithm will exploit parallel or pipeline computing capabilities to effect faster convergence than serial techniques.

  14. A new parallel-vector finite element analysis software on distributed-memory computers

    NASA Technical Reports Server (NTRS)

    Qin, Jiangning; Nguyen, Duc T.

    1993-01-01

    A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.

  15. Design and realization of test system for testing parallelism and jumpiness of optical axis of photoelectric equipment

    NASA Astrophysics Data System (ADS)

    Shi, Sheng-bing; Chen, Zhen-xing; Qin, Shao-gang; Song, Chun-yan; Jiang, Yun-hong

    2014-09-01

    With the development of science and technology, photoelectric equipment comprises visible system, infrared system, laser system and so on, integration, information and complication are higher than past. Parallelism and jumpiness of optical axis are important performance of photoelectric equipment,directly affect aim, ranging, orientation and so on. Jumpiness of optical axis directly affect hit precision of accurate point damage weapon, but we lack the facility which is used for testing this performance. In this paper, test system which is used fo testing parallelism and jumpiness of optical axis is devised, accurate aim isn't necessary and data processing are digital in the course of testing parallelism, it can finish directly testing parallelism of multi-axes, aim axis and laser emission axis, parallelism of laser emission axis and laser receiving axis and first acuualizes jumpiness of optical axis of optical sighting device, it's a universal test system.

  16. The Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/0 requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley's file structure and application interface, as well as the performance advantages offered by that interface.

  17. Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Samuel; Oliker, Leonid; Vuduc, Richard

    2008-10-16

    We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one ofmore » the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less

  18. Physics based modeling of a series parallel battery pack for asymmetry analysis, predictive control and life extension

    NASA Astrophysics Data System (ADS)

    Ganesan, Nandhini; Basu, Suman; Hariharan, Krishnan S.; Kolake, Subramanya Mayya; Song, Taewon; Yeo, Taejung; Sohn, Dong Kee; Doo, Seokgwang

    2016-08-01

    Lithium-Ion batteries used for electric vehicle applications are subject to large currents and various operation conditions, making battery pack design and life extension a challenging problem. With increase in complexity, modeling and simulation can lead to insights that ensure optimal performance and life extension. In this manuscript, an electrochemical-thermal (ECT) coupled model for a 6 series × 5 parallel pack is developed for Li ion cells with NCA/C electrodes and validated against experimental data. Contribution of the cathode to overall degradation at various operating conditions is assessed. Pack asymmetry is analyzed from a design and an operational perspective. Design based asymmetry leads to a new approach of obtaining the individual cell responses of the pack from an average ECT output. Operational asymmetry is demonstrated in terms of effects of thermal gradients on cycle life, and an efficient model predictive control technique is developed. Concept of reconfigurable battery pack is studied using detailed simulations that can be used for effective monitoring and extension of battery pack life.

  19. Systems-on-chip approach for real-time simulation of wheel-rail contact laws

    NASA Astrophysics Data System (ADS)

    Mei, T. X.; Zhou, Y. J.

    2013-04-01

    This paper presents the development of a systems-on-chip approach to speed up the simulation of wheel-rail contact laws, which can be used to reduce the requirement for high-performance computers and enable simulation in real time for the use of hardware-in-loop for experimental studies of the latest vehicle dynamic and control technologies. The wheel-rail contact laws are implemented using a field programmable gate array (FPGA) device with a design that substantially outperforms modern general-purpose PC platforms or fixed architecture digital signal processor devices in terms of processing time, configuration flexibility and cost. In order to utilise the FPGA's parallel-processing capability, the operations in the contact laws algorithms are arranged in a parallel manner and multi-contact patches are tackled simultaneously in the design. The interface between the FPGA device and the host PC is achieved by using a high-throughput and low-latency Ethernet link. The development is based on FASTSIM algorithms, although the design can be adapted and expanded for even more computationally demanding tasks.

  20. Parallel Monte Carlo transport modeling in the context of a time-dependent, three-dimensional multi-physics code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Procassini, R.J.

    1997-12-31

    The fine-scale, multi-space resolution that is envisioned for accurate simulations of complex weapons systems in three spatial dimensions implies flop-rate and memory-storage requirements that will only be obtained in the near future through the use of parallel computational techniques. Since the Monte Carlo transport models in these simulations usually stress both of these computational resources, they are prime candidates for parallelization. The MONACO Monte Carlo transport package, which is currently under development at LLNL, will utilize two types of parallelism within the context of a multi-physics design code: decomposition of the spatial domain across processors (spatial parallelism) and distribution ofmore » particles in a given spatial subdomain across additional processors (particle parallelism). This implementation of the package will utilize explicit data communication between domains (message passing). Such a parallel implementation of a Monte Carlo transport model will result in non-deterministic communication patterns. The communication of particles between subdomains during a Monte Carlo time step may require a significant level of effort to achieve a high parallel efficiency.« less

  1. Modelling and simulation of parallel triangular triple quantum dots (TTQD) by using SIMON 2.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fathany, Maulana Yusuf, E-mail: myfathany@gmail.com; Fuada, Syifaul, E-mail: fsyifaul@gmail.com; Lawu, Braham Lawas, E-mail: bram-labs@rocketmail.com

    2016-04-19

    This research presents analysis of modeling on Parallel Triple Quantum Dots (TQD) by using SIMON (SIMulation Of Nano-structures). Single Electron Transistor (SET) is used as the basic concept of modeling. We design the structure of Parallel TQD by metal material with triangular geometry model, it is called by Triangular Triple Quantum Dots (TTQD). We simulate it with several scenarios using different parameters; such as different value of capacitance, various gate voltage, and different thermal condition.

  2. Hypercluster Parallel Processor

    NASA Technical Reports Server (NTRS)

    Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela

    1992-01-01

    Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.

  3. Multidisciplinary systems optimization by linear decomposition

    NASA Technical Reports Server (NTRS)

    Sobieski, J.

    1984-01-01

    In a typical design process major decisions are made sequentially. An illustrated example is given for an aircraft design in which the aerodynamic shape is usually decided first, then the airframe is sized for strength and so forth. An analogous sequence could be laid out for any other major industrial product, for instance, a ship. The loops in the discipline boxes symbolize iterative design improvements carried out within the confines of a single engineering discipline, or subsystem. The loops spanning several boxes depict multidisciplinary design improvement iterations. Omitted for graphical simplicity is parallelism of the disciplinary subtasks. The parallelism is important in order to develop a broad workfront necessary to shorten the design time. If all the intradisciplinary and interdisciplinary iterations were carried out to convergence, the process could yield a numerically optimal design. However, it usually stops short of that because of time and money limitations. This is especially true for the interdisciplinary iterations.

  4. Athena Mission Status

    NASA Astrophysics Data System (ADS)

    Lumb, D.

    2016-07-01

    Athena has been selected by ESA for its second large mission opportunity of the Cosmic Visions programme, to address the theme of the Hot and Energetic Universe. Following the submission of a proposal from the community, the technical and programmatic aspects of the mission design were reviewed in ESA's Concurrent Design Facility. The proposed concept was deemed to betechnically feasible, but with potential constraints from cost and schedule. Two parallel industry study contracts have been conducted to explore these conclusions more thoroughly, with the key aim of providing consolidated inputs to a Mission Consolidation Review that was conducted in April-May 2016. This MCR has recommended a baseline design, which allows the agency to solicit proposals for a community provided payload. Key design aspects arising from the studies are described, and the new reference design is summarised.

  5. Data decomposition method for parallel polygon rasterization considering load balancing

    NASA Astrophysics Data System (ADS)

    Zhou, Chen; Chen, Zhenjie; Liu, Yongxue; Li, Feixue; Cheng, Liang; Zhu, A.-xing; Li, Manchun

    2015-12-01

    It is essential to adopt parallel computing technology to rapidly rasterize massive polygon data. In parallel rasterization, it is difficult to design an effective data decomposition method. Conventional methods ignore load balancing of polygon complexity in parallel rasterization and thus fail to achieve high parallel efficiency. In this paper, a novel data decomposition method based on polygon complexity (DMPC) is proposed. First, four factors that possibly affect the rasterization efficiency were investigated. Then, a metric represented by the boundary number and raster pixel number in the minimum bounding rectangle was developed to calculate the complexity of each polygon. Using this metric, polygons were rationally allocated according to the polygon complexity, and each process could achieve balanced loads of polygon complexity. To validate the efficiency of DMPC, it was used to parallelize different polygon rasterization algorithms and tested on different datasets. Experimental results showed that DMPC could effectively parallelize polygon rasterization algorithms. Furthermore, the implemented parallel algorithms with DMPC could achieve good speedup ratios of at least 15.69 and generally outperformed conventional decomposition methods in terms of parallel efficiency and load balancing. In addition, the results showed that DMPC exhibited consistently better performance for different spatial distributions of polygons.

  6. Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores

    NASA Astrophysics Data System (ADS)

    Kegel, Philipp; Schellmann, Maraike; Gorlatch, Sergei

    We compare two parallel programming approaches for multi-core systems: the well-known OpenMP and the recently introduced Threading Building Blocks (TBB) library by Intel®. The comparison is made using the parallelization of a real-world numerical algorithm for medical imaging. We develop several parallel implementations, and compare them w.r.t. programming effort, programming style and abstraction, and runtime performance. We show that TBB requires a considerable program re-design, whereas with OpenMP simple compiler directives are sufficient. While TBB appears to be less appropriate for parallelizing existing implementations, it fosters a good programming style and higher abstraction level for newly developed parallel programs. Our experimental measurements on a dual quad-core system demonstrate that OpenMP slightly outperforms TBB in our implementation.

  7. Topology search of 3-DOF translational parallel manipulators with three identical limbs for leg mechanisms

    NASA Astrophysics Data System (ADS)

    Wang, Mingfeng; Ceccarelli, Marco

    2015-07-01

    Three-degree of freedom(3-DOF) translational parallel manipulators(TPMs) have been widely studied both in industry and academia in the past decades. However, most architectures of 3-DOF TPMs are created mainly on designers' intuition, empirical knowledge, or associative reasoning and the topology synthesis researches of 3-DOF TPMs are still limited. In order to find out the atlas of designs for 3-DOF TPMs, a topology search is presented for enumeration of 3-DOF TPMs whose limbs can be modeled as 5-DOF serial chains. The proposed topology search of 3-DOF TPMs is aimed to overcome the sensitivities of the design solution of a 3-DOF TPM for a LARM leg mechanism in a biped robot. The topology search, which is based on the concept of generation and specialization in graph theory, is reported as a step-by-step procedure with desired specifications, principle and rules of generalization, design requirements and constraints, and algorithm of number synthesis. In order to obtain new feasible designs for a chosen example and to limit the search domain under general considerations, one topological generalized kinematic chain is chosen to be specialized. An atlas of new feasible designs is obtained and analyzed for a specific solution as leg mechanisms. The proposed methodology provides a topology search for 3-DOF TPMs for leg mechanisms, but it can be also expanded for other applications and tasks.

  8. Comparison of capacitive and radio frequency resonator sensors for monitoring parallelized droplet microfluidic production.

    PubMed

    Conchouso, David; McKerricher, Garret; Arevalo, Arpys; Castro, David; Shamim, Atif; Foulds, Ian G

    2016-08-16

    Scaled-up production of microfluidic droplets, through the parallelization of hundreds of droplet generators, has received a lot of attention to bring novel multiphase microfluidics research to industrial applications. However, apart from droplet generation, other significant challenges relevant to this goal have never been discussed. Examples include monitoring systems, high-throughput processing of droplets and quality control procedures among others. In this paper, we present and compare capacitive and radio frequency (RF) resonator sensors as two candidates that can measure the dielectric properties of emulsions in microfluidic channels. By placing several of these sensors in a parallelization device, the stability of the droplet generation at different locations can be compared, and potential malfunctions can be detected. This strategy enables for the first time the monitoring of scaled-up microfluidic droplet production. Both sensors were prototyped and characterized using emulsions with droplets of 100-150 μm in diameter, which were generated in parallelization devices at water-in-oil volume fractions (φ) between 11.1% and 33.3%.Using these sensors, we were able to measure accurately increments as small as 2.4% in the water volume fraction of the emulsions. Although both methods rely on the dielectric properties of the emulsions, the main advantage of the RF resonator sensors is the fact that they can be designed to resonate at multiple frequencies of the broadband transmission line. Consequently with careful design, two or more sensors can be parallelized and read out by a single signal. Finally, a comparison between these sensors based on their sensitivity, readout cost and simplicity, and design flexibility is also discussed.

  9. Study and Development of an Air Conditioning System Operating on a Magnetic Heat Pump Cycle

    NASA Technical Reports Server (NTRS)

    Wang, Pao-Lien

    1991-01-01

    This report describes the design of a laboratory scale demonstration prototype of an air conditioning system operating on a magnetic heat pump cycle. Design parameters were selected through studies performed by a Kennedy Space Center (KSC) System Simulation Computer Model. The heat pump consists of a rotor turning through four magnetic fields that are created by permanent magnets. Gadolinium was selected as the working material for this demonstration prototype. The rotor was designed to be constructed of flat parallel disks of gadolinium with very little space in between. The rotor rotates in an aluminum housing. The laboratory scale demonstration prototype is designed to provide a theoretical Carnot Cycle efficiency of 62 percent and a Coefficient of Performance of 16.55.

  10. Massively parallel de novo protein design for targeted therapeutics.

    PubMed

    Chevalier, Aaron; Silva, Daniel-Adriano; Rocklin, Gabriel J; Hicks, Derrick R; Vergara, Renan; Murapa, Patience; Bernard, Steffen M; Zhang, Lu; Lam, Kwok-Ho; Yao, Guorui; Bahl, Christopher D; Miyashita, Shin-Ichiro; Goreshnik, Inna; Fuller, James T; Koday, Merika T; Jenkins, Cody M; Colvin, Tom; Carter, Lauren; Bohn, Alan; Bryan, Cassie M; Fernández-Velasco, D Alejandro; Stewart, Lance; Dong, Min; Huang, Xuhui; Jin, Rongsheng; Wilson, Ian A; Fuller, Deborah H; Baker, David

    2017-10-05

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37-43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.

  11. Massively parallel de novo protein design for targeted therapeutics

    NASA Astrophysics Data System (ADS)

    Chevalier, Aaron; Silva, Daniel-Adriano; Rocklin, Gabriel J.; Hicks, Derrick R.; Vergara, Renan; Murapa, Patience; Bernard, Steffen M.; Zhang, Lu; Lam, Kwok-Ho; Yao, Guorui; Bahl, Christopher D.; Miyashita, Shin-Ichiro; Goreshnik, Inna; Fuller, James T.; Koday, Merika T.; Jenkins, Cody M.; Colvin, Tom; Carter, Lauren; Bohn, Alan; Bryan, Cassie M.; Fernández-Velasco, D. Alejandro; Stewart, Lance; Dong, Min; Huang, Xuhui; Jin, Rongsheng; Wilson, Ian A.; Fuller, Deborah H.; Baker, David

    2017-10-01

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37-43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.

  12. Massively parallel de novo protein design for targeted therapeutics

    PubMed Central

    Chevalier, Aaron; Silva, Daniel-Adriano; Rocklin, Gabriel J.; Hicks, Derrick R.; Vergara, Renan; Murapa, Patience; Bernard, Steffen M.; Zhang, Lu; Lam, Kwok-Ho; Yao, Guorui; Bahl, Christopher D.; Miyashita, Shin-Ichiro; Goreshnik, Inna; Fuller, James T.; Koday, Merika T.; Jenkins, Cody M.; Colvin, Tom; Carter, Lauren; Bohn, Alan; Bryan, Cassie M.; Fernández-Velasco, D. Alejandro; Stewart, Lance; Dong, Min; Huang, Xuhui; Jin, Rongsheng; Wilson, Ian A.; Fuller, Deborah H.; Baker, David

    2018-01-01

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37–43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing. PMID:28953867

  13. A supportive architecture for CFD-based design optimisation

    NASA Astrophysics Data System (ADS)

    Li, Ni; Su, Zeya; Bi, Zhuming; Tian, Chao; Ren, Zhiming; Gong, Guanghong

    2014-03-01

    Multi-disciplinary design optimisation (MDO) is one of critical methodologies to the implementation of enterprise systems (ES). MDO requiring the analysis of fluid dynamics raises a special challenge due to its extremely intensive computation. The rapid development of computational fluid dynamic (CFD) technique has caused a rise of its applications in various fields. Especially for the exterior designs of vehicles, CFD has become one of the three main design tools comparable to analytical approaches and wind tunnel experiments. CFD-based design optimisation is an effective way to achieve the desired performance under the given constraints. However, due to the complexity of CFD, integrating with CFD analysis in an intelligent optimisation algorithm is not straightforward. It is a challenge to solve a CFD-based design problem, which is usually with high dimensions, and multiple objectives and constraints. It is desirable to have an integrated architecture for CFD-based design optimisation. However, our review on existing works has found that very few researchers have studied on the assistive tools to facilitate CFD-based design optimisation. In the paper, a multi-layer architecture and a general procedure are proposed to integrate different CFD toolsets with intelligent optimisation algorithms, parallel computing technique and other techniques for efficient computation. In the proposed architecture, the integration is performed either at the code level or data level to fully utilise the capabilities of different assistive tools. Two intelligent algorithms are developed and embedded with parallel computing. These algorithms, together with the supportive architecture, lay a solid foundation for various applications of CFD-based design optimisation. To illustrate the effectiveness of the proposed architecture and algorithms, the case studies on aerodynamic shape design of a hypersonic cruising vehicle are provided, and the result has shown that the proposed architecture and developed algorithms have performed successfully and efficiently in dealing with the design optimisation with over 200 design variables.

  14. Developing a Hadoop-based Middleware for Handling Multi-dimensional NetCDF

    NASA Astrophysics Data System (ADS)

    Li, Z.; Yang, C. P.; Schnase, J. L.; Duffy, D.; Lee, T. J.

    2014-12-01

    Climate observations and model simulations are collecting and generating vast amounts of climate data, and these data are ever-increasing and being accumulated in a rapid speed. Effectively managing and analyzing these data are essential for climate change studies. Hadoop, a distributed storage and processing framework for large data sets, has attracted increasing attentions in dealing with the Big Data challenge. The maturity of Infrastructure as a Service (IaaS) of cloud computing further accelerates the adoption of Hadoop in solving Big Data problems. However, Hadoop is designed to process unstructured data such as texts, documents and web pages, and cannot effectively handle the scientific data format such as array-based NetCDF files and other binary data format. In this paper, we propose to build a Hadoop-based middleware for transparently handling big NetCDF data by 1) designing a distributed climate data storage mechanism based on POSIX-enabled parallel file system to enable parallel big data processing with MapReduce, as well as support data access by other systems; 2) modifying the Hadoop framework to transparently processing NetCDF data in parallel without sequencing or converting the data into other file formats, or loading them to HDFS; and 3) seamlessly integrating Hadoop, cloud computing and climate data in a highly scalable and fault-tolerance framework.

  15. JETSPIN: A specific-purpose open-source software for simulations of nanofiber electrospinning

    NASA Astrophysics Data System (ADS)

    Lauricella, Marco; Pontrelli, Giuseppe; Coluzza, Ivan; Pisignano, Dario; Succi, Sauro

    2015-12-01

    We present the open-source computer program JETSPIN, specifically designed to simulate the electrospinning process of nanofibers. Its capabilities are shown with proper reference to the underlying model, as well as a description of the relevant input variables and associated test-case simulations. The various interactions included in the electrospinning model implemented in JETSPIN are discussed in detail. The code is designed to exploit different computational architectures, from single to parallel processor workstations. This paper provides an overview of JETSPIN, focusing primarily on its structure, parallel implementations, functionality, performance, and availability.

  16. A Visual Database System for Image Analysis on Parallel Computers and its Application to the EOS Amazon Project

    NASA Technical Reports Server (NTRS)

    Shapiro, Linda G.; Tanimoto, Steven L.; Ahrens, James P.

    1996-01-01

    The goal of this task was to create a design and prototype implementation of a database environment that is particular suited for handling the image, vision and scientific data associated with the NASA's EOC Amazon project. The focus was on a data model and query facilities that are designed to execute efficiently on parallel computers. A key feature of the environment is an interface which allows a scientist to specify high-level directives about how query execution should occur.

  17. System software for the finite element machine

    NASA Technical Reports Server (NTRS)

    Crockett, T. W.; Knott, J. D.

    1985-01-01

    The Finite Element Machine is an experimental parallel computer developed at Langley Research Center to investigate the application of concurrent processing to structural engineering analysis. This report describes system-level software which has been developed to facilitate use of the machine by applications researchers. The overall software design is outlined, and several important parallel processing issues are discussed in detail, including processor management, communication, synchronization, and input/output. Based on experience using the system, the hardware architecture and software design are critiqued, and areas for further work are suggested.

  18. Multidisciplinary Design Optimization (MDO) Methods: Their Synergy with Computer Technology in Design Process

    NASA Technical Reports Server (NTRS)

    Sobieszczanski-Sobieski, Jaroslaw

    1998-01-01

    The paper identifies speed, agility, human interface, generation of sensitivity information, task decomposition, and data transmission (including storage) as important attributes for a computer environment to have in order to support engineering design effectively. It is argued that when examined in terms of these attributes the presently available environment can be shown to be inadequate a radical improvement is needed, and it may be achieved by combining new methods that have recently emerged from multidisciplinary design optimization (MDO) with massively parallel processing computer technology. The caveat is that, for successful use of that technology in engineering computing, new paradigms for computing will have to be developed - specifically, innovative algorithms that are intrinsically parallel so that their performance scales up linearly with the number of processors. It may be speculated that the idea of simulating a complex behavior by interaction of a large number of very simple models may be an inspiration for the above algorithms, the cellular automata are an example. Because of the long lead time needed to develop and mature new paradigms, development should be now, even though the widespread availability of massively parallel processing is still a few years away.

  19. Multidisciplinary Design Optimisation (MDO) Methods: Their Synergy with Computer Technology in the Design Process

    NASA Technical Reports Server (NTRS)

    Sobieszczanski-Sobieski, Jaroslaw

    1999-01-01

    The paper identifies speed, agility, human interface, generation of sensitivity information, task decomposition, and data transmission (including storage) as important attributes for a computer environment to have in order to support engineering design effectively. It is argued that when examined in terms of these attributes the presently available environment can be shown to be inadequate. A radical improvement is needed, and it may be achieved by combining new methods that have recently emerged from multidisciplinary design optimisation (MDO) with massively parallel processing computer technology. The caveat is that, for successful use of that technology in engineering computing, new paradigms for computing will have to be developed - specifically, innovative algorithms that are intrinsically parallel so that their performance scales up linearly with the number of processors. It may be speculated that the idea of simulating a complex behaviour by interaction of a large number of very simple models may be an inspiration for the above algorithms; the cellular automata are an example. Because of the long lead time needed to develop and mature new paradigms, development should begin now, even though the widespread availability of massively parallel processing is still a few years away.

  20. A Data Parallel Multizone Navier-Stokes Code

    NASA Technical Reports Server (NTRS)

    Jespersen, Dennis C.; Levit, Creon; Kwak, Dochan (Technical Monitor)

    1995-01-01

    We have developed a data parallel multizone compressible Navier-Stokes code on the Connection Machine CM-5. The code is set up for implicit time-stepping on single or multiple structured grids. For multiple grids and geometrically complex problems, we follow the "chimera" approach, where flow data on one zone is interpolated onto another in the region of overlap. We will describe our design philosophy and give some timing results for the current code. The design choices can be summarized as: 1. finite differences on structured grids; 2. implicit time-stepping with either distributed solves or data motion and local solves; 3. sequential stepping through multiple zones with interzone data transfer via a distributed data structure. We have implemented these ideas on the CM-5 using CMF (Connection Machine Fortran), a data parallel language which combines elements of Fortran 90 and certain extensions, and which bears a strong similarity to High Performance Fortran (HPF). One interesting feature is the issue of turbulence modeling, where the architecture of a parallel machine makes the use of an algebraic turbulence model awkward, whereas models based on transport equations are more natural. We will present some performance figures for the code on the CM-5, and consider the issues involved in transitioning the code to HPF for portability to other parallel platforms.

  1. GPU-based parallel algorithm for blind image restoration using midfrequency-based methods

    NASA Astrophysics Data System (ADS)

    Xie, Lang; Luo, Yi-han; Bao, Qi-liang

    2013-08-01

    GPU-based general-purpose computing is a new branch of modern parallel computing, so the study of parallel algorithms specially designed for GPU hardware architecture is of great significance. In order to solve the problem of high computational complexity and poor real-time performance in blind image restoration, the midfrequency-based algorithm for blind image restoration was analyzed and improved in this paper. Furthermore, a midfrequency-based filtering method is also used to restore the image hardly with any recursion or iteration. Combining the algorithm with data intensiveness, data parallel computing and GPU execution model of single instruction and multiple threads, a new parallel midfrequency-based algorithm for blind image restoration is proposed in this paper, which is suitable for stream computing of GPU. In this algorithm, the GPU is utilized to accelerate the estimation of class-G point spread functions and midfrequency-based filtering. Aiming at better management of the GPU threads, the threads in a grid are scheduled according to the decomposition of the filtering data in frequency domain after the optimization of data access and the communication between the host and the device. The kernel parallelism structure is determined by the decomposition of the filtering data to ensure the transmission rate to get around the memory bandwidth limitation. The results show that, with the new algorithm, the operational speed is significantly increased and the real-time performance of image restoration is effectively improved, especially for high-resolution images.

  2. Analysis of multigrid methods on massively parallel computers: Architectural implications

    NASA Technical Reports Server (NTRS)

    Matheson, Lesley R.; Tarjan, Robert E.

    1993-01-01

    We study the potential performance of multigrid algorithms running on massively parallel computers with the intent of discovering whether presently envisioned machines will provide an efficient platform for such algorithms. We consider the domain parallel version of the standard V cycle algorithm on model problems, discretized using finite difference techniques in two and three dimensions on block structured grids of size 10(exp 6) and 10(exp 9), respectively. Our models of parallel computation were developed to reflect the computing characteristics of the current generation of massively parallel multicomputers. These models are based on an interconnection network of 256 to 16,384 message passing, 'workstation size' processors executing in an SPMD mode. The first model accomplishes interprocessor communications through a multistage permutation network. The communication cost is a logarithmic function which is similar to the costs in a variety of different topologies. The second model allows single stage communication costs only. Both models were designed with information provided by machine developers and utilize implementation derived parameters. With the medium grain parallelism of the current generation and the high fixed cost of an interprocessor communication, our analysis suggests an efficient implementation requires the machine to support the efficient transmission of long messages, (up to 1000 words) or the high initiation cost of a communication must be significantly reduced through an alternative optimization technique. Furthermore, with variable length message capability, our analysis suggests the low diameter multistage networks provide little or no advantage over a simple single stage communications network.

  3. Stability of tapered and parallel-walled dental implants: A systematic review and meta-analysis.

    PubMed

    Atieh, Momen A; Alsabeeha, Nabeel; Duncan, Warwick J

    2018-05-15

    Clinical trials have suggested that dental implants with a tapered configuration have improved stability at placement, allowing immediate placement and/or loading. The aim of this systematic review and meta-analysis was to evaluate the implant stability of tapered dental implants compared to standard parallel-walled dental implants. Applying the guidelines of Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement, randomized controlled trials (RCTs) were searched for in electronic databases and complemented by hand searching. The risk of bias was assessed using the Cochrane Collaboration's Risk of Bias tool and data were analyzed using statistical software. A total of 1199 studies were identified, of which, five trials were included with 336 dental implants in 303 participants. Overall meta-analysis showed that tapered dental implants had higher implant stability values than parallel-walled dental implants at insertion and 8 weeks but the difference was not statistically significant. Tapered dental implants had significantly less marginal bone loss compared to parallel-walled dental implants. No significant differences in implant failure rate were found between tapered and parallel-walled dental implants. There is limited evidence to demonstrate the effectiveness of tapered dental implants in achieving greater implant stability compared to parallel-walled dental implants. Superior short-term results in maintaining peri-implant marginal bone with tapered dental implants are possible. Further properly designed RCTs are required to endorse the supposed advantages of tapered dental implants in immediate loading protocol and other complex clinical scenarios. © 2018 Wiley Periodicals, Inc.

  4. The characteristics and limitations of the MPS/MMS battery charging system

    NASA Technical Reports Server (NTRS)

    Ford, F. E.; Palandati, C. F.; Davis, J. F.; Tasevoli, C. M.

    1980-01-01

    A series of tests was conducted on two 12 ampere hour nickel cadmium batteries under a simulated cycle regime using the multiple voltage versus temperature levels designed into the modular power system (MPS). These tests included: battery recharge as a function of voltage control level; temperature imbalance between two parallel batteries; a shorted or partially shorted cell in one of the two parallel batteries; impedance imbalance of one of the parallel battery circuits; and disabling and enabling one of the batteries from the bus at various charge and discharge states. The results demonstrate that the eight commandable voltage versus temperature levels designed into the MPS provide a very flexible system that not only can accommodate a wide range of normal power system operation, but also provides a high degree of flexibility in responding to abnormal operating conditions.

  5. Update on Development of Mesh Generation Algorithms in MeshKit

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jain, Rajeev; Vanderzee, Evan; Mahadevan, Vijay

    2015-09-30

    MeshKit uses a graph-based design for coding all its meshing algorithms, which includes the Reactor Geometry (and mesh) Generation (RGG) algorithms. This report highlights the developmental updates of all the algorithms, results and future work. Parallel versions of algorithms, documentation and performance results are reported. RGG GUI design was updated to incorporate new features requested by the users; boundary layer generation and parallel RGG support were added to the GUI. Key contributions to the release, upgrade and maintenance of other SIGMA1 libraries (CGM and MOAB) were made. Several fundamental meshing algorithms for creating a robust parallel meshing pipeline in MeshKitmore » are under development. Results and current status of automated, open-source and high quality nuclear reactor assembly mesh generation algorithms such as trimesher, quadmesher, interval matching and multi-sweeper are reported.« less

  6. A Multiscale Parallel Computing Architecture for Automated Segmentation of the Brain Connectome

    PubMed Central

    Knobe, Kathleen; Newton, Ryan R.; Schlimbach, Frank; Blower, Melanie; Reid, R. Clay

    2015-01-01

    Several groups in neurobiology have embarked into deciphering the brain circuitry using large-scale imaging of a mouse brain and manual tracing of the connections between neurons. Creating a graph of the brain circuitry, also called a connectome, could have a huge impact on the understanding of neurodegenerative diseases such as Alzheimer’s disease. Although considerably smaller than a human brain, a mouse brain already exhibits one billion connections and manually tracing the connectome of a mouse brain can only be achieved partially. This paper proposes to scale up the tracing by using automated image segmentation and a parallel computing approach designed for domain experts. We explain the design decisions behind our parallel approach and we present our results for the segmentation of the vasculature and the cell nuclei, which have been obtained without any manual intervention. PMID:21926011

  7. Binary zone-plate array for a parallel joint transform correlator applied to face recognition.

    PubMed

    Kodate, K; Hashimoto, A; Thapliya, R

    1999-05-10

    Taking advantage of small aberrations, high efficiency, and compactness, we developed a new, to our knowledge, design procedure for a binary zone-plate array (BZPA) and applied it to a parallel joint transform correlator for the recognition of the human face. Pairs of reference and unknown images of faces are displayed on a liquid-crystal spatial light modulator (SLM), Fourier transformed by the BZPA, intensity recorded on an optically addressable SLM, and inversely Fourier transformed to obtain correlation signals. Consideration of the bandwidth allows the relations among the channel number, the numerical aperture of the zone plates, and the pattern size to be determined. Experimentally a five-channel parallel correlator was implemented and tested successfully with a 100-person database. The design and the fabrication of a 20-channel BZPA for phonetic character recognition are also included.

  8. School Principals' Opinions on In-Class Inspections

    ERIC Educational Resources Information Center

    Kayikci, Kemal; Sahin, Ahmet; Canturk, Gokhan

    2016-01-01

    The aim of this research is to determine school principals' opinions on the in-class inspections carried out by inspectors of the Ministry of National Education of Turkey (MEB). The study was modeled as a convergent parallel design, one of the mixed methods which combined qualitative and quantitative methods. For data collection, the researchers…

  9. Randomized Controlled Trial of Video Self-Modeling Following Speech Restructuring Treatment for Stuttering

    ERIC Educational Resources Information Center

    Cream, Angela; O'Brian, Sue; Jones, Mark; Block, Susan; Harrison, Elisabeth; Lincoln, Michelle; Hewat, Sally; Packman, Ann; Menzies, Ross; Onslow, Mark

    2010-01-01

    Purpose: In this study, the authors investigated the efficacy of video self-modeling (VSM) following speech restructuring treatment to improve the maintenance of treatment effects. Method: The design was an open-plan, parallel-group, randomized controlled trial. Participants were 89 adults and adolescents who undertook intensive speech…

  10. Fear Appeals and College Students' Attitudes and Behavioral Intentions toward Global Warming

    ERIC Educational Resources Information Center

    Li, Shu-Chu Sarrina

    2014-01-01

    This study used Witte's extended parallel process model to examine the relationships between the use of fear appeals and college students' attitudes and behavioral intentions toward global warming. A pretest-posttest quasi-experimental design was adopted. Three hundred forty-one college students from six communication courses at two universities…

  11. Beyond "Parallel Play": Creating a Realistic Model of Integrative Learning with Community College Freshmen

    ERIC Educational Resources Information Center

    Burg, Evelyn; Klages, Marisa; Sokolski, Patricia

    2013-01-01

    What does interdisciplinary integration actually look like for students beginning their college studies? This article describes what a LaGuardia Community College teaching team, who typically share a theme and consult periodically but keep their classes distinct--discovered when they designed an integrative assignment for a paired developmental…

  12. Pattern Separation and Goal-Directed Behavior in the Aged Canine

    ERIC Educational Resources Information Center

    Snigdha, Shikha; Yassa, Michael A.; deRivera, Christina; Milgram, Norton W.; Cotman, Carl W.

    2017-01-01

    The pattern separation task has recently emerged as a behavioral model of hippocampus function and has been used in several pharmaceutical trials. The canine is a useful model to evaluate a multitude of hippocampal-dependent cognitive tasks that parallel those in humans. Thus, this study was designed to evaluate the suitability of pattern…

  13. Comparing the Teaching Interaction Procedure to Social Stories for People with Autism

    ERIC Educational Resources Information Center

    Leaf, Justin B.; Oppenheim-Leaf, Misty L.; Call, Nikki A.; Sheldon, Jan B.; Sherman, James A.; Taubman, Mitchell; McEachin, John; Dayharsh, Jamison; Leaf, Ronald

    2012-01-01

    This study compared social stories and the teaching interaction procedure to teach social skills to 6 children and adolescents with an autism spectrum disorder. Researchers taught 18 social skills with social stories and 18 social skills with the teaching interaction procedure within a parallel treatment design. The teaching interaction procedure…

  14. Using Game Theory and the Bible to Build Critical Thinking Skills

    ERIC Educational Resources Information Center

    McCannon, Bryan C.

    2007-01-01

    The author describes a course designed to build the critical thinking skills of undergraduate economics students. The course introduces and uses game theory to study the Bible. Students gain experience using game theory to formalize events and, by drawing parallels between the Bible and common economic concepts, illustrate the pervasiveness of…

  15. The Straightforwardness of Advice: Advice-Giving in Interactions Between Swedish District Nurses and Patients.

    ERIC Educational Resources Information Center

    Leppanen, Vesa

    1998-01-01

    A study examined advice-giving interactions between Swedish district nurses and patients, comparing these sequences with parallel interactions between British health visitors and first-time mothers in previous research. Analysis focused on how advice-giving is organized in the settings, including how advice is initiated and designed, its…

  16. Pre-Service Teachers' TPACK Development and Conceptions through a TPACK-Based Course

    ERIC Educational Resources Information Center

    Durdu, Levent; Dag, Funda

    2017-01-01

    This study examines pre-service teachers' Technological Pedagogical Content Knowledge (TPACK) development and analyses their conceptions of learning and teaching with technology. With this aim in mind, researchers designed and implemented a computer-based mathematics course based on a TPACK framework. As a research methodology, a parallel mixed…

  17. Computations on Wings With Full-Span Oscillating Control Surfaces Using Navier-Stokes Equations

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.

    2013-01-01

    A dual-level parallel procedure is presented for computing large databases to support aerospace vehicle design. This procedure has been developed as a single Unix script within the Parallel Batch Submission environment utilizing MPIexec and runs MPI based analysis software. It has been developed to provide a process for aerospace designers to generate data for large numbers of cases with the highest possible fidelity and reasonable wall clock time. A single job submission environment has been created to avoid keeping track of multiple jobs and the associated system administration overhead. The process has been demonstrated for computing large databases for the design of typical aerospace configurations, a launch vehicle and a rotorcraft.

  18. Finite-sample corrected generalized estimating equation of population average treatment effects in stepped wedge cluster randomized trials.

    PubMed

    Scott, JoAnna M; deCamp, Allan; Juraska, Michal; Fay, Michael P; Gilbert, Peter B

    2017-04-01

    Stepped wedge designs are increasingly commonplace and advantageous for cluster randomized trials when it is both unethical to assign placebo, and it is logistically difficult to allocate an intervention simultaneously to many clusters. We study marginal mean models fit with generalized estimating equations for assessing treatment effectiveness in stepped wedge cluster randomized trials. This approach has advantages over the more commonly used mixed models that (1) the population-average parameters have an important interpretation for public health applications and (2) they avoid untestable assumptions on latent variable distributions and avoid parametric assumptions about error distributions, therefore, providing more robust evidence on treatment effects. However, cluster randomized trials typically have a small number of clusters, rendering the standard generalized estimating equation sandwich variance estimator biased and highly variable and hence yielding incorrect inferences. We study the usual asymptotic generalized estimating equation inferences (i.e., using sandwich variance estimators and asymptotic normality) and four small-sample corrections to generalized estimating equation for stepped wedge cluster randomized trials and for parallel cluster randomized trials as a comparison. We show by simulation that the small-sample corrections provide improvement, with one correction appearing to provide at least nominal coverage even with only 10 clusters per group. These results demonstrate the viability of the marginal mean approach for both stepped wedge and parallel cluster randomized trials. We also study the comparative performance of the corrected methods for stepped wedge and parallel designs, and describe how the methods can accommodate interval censoring of individual failure times and incorporate semiparametric efficient estimators.

  19. Parallel Processing of Broad-Band PPM Signals

    NASA Technical Reports Server (NTRS)

    Gray, Andrew; Kang, Edward; Lay, Norman; Vilnrotter, Victor; Srinivasan, Meera; Lee, Clement

    2010-01-01

    A parallel-processing algorithm and a hardware architecture to implement the algorithm have been devised for timeslot synchronization in the reception of pulse-position-modulated (PPM) optical or radio signals. As in the cases of some prior algorithms and architectures for parallel, discrete-time, digital processing of signals other than PPM, an incoming broadband signal is divided into multiple parallel narrower-band signals by means of sub-sampling and filtering. The number of parallel streams is chosen so that the frequency content of the narrower-band signals is low enough to enable processing by relatively-low speed complementary metal oxide semiconductor (CMOS) electronic circuitry. The algorithm and architecture are intended to satisfy requirements for time-varying time-slot synchronization and post-detection filtering, with correction of timing errors independent of estimation of timing errors. They are also intended to afford flexibility for dynamic reconfiguration and upgrading. The architecture is implemented in a reconfigurable CMOS processor in the form of a field-programmable gate array. The algorithm and its hardware implementation incorporate three separate time-varying filter banks for three distinct functions: correction of sub-sample timing errors, post-detection filtering, and post-detection estimation of timing errors. The design of the filter bank for correction of timing errors, the method of estimating timing errors, and the design of a feedback-loop filter are governed by a host of parameters, the most critical one, with regard to processing very broadband signals with CMOS hardware, being the number of parallel streams (equivalently, the rate-reduction parameter).

  20. Parallel computer vision

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Uhr, L.

    1987-01-01

    This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.

  1. Life Management Skills. Teacher's Guide [and Student Workbook]. Parallel Alternative Strategies for Students (PASS).

    ERIC Educational Resources Information Center

    Goldstein, Jeren; Walford, Sylvia

    This teacher's guide and student workbook are part of a series of supplementary curriculum packages presenting alternative methods and activities designed to meet the needs of Florida secondary students with mild disabilities or other special learning needs. The Life Management Skills PASS (Parallel Alternative Strategies for Students) teacher's…

  2. Parallel Performance of a Combustion Chemistry Simulation

    DOE PAGES

    Skinner, Gregg; Eigenmann, Rudolf

    1995-01-01

    We used a description of a combustion simulation's mathematical and computational methods to develop a version for parallel execution. The result was a reasonable performance improvement on small numbers of processors. We applied several important programming techniques, which we describe, in optimizing the application. This work has implications for programming languages, compiler design, and software engineering.

  3. DICE/ColDICE: 6D collisionless phase space hydrodynamics using a lagrangian tesselation

    NASA Astrophysics Data System (ADS)

    Sousbie, Thierry

    2018-01-01

    DICE is a C++ template library designed to solve collisionless fluid dynamics in 6D phase space using massively parallel supercomputers via an hybrid OpenMP/MPI parallelization. ColDICE, based on DICE, implements a cosmological and physical VLASOV-POISSON solver for cold systems such as dark matter (CDM) dynamics.

  4. 5 mW parallel-connected resonant-tunnelling diode oscillator

    NASA Technical Reports Server (NTRS)

    Stephan, K. D.; Wong, S.-C.; Brown, E. R.; Molvar, K. M.; Calawa, A. R.; Manfra, M. J.

    1992-01-01

    A new type of resonant-tunneling diode (RTD) oscillator that generates 5 mW at 1.18 GHz is reported. This result was obtained by connecting in parallel 25 individual diodes designed for such a connection. This experiment demonstrates that RTDs can successfully be used in a chip-level power-combining circuit.

  5. Language Workbook for Food Service.

    ERIC Educational Resources Information Center

    Mankoski, Linda C.

    This workbook parallels the manual, "Food Service" (see related note), and is designed to assist the language arts or foods service teacher in helping deaf students cope with problems of reading the parallel text. The language system used in this text is based upon the Roberts English Series, which uses a linguistic approach to teaching language…

  6. A parallel expert system for the control of a robotic air vehicle

    NASA Technical Reports Server (NTRS)

    Shakley, Donald; Lamont, Gary B.

    1988-01-01

    Expert systems can be used to govern the intelligent control of vehicles, for example the Robotic Air Vehicle (RAV). Due to the nature of the RAV system the associated expert system needs to perform in a demanding real-time environment. The use of a parallel processing capability to support the associated expert system's computational requirement is critical in this application. Thus, algorithms for parallel real-time expert systems must be designed, analyzed, and synthesized. The design process incorporates a consideration of the rule-set/face-set size along with representation issues. These issues are looked at in reference to information movement and various inference mechanisms. Also examined is the process involved with transporting the RAV expert system functions from the TI Explorer, where they are implemented in the Automated Reasoning Tool (ART), to the iPSC Hypercube, where the system is synthesized using Concurrent Common LISP (CCLISP). The transformation process for the ART to CCLISP conversion is described. The performance characteristics of the parallel implementation of these expert systems on the iPSC Hypercube are compared to the TI Explorer implementation.

  7. High-performance Chinese multiclass traffic sign detection via coarse-to-fine cascade and parallel support vector machine detectors

    NASA Astrophysics Data System (ADS)

    Chang, Faliang; Liu, Chunsheng

    2017-09-01

    The high variability of sign colors and shapes in uncontrolled environments has made the detection of traffic signs a challenging problem in computer vision. We propose a traffic sign detection (TSD) method based on coarse-to-fine cascade and parallel support vector machine (SVM) detectors to detect Chinese warning and danger traffic signs. First, a region of interest (ROI) extraction method is proposed to extract ROIs using color contrast features in local regions. The ROI extraction can reduce scanning regions and save detection time. For multiclass TSD, we propose a structure that combines a coarse-to-fine cascaded tree with a parallel structure of histogram of oriented gradients (HOG) + SVM detectors. The cascaded tree is designed to detect different types of traffic signs in a coarse-to-fine process. The parallel HOG + SVM detectors are designed to do fine detection of different types of traffic signs. The experiments demonstrate the proposed TSD method can rapidly detect multiclass traffic signs with different colors and shapes in high accuracy.

  8. The Distributed Diagonal Force Decomposition Method for Parallelizing Molecular Dynamics Simulations

    PubMed Central

    Boršnik, Urban; Miller, Benjamin T.; Brooks, Bernard R.; Janežič, Dušanka

    2011-01-01

    Parallelization is an effective way to reduce the computational time needed for molecular dynamics simulations. We describe a new parallelization method, the distributed-diagonal force decomposition method, with which we extend and improve the existing force decomposition methods. Our new method requires less data communication during molecular dynamics simulations than replicated data and current force decomposition methods, increasing the parallel efficiency. It also dynamically load-balances the processors' computational load throughout the simulation. The method is readily implemented in existing molecular dynamics codes and it has been incorporated into the CHARMM program, allowing its immediate use in conjunction with the many molecular dynamics simulation techniques that are already present in the program. We also present the design of the Force Decomposition Machine, a cluster of personal computers and networks that is tailored to running molecular dynamics simulations using the distributed diagonal force decomposition method. The design is expandable and provides various degrees of fault resilience. This approach is easily adaptable to computers with Graphics Processing Units because it is independent of the processor type being used. PMID:21793007

  9. Multibus-based parallel processor for simulation

    NASA Technical Reports Server (NTRS)

    Ogrady, E. P.; Wang, C.-H.

    1983-01-01

    A Multibus-based parallel processor simulation system is described. The system is intended to serve as a vehicle for gaining hands-on experience, testing system and application software, and evaluating parallel processor performance during development of a larger system based on the horizontal/vertical-bus interprocessor communication mechanism. The prototype system consists of up to seven Intel iSBC 86/12A single-board computers which serve as processing elements, a multiple transmission controller (MTC) designed to support system operation, and an Intel Model 225 Microcomputer Development System which serves as the user interface and input/output processor. All components are interconnected by a Multibus/IEEE 796 bus. An important characteristic of the system is that it provides a mechanism for a processing element to broadcast data to other selected processing elements. This parallel transfer capability is provided through the design of the MTC and a minor modification to the iSBC 86/12A board. The operation of the MTC, the basic hardware-level operation of the system, and pertinent details about the iSBC 86/12A and the Multibus are described.

  10. A Parallel Genetic Algorithm for Automated Electronic Circuit Design

    NASA Technical Reports Server (NTRS)

    Long, Jason D.; Colombano, Silvano P.; Haith, Gary L.; Stassinopoulos, Dimitris

    2000-01-01

    Parallelized versions of genetic algorithms (GAs) are popular primarily for three reasons: the GA is an inherently parallel algorithm, typical GA applications are very compute intensive, and powerful computing platforms, especially Beowulf-style computing clusters, are becoming more affordable and easier to implement. In addition, the low communication bandwidth required allows the use of inexpensive networking hardware such as standard office ethernet. In this paper we describe a parallel GA and its use in automated high-level circuit design. Genetic algorithms are a type of trial-and-error search technique that are guided by principles of Darwinian evolution. Just as the genetic material of two living organisms can intermix to produce offspring that are better adapted to their environment, GAs expose genetic material, frequently strings of 1s and Os, to the forces of artificial evolution: selection, mutation, recombination, etc. GAs start with a pool of randomly-generated candidate solutions which are then tested and scored with respect to their utility. Solutions are then bred by probabilistically selecting high quality parents and recombining their genetic representations to produce offspring solutions. Offspring are typically subjected to a small amount of random mutation. After a pool of offspring is produced, this process iterates until a satisfactory solution is found or an iteration limit is reached. Genetic algorithms have been applied to a wide variety of problems in many fields, including chemistry, biology, and many engineering disciplines. There are many styles of parallelism used in implementing parallel GAs. One such method is called the master-slave or processor farm approach. In this technique, slave nodes are used solely to compute fitness evaluations (the most time consuming part). The master processor collects fitness scores from the nodes and performs the genetic operators (selection, reproduction, variation, etc.). Because of dependency issues in the GA, it is possible to have idle processors. However, as long as the load at each processing node is similar, the processors are kept busy nearly all of the time. In applying GAs to circuit design, a suitable genetic representation 'is that of a circuit-construction program. We discuss one such circuit-construction programming language and show how evolution can generate useful analog circuit designs. This language has the desirable property that virtually all sets of combinations of primitives result in valid circuit graphs. Our system allows circuit size (number of devices), circuit topology, and device values to be evolved. Using a parallel genetic algorithm and circuit simulation software, we present experimental results as applied to three analog filter and two amplifier design tasks. For example, a figure shows an 85 dB amplifier design evolved by our system, and another figure shows the performance of that circuit (gain and frequency response). In all tasks, our system is able to generate circuits that achieve the target specifications.

  11. Minimally invasive strabismus surgery versus paralimbal approach: A randomized, parallel design study is minimally invasive strabismus surgery worth the effort?

    PubMed Central

    Sharma, Richa; Amitava, Abadan K; Bani, Sadat AO

    2014-01-01

    Introduction: Minimal access surgery is common in all fields of medicine. We compared a new minimally invasive strabismus surgery (MISS) approach with a standard paralimbal strabismus surgery (SPSS) approach in terms of post-operative course. Materials and Methods: This parallel design study was done on 28 eyes of 14 patients, in which one eye was randomized to MISS and the other to SPSS. MISS was performed by giving two conjunctival incisions parallel to the horizontal rectus muscles; performing recession or resection below the conjunctival strip so obtained. We compared post-operative redness, congestion, chemosis, foreign body sensation (FBS), and drop intolerance (DI) on a graded scale of 0 to 3 on post-operative day 1, at 2-3 weeks, and 6 weeks. In addition, all scores were added to obtain a total inflammatory score (TIS). Statistical Analysis: Inflammatory scores were analyzed using Wilcoxon's signed rank test. Results: On the first post-operative day, only FBS (P =0.01) and TIS (P =0.04) showed significant difference favoring MISS. At 2-3 weeks, redness (P =0.04), congestion (P =0.04), FBS (P =0.02), and TIS (P =0.04) were significantly less in MISS eye. At 6 weeks, only redness (P =0.04) and TIS (P =0.05) were significantly less. Conclusion: MISS is more comfortable in the immediate post-operative period and provides better cosmesis in the intermediate period. PMID:24088635

  12. Design, Desire, and Difference

    ERIC Educational Resources Information Center

    Leander, Kevin M.; Boldt, Gail

    2018-01-01

    In response to the rise in popularity of concepts of "design" in education research, pedagogy, and curriculum design, in this article we consider how the New London Group conceived of the role of student design practices as an outcome of pedagogy, as well as the parallel role of design in teaching practices. In this descriptive analysis,…

  13. Design and Performance of a 1 ms High-Speed Vision Chip with 3D-Stacked 140 GOPS Column-Parallel PEs †.

    PubMed

    Nose, Atsushi; Yamazaki, Tomohiro; Katayama, Hironobu; Uehara, Shuji; Kobayashi, Masatsugu; Shida, Sayaka; Odahara, Masaki; Takamiya, Kenichi; Matsumoto, Shizunori; Miyashita, Leo; Watanabe, Yoshihiro; Izawa, Takashi; Muramatsu, Yoshinori; Nitta, Yoshikazu; Ishikawa, Masatoshi

    2018-04-24

    We have developed a high-speed vision chip using 3D stacking technology to address the increasing demand for high-speed vision chips in diverse applications. The chip comprises a 1/3.2-inch, 1.27 Mpixel, 500 fps (0.31 Mpixel, 1000 fps, 2 × 2 binning) vision chip with 3D-stacked column-parallel Analog-to-Digital Converters (ADCs) and 140 Giga Operation per Second (GOPS) programmable Single Instruction Multiple Data (SIMD) column-parallel PEs for new sensing applications. The 3D-stacked structure and column parallel processing architecture achieve high sensitivity, high resolution, and high-accuracy object positioning.

  14. 76 FR 2944 - Notice of Passenger Facility Charge (PFC) Approvals and Disapprovals

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-18

    ... equipment. Rehabilitate airfield guidance signs. Rehabilitate runway 16/34 (design only). Rehabilitate parallel and connecting taxiways (design only). Rehabilitate terminal building. Conduct wildlife hazard assessment. Terminal building expansion (design only). PFC administrative costs. Reconstruct west aircraft...

  15. Improved packing of protein side chains with parallel ant colonies.

    PubMed

    Quan, Lijun; Lü, Qiang; Li, Haiou; Xia, Xiaoyan; Wu, Hongjie

    2014-01-01

    The accurate packing of protein side chains is important for many computational biology problems, such as ab initio protein structure prediction, homology modelling, and protein design and ligand docking applications. Many of existing solutions are modelled as a computational optimisation problem. As well as the design of search algorithms, most solutions suffer from an inaccurate energy function for judging whether a prediction is good or bad. Even if the search has found the lowest energy, there is no certainty of obtaining the protein structures with correct side chains. We present a side-chain modelling method, pacoPacker, which uses a parallel ant colony optimisation strategy based on sharing a single pheromone matrix. This parallel approach combines different sources of energy functions and generates protein side-chain conformations with the lowest energies jointly determined by the various energy functions. We further optimised the selected rotamers to construct subrotamer by rotamer minimisation, which reasonably improved the discreteness of the rotamer library. We focused on improving the accuracy of side-chain conformation prediction. For a testing set of 442 proteins, 87.19% of X1 and 77.11% of X12 angles were predicted correctly within 40° of the X-ray positions. We compared the accuracy of pacoPacker with state-of-the-art methods, such as CIS-RR and SCWRL4. We analysed the results from different perspectives, in terms of protein chain and individual residues. In this comprehensive benchmark testing, 51.5% of proteins within a length of 400 amino acids predicted by pacoPacker were superior to the results of CIS-RR and SCWRL4 simultaneously. Finally, we also showed the advantage of using the subrotamers strategy. All results confirmed that our parallel approach is competitive to state-of-the-art solutions for packing side chains. This parallel approach combines various sources of searching intelligence and energy functions to pack protein side chains. It provides a frame-work for combining different inaccuracy/usefulness objective functions by designing parallel heuristic search algorithms.

  16. Emerging Nanophotonic Applications Explored with Advanced Scientific Parallel Computing

    NASA Astrophysics Data System (ADS)

    Meng, Xiang

    The domain of nanoscale optical science and technology is a combination of the classical world of electromagnetics and the quantum mechanical regime of atoms and molecules. Recent advancements in fabrication technology allows the optical structures to be scaled down to nanoscale size or even to the atomic level, which are far smaller than the wavelength they are designed for. These nanostructures can have unique, controllable, and tunable optical properties and their interactions with quantum materials can have important near-field and far-field optical response. Undoubtedly, these optical properties can have many important applications, ranging from the efficient and tunable light sources, detectors, filters, modulators, high-speed all-optical switches; to the next-generation classical and quantum computation, and biophotonic medical sensors. This emerging research of nanoscience, known as nanophotonics, is a highly interdisciplinary field requiring expertise in materials science, physics, electrical engineering, and scientific computing, modeling and simulation. It has also become an important research field for investigating the science and engineering of light-matter interactions that take place on wavelength and subwavelength scales where the nature of the nanostructured matter controls the interactions. In addition, the fast advancements in the computing capabilities, such as parallel computing, also become as a critical element for investigating advanced nanophotonic devices. This role has taken on even greater urgency with the scale-down of device dimensions, and the design for these devices require extensive memory and extremely long core hours. Thus distributed computing platforms associated with parallel computing are required for faster designs processes. Scientific parallel computing constructs mathematical models and quantitative analysis techniques, and uses the computing machines to analyze and solve otherwise intractable scientific challenges. In particular, parallel computing are forms of computation operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently. In this dissertation, we report a series of new nanophotonic developments using the advanced parallel computing techniques. The applications include the structure optimizations at the nanoscale to control both the electromagnetic response of materials, and to manipulate nanoscale structures for enhanced field concentration, which enable breakthroughs in imaging, sensing systems (chapter 3 and 4) and improve the spatial-temporal resolutions of spectroscopies (chapter 5). We also report the investigations on the confinement study of optical-matter interactions at the quantum mechanical regime, where the size-dependent novel properties enhanced a wide range of technologies from the tunable and efficient light sources, detectors, to other nanophotonic elements with enhanced functionality (chapter 6 and 7).

  17. Airborne Precision Spacing for Dependent Parallel Operations Interface Study

    NASA Technical Reports Server (NTRS)

    Volk, Paul M.; Takallu, M. A.; Hoffler, Keith D.; Weiser, Jarold; Turner, Dexter

    2012-01-01

    This paper describes a usability study of proposed cockpit interfaces to support Airborne Precision Spacing (APS) operations for aircraft performing dependent parallel approaches (DPA). NASA has proposed an airborne system called Pair Dependent Speed (PDS) which uses their Airborne Spacing for Terminal Arrival Routes (ASTAR) algorithm to manage spacing intervals. Interface elements were designed to facilitate the input of APS-DPA spacing parameters to ASTAR, and to convey PDS system information to the crew deemed necessary and/or helpful to conduct the operation, including: target speed, guidance mode, target aircraft depiction, and spacing trend indication. In the study, subject pilots observed recorded simulations using the proposed interface elements in which the ownship managed assigned spacing intervals from two other arriving aircraft. Simulations were recorded using the Aircraft Simulation for Traffic Operations Research (ASTOR) platform, a medium-fidelity simulator based on a modern Boeing commercial glass cockpit. Various combinations of the interface elements were presented to subject pilots, and feedback was collected via structured questionnaires. The results of subject pilot evaluations show that the proposed design elements were acceptable, and that preferable combinations exist within this set of elements. The results also point to potential improvements to be considered for implementation in future experiments.

  18. Parallel programming of industrial applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Heroux, M; Koniges, A; Simon, H

    1998-07-21

    In the introductory material, we overview the typical MPP environment for real application computing and the special tools available such as parallel debuggers and performance analyzers. Next, we draw from a series of real applications codes and discuss the specific challenges and problems that are encountered in parallelizing these individual applications. The application areas drawn from include biomedical sciences, materials processing and design, plasma and fluid dynamics, and others. We show how it was possible to get a particular application to run efficiently and what steps were necessary. Finally we end with a summary of the lessons learned from thesemore » applications and predictions for the future of industrial parallel computing. This tutorial is based on material from a forthcoming book entitled: "Industrial Strength Parallel Computing" to be published by Morgan Kaufmann Publishers (ISBN l-55860-54).« less

  19. F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

    NASA Technical Reports Server (NTRS)

    DiNucci, David C.; Saini, Subhash (Technical Monitor)

    1998-01-01

    Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).

  20. Hierarchical Fuzzy Control Applied to Parallel Connected UPS Inverters Using Average Current Sharing Scheme

    NASA Astrophysics Data System (ADS)

    Singh, Santosh Kumar; Ghatak Choudhuri, Sumit

    2018-05-01

    Parallel connection of UPS inverters to enhance power rating is a widely accepted practice. Inter-modular circulating currents appear when multiple inverter modules are connected in parallel to supply variable critical load. Interfacing of modules henceforth requires an intensive design, using proper control strategy. The potentiality of human intuitive Fuzzy Logic (FL) control with imprecise system model is well known and thus can be utilised in parallel-connected UPS systems. Conventional FL controller is computational intensive, especially with higher number of input variables. This paper proposes application of Hierarchical-Fuzzy Logic control for parallel connected Multi-modular inverters system for reduced computational burden on the processor for a given switching frequency. Simulated results in MATLAB environment and experimental verification using Texas TMS320F2812 DSP are included to demonstrate feasibility of the proposed control scheme.

Top