Science.gov

Sample records for implementation compilation optimization

  1. SOL - SIZING AND OPTIMIZATION LANGUAGE COMPILER

    NASA Technical Reports Server (NTRS)

    Scotti, S. J.

    1994-01-01

    each variable was used. The listings summarize all optimizations, listing the objective functions, design variables, and constraints. The compiler offers error-checking specific to optimization problems, so that simple mistakes will not cost hours of debugging time. The optimization engine used by and included with the SOL compiler is a version of Vanderplatt's ADS system (Version 1.1) modified specifically to work with the SOL compiler. SOL allows the use of the over 100 ADS optimization choices such as Sequential Quadratic Programming, Modified Feasible Directions, interior and exterior penalty function and variable metric methods. Default choices of the many control parameters of ADS are made for the user, however, the user can override any of the ADS control parameters desired for each individual optimization. The SOL language and compiler were developed with an advanced compiler-generation system to ensure correctness and simplify program maintenance. Thus, SOL's syntax was defined precisely by a LALR(1) grammar and the SOL compiler's parser was generated automatically from the LALR(1) grammar with a parser-generator. Hence unlike ad hoc, manually coded interfaces, the SOL compiler's lexical analysis insures that the SOL compiler recognizes all legal SOL programs, can recover from and correct for many errors and report the location of errors to the user. This version of the SOL compiler has been implemented on VAX/VMS computer systems and requires 204 KB of virtual memory to execute. Since the SOL compiler produces FORTRAN code, it requires the VAX FORTRAN compiler to produce an executable program. The SOL compiler consists of 13,000 lines of Pascal code. It was developed in 1986 and last updated in 1988. The ADS and other utility subroutines amount to 14,000 lines of FORTRAN code and were also updated in 1988.

  2. Asymptotically optimal topological quantum compiling.

    PubMed

    Kliuchnikov, Vadym; Bocharov, Alex; Svore, Krysta M

    2014-04-11

    We address the problem of compiling quantum operations into braid representations for non-Abelian quasiparticles described by the Fibonacci anyon model. We classify the single-qubit unitaries that can be represented exactly by Fibonacci anyon braids and use the classification to develop a probabilistically polynomial algorithm that approximates any given single-qubit unitary to a desired precision by an asymptotically depth-optimal braid pattern. We extend our algorithm in two directions: to produce braids that allow only single-strand movement, called weaves, and to produce depth-optimal approximations of two-qubit gates. Our compiled braid patterns have depths that are 20 to 1000 times shorter than those output by prior state-of-the-art methods, for precisions ranging between 10(-10) and 10(-30). PMID:24765934

  3. Optimizing Sisal Compiler; Sisal Compiler and Running System

    SciTech Connect

    1992-11-12

    OSC is a compiler and runtime system for the functional language Sisal. Functional languages are based on mathematical principals, and may reduce the cost of parallel program development without sacrificing performance. OSC compiles Sisal source code to binary form, automatically inserting calls to the Sisal runtime system to manage parallel execution of independent tasks. Features include support for dynamic arrays, automatic vectorization, and automatic parallelization. At runtime, the user may specify the number of workers, the granularity of tasks, and other execution parameters.

  4. Optimizing Sisal Compiler; Sisal Compiler and Running System

    Energy Science and Technology Software Center (ESTSC)

    1992-11-12

    OSC is a compiler and runtime system for the functional language Sisal. Functional languages are based on mathematical principals, and may reduce the cost of parallel program development without sacrificing performance. OSC compiles Sisal source code to binary form, automatically inserting calls to the Sisal runtime system to manage parallel execution of independent tasks. Features include support for dynamic arrays, automatic vectorization, and automatic parallelization. At runtime, the user may specify the number of workers,more » the granularity of tasks, and other execution parameters.« less

  5. Compiler-Directed File Layout Optimization for Hierarchical Storage Systems

    DOE PAGESBeta

    Ding, Wei; Zhang, Yuanrui; Kandemir, Mahmut; Son, Seung Woo

    2013-01-01

    File layout of array data is a critical factor that effects the behavior of storage caches, and has so far taken not much attention in the context of hierarchical storage systems. The main contribution of this paper is a compiler-driven file layout optimization scheme for hierarchical storage caches. This approach, fully automated within an optimizing compiler, analyzes a multi-threaded application code and determines a file layout for each disk-resident array referenced by the code, such that the performance of the target storage cache hierarchy is maximized. We tested our approach using 16 I/O intensive application programs and compared its performancemore » against two previously proposed approaches under different cache space management schemes. Our experimental results show that the proposed approach improves the execution time of these parallel applications by 23.7% on average.« less

  6. Final Project Report: A Polyhedral Transformation Framework for Compiler Optimization

    SciTech Connect

    Sadayappan, Ponnuswamy; Rountev, Atanas

    2015-06-15

    The project developed the polyhedral compiler transformation module PolyOpt/Fortran in the ROSE compiler framework. PolyOpt/Fortran performs automated transformation of affine loop nests within FORTRAN programs for enhanced data locality and parallel execution. A FORTAN version of the Polybench library was also developed by the project. A third development was a dynamic analysis approach to gauge vectorization potential within loops of programs; software (DDVec) for automated instrumentation and dynamic analysis of programs was developed.

  7. Compiler Optimization Pass Visualization: The Procedural Abstraction Case

    ERIC Educational Resources Information Center

    Schaeckeler, Stefan; Shang, Weijia; Davis, Ruth

    2009-01-01

    There is an active research community concentrating on visualizations of algorithms taught in CS1 and CS2 courses. These visualizations can help students to create concrete visual images of the algorithms and their underlying concepts. Not only "fundamental algorithms" can be visualized, but also algorithms used in compilers. Visualizations that…

  8. An Optimizing Compiler for Petascale I/O on Leadership-Class Architectures

    SciTech Connect

    Kandemir, Mahmut Taylan; Choudary, Alok; Thakur, Rajeev

    2014-03-01

    In high-performance computing (HPC), parallel I/O architectures usually have very complex hierarchies with multiple layers that collectively constitute an I/O stack, including high-level I/O libraries such as PnetCDF and HDF5, I/O middleware such as MPI-IO, and parallel file systems such as PVFS and Lustre. Our DOE project explored automated instrumentation and compiler support for I/O intensive applications. Our project made significant progress towards understanding the complex I/O hierarchies of high-performance storage systems (including storage caches, HDDs, and SSDs), and designing and implementing state-of-the-art compiler/runtime system technology that targets I/O intensive HPC applications that target leadership class machine. This final report summarizes the major achievements of the project and also points out promising future directions Two new sections in this report compared to the previous report are IOGenie and SSD/NVM-specific optimizations.

  9. Optimization guide for programs compiled under IBM FORTRAN H (OPT=2)

    NASA Technical Reports Server (NTRS)

    Smith, D. M.; Dobyns, A. H.; Marsh, H. M.

    1977-01-01

    Guidelines are given to provide the programmer with various techniques for optimizing programs when the FORTRAN IV H compiler is used with OPT=2. Subroutines and programs are described in the appendices along with a timing summary of all the examples given in the manual.

  10. Optimizing python-based ROOT I/O with PyPy's tracing just-in-time compiler

    NASA Astrophysics Data System (ADS)

    Tlp Lavrijsen, Wim

    2012-12-01

    The Python programming language allows objects and classes to respond dynamically to the execution environment. Most of this, however, is made possible through language hooks which by definition can not be optimized and thus tend to be slow. The PyPy implementation of Python includes a tracing just in time compiler (JIT), which allows similar dynamic responses but at the interpreter-, rather than the application-level. Therefore, it is possible to fully remove the hooks, leaving only the dynamic response, in the optimization stage for hot loops, if the types of interest are opened up to the JIT. A general opening up of types to the JIT, based on reflection information, has already been developed (cppyy). The work described in this paper takes it one step further by customizing access to ROOT I/O to the JIT, allowing for fully automatic optimizations.

  11. Multiprocessors and runtime compilation

    NASA Technical Reports Server (NTRS)

    Saltz, Joel; Berryman, Harry; Wu, Janet

    1990-01-01

    Runtime preprocessing plays a major role in many efficient algorithms in computer science, as well as playing an important role in exploiting multiprocessor architectures. Examples are given that elucidate the importance of runtime preprocessing and show how these optimizations can be integrated into compilers. To support the arguments, transformations implemented in prototype multiprocessor compilers are described and benchmarks from the iPSC2/860, the CM-2, and the Encore Multimax/320 are presented.

  12. An Optimizing Compiler for Petascale I/O on Leadership Class Architectures

    SciTech Connect

    Choudhary, Alok; Kandemir, Mahmut

    2015-03-18

    In high-performance computing systems, parallel I/O architectures usually have very complex hierarchies with multiple layers that collectively constitute an I/O stack, including high-level I/O libraries such as PnetCDF and HDF5, I/O middleware such as MPI-IO, and parallel file systems such as PVFS and Lustre. Our project explored automated instrumentation and compiler support for I/O intensive applications. Our project made significant progress towards understanding the complex I/O hierarchies of high-performance storage systems (including storage caches, HDDs, and SSDs), and designing and implementing state-of-the-art compiler/runtime system technology that targets I/O intensive HPC applications that target leadership class machine. This final report summarizes the major achievements of the project and also points out promising future directions.

  13. Compiler optimizations as a countermeasure against side-channel analysis in MSP430-based devices.

    PubMed

    Malagón, Pedro; de Goyeneche, Juan-Mariano; Zapater, Marina; Moya, José M; Banković, Zorana

    2012-01-01

    Ambient Intelligence (AmI) requires devices everywhere, dynamic and massively distributed networks of low-cost nodes that, among other data, manage private information or control restricted operations. MSP430, a 16-bit microcontroller, is used in WSN platforms, as the TelosB. Physical access to devices cannot be restricted, so attackers consider them a target of their malicious attacks in order to obtain access to the network. Side-channel analysis (SCA) easily exploits leakages from the execution of encryption algorithms that are dependent on critical data to guess the key value. In this paper we present an evaluation framework that facilitates the analysis of the effects of compiler and backend optimizations on the resistance against statistical SCA. We propose an optimization-based software countermeasure that can be used in current low-cost devices to radically increase resistance against statistical SCA, analyzed with the new framework. PMID:22969383

  14. Compiler Optimizations as a Countermeasure against Side-Channel Analysis in MSP430-Based Devices

    PubMed Central

    Malagón, Pedro; de Goyeneche, Juan-Mariano; Zapater, Marina; Moya, José M.; Banković, Zorana

    2012-01-01

    Ambient Intelligence (AmI) requires devices everywhere, dynamic and massively distributed networks of low-cost nodes that, among other data, manage private information or control restricted operations. MSP430, a 16-bit microcontroller, is used in WSN platforms, as the TelosB. Physical access to devices cannot be restricted, so attackers consider them a target of their malicious attacks in order to obtain access to the network. Side-channel analysis (SCA) easily exploits leakages from the execution of encryption algorithms that are dependent on critical data to guess the key value. In this paper we present an evaluation framework that facilitates the analysis of the effects of compiler and backend optimizations on the resistance against statistical SCA. We propose an optimization-based software countermeasure that can be used in current low-cost devices to radically increase resistance against statistical SCA, analyzed with the new framework. PMID:22969383

  15. Compiler optimization technique for data cache prefetching using a small CAM array

    SciTech Connect

    Chi, C.H.

    1994-12-31

    With advances in compiler optimization and program flow analysis, software assisted cache prefetching schemes using PREFETCH instructions are now possible. Although data can be prefetched accurately into the cache, the runtime overhead associated with these schemes often limits their practical use. In this paper, we propose a new scheme, called the Strike-CAM Data Prefetching (SCP), to prefetch array references with constant strides accurately. Compared to current software assisted data prefetching schemes, the SCP scheme has much lower runtime overhead without sacrificing prefetching accuracy. Our result showed that the SCP scheme is particularly suitable for computing intensive scientific applications where cache misses are mainly due to array references with constant strides and they can be prefetched very accurately by this SCP scheme.

  16. Recent advances in PC-Linux systems for electronic structure computations by optimized compilers and numerical libraries.

    PubMed

    Yu, Jen-Shiang K; Yu, Chin-Hui

    2002-01-01

    One of the most frequently used packages for electronic structure research, GAUSSIAN 98, is compiled on Linux systems with various hardware configurations, including AMD Athlon (with the "Thunderbird" core), AthlonMP, and AthlonXP (with the "Palomino" core) systems as well as the Intel Pentium 4 (with the "Willamette" core) machines. The default PGI FORTRAN compiler (pgf77) and the Intel FORTRAN compiler (ifc) are respectively employed with different architectural optimization options to compile GAUSSIAN 98 and test the performance improvement. In addition to the BLAS library included in revision A.11 of this package, the Automatically Tuned Linear Algebra Software (ATLAS) library is linked against the binary executables to improve the performance. Various Hartree-Fock, density-functional theories, and the MP2 calculations are done for benchmarking purposes. It is found that the combination of ifc with ATLAS library gives the best performance for GAUSSIAN 98 on all of these PC-Linux computers, including AMD and Intel CPUs. Even on AMD systems, the Intel FORTRAN compiler invariably produces binaries with better performance than pgf77. The enhancement provided by the ATLAS library is more significant for post-Hartree-Fock calculations. The performance on one single CPU is potentially as good as that on an Alpha 21264A workstation or an SGI supercomputer. The floating-point marks by SpecFP2000 have similar trends to the results of GAUSSIAN 98 package. PMID:12086529

  17. Berkeley Unified Parallel C (UPC) Compiler

    Energy Science and Technology Software Center (ESTSC)

    2003-04-06

    This program is a portable, open-source, compiler for the UPC language, which is based on the Open64 framework, and has extensive support for optimizations. This compiler operated by translating UPC into ANS/ISO C for compilation by a native compiler and linking with a UPC Runtime Library. This design eases portability to both shared and distributed memory parallel architectures. For proper operation the "Berkeley Unified Parallel C (UPC) Runtime Library" and its dependencies are required. Compatiblemore » replacements which implement "The Berkeley UPC Runtime Specification" are possible.« less

  18. Read buffer optimizations to support compiler-assisted multiple instruction retry

    NASA Technical Reports Server (NTRS)

    Alewine, N. J.; Fuchs, W. K.; Hwu, W. M.

    1993-01-01

    Multiple instruction retry is a recovery mechanism for transient processor faults. We previously developed a compiler-assisted approach to multiple instruction ferry in which a read buffer of size 2N (where N represents the maximum instruction rollback distance) was used to resolve some data hazards while the compiler resolved the remaining hazards. The compiler-assisted scheme was shown to reduce the performance overhead and/or hardware complexity normally associated with hardware-only retry schemes. This paper examines the size and design of the read buffer. We establish a practical lower bound and average size requirement for the read buffer by modifying the scheme to save only the data required for rollback. The study measures the effect on the performance of a DECstation 3100 running ten application programs using six read buffer configurations with varying read buffer sizes. Two alternative configurations are shown to be the most efficient and differed depending on whether split-cycle-saves are assumed. Up to a 55 percent read buffer size reduction is achievable with an average reduction of 39 percent given the most efficient read buffer configuration and a variety of applications.

  19. Implementing the optimal provision of ecosystem services.

    PubMed

    Polasky, Stephen; Lewis, David J; Plantinga, Andrew J; Nelson, Erik

    2014-04-29

    Many ecosystem services are public goods whose provision depends on the spatial pattern of land use. The pattern of land use is often determined by the decisions of multiple private landowners. Increasing the provision of ecosystem services, though beneficial for society as a whole, may be costly to private landowners. A regulator interested in providing incentives to landowners for increased provision of ecosystem services often lacks complete information on landowners' costs. The combination of spatially dependent benefits and asymmetric cost information means that the optimal provision of ecosystem services cannot be achieved using standard regulatory or payment for ecosystem services approaches. Here we show that an auction that sets payments between landowners and the regulator for the increased value of ecosystem services with conservation provides incentives for landowners to truthfully reveal cost information, and allows the regulator to implement the optimal provision of ecosystem services, even in the case with spatially dependent benefits and asymmetric information. PMID:24722635

  20. Implementing the optimal provision of ecosystem services

    PubMed Central

    Polasky, Stephen; Lewis, David J.; Plantinga, Andrew J.; Nelson, Erik

    2014-01-01

    Many ecosystem services are public goods whose provision depends on the spatial pattern of land use. The pattern of land use is often determined by the decisions of multiple private landowners. Increasing the provision of ecosystem services, though beneficial for society as a whole, may be costly to private landowners. A regulator interested in providing incentives to landowners for increased provision of ecosystem services often lacks complete information on landowners’ costs. The combination of spatially dependent benefits and asymmetric cost information means that the optimal provision of ecosystem services cannot be achieved using standard regulatory or payment for ecosystem services approaches. Here we show that an auction that sets payments between landowners and the regulator for the increased value of ecosystem services with conservation provides incentives for landowners to truthfully reveal cost information, and allows the regulator to implement the optimal provision of ecosystem services, even in the case with spatially dependent benefits and asymmetric information. PMID:24722635

  1. The reference human nuclear mitochondrial sequences compilation validated and implemented on the UCSC genome browser

    PubMed Central

    2011-01-01

    Background Eukaryotic nuclear genomes contain fragments of mitochondrial DNA called NumtS (Nuclear mitochondrial Sequences), whose mode and time of insertion, as well as their functional/structural role within the genome are debated issues. Insertion sites match with chromosomal breaks, revealing that micro-deletions usually occurring at non-homologous end joining loci become reduced in presence of NumtS. Some NumtS are involved in recombination events leading to fragment duplication. Moreover, NumtS are polymorphic, a feature that renders them candidates as population markers. Finally, they are a cause of contamination during human mtDNA sequencing, leading to the generation of false heteroplasmies. Results Here we present RHNumtS.2, the most exhaustive human NumtSome catalogue annotating 585 NumtS, 97% of which were here validated in a European individual and in HapMap samples. The NumtS complete dataset and related features have been made available at the UCSC Genome Browser. The produced sequences have been submitted to INSDC databases. The implementation of the RHNumtS.2 tracks within the UCSC Genome Browser has been carried out with the aim to facilitate browsing of the NumtS tracks to be exploited in a wide range of research applications. Conclusions We aimed at providing the scientific community with the most exhaustive overview on the human NumtSome, a resource whose aim is to support several research applications, such as studies concerning human structural variation, diversity, and disease, as well as the detection of false heteroplasmic mtDNA variants. Upon implementation of the NumtS tracks, the application of the BLAT program on the UCSC Genome Browser has now become an additional tool to check for heteroplasmic artefacts, supported by data available through the NumtS tracks. PMID:22013967

  2. HOPE: Just-in-time Python compiler for astrophysical computations

    NASA Astrophysics Data System (ADS)

    Akeret, Joel; Gamper, Lukas; Amara, Adam; Refregier, Alexandre

    2014-11-01

    HOPE is a specialized Python just-in-time (JIT) compiler designed for numerical astrophysical applications. HOPE focuses on a subset of the language and is able to translate Python code into C++ while performing numerical optimization on mathematical expressions at runtime. To enable the JIT compilation, the user only needs to add a decorator to the function definition. By using HOPE, the user benefits from being able to write common numerical code in Python while getting the performance of compiled implementation.

  3. Implementation and Performance Issues in Collaborative Optimization

    NASA Technical Reports Server (NTRS)

    Braun, Robert; Gage, Peter; Kroo, Ilan; Sobieski, Ian

    1996-01-01

    Collaborative optimization is a multidisciplinary design architecture that is well-suited to large-scale multidisciplinary optimization problems. This paper compares this approach with other architectures, examines the details of the formulation, and some aspects of its performance. A particular version of the architecture is proposed to better accommodate the occurrence of multiple feasible regions. The use of system level inequality constraints is shown to increase the convergence rate. A series of simple test problems, demonstrated to challenge related optimization architectures, is successfully solved with collaborative optimization.

  4. Feedback Implementation of Zermelo's Optimal Control by Sugeno Approximation

    NASA Technical Reports Server (NTRS)

    Clifton, C.; Homaifax, A.; Bikdash, M.

    1997-01-01

    This paper proposes an approach to implement optimal control laws of nonlinear systems in real time. Our methodology does not require solving two-point boundary value problems online and may not require it off-line either. The optimal control law is learned using the original Sugeno controller (OSC) from a family of optimal trajectories. We compare the trajectories generated by the OSC and the trajectories yielded by the optimal feedback control law when applied to Zermelo's ship steering problem.

  5. Praxis compiler internals

    SciTech Connect

    Evans, A. Jr.

    1981-01-01

    Praxis is a high level machine-oriented algebraic computer language, designed by Bolt Beranek and Newman, Inc. (BBN) and intended for such applications as process control, communications, and system programming in general. Under contract to Lawrence Livermore National Laboratories (LLNL), BBN has implemented the following three compilers for Praxis: a VAX compiler, running on VAX and producing VAX code; a PDP-11 compiler, running on the PDP-11 and producing code for that machine; and a cross compiler, running on VAX and producing code for the PDP-11. The compilers are written in Praxis and so compile themselves. Further, most of the code is common to the three compilers. This document describes the internal operation of the compilers. The emphasis is on the major data bases and interfaces with little discussion of the details of algorithms, since the latter can readily be deduced from study of the listings providing that the data being manipulated are understood. The purpose of this document is to provide enough information to a maintenance staff that does not include the initial implementors so that they can maintain the compiler and make modifications as requirements change.

  6. Optimizing Cancer Care Delivery through Implementation Science.

    PubMed

    Adesoye, Taiwo; Greenberg, Caprice C; Neuman, Heather B

    2016-01-01

    The 2013 Institute of Medicine report investigating cancer care concluded that the cancer care delivery system is in crisis due to an increased demand for care, increasing complexity of treatment, decreasing work force, and rising costs. Engaging patients and incorporating evidence-based care into routine clinical practice are essential components of a high-quality cancer delivery system. However, a gap currently exists between the identification of beneficial research findings and the application in clinical practice. Implementation research strives to address this gap. In this review, we discuss key components of high-quality implementation research. We then apply these concepts to a current cancer care delivery challenge in women's health, specifically the implementation of a surgery decision aid for women newly diagnosed with breast cancer. PMID:26858933

  7. Optimizing Cancer Care Delivery through Implementation Science

    PubMed Central

    Adesoye, Taiwo; Greenberg, Caprice C.; Neuman, Heather B.

    2016-01-01

    The 2013 Institute of Medicine report investigating cancer care concluded that the cancer care delivery system is in crisis due to an increased demand for care, increasing complexity of treatment, decreasing work force, and rising costs. Engaging patients and incorporating evidence-based care into routine clinical practice are essential components of a high-quality cancer delivery system. However, a gap currently exists between the identification of beneficial research findings and the application in clinical practice. Implementation research strives to address this gap. In this review, we discuss key components of high-quality implementation research. We then apply these concepts to a current cancer care delivery challenge in women’s health, specifically the implementation of a surgery decision aid for women newly diagnosed with breast cancer. PMID:26858933

  8. Optimal Implementations for Reliable Circadian Clocks

    NASA Astrophysics Data System (ADS)

    Hasegawa, Yoshihiko; Arita, Masanori

    2014-09-01

    Circadian rhythms are acquired through evolution to increase the chances for survival through synchronizing with the daylight cycle. Reliable synchronization is realized through two trade-off properties: regularity to keep time precisely, and entrainability to synchronize the internal time with daylight. We find by using a phase model with multiple inputs that achieving the maximal limit of regularity and entrainability entails many inherent features of the circadian mechanism. At the molecular level, we demonstrate the role sharing of two light inputs, phase advance and delay, as is well observed in mammals. At the behavioral level, the optimal phase-response curve inevitably contains a dead zone, a time during which light pulses neither advance nor delay the clock. We reproduce the results of phase-controlling experiments entrained by two types of periodic light pulses. Our results indicate that circadian clocks are designed optimally for reliable clockwork through evolution.

  9. An Advanced Compiler Designed for a VLIW DSP for Sensors-Based Systems

    PubMed Central

    Yang, Xu; He, Hu

    2012-01-01

    The VLIW architecture can be exploited to greatly enhance instruction level parallelism, thus it can provide computation power and energy efficiency advantages, which satisfies the requirements of future sensor-based systems. However, as VLIW codes are mainly compiled statically, the performance of a VLIW processor is dominated by the behavior of its compiler. In this paper, we present an advanced compiler designed for a VLIW DSP named Magnolia, which will be used in sensor-based systems. This compiler is based on the Open64 compiler. We have implemented several advanced optimization techniques in the compiler, and fulfilled the O3 level optimization. Benchmarks from the DSPstone test suite are used to verify the compiler. Results show that the code generated by our compiler can make the performance of Magnolia match that of the current state-of-the-art DSP processors. PMID:22666040

  10. Parallel optimization algorithms and their implementation in VLSI design

    NASA Technical Reports Server (NTRS)

    Lee, G.; Feeley, J. J.

    1991-01-01

    Two new parallel optimization algorithms based on the simplex method are described. They may be executed by a SIMD parallel processor architecture and be implemented in VLSI design. Several VLSI design implementations are introduced. An application example is reported to demonstrate that the algorithms are effective.

  11. A Training Package for Implementing the IEP Process in Wyoming. Volume IV. Compilation of Successful Training Strategies.

    ERIC Educational Resources Information Center

    Foxworth-Mott, Anita; Moore, Caroline

    Volume IV of a four volume series offers strategies for implementing effective inservice workshops to train administrators, assessment personnel, and others involved in the development and implementation of individualized education programs (IEPs) for handicapped children in Wyoming. Part 1 addresses points often overlooked in delivering training,…

  12. Optimization of an optically implemented on-board FDMA demultiplexer

    NASA Technical Reports Server (NTRS)

    Fargnoli, J.; Riddle, L.

    1991-01-01

    Performance of a 30 GHz frequency division multiple access (FDMA) uplink to a processing satellite is modelled for the case where the onboard demultiplexer is implemented optically. Included in the performance model are the effects of adjacent channel interference, intersymbol interference, and spurious signals associated with the optical implementation. Demultiplexer parameters are optimized to provide the minimum bit error probability at a given bandwidth efficiency when filtered QPSK modulation is employed.

  13. A systolic array parallelizing compiler

    SciTech Connect

    Tseng, P.S. )

    1990-01-01

    This book presents a completely new approach to the problem of systolic array parallelizing compiler. It describes the AL parallelizing compiler for the Warp systolic array, the first working systolic array parallelizing compiler which can generate efficient parallel code for complete LINPACK routines. This book begins by analyzing the architectural strength of the Warp systolic array. It proposes a model for mapping programs onto the machine and introduces the notion of data relations for optimizing the program mapping. Also presented are successful applications of the AL compiler in matrix computation and image processing. A complete listing of the source program and compiler-generated parallel code are given to clarify the overall picture of the compiler. The book concludes that systolic array parallelizing compiler can produce efficient parallel code, almost identical to what the user would have written by hand.

  14. The Specification of Source-to-source Transformations for the Compile-time Optimization of Parallel Object-oriented Scientific Applications

    SciTech Connect

    Quinlan, D; Kowarschik, M

    2001-06-05

    The performance of object-oriented applications in scientific computing often suffers from the inefficient use of high-level abstractions provided by underlying libraries. Since these library abstractions are not part of the programming language itself there is no compiler mechanism to respect their semantics and thus to perform appropriate optimizations, e.g., array semantics within object-oriented array class libraries which permit parallel optimizations inconceivable to the serial compiler. We have presented the ROSE infrastructure as a tool for automatically generating library-specific preprocessors. These preprocessors can perform sematics-based source-to-source transformations of the application in order to introduce high-level code optimizations. In this paper we outline the design of ROSE and focus on the discussion of various approaches for specifying and processing complex source code transformations. These techniques are supposed to be as easy and intuitive as possible for the ROSE users, i.e. for the designers of the library-specific preprocessors.

  15. All-Optical Implementation of the Ant Colony Optimization Algorithm

    NASA Astrophysics Data System (ADS)

    Hu, Wenchao; Wu, Kan; Shum, Perry Ping; Zheludev, Nikolay I.; Soci, Cesare

    2016-05-01

    We report all-optical implementation of the optimization algorithm for the famous “ant colony” problem. Ant colonies progressively optimize pathway to food discovered by one of the ants through identifying the discovered route with volatile chemicals (pheromones) secreted on the way back from the food deposit. Mathematically this is an important example of graph optimization problem with dynamically changing parameters. Using an optical network with nonlinear waveguides to represent the graph and a feedback loop, we experimentally show that photons traveling through the network behave like ants that dynamically modify the environment to find the shortest pathway to any chosen point in the graph. This proof-of-principle demonstration illustrates how transient nonlinearity in the optical system can be exploited to tackle complex optimization problems directly, on the hardware level, which may be used for self-routing of optical signals in transparent communication networks and energy flow in photonic systems.

  16. All-Optical Implementation of the Ant Colony Optimization Algorithm.

    PubMed

    Hu, Wenchao; Wu, Kan; Shum, Perry Ping; Zheludev, Nikolay I; Soci, Cesare

    2016-01-01

    We report all-optical implementation of the optimization algorithm for the famous "ant colony" problem. Ant colonies progressively optimize pathway to food discovered by one of the ants through identifying the discovered route with volatile chemicals (pheromones) secreted on the way back from the food deposit. Mathematically this is an important example of graph optimization problem with dynamically changing parameters. Using an optical network with nonlinear waveguides to represent the graph and a feedback loop, we experimentally show that photons traveling through the network behave like ants that dynamically modify the environment to find the shortest pathway to any chosen point in the graph. This proof-of-principle demonstration illustrates how transient nonlinearity in the optical system can be exploited to tackle complex optimization problems directly, on the hardware level, which may be used for self-routing of optical signals in transparent communication networks and energy flow in photonic systems. PMID:27222098

  17. All-Optical Implementation of the Ant Colony Optimization Algorithm

    PubMed Central

    Hu, Wenchao; Wu, Kan; Shum, Perry Ping; Zheludev, Nikolay I.; Soci, Cesare

    2016-01-01

    We report all-optical implementation of the optimization algorithm for the famous “ant colony” problem. Ant colonies progressively optimize pathway to food discovered by one of the ants through identifying the discovered route with volatile chemicals (pheromones) secreted on the way back from the food deposit. Mathematically this is an important example of graph optimization problem with dynamically changing parameters. Using an optical network with nonlinear waveguides to represent the graph and a feedback loop, we experimentally show that photons traveling through the network behave like ants that dynamically modify the environment to find the shortest pathway to any chosen point in the graph. This proof-of-principle demonstration illustrates how transient nonlinearity in the optical system can be exploited to tackle complex optimization problems directly, on the hardware level, which may be used for self-routing of optical signals in transparent communication networks and energy flow in photonic systems. PMID:27222098

  18. Tour through the Praxis compiler

    SciTech Connect

    Morgan, C.R.

    1981-07-01

    Praxis is a high level computer language, designed by Bolt Beranek and Newman Inc. for Lawrence Livermore National Laboratories. It is intended for such applications as process control, communications and systems programming. Praxis provides structured programming features, such as strong typing and data encapsulation, while providing expressibility and efficient code. Praxis is a modification of a language designed for the Defense Communications Agency. During that study we proposed a possible compiler implementation which has become the basic design for the Praxis compiler. This report will update that compiler design to reflect the current Praxis compiler. The structure of this report was inspired by C language compiler description. The Praxis compiler is constructed to be easily portable to other source and object machines. It has been designed to provide a straightforward implementation of code generators and compilers for a broad class of machines including register and stack machines. The design isolates most machine dependent portions in the code generator phases of the compiler. In fact most code generator strategy is contained within a set of tables that are generated by a partially machine independent code generator tool. Currently, Praxis compilers exist for the DEC PDP-11 and VAX computers. A Compiler for the DECSystem-20 is being developed.

  19. Algorithmic synthesis using Python compiler

    NASA Astrophysics Data System (ADS)

    Cieszewski, Radoslaw; Romaniuk, Ryszard; Pozniak, Krzysztof; Linczuk, Maciej

    2015-09-01

    This paper presents a python to VHDL compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and translate it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the programmed circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. This can be achieved by using many computational resources at the same time. Creating parallel programs implemented in FPGAs in pure HDL is difficult and time consuming. Using higher level of abstraction and High-Level Synthesis compiler implementation time can be reduced. The compiler has been implemented using the Python language. This article describes design, implementation and results of created tools.

  20. Implementing size-optimal discrete neural networks require analog circuitry

    SciTech Connect

    Beiu, V.

    1998-12-01

    This paper starts by overviewing results dealing with the approximation capabilities of neural networks, as well as bounds on the size of threshold gate circuits. Based on a constructive solution for Kolmogorov`s superpositions the authors show that implementing Boolean functions can be done using neurons having an identity transfer function. Because in this case the size of the network is minimized, it follows that size-optimal solutions for implementing Boolean functions can be obtained using analog circuitry. Conclusions and several comments on the required precision are ending the paper.

  1. Implementation and Optimization of Image Processing Algorithms on Embedded GPU

    NASA Astrophysics Data System (ADS)

    Singhal, Nitin; Yoo, Jin Woo; Choi, Ho Yeol; Park, In Kyu

    In this paper, we analyze the key factors underlying the implementation, evaluation, and optimization of image processing and computer vision algorithms on embedded GPU using OpenGL ES 2.0 shader model. First, we present the characteristics of the embedded GPU and its inherent advantage when compared to embedded CPU. Additionally, we propose techniques to achieve increased performance with optimized shader design. To show the effectiveness of the proposed techniques, we employ cartoon-style non-photorealistic rendering (NPR), speeded-up robust feature (SURF) detection, and stereo matching as our example algorithms. Performance is evaluated in terms of the execution time and speed-up achieved in comparison with the implementation on embedded CPU.

  2. Analog and digital FPGA implementation of BRIN for optimization problems.

    PubMed

    Ng, H S; Lam, K P

    2003-01-01

    The binary relation inference network (BRIN) shows promise in obtaining the global optimal solution for optimization problem, which is time independent of the problem size. However, the realization of this method is dependent on the implementation platforms. We studied analog and digital FPGA implementation platforms. Analog implementation of BRIN for two different directed graph problems is studied. As transitive closure problems can transform to a special case of shortest path problems or a special case of maximum spanning tree problems, two different forms of BRIN are discussed. Their circuits using common analog integrated circuits are investigated. The BRIN solution for critical path problems is expressed and is implemented using the separated building block circuit and the combined building block circuit. As these circuits are different, the response time of these networks will be different. The advancement of field programmable gate arrays (FPGAs) in recent years, allowing millions of gates on a single chip and accompanying with high-level design tools, has allowed the implementation of very complex networks. With this exemption on manual circuit construction and availability of efficient design platform, the BRIN architecture could be built in a much more efficient way. Problems on bandwidth are removed by taking all previous external connections to the inside of the chip. By transforming BRIN to FPGA (Xilinx XC4010XL and XCV800 Virtex), we implement a synchronous network with computations in a finite number of steps. Two case studies are presented, with correct results verified from simulation implementation. Resource consumption on FPGAs is studied showing that Virtex devices are more suitable for the expansion of network in future developments. PMID:18244587

  3. Implementation of generalized optimality criteria in a multidisciplinary environment

    NASA Technical Reports Server (NTRS)

    Canfield, R. A.; Venkayya, V. B.

    1989-01-01

    A generalized optimality criterion method consisting of a dual problem solver combined with a compound scaling algorithm was implemented in the multidisciplinary design tool, ASTROS. This method enables, for the first time in a production design tool, the determination of a minimum weight design using thousands of independent structural design variables while simultaneously considering constraints on response quantities in several disciplines. Even for moderately large examples, the computational efficiency is improved significantly relative to the conventional approach.

  4. HAL/S-FC compiler system specifications

    NASA Technical Reports Server (NTRS)

    1976-01-01

    This document specifies the informational interfaces within the HAL/S-FC compiler, and between the compiler and the external environment. This Compiler System Specification is for the HAL/S-FC compiler and its associated run time facilities which implement the full HAL/S language. The HAL/S-FC compiler is designed to operate stand-alone on any compatible IBM 360/370 computer and within the Software Development Laboratory (SDL) at NASA/JSC, Houston, Texas.

  5. Implementation of optimal phase-covariant cloning machines

    SciTech Connect

    Sciarrino, Fabio; De Martini, Francesco

    2007-07-15

    The optimal phase-covariant quantum cloning machine (PQCM) broadcasts the information associated to an input qubit into a multiqubit system, exploiting a partial a priori knowledge of the input state. This additional a priori information leads to a higher fidelity than for the universal cloning. The present article first analyzes different innovative schemes to implement the 1{yields}3 PQCM. The method is then generalized to any 1{yields}M machine for an odd value of M by a theoretical approach based on the general angular momentum formalism. Finally different experimental schemes based either on linear or nonlinear methods and valid for single photon polarization encoded qubits are discussed.

  6. Barriers to Implementation of Optimal Laboratory Biosafety Practices in Pakistan.

    PubMed

    Shakoor, Sadia; Shafaq, Humaira; Hasan, Rumina; Qureshi, Shahida M; Dojki, Maqboola; Hughes, Molly A; Zaidi, Anita K M; Khan, Erum

    2016-01-01

    The primary goal of biosafety education is to ensure safe practices among workers in biomedical laboratories. Despite several educational workshops by the Pakistan Biological Safety Association (PBSA), compliance with safe practices among laboratory workers remains low. To determine barriers to implementation of recommended biosafety practices among biomedical laboratory workers in Pakistan, we conducted a questionnaire-based survey of participants attending 2 workshops focusing on biosafety practices in Karachi and Lahore in February 2015. Questionnaires were developed by modifying the BARRIERS scale in which respondents are required to rate barriers on a 1-4 scale. Nineteen of the original 29 barriers were included and subcategorized into 4 groups: awareness, material quality, presentation, and workplace barriers. Workshops were attended by 64 participants. Among barriers that were rated as moderate to great barriers by at least 50% of respondents were: lack of time to read biosafety guidelines (workplace subscale), lack of staff authorization to change/improve practice (workplace subscale), no career or self-improvement advantages to the staff for implementing optimal practices (workplace subscale), and unclear practice implications (presentation subscale). A lack of recognition for employees' rights and benefits in the workplace was found to be a predominant reason for a lack of compliance. Based on perceived barriers, substantial improvement in work environment, worker facilitation, and enabling are needed for achieving improved or optimal biosafety practices in Pakistan. PMID:27400192

  7. Optimal clinical implementation of the Siemens virtual wedge

    SciTech Connect

    Walker, C.P.; Richmond, N.D.; Lambert, G.D

    2003-09-30

    Installation of a modern high-energy Siemens Primus linear accelerator at the Northern Centre for Cancer Treatment (NCCT) provided the opportunity to investigate the optimal clinical implementation of the Siemens virtual wedge filter. Previously published work has concentrated on the production of virtual wedge angles at 15 deg., 30 deg., 45 deg., and 60 deg. as replacements for the Siemens hard wedges of the same nominal angles. However, treatment plan optimization of the dose distribution can be achieved with the Primus, as its control software permits the selection of any virtual wedge angle from 15 degree sign to 60 degree sign in increments of 1 deg. The same result can also be produced from a combination of open and 60 deg. wedged fields. Helax-TMS models both of these modes of virtual wedge delivery by the wedge angle and the wedge fraction methods respectively. This paper describes results of timing studies in the planning of optimized patient dose distributions by both methods and in the subsequent treatment delivery procedures. Employment of the wedge fraction method results in the delivery of small numbers of monitor units to the beam's central axis; therefore, wedge profile stability and delivered dose with low numbers of monitor units were also investigated. The wedge fraction was proven to be the most efficient method when the time taken for both planning and treatment delivery were taken into consideration, and is now used exclusively for virtual wedge treatment delivery in Newcastle. It has also been shown that there are no unfavorable dosimetric consequences from its practical implementation.

  8. Implementing stationary-phase optimized selectivity in supercritical fluid chromatography.

    PubMed

    Delahaye, Sander; Lynen, Frédéric

    2014-12-16

    The performance of stationary-phase optimized selectivity liquid chromatography (SOS-LC) for improved separation of complex mixtures has been demonstrated before. A dedicated kit containing column segments of different lengths and packed with different stationary phases is commercially available together with algorithms capable of predicting and ranking isocratic and gradient separations over vast amounts of possible column combinations. Implementation in chromatographic separations involving compressible fluids, as is the case in supercritical fluid chromatography, had thus far not been attempted. The challenge of this approach is the dependency of solute retention with the mobile-phase density, complicating linear extrapolation of retention over longer or shorter columns segments, as is the case in conventional SOS-LC. In this study, the possibilities of performing stationary-phase optimized selectivity supercritical fluid chromatography (SOS-SFC) are demonstrated with typical low density mobile phases (94% CO2). The procedure is optimized with the commercially available column kit and with the classical isocratic SOS-LC algorithm. SOS-SFC appears possible without any density correction, although optimal correspondence between prediction and experiment is obtained when isopycnic conditions are maintained. As also the influence of the segment order appears significantly less relevant than expected, the use of the approach in SFC appears as promising as is the case in HPLC. Next to the classical use of SOS for faster baseline separation of all solutes in a mixture, the benefits of the approach for predicting as wide as possible separation windows around to-be-purified solutes in semipreparative SFC are illustrated, leading to significant production rate improvements in (semi)preparative SFC. PMID:25393519

  9. Optimized evaporation technique for leachate treatment: Small scale implementation.

    PubMed

    Benyoucef, Fatima; Makan, Abdelhadi; El Ghmari, Abderrahman; Ouatmane, Aziz

    2016-04-01

    This paper introduces an optimized evaporation technique for leachate treatment. For this purpose and in order to study the feasibility and measure the effectiveness of the forced evaporation, three cuboidal steel tubs were designed and implemented. The first control-tub was installed at the ground level to monitor natural evaporation. Similarly, the second and the third tub, models under investigation, were installed respectively at the ground level (equipped-tub 1) and out of the ground level (equipped-tub 2), and provided with special equipment to accelerate the evaporation process. The obtained results showed that the evaporation rate at the equipped-tubs was much accelerated with respect to the control-tub. It was accelerated five times in the winter period, where the evaporation rate was increased from a value of 0.37 mm/day to reach a value of 1.50 mm/day. In the summer period, the evaporation rate was accelerated more than three times and it increased from a value of 3.06 mm/day to reach a value of 10.25 mm/day. Overall, the optimized evaporation technique can be applied effectively either under electric or solar energy supply, and will accelerate the evaporation rate from three to five times whatever the season temperature. PMID:26826455

  10. Designing a stencil compiler for the Connection Machine model CM-5

    SciTech Connect

    Brickner, R.G.; Holian, K.; Thiagarajan, B.; Johnsson, S.L. |

    1994-12-31

    In this paper the authors present the design of a stencil compiler for the Connection Machine system CM-5. The stencil compiler will optimize the data motion between processing nodes, minimize the data motion within a node, and minimize the data motion between registers and local memory in a node. The compiler will natively support two-dimensional stencils, but stencils in three dimensions will be automatically decomposed. Lower dimensional stencils are treated as degenerate stencils. The compiler will be integrated as part of the CM Fortran programming system. Much of the compiler code will be adapted from the CM-2/200 stencil compiler, which is part of CMSSL (the Connection Machine Scientific Software Library) Release 3.1 for the CM-2/200, and the compiler will be available as part of the Connection Machine Scientific Software Library (CMSSL) for the CM-5. In addition to setting down design considerations, they report on the implementation status of the stencil compiler. In particular, they discuss optimization strategies and status of code conversion from CM-2/200 to CM-5 architecture, and report on the measured performance of prototype target code which the compiler will generate.

  11. Optimal control of ICU patient discharge: from theory to implementation.

    PubMed

    Mallor, Fermín; Azcárate, Cristina; Barado, Julio

    2015-09-01

    This paper deals with the management of scarce health care resources. We consider a control problem in which the objective is to minimize the rate of patient rejection due to service saturation. The scope of decisions is limited, in terms both of the amount of resources to be used, which are supposed to be fixed, and of the patient arrival pattern, which is assumed to be uncontrollable. This means that the only potential areas of control are speed or completeness of service. By means of queuing theory and optimization techniques, we provide a theoretical solution expressed in terms of service rates. In order to make this theoretical analysis useful for the effective control of the healthcare system, however, further steps in the analysis of the solution are required: physicians need flexible and medically-meaningful operative rules for shortening patient length of service to the degree needed to give the service rates dictated by the theoretical analysis. The main contribution of this paper is to discuss how the theoretical solutions can be transformed into effective management rules to guide doctors' decisions. The study examines three types of rules based on intuitive interpretations of the theoretical solution. Rules are evaluated through implementation in a simulation model. We compare the service rates provided by the different policies with those dictated by the theoretical solution. Probabilistic analysis is also included to support rule validity. An Intensive Care Unit is used to illustrate this control problem. The study focuses on the Markovian case before moving on to consider more realistic LoS distributions (Weibull, Lognormal and Phase-type distribution). PMID:25763761

  12. NONMEM version III implementation on a VAX 9000: a DCL procedure for single-step execution and the unrealized advantage of a vectorizing FORTRAN compiler.

    PubMed

    Vielhaber, J P; Kuhlman, J V; Barrett, J S

    1993-06-01

    There is great interest within the FDA, academia, and the pharmaceutical industry to provide more detailed information about the time course of drug concentration and effect in subjects receiving a drug as part of their overall therapy. Advocates of this effort expect the eventual goal of these endeavors to provide labeling which reflects the experience of drug administration to the entire population of potential recipients. The set of techniques which have been thus far applied to this task has been defined as population approach methodologies. While a consensus view on the usefulness of these techniques is not likely to be formed in the near future, most pharmaceutical companies or individuals who provide kinetic/dynamic support for drug development programs are investigating population approach methods. A major setback in this investigation has been the shortage of computational tools to analyze population data. One such algorithm, NONMEM, supplied by the NONMEM Project Group of the University of California, San Francisco has been widely used and remains the most accessible computational tool to date. The program is distributed to users as FORTRAN 77 source code with instructions for platform customization. Given the memory and compiler requirements of this algorithm and the intensive matrix manipulation required for run convergence and parameter estimation, this program's performance is largely determined by the platform and the FORTRAN compiler used to create the NONMEM executable. Benchmark testing on a VAX 9000 with Digital's FORTRAN (v. 1.2) compiler suggests that this is an acceptable platform. Due to excessive branching within the loops of the NONMEM source code, the vector processing capabilities of the KV900-AA vector processor actually decrease performance. A DCL procedure is given to provide single step execution of this algorithm. PMID:8370277

  13. Compiler-assisted static checkpoint insertion

    NASA Technical Reports Server (NTRS)

    Long, Junsheng; Fuchs, W. K.; Abraham, Jacob A.

    1992-01-01

    This paper describes a compiler-assisted approach for static checkpoint insertion. Instead of fixing the checkpoint location before program execution, a compiler enhanced polling mechanism is utilized to maintain both the desired checkpoint intervals and reproducible checkpoint 1ocations. The technique has been implemented in a GNU CC compiler for Sun 3 and Sun 4 (Sparc) processors. Experiments demonstrate that the approach provides for stable checkpoint intervals and reproducible checkpoint placements with performance overhead comparable to a previously presented compiler assisted dynamic scheme (CATCH) utilizing the system clock.

  14. Testing-Based Compiler Validation for Synchronous Languages

    NASA Technical Reports Server (NTRS)

    Garoche, Pierre-Loic; Howar, Falk; Kahsai, Temesghen; Thirioux, Xavier

    2014-01-01

    In this paper we present a novel lightweight approach to validate compilers for synchronous languages. Instead of verifying a compiler for all input programs or providing a fixed suite of regression tests, we extend the compiler to generate a test-suite with high behavioral coverage and geared towards discovery of faults for every compiled artifact. We have implemented and evaluated our approach using a compiler from Lustre to C.

  15. Implementation and optimization of portable standard LISP for the Cray

    SciTech Connect

    Anderson, J.W.; Kessler, R.R.; Galway, W.F.

    1987-01-01

    Portable Standard LISP (PSL), a dialect of LISP developed at the University of Utah, has been implemented on the CRAY-1s and CRAY X-MPs at the Los Alamos National Laboratory and at the National Magnetic Fusion Energy Computer Center at Lawrence Livermore National Laboratory. This implementation was developed using a highly portable model and then tuned for the Cray architecture. The speed of the resulting system is quite impressive, and the environment is very good for symbolic processing. 5 refs., 6 tabs.

  16. Spacelab user implementation assessment study. Volume 2: Concept optimization

    NASA Technical Reports Server (NTRS)

    1975-01-01

    The integration and checkout activities of Spacelab payloads consist of two major sets of tasks: support functions, and test and operations. The support functions are definitized and the optimized approach for the accomplishment of these functions are delineated. Comparable data are presented for test and operations activities.

  17. Optimal q-Markov COVER for finite precision implementation

    NASA Technical Reports Server (NTRS)

    Williamson, Darrell; Skelton, Robert E.

    1989-01-01

    The existing q-Markov COVER realization theory does not take into account the problems of arithmetic errors due to both the quantization of states and coefficients of the reduced order model. All q-Markov COVERs allow some freedom in the choice of parameters. Here, researchers exploit this freedom in the existing theory to optimize the models with respect to these finite wordlength effects.

  18. Implementing quantum gates by optimal control with doubly exponential convergence.

    PubMed

    de Fouquieres, Pierre

    2012-03-16

    We introduce a novel algorithm for the task of coherently controlling a quantum mechanical system to implement any chosen unitary dynamics. It performs faster than existing state of the art methods by 1 to 3 orders of magnitude (depending on which one we compare to), particularly for quantum information processing purposes. This substantially enhances the ability to both study the control capabilities of physical systems within their coherence times, and constrain solutions for control tasks to lie within experimentally feasible regions. Natural extensions of the algorithm are also discussed. PMID:22540447

  19. Implementing size-optimal discrete neural networks requires analog circuitry

    SciTech Connect

    Beiu, V.

    1998-03-01

    Neural networks (NNs) have been experimentally shown to be quite effective in many applications. This success has led researchers to undertake a rigorous analysis of the mathematical properties that enable them to perform so well. It has generated two directions of research: (i) to find existence/constructive proofs for what is now known as the universal approximation problem; (ii) to find tight bounds on the size needed by the approximation problem (or some particular cases). The paper will focus on both aspects, for the particular case when the functions to be implemented are Boolean.

  20. A controller based on Optimal Type-2 Fuzzy Logic: systematic design, optimization and real-time implementation.

    PubMed

    Fayek, H M; Elamvazuthi, I; Perumal, N; Venkatesh, B

    2014-09-01

    A computationally-efficient systematic procedure to design an Optimal Type-2 Fuzzy Logic Controller (OT2FLC) is proposed. The main scheme is to optimize the gains of the controller using Particle Swarm Optimization (PSO), then optimize only two parameters per type-2 membership function using Genetic Algorithm (GA). The proposed OT2FLC was implemented in real-time to control the position of a DC servomotor, which is part of a robotic arm. The performance judgments were carried out based on the Integral Absolute Error (IAE), as well as the computational cost. Various type-2 defuzzification methods were investigated in real-time. A comparative analysis with an Optimal Type-1 Fuzzy Logic Controller (OT1FLC) and a PI controller, demonstrated OT2FLC׳s superiority; which is evident in handling uncertainty and imprecision induced in the system by means of noise and disturbances. PMID:24962934

  1. Economic Implementation and Optimization of Secondary Oil Recovery

    SciTech Connect

    Cary D. Brock

    2006-01-09

    The St Mary West Barker Sand Unit (SMWBSU or Unit) located in Lafayette County, Arkansas was unitized for secondary recovery operations in 2002 followed by installation of a pilot injection system in the fall of 2003. A second downdip water injection well was added to the pilot project in 2005 and 450,000 barrels of saltwater has been injected into the reservoir sand to date. Daily injection rates have been improved over initial volumes by hydraulic fracture stimulation of the reservoir sand in the injection wells. Modifications to the injection facilities are currently being designed to increase water injection rates for the pilot flood. A fracture treatment on one of the production wells resulted in a seven-fold increase of oil production. Recent water production and increased oil production in a producer closest to the pilot project indicates possible response to the water injection. The reservoir and wellbore injection performance data obtained during the pilot project will be important to the secondary recovery optimization study for which the DOE grant was awarded. The reservoir characterization portion of the modeling and simulation study is in progress by Strand Energy project staff under the guidance of University of Houston Department of Geosciences professor Dr. Janok Bhattacharya and University of Texas at Austin Department of Petroleum and Geosystems Engineering professor Dr. Larry W. Lake. A geologic and petrophysical model of the reservoir is being constructed from geophysical data acquired from core, well log and production performance histories. Possible use of an outcrop analog to aid in three dimensional, geostatistical distribution of the flow unit model developed from the wellbore data will be investigated. The reservoir model will be used for full-field history matching and subsequent fluid flow simulation based on various injection schemes including patterned water flooding, addition of alkaline surfactant-polymer (ASP) to the injected water

  2. Livermore Compiler Analysis Loop Suite

    SciTech Connect

    Hornung, R. D.

    2013-03-01

    LCALS is designed to evaluate compiler optimizations and performance of a variety of loop kernels and loop traversal software constructs. Some of the loop kernels are pulled directly from "Livermore Loops Coded in C", developed at LLNL (see item 11 below for details of earlier code versions). The older suites were used to evaluate floating-point performances of hardware platforms prior to porting larger application codes. The LCALS suite is geared toward assissing C++ compiler optimizations and platform performance related to SIMD vectorization, OpenMP threading, and advanced C++ language features. LCALS contains 20 of 24 loop kernels from the older Livermore Loop suites, plus various others representative of loops found in current production appkication codes at LLNL. The latter loops emphasize more diverse loop constructs and data access patterns than the others, such as multi-dimensional difference stencils. The loops are included in a configurable framework, which allows control of compilation, loop sampling for execution timing, which loops are run and their lengths. It generates timing statistics for analysis and comparing variants of individual loops. Also, it is easy to add loops to the suite as desired.

  3. Development and implementation of a rail current optimization program

    SciTech Connect

    King, T.L.; Dharamshi, R.; Kim, K.; Zhang, J.; Tompkins, M.W.; Anderson, M.A.; Feng, Q.

    1997-01-01

    Efforts are underway to automate the operation of a railgun hydrogen pellet injector for fusion reactor refueling. A plasma armature is employed to avoid the friction produced by a sliding metal armature and, in particular, to prevent high-Z impurities from entering the tokamak. High currents are used to achieve high accelerations, resulting in high plasma temperatures. Consequently, the plasma armature ablates and accumulates material from the pellet and gun barrel. This increases inertial and viscous drag, lowering acceleration. A railgun model has been developed to compute the acceleration in the presence of these losses. In order to quantify these losses, the ablation coefficient, {alpha}, and drag coefficient, C{sub d}, must be determined. These coefficients are estimated based on the pellet acceleration. The sensitivity of acceleration to {alpha} and C{sub d} has been calculated using the model. Once {alpha} and C{sub d} have been determined, their values are applied to the model to compute the appropriate current pulse width. An optimization program was written in LabVIEW software to carry out this procedure. This program was then integrated into the existing code used to operate the railgun system. Preliminary results obtained after test firing the gun indicate that the program computes reasonable values for {alpha} and C{sub d} and calculates realistic pulse widths.

  4. Implementation of Emission Trading in Carbon Dioxide Sequestration Optimization Management

    NASA Astrophysics Data System (ADS)

    Zhang, X.; Duncan, I.

    2013-12-01

    As an effective mid- and long- term solution for large-scale mitigation of industrial CO2 emissions, CO2 capture and sequestration (CCS) has been paid more and more attention in the past decades. A general CCS management system has complex characteristics of multiple emission sources, multiple mitigation technologies, multiple sequestration sites, and multiple project periods. Trade-off exists among numerous environmental, economic, political, and technical factors, leading to varied system features. Sound decision alternatives are thus desired for provide decision supports for decision makers or managers for managing such a CCS system from capture to the final geologic storage phases. Carbon emission trading has been developed as a cost-effective tool for reducing the global greenhouse gas emissions. In this study, a carbon capture and sequestration optimization management model is proposed to address the above issues. The carbon emission trading is integrated into the model, and its impacts on the resulting management decisions are analyzed. A multi-source multi-period case study is provided to justify the applicability of the modeling approach, where uncertainties in modeling parameters are also dealt with.

  5. Evaluation of a Multicore-Optimized Implementation for Tomographic Reconstruction

    PubMed Central

    Agulleiro, Jose-Ignacio; Fernández, José Jesús

    2012-01-01

    Tomography allows elucidation of the three-dimensional structure of an object from a set of projection images. In life sciences, electron microscope tomography is providing invaluable information about the cell structure at a resolution of a few nanometres. Here, large images are required to combine wide fields of view with high resolution requirements. The computational complexity of the algorithms along with the large image size then turns tomographic reconstruction into a computationally demanding problem. Traditionally, high-performance computing techniques have been applied to cope with such demands on supercomputers, distributed systems and computer clusters. In the last few years, the trend has turned towards graphics processing units (GPUs). Here we present a detailed description and a thorough evaluation of an alternative approach that relies on exploitation of the power available in modern multicore computers. The combination of single-core code optimization, vector processing, multithreading and efficient disk I/O operations succeeds in providing fast tomographic reconstructions on standard computers. The approach turns out to be competitive with the fastest GPU-based solutions thus far. PMID:23139768

  6. Array-Pattern-Match Compiler for Opportunistic Data Analysis

    NASA Technical Reports Server (NTRS)

    James, Mark

    2006-01-01

    A computer program has been written to facilitate real-time sifting of scientific data as they are acquired to find data patterns deemed to warrant further analysis. The patterns in question are of a type denoted array patterns, which are specified by nested parenthetical expressions. [One example of an array pattern is ((>3) 0 (not=1)): this pattern matches a vector of at least three elements, the first of which exceeds 3, the second of which is 0, and the third of which does not equal 1.] This program accepts a high-level description of a static array pattern and compiles a highly optimal and compact other program to determine whether any given instance of any data array matches that pattern. The compiler implemented by this program is independent of the target language, so that as new languages are used to write code that processes scientific data, they can easily be adapted to this compiler. This program runs on a variety of different computing platforms. It must be run in conjunction with any one of a number of Lisp compilers that are available commercially or as shareware.

  7. An Extensible Open-Source Compiler Infrastructure for Testing

    SciTech Connect

    Quinlan, D; Ur, S; Vuduc, R

    2005-12-09

    Testing forms a critical part of the development process for large-scale software, and there is growing need for automated tools that can read, represent, analyze, and transform the application's source code to help carry out testing tasks. However, the support required to compile applications written in common general purpose languages is generally inaccessible to the testing research community. In this paper, we report on an extensible, open-source compiler infrastructure called ROSE, which is currently in development at Lawrence Livermore National Laboratory. ROSE specifically targets developers who wish to build source-based tools that implement customized analyses and optimizations for large-scale C, C++, and Fortran90 scientific computing applications (on the order of a million lines of code or more). However, much of this infrastructure can also be used to address problems in testing, and ROSE is by design broadly accessible to those without a formal compiler background. This paper details the interactions between testing of applications and the ways in which compiler technology can aid in the understanding of those applications. We emphasize the particular aspects of ROSE, such as support for the general analysis of whole programs, that are particularly well-suited to the testing research community and the scale of the problems that community solves.

  8. Analytical techniques: A compilation

    NASA Technical Reports Server (NTRS)

    1975-01-01

    A compilation, containing articles on a number of analytical techniques for quality control engineers and laboratory workers, is presented. Data cover techniques for testing electronic, mechanical, and optical systems, nondestructive testing techniques, and gas analysis techniques.

  9. An Approach for Dynamic Optimization of Prevention Program Implementation in Stochastic Environments

    NASA Astrophysics Data System (ADS)

    Kang, Yuncheol; Prabhu, Vittal

    The science of preventing youth problems has significantly advanced in developing evidence-based prevention program (EBP) by using randomized clinical trials. Effective EBP can reduce delinquency, aggression, violence, bullying and substance abuse among youth. Unfortunately the outcomes of EBP implemented in natural settings usually tend to be lower than in clinical trials, which has motivated the need to study EBP implementations. In this paper we propose to model EBP implementations in natural settings as stochastic dynamic processes. Specifically, we propose Markov Decision Process (MDP) for modeling and dynamic optimization of such EBP implementations. We illustrate these concepts using simple numerical examples and discuss potential challenges in using such approaches in practice.

  10. HAL/S-FC and HAL/S-360 compiler system program description

    NASA Technical Reports Server (NTRS)

    1976-01-01

    The compiler is a large multi-phase design and can be broken into four phases: Phase 1 inputs the source language and does a syntactic and semantic analysis generating the source listing, a file of instructions in an internal format (HALMAT) and a collection of tables to be used in subsequent phases. Phase 1.5 massages the code produced by Phase 1, performing machine independent optimization. Phase 2 inputs the HALMAT produced by Phase 1 and outputs machine language object modules in a form suitable for the OS-360 or FCOS linkage editor. Phase 3 produces the SDF tables. The four phases described are written in XPL, a language specifically designed for compiler implementation. In addition to the compiler, there is a large library containing all the routines that can be explicitly called by the source language programmer plus a large collection of routines for implementing various facilities of the language.

  11. Livermore Compiler Analysis Loop Suite

    Energy Science and Technology Software Center (ESTSC)

    2013-03-01

    LCALS is designed to evaluate compiler optimizations and performance of a variety of loop kernels and loop traversal software constructs. Some of the loop kernels are pulled directly from "Livermore Loops Coded in C", developed at LLNL (see item 11 below for details of earlier code versions). The older suites were used to evaluate floating-point performances of hardware platforms prior to porting larger application codes. The LCALS suite is geared toward assissing C++ compiler optimizationsmore » and platform performance related to SIMD vectorization, OpenMP threading, and advanced C++ language features. LCALS contains 20 of 24 loop kernels from the older Livermore Loop suites, plus various others representative of loops found in current production appkication codes at LLNL. The latter loops emphasize more diverse loop constructs and data access patterns than the others, such as multi-dimensional difference stencils. The loops are included in a configurable framework, which allows control of compilation, loop sampling for execution timing, which loops are run and their lengths. It generates timing statistics for analysis and comparing variants of individual loops. Also, it is easy to add loops to the suite as desired.« less

  12. Further optimization of SeDDaRA blind image deconvolution algorithm and its DSP implementation

    NASA Astrophysics Data System (ADS)

    Wen, Bo; Zhang, Qiheng; Zhang, Jianlin

    2011-11-01

    Efficient algorithm for blind image deconvolution and its high-speed implementation is of great value in practice. Further optimization of SeDDaRA is developed, from algorithm structure to numerical calculation methods. The main optimization covers that, the structure's modularization for good implementation feasibility, reducing the data computation and dependency of 2D-FFT/IFFT, and acceleration of power operation by segmented look-up table. Then the Fast SeDDaRA is proposed and specialized for low complexity. As the final implementation, a hardware system of image restoration is conducted by using the multi-DSP parallel processing. Experimental results show that, the processing time and memory demand of Fast SeDDaRA decreases 50% at least; the data throughput of image restoration system is over 7.8Msps. The optimization is proved efficient and feasible, and the Fast SeDDaRA is able to support the real-time application.

  13. Selected photographic techniques, a compilation

    NASA Technical Reports Server (NTRS)

    1971-01-01

    A selection has been made of methods, devices, and techniques developed in the field of photography during implementation of space and nuclear research projects. These items include many adaptations, variations, and modifications to standard hardware and practice, and should prove interesting to both amateur and professional photographers and photographic technicians. This compilation is divided into two sections. The first section presents techniques and devices that have been found useful in making photolab work simpler, more productive, and higher in quality. Section two deals with modifications to and special applications for existing photographic equipment.

  14. Optical implementations of the optimal phase-covariant quantum cloning machine

    SciTech Connect

    Fiurasek, Jaromir

    2003-05-01

    We propose two simple implementations of the optimal symmetric 1{yields}2 phase-covariant cloning machine for qubits. The first scheme is designed for qubits encoded into polarization states of photons and it involves a mixing of two photons on an unbalanced beam splitter. This scheme is probabilistic and the cloning succeeds with the probability 1/3. In the second setup, the qubits are represented by the states of Rydberg atoms and the cloning is accomplished by the resonant interaction of the atoms with a microwave field confined in a high-Q cavity. This latter approach allows for deterministic implementation of the optimal cloning transformation.

  15. An implementation of particle swarm optimization to evaluate optimal under-voltage load shedding in competitive electricity markets

    NASA Astrophysics Data System (ADS)

    Hosseini-Bioki, M. M.; Rashidinejad, M.; Abdollahi, A.

    2013-11-01

    Load shedding is a crucial issue in power systems especially under restructured electricity environment. Market-driven load shedding in reregulated power systems associated with security as well as reliability is investigated in this paper. A technoeconomic multi-objective function is introduced to reveal an optimal load shedding scheme considering maximum social welfare. The proposed optimization problem includes maximum GENCOs and loads' profits as well as maximum loadability limit under normal and contingency conditions. Particle swarm optimization (PSO) as a heuristic optimization technique, is utilized to find an optimal load shedding scheme. In a market-driven structure, generators offer their bidding blocks while the dispatchable loads will bid their price-responsive demands. An independent system operator (ISO) derives a market clearing price (MCP) while rescheduling the amount of generating power in both pre-contingency and post-contingency conditions. The proposed methodology is developed on a 3-bus system and then is applied to a modified IEEE 30-bus test system. The obtained results show the effectiveness of the proposed methodology in implementing the optimal load shedding satisfying social welfare by maintaining voltage stability margin (VSM) through technoeconomic analyses.

  16. Real-time implementation of optimized maximum noise fraction transform for feature extraction of hyperspectral images

    NASA Astrophysics Data System (ADS)

    Wu, Yuanfeng; Gao, Lianru; Zhang, Bing; Zhao, Haina; Li, Jun

    2014-01-01

    We present a parallel implementation of the optimized maximum noise fraction (G-OMNF) transform algorithm for feature extraction of hyperspectral images on commodity graphics processing units (GPUs). The proposed approach explored the algorithm data-level concurrency and optimized the computing flow. We first defined a three-dimensional grid, in which each thread calculates a sub-block data to easily facilitate the spatial and spectral neighborhood data searches in noise estimation, which is one of the most important steps involved in OMNF. Then, we optimized the processing flow and computed the noise covariance matrix before computing the image covariance matrix to reduce the original hyperspectral image data transmission. These optimization strategies can greatly improve the computing efficiency and can be applied to other feature extraction algorithms. The proposed parallel feature extraction algorithm was implemented on an Nvidia Tesla GPU using the compute unified device architecture and basic linear algebra subroutines library. Through the experiments on several real hyperspectral images, our GPU parallel implementation provides a significant speedup of the algorithm compared with the CPU implementation, especially for highly data parallelizable and arithmetically intensive algorithm parts, such as noise estimation. In order to further evaluate the effectiveness of G-OMNF, we used two different applications: spectral unmixing and classification for evaluation. Considering the sensor scanning rate and the data acquisition time, the proposed parallel implementation met the on-board real-time feature extraction.

  17. An optimized implementation of a fault-tolerant clock synchronization circuit

    NASA Technical Reports Server (NTRS)

    Torres-Pomales, Wilfredo

    1995-01-01

    A fault-tolerant clock synchronization circuit was designed and tested. A comparison to a previous design and the procedure followed to achieve the current optimization are included. The report also includes a description of the system and the results of tests performed to study the synchronization and fault-tolerant characteristics of the implementation.

  18. Metallurgical processing: A compilation

    NASA Technical Reports Server (NTRS)

    1973-01-01

    The items in this compilation, all relating to metallurgical processing, are presented in two sections. The first section includes processes which are general in scope and applicable to a variety of metals or alloys. The second describes the processes that concern specific metals and their alloys.

  19. Compilation of non-contemporaneous constraints

    SciTech Connect

    Wray, R.E. III; Laird, J.E.; Jones, R.M.

    1996-12-31

    Hierarchical execution of domain knowledge is a useful approach for intelligent, real-time systems in complex domains. In addition, well-known techniques for knowledge compilation allow the reorganization of knowledge hierarchies into more efficient forms. However, these techniques have been developed in the context of systems that work in static domains. Our investigations indicate that it is not straightforward to apply knowledge compilation methods for hierarchical knowledge to systems that generate behavior in dynamic environments. One particular problem involves the compilation of non-contemporaneous constraints. This problem arises when a training instance dynamically changes during execution. After defining the problem, we analyze several theoretical approaches that address non-contemporaneous constraints. We have implemented the most promising of these alternatives within Soar, a software architecture for performance and learning. Our results demonstrate that the proposed solutions eliminate the problem in some situations and suggest that knowledge compilation methods are appropriate for interactive environments.

  20. The fault-tree compiler

    NASA Technical Reports Server (NTRS)

    Martensen, Anna L.; Butler, Ricky W.

    1987-01-01

    The Fault Tree Compiler Program is a new reliability tool used to predict the top event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N gates. The high level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precise (within the limits of double precision floating point arithmetic) to the five digits in the answer. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Corporation VAX with the VMS operation system.

  1. An optimized ultrasound digital beamformer with dynamic focusing implemented on FPGA.

    PubMed

    Almekkawy, Mohamed; Xu, Jingwei; Chirala, Mohan

    2014-01-01

    We present a resource-optimized dynamic digital beamformer for an ultrasound system based on a field-programmable gate array (FPGA). A comprehensive 64-channel receive beamformer with full dynamic focusing is embedded in the Altera Arria V FPGA chip. To improve spatial and contrast resolution, full dynamic beamforming is implemented by a novel method with resource optimization. This was conceived using the implementation of the delay summation through a bulk (coarse) delay and fractional (fine) delay. The sampling frequency is 40 MHz and the beamformer includes a 240 MHz polyphase filter that enhances the temporal resolution of the system while relaxing the Analog-to-Digital converter (ADC) bandwidth requirement. The results indicate that our 64-channel dynamic beamformer architecture is amenable for a low power FPGA-based implementation in a portable ultrasound system. PMID:25570695

  2. A novel implementation of method of optimality criterion in synthesizing spacecraft structures with natural frequency constraints

    NASA Technical Reports Server (NTRS)

    Wang, Bo Ping; Chu, F. H.

    1989-01-01

    In the design of spacecraft structures, fine tuning the structure to achieve minimum weight with natural frequency constraints is a time consuming process. Here, a novel implementation of the method of optimality criterion (OC) is developed. In this new implementation of OC, the free vibration analysis results are used to compute the eigenvalue sensitivity data required for the formulation. Specifically, the modal elemental strain and kinetic energies are used. Additionally, normalized design parameters are introduced as a second level linking that allows design variables of different values to be linked together. With the use of this novel formulation, synthesis of structures with natural frequency constraint can be carried out manually using modal analysis results. Design examples are presented to illustrate this novel implementation of the optimality criterion method.

  3. The Study of Cross-layer Optimization for Wireless Rechargeable Sensor Networks Implemented in Coal Mines.

    PubMed

    Ding, Xu; Shi, Lei; Han, Jianghong; Lu, Jingting

    2016-01-01

    Wireless sensor networks deployed in coal mines could help companies provide workers working in coal mines with more qualified working conditions. With the underground information collected by sensor nodes at hand, the underground working conditions could be evaluated more precisely. However, sensor nodes may tend to malfunction due to their limited energy supply. In this paper, we study the cross-layer optimization problem for wireless rechargeable sensor networks implemented in coal mines, of which the energy could be replenished through the newly-brewed wireless energy transfer technique. The main results of this article are two-fold: firstly, we obtain the optimal relay nodes' placement according to the minimum overall energy consumption criterion through the Lagrange dual problem and KKT conditions; secondly, the optimal strategies for recharging locomotives and wireless sensor networks are acquired by solving a cross-layer optimization problem. The cyclic nature of these strategies is also manifested through simulations in this paper. PMID:26828500

  4. The Study of Cross-layer Optimization for Wireless Rechargeable Sensor Networks Implemented in Coal Mines

    PubMed Central

    Ding, Xu; Shi, Lei; Han, Jianghong; Lu, Jingting

    2016-01-01

    Wireless sensor networks deployed in coal mines could help companies provide workers working in coal mines with more qualified working conditions. With the underground information collected by sensor nodes at hand, the underground working conditions could be evaluated more precisely. However, sensor nodes may tend to malfunction due to their limited energy supply. In this paper, we study the cross-layer optimization problem for wireless rechargeable sensor networks implemented in coal mines, of which the energy could be replenished through the newly-brewed wireless energy transfer technique. The main results of this article are two-fold: firstly, we obtain the optimal relay nodes’ placement according to the minimum overall energy consumption criterion through the Lagrange dual problem and KKT conditions; secondly, the optimal strategies for recharging locomotives and wireless sensor networks are acquired by solving a cross-layer optimization problem. The cyclic nature of these strategies is also manifested through simulations in this paper. PMID:26828500

  5. Analysis and selection of optimal function implementations in massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Ratterman, Joseph D.

    2011-05-31

    An apparatus, program product and method optimize the operation of a parallel computer system by, in part, collecting performance data for a set of implementations of a function capable of being executed on the parallel computer system based upon the execution of the set of implementations under varying input parameters in a plurality of input dimensions. The collected performance data may be used to generate selection program code that is configured to call selected implementations of the function in response to a call to the function under varying input parameters. The collected performance data may be used to perform more detailed analysis to ascertain the comparative performance of the set of implementations of the function under the varying input parameters.

  6. Valve technology: A compilation

    NASA Technical Reports Server (NTRS)

    1971-01-01

    A technical compilation on the types, applications and modifications to certain valves is presented. Data cover the following: (1) valves that feature automatic response to stimuli (thermal, electrical, fluid pressure, etc.), (2) modified valves changed by redesign of components to increase initial design effectiveness or give the item versatility beyond its basic design capability, and (3) special purpose valves with limited application as presented, but lending themselves to other uses with minor changes.

  7. Metallurgy: A compilation

    NASA Technical Reports Server (NTRS)

    1972-01-01

    A compilation on the technical uses of various metallurgical processes is presented. Descriptions are given of the mechanical properties of various alloys, ranging from TAZ-813 at 2200 F to investment cast alloy 718 at -320 F. Methods are also described for analyzing some of the constituents of various alloys from optical properties of carbide precipitates in Rene 41 to X-ray spectrographic analysis of the manganese content of high chromium steels.

  8. Optimizing local protocols for implementing bipartite nonlocal unitary gates using prior entanglement and classical communication

    SciTech Connect

    Cohen, Scott M.

    2010-06-15

    We present a method of optimizing recently designed protocols for implementing an arbitrary nonlocal unitary gate acting on a bipartite system. These protocols use only local operations and classical communication with the assistance of entanglement, and they are deterministic while also being 'one-shot', in that they use only one copy of an entangled resource state. The optimization minimizes the amount of entanglement needed, and also the amount of classical communication, and it is often the case that less of each of these resources is needed than with an alternative protocol using two-way teleportation.

  9. Implementation of a multiblock sensitivity analysis method in numerical aerodynamic shape optimization

    NASA Technical Reports Server (NTRS)

    Lacasse, James M.

    1995-01-01

    A multiblock sensitivity analysis method is applied in a numerical aerodynamic shape optimization technique. The Sensitivity Analysis Domain Decomposition (SADD) scheme which is implemented in this study was developed to reduce the computer memory requirements resulting from the aerodynamic sensitivity analysis equations. Discrete sensitivity analysis offers the ability to compute quasi-analytical derivatives in a more efficient manner than traditional finite-difference methods, which tend to be computationally expensive and prone to inaccuracies. The direct optimization procedure couples CFD analysis based on the two-dimensional thin-layer Navier-Stokes equations with a gradient-based numerical optimization technique. The linking mechanism is the sensitivity equation derived from the CFD discretized flow equations, recast in adjoint form, and solved using direct matrix inversion techniques. This investigation is performed to demonstrate an aerodynamic shape optimization technique on a multiblock domain and its applicability to complex geometries. The objectives are accomplished by shape optimizing two aerodynamic configurations. First, the shape optimization of a transonic airfoil is performed to investigate the behavior of the method in highly nonlinear flows and the effect of different grid blocking strategies on the procedure. Secondly, shape optimization of a two-element configuration in subsonic flow is completed. Cases are presented for this configuration to demonstrate the effect of simultaneously reshaping interfering elements. The aerodynamic shape optimization is shown to produce supercritical type airfoils in the transonic flow from an initially symmetric airfoil. Multiblocking effects the path of optimization while providing similar results at the conclusion. Simultaneous reshaping of elements is shown to be more effective than individual element reshaping due to the inclusion of mutual interference effects.

  10. FPGA Implementation of Optimal 3D-Integer DCT Structure for Video Compression

    PubMed Central

    Jacob, J. Augustin; Kumar, N. Senthil

    2015-01-01

    A novel optimal structure for implementing 3D-integer discrete cosine transform (DCT) is presented by analyzing various integer approximation methods. The integer set with reduced mean squared error (MSE) and high coding efficiency are considered for implementation in FPGA. The proposed method proves that the least resources are utilized for the integer set that has shorter bit values. Optimal 3D-integer DCT structure is determined by analyzing the MSE, power dissipation, coding efficiency, and hardware complexity of different integer sets. The experimental results reveal that direct method of computing the 3D-integer DCT using the integer set [10, 9, 6, 2, 3, 1, 1] performs better when compared to other integer sets in terms of resource utilization and power dissipation. PMID:26601120

  11. FPGA Implementation of Optimal 3D-Integer DCT Structure for Video Compression.

    PubMed

    Jacob, J Augustin; Kumar, N Senthil

    2015-01-01

    A novel optimal structure for implementing 3D-integer discrete cosine transform (DCT) is presented by analyzing various integer approximation methods. The integer set with reduced mean squared error (MSE) and high coding efficiency are considered for implementation in FPGA. The proposed method proves that the least resources are utilized for the integer set that has shorter bit values. Optimal 3D-integer DCT structure is determined by analyzing the MSE, power dissipation, coding efficiency, and hardware complexity of different integer sets. The experimental results reveal that direct method of computing the 3D-integer DCT using the integer set [10, 9, 6, 2, 3, 1, 1] performs better when compared to other integer sets in terms of resource utilization and power dissipation. PMID:26601120

  12. Implementation of a Point Algorithm for Real-Time Convex Optimization

    NASA Technical Reports Server (NTRS)

    Acikmese, Behcet; Motaghedi, Shui; Carson, John

    2007-01-01

    The primal-dual interior-point algorithm implemented in G-OPT is a relatively new and efficient way of solving convex optimization problems. Given a prescribed level of accuracy, the convergence to the optimal solution is guaranteed in a predetermined, finite number of iterations. G-OPT Version 1.0 is a flight software implementation written in C. Onboard application of the software enables autonomous, real-time guidance and control that explicitly incorporates mission constraints such as control authority (e.g. maximum thrust limits), hazard avoidance, and fuel limitations. This software can be used in planetary landing missions (Mars pinpoint landing and lunar landing), as well as in proximity operations around small celestial bodies (moons, asteroids, and comets). It also can be used in any spacecraft mission for thrust allocation in six-degrees-of-freedom control.

  13. Implementation of utility-based resource optimization protocols on ITA Sensor Fabric

    NASA Astrophysics Data System (ADS)

    Eswaran, Sharanya; Misra, Archan; Bergamaschi, Flavio; La Porta, Thomas

    2010-04-01

    Utility-based cross-layer optimization is a valuable tool for resource management in mission-oriented wireless sensor networks (WSN). The benefits of this technique include the ability to take application- or mission-level utilities into account and to dynamically adapt to the highly variable environment of tactical WSNs. Recently, we developed a family of distributed protocols which adapts the bandwidth and energy usage in mission-oriented WSN in order to optimally allocate resources among multiple missions, that may have specific demands depending on their priority, and also variable schedules, entering and leaving the network at different times.9-12 In this paper, we illustrate the practical applicability of this family of protocols in tactical networks by implementing one of the protocols, which ensures optimal rate adaptation for congestion control in mission-oriented networks,9 on a real-time 802.11b network using the ITA Sensor Fabric.13 The ITA Sensor Fabric is a middleware infrastructure, developed as part of the International Technology Alliance (ITA) in Network and Information Science,14 to address the challenges in the areas of sensor identification, classification, interoperability and sensor data sharing, dissemination and consumability, commonly present in tactical WSNs.15 Through this implementation, we (i) study the practical challenges arising from the implementation and (ii) provide a proof of concept regarding the applicability of this family of protocols for efficient resource management in tactical WSNs amidst the heterogeneous and dynamic sets of sensors, missions and middle-ware.

  14. Optimization of FIR Digital Filters Using a Real Parameter Parallel Genetic Algorithm and Implementations.

    NASA Astrophysics Data System (ADS)

    Xu, Dexiang

    This dissertation presents a novel method of designing finite word length Finite Impulse Response (FIR) digital filters using a Real Parameter Parallel Genetic Algorithm (RPPGA). This algorithm is derived from basic Genetic Algorithms which are inspired by natural genetics principles. Both experimental results and theoretical studies in this work reveal that the RPPGA is a suitable method for determining the optimal or near optimal discrete coefficients of finite word length FIR digital filters. Performance of RPPGA is evaluated by comparing specifications of filters designed by other methods with filters designed by RPPGA. The parallel and spatial structures of the algorithm result in faster and more robust optimization than basic genetic algorithms. A filter designed by RPPGA is implemented in hardware to attenuate high frequency noise in a data acquisition system for collecting seismic signals. These studies may lead to more applications of the Real Parameter Parallel Genetic Algorithms in Electrical Engineering.

  15. Implementation of a near-optimal global set point control method in a DDC controller

    SciTech Connect

    Cascia, M.A.

    2000-07-01

    A near-optimal global set point control method that can be implemented in an energy management system's (EMS) DDC controller is described in this paper. Mathematical models are presented for the power consumption of electric chillers, hot water boilers, chilled and hot water pumps, and air handler fans, which allow the calculation of near-optimal chilled water, hot water, and coil discharge air set points to minimize power consumption, based on data collected by the EMS. Also optimized are the differential and static pressure set points for the variable speed pumps and fans. A pilot test of this control methodology was implemented for a cooling plant at a pharmaceutical manufacturing facility near Dallas, Texas. Data collected at this site showed good agreement between the actual power consumed by the chillers, chilled water pumps, and air handlers and that predicted by the models. An approximate model was developed to calculate real-time power savings in the DDC controller. A third-party energy accounting program was used to track savings due to the near-optimal control, and results show a monthly KWH reduction ranging from 3% to 14%.

  16. Optimization and implementation of the integer wavelet transform for image coding.

    PubMed

    Grangetto, Marco; Magli, Enrico; Martina, Maurizio; Olmo, Gabriella

    2002-01-01

    This paper deals with the design and implementation of an image transform coding algorithm based on the integer wavelet transform (IWT). First of all, criteria are proposed for the selection of optimal factorizations of the wavelet filter polyphase matrix to be employed within the lifting scheme. The obtained results lead to the IWT implementations with very satisfactory lossless and lossy compression performance. Then, the effects of finite precision representation of the lifting coefficients on the compression performance are analyzed, showing that, in most cases, a very small number of bits can be employed for the mantissa keeping the performance degradation very limited. Stemming from these results, a VLSI architecture is proposed for the IWT implementation, capable of achieving very high frame rates with moderate gate complexity. PMID:18244658

  17. Multi-GPU implementation of a VMAT treatment plan optimization algorithm

    SciTech Connect

    Tian, Zhen E-mail: Xun.Jia@UTSouthwestern.edu Folkerts, Michael; Tan, Jun; Jia, Xun E-mail: Xun.Jia@UTSouthwestern.edu Jiang, Steve B. E-mail: Xun.Jia@UTSouthwestern.edu; Peng, Fei

    2015-06-15

    Purpose: Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU’s relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors’ group, on a multi-GPU platform to solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. Methods: The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors’ method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H and N) cancer case is

  18. A silicon compiler for dedicated mathematical systems based on CORDIC arithmetic processors

    SciTech Connect

    Hu, X.

    1989-01-01

    There exists a large number of computationally complex systems that are not well structured and that contain many non-linear function evaluations. Examples can be found in the areas of robotics, engineering graphics, and signal processing. These systems can be implemented in software. However, if high calculation speed is required or if the same computational process must be repeated numerous times, a hardware implementation may be desirable. The silicon compiler approach proposed in this thesis for designing systems based on bit-serial CORDIC arithmetic units enables an efficient hardware implementation of such a complex computational system. The basic building block employed by the compiler is a CORDIC processor capable of evaluating a number of arithmetic and mathematical operations including multiplication and division, and trigonometric and hyperbolic functions. The silicon compiler consists of a series of software tools that automatically realize a user's high-level description as a fully interconnected CORDIC processor network, perform bit-level logic simulation, optimize the design parameters, synchronize the data propagation, and generate the final mask artwork. Primary emphases in this thesis are the expansion of the current CORDIC theory and the development of the software tools in the silicon compiler. Applications of this work are also presented.

  19. HAL/S-360 compiler test activity report

    NASA Technical Reports Server (NTRS)

    Helmers, C. T.

    1974-01-01

    The levels of testing employed in verifying the HAL/S-360 compiler were as follows: (1) typical applications program case testing; (2) functional testing of the compiler system and its generated code; and (3) machine oriented testing of compiler implementation on operational computers. Details of the initial test plan and subsequent adaptation are reported, along with complete test results for each phase which examined the production of object codes for every possible source statement.

  20. Atomic mass compilation 2012

    SciTech Connect

    Pfeiffer, B.; Venkataramaniah, K.; Czok, U.; Scheidenberger, C.

    2014-03-15

    Atomic mass reflects the total binding energy of all nucleons in an atomic nucleus. Compilations and evaluations of atomic masses and derived quantities, such as neutron or proton separation energies, are indispensable tools for research and applications. In the last decade, the field has evolved rapidly after the advent of new production and measuring techniques for stable and unstable nuclei resulting in substantial ameliorations concerning the body of data and their precision. Here, we present a compilation of atomic masses comprising the data from the evaluation of 2003 as well as the results of new measurements performed. The relevant literature in refereed journals and reports as far as available, was scanned for the period beginning 2003 up to and including April 2012. Overall, 5750 new data points have been collected. Recommended values for the relative atomic masses have been derived and a comparison with the 2003 Atomic Mass Evaluation has been performed. This work has been carried out in collaboration with and as a contribution to the European Nuclear Structure and Decay Data Network of Evaluations.

  1. Proof-Carrying Code with Correct Compilers

    NASA Technical Reports Server (NTRS)

    Appel, Andrew W.

    2009-01-01

    In the late 1990s, proof-carrying code was able to produce machine-checkable safety proofs for machine-language programs even though (1) it was impractical to prove correctness properties of source programs and (2) it was impractical to prove correctness of compilers. But now it is practical to prove some correctness properties of source programs, and it is practical to prove correctness of optimizing compilers. We can produce more expressive proof-carrying code, that can guarantee correctness properties for machine code and not just safety. We will construct program logics for source languages, prove them sound w.r.t. the operational semantics of the input language for a proved-correct compiler, and then use these logics as a basis for proving the soundness of static analyses.

  2. Parallel incremental compilation. Doctoral thesis

    SciTech Connect

    Gafter, N.M.

    1990-06-01

    The time it takes to compile a large program has been a bottleneck in the software development process. When an interactive programming environment with an incremental compiler is used, compilation speed becomes even more important, but existing incremental compilers are very slow for some types of program changes. We describe a set of techniques that enable incremental compilation to exploit fine-grained concurrency in a shared-memory multi-processor and achieve asymptotic improvement over sequential algorithms. Because parallel non-incremental compilation is a special case of parallel incremental compilation, the design of a parallel compiler is a corollary of our result. Instead of running the individual phases concurrently, our design specifies compiler phases that are mutually sequential. However, each phase is designed to exploit fine-grained parallelism. By allowing each phase to present its output as a complete structure rather than as a stream of data, we can apply techniques such as parallel prefix and parallel divide-and-conquer, and we can construct applicative data structures to achieve sublinear execution time. Parallel algorithms for each phase of a compiler are presented to demonstrate that a complete incremental compiler can achieve execution time that is asymptotically less than sequential algorithms.

  3. Implementation and optimization of an improved morphological filtering algorithm for speckle removal based on DSPs

    NASA Astrophysics Data System (ADS)

    Liu, Qitao; Li, Yingchun; Sun, Huayan; Zhao, Yanzhong

    2008-03-01

    Laser active imaging system, which is of high resolution, anti-jamming and can be three-dimensional (3-D) imaging, has been used widely. But its imagery is usually affected by speckle noise which makes the grayscale of pixels change violently, hides the subtle details and makes the imaging resolution descend greatly. Removing speckle noise is one of the most difficult problems encountered in this system because of the poor statistical property of speckle. Based on the analysis of the statistical characteristic of speckle and morphological filtering algorithm, in this paper, an improved multistage morphological filtering algorithm is studied and implemented on TMS320C6416 DSP. The algorithm makes the morphological open-close and close-open transformation by using two different linear structure elements respectively, and then takes a weighted average over the above transformational results. The weighted coefficients are decided by the statistical characteristic of speckle. This algorithm is implemented on the TMS320C6416 DSPs after simulation on computer. The procedure of software design is fully presented. The methods are fully illustrated to achieve and optimize the algorithm in the research of the structural characteristic of TMS320C6416 DSP and feature of the algorithm. In order to fully benefit from such devices and increase the performance of the whole system, it is necessary to take a series of steps to optimize the DSP programs. This paper introduces some effective methods, including refining code structure, eliminating memory dependence, optimizing assembly code via linear assembly and so on, for TMS320C6x C language optimization and then offers the results of the application in a real-time implementation. The results of processing to the images blurred by speckle noise shows that the algorithm can not only effectively suppress speckle noise but also preserve the geometrical features of images. The results of the optimized code running on the DSP platform

  4. The optimized gradient method for full waveform inversion and its spectral implementation

    NASA Astrophysics Data System (ADS)

    Wu, Zedong; Alkhalifah, Tariq

    2016-06-01

    At the heart of the full waveform inversion (FWI) implementation is wavefield extrapolation, and specifically its accuracy and cost. To obtain accurate, dispersion free wavefields, the extrapolation for modelling is often expensive. Combining an efficient extrapolation with a novel gradient preconditioning can render an FWI implementation that efficiently converges to an accurate model. We, specifically, recast the extrapolation part of the inversion in terms of its spectral components for both data and gradient calculation. This admits dispersion free wavefields even at large extrapolation time steps, which improves the efficiency of the inversion. An alternative spectral representation of the depth axis in terms of sine functions allows us to impose a free surface boundary condition, which reflects our medium boundaries more accurately. Using a newly derived perfectly matched layer formulation for this spectral implementation, we can define a finite model with absorbing boundaries. In order to reduce the nonlinearity in FWI, we propose a multiscale conditioning of the objective function through combining the different directional components of the gradient to optimally update the velocity. Through solving a simple optimization problem, it specifically admits the smoothest approximate update while guaranteeing its ascending direction. An application to the Marmousi model demonstrates the capability of the proposed approach and justifies our assertions with respect to cost and convergence.

  5. The Optimized Gradient Method for Full Waveform Inversion and its Spectral Implementation

    NASA Astrophysics Data System (ADS)

    Wu, Zedong; Alkhalifah, Tariq

    2016-03-01

    At the heart of the full waveform inversion (FWI) implementation is wavefield extrapolation, and specifically its accuracy and cost. To obtain accurate, dispersion free wavefields, the extrapolation for modeling is often expensive. Combining an efficient extrapolation with a novel gradient preconditioning can render an FWI implementation that efficiently converges to an accurate model. We, specifically, recast the extrapolation part of the inversion in terms of its spectral components for both data and gradient calculation. This admits dispersion free wavefields even at large extrapolation time steps, which improves the efficiency of the inversion. An alternative spectral representation of the depth axis in terms of sine functions allows us to impose a free surface boundary condition, which reflects our medium boundaries more accurately. Using a newly derived perfectly matched layer formulation for this spectral implementation, we can define a finite model with absorbing boundaries. In order to reduce the nonlinearity in FWI, we propose a multi-scale conditioning of the objective function through combining the different directional components of the gradient to optimally update the velocity. Through solving a simple optimization problem, it specifically admits the smoothest approximate update while guaranteeing its ascending direction. An application to the Marmousi model demonstrates the capability of the proposed approach, and justifies our assertions with respect to cost and convergence.

  6. Optimal Pain Assessment in Pediatric Rehabilitation: Implementation of a Nursing Guideline.

    PubMed

    Kingsnorth, Shauna; Joachimides, Nick; Krog, Kim; Davies, Barbara; Higuchi, Kathryn Smith

    2015-12-01

    In Ontario, Canada, the Registered Nurses' Association promotes a Best Practice Spotlight Organization initiative to enhance evidence-based practice. Qualifying organizations are required to implement strategies, evaluate outcomes, and sustain practices aligned with nursing clinical practice guidelines. This study reports on the development and evaluation of a multifaceted implementation strategy to support adoption of a nursing clinical practice guideline on the assessment and management of acute pain in a pediatric rehabilitation and complex continuing care hospital. Multiple approaches were employed to influence behavior, attitudes, and awareness around optimal pain practice (e.g., instructional resources, electronic reminders, audits, and feedback). Four measures were introduced to assess pain in communicating and noncommunicating children as part of a campaign to treat pain as the fifth vital sign. A prospective repeated measures design examined survey and audit data to assess practice aligned with the guideline. The Knowledge and Attitudes Survey (KNAS) was adapted to ensure relevance to the local practice setting and was assessed before and after nurses' participation in three education modules. Audit data included client demographics and pain scores assessed annually over a 3-year window. A final sample of 69 nurses (78% response rate) provided pre-/post-survey data. A total of 108 pediatric surgical clients (younger than 19 years) contributed audit data across the three collection cycles. Significant improvements in nurses' knowledge, attitudes, and behaviors related to optimal pain care for children with disabilities were noted following adoption of the pain clinical practice guideline. Targeted guideline implementation strategies are central to supporting optimal pain practice. PMID:26395294

  7. Parameterized CAD techniques implementation for the fatigue behaviour optimization of a service chamber

    NASA Astrophysics Data System (ADS)

    Sánchez, H. T.; Estrems, M.; Franco, P.; Faura, F.

    2009-11-01

    In recent years, the market of heat exchangers is increasingly demanding new products in short cycle time, which means that both the design and manufacturing stages must be extremely reduced. The design stage can be reduced by means of CAD-based parametric design techniques. The methodology presented in this proceeding is based on the optimized control of geometric parameters of a service chamber of a heat exchanger by means of the Application Programming Interface (API) provided by the Solidworks CAD package. Using this implementation, a set of different design configurations of the service chamber made of stainless steel AISI 316 are studied by means of the FE method. As a result of this study, a set of knowledge rules based on the fatigue behaviour are constructed and integrated into the design optimization process.

  8. Steering quantum dynamics via bang-bang control: Implementing optimal fixed-point quantum search algorithm

    NASA Astrophysics Data System (ADS)

    Bhole, Gaurav; Anjusha, V. S.; Mahesh, T. S.

    2016-04-01

    A robust control over quantum dynamics is of paramount importance for quantum technologies. Many of the existing control techniques are based on smooth Hamiltonian modulations involving repeated calculations of basic unitaries resulting in time complexities scaling rapidly with the length of the control sequence. Here we show that bang-bang controls need one-time calculation of basic unitaries and hence scale much more efficiently. By employing a global optimization routine such as the genetic algorithm, it is possible to synthesize not only highly intricate unitaries, but also certain nonunitary operations. We demonstrate the unitary control through the implementation of the optimal fixed-point quantum search algorithm in a three-qubit nuclear magnetic resonance (NMR) system. Moreover, by combining the bang-bang pulses with the crusher gradients, we also demonstrate nonunitary transformations of thermal equilibrium states into effective pure states in three- as well as five-qubit NMR systems.

  9. Implementation of natural frequency analysis and optimality criterion design. [computer technique for structural analysis

    NASA Technical Reports Server (NTRS)

    Levy, R.; Chai, K.

    1978-01-01

    A description is presented of an effective optimality criterion computer design approach for member size selection to improve frequency characteristics for moderately large structure models. It is shown that the implementation of the simultaneous iteration method within a natural frequency structural design optimization provides a method which is more efficient in isolating the lowest natural frequency modes than the frequently applied Stodola method. Additional computational advantages are derived by using previously converged eigenvectors at the start of the iterations during the second and the following design cycles. Vectors with random components can be used at the first design cycle, which, in relation to the entire computer time for the design program, results in only a moderate computational penalty.

  10. Sequential Principal Component Analysis -An Optimal and Hardware-Implementable Transform for Image Compression

    NASA Technical Reports Server (NTRS)

    Duong, Tuan A.; Duong, Vu A.

    2009-01-01

    This paper presents the JPL-developed Sequential Principal Component Analysis (SPCA) algorithm for feature extraction / image compression, based on "dominant-term selection" unsupervised learning technique that requires an order-of-magnitude lesser computation and has simpler architecture compared to the state of the art gradient-descent techniques. This algorithm is inherently amenable to a compact, low power and high speed VLSI hardware embodiment. The paper compares the lossless image compression performance of the JPL's SPCA algorithm with the state of the art JPEG2000, widely used due to its simplified hardware implementability. JPEG2000 is not an optimal data compression technique because of its fixed transform characteristics, regardless of its data structure. On the other hand, conventional Principal Component Analysis based transform (PCA-transform) is a data-dependent-structure transform. However, it is not easy to implement the PCA in compact VLSI hardware, due to its highly computational and architectural complexity. In contrast, the JPL's "dominant-term selection" SPCA algorithm allows, for the first time, a compact, low-power hardware implementation of the powerful PCA algorithm. This paper presents a direct comparison of the JPL's SPCA versus JPEG2000, incorporating the Huffman and arithmetic coding for completeness of the data compression operation. The simulation results show that JPL's SPCA algorithm is superior as an optimal data-dependent-transform over the state of the art JPEG2000. When implemented in hardware, this technique is projected to be ideally suited to future NASA missions for autonomous on-board image data processing to improve the bandwidth of communication.

  11. Development and implementation of rotorcraft preliminary design methodology using multidisciplinary design optimization

    NASA Astrophysics Data System (ADS)

    Khalid, Adeel Syed

    Rotorcraft's evolution has lagged behind that of fixed-wing aircraft. One of the reasons for this gap is the absence of a formal methodology to accomplish a complete conceptual and preliminary design. Traditional rotorcraft methodologies are not only time consuming and expensive but also yield sub-optimal designs. Rotorcraft design is an excellent example of a multidisciplinary complex environment where several interdependent disciplines are involved. A formal framework is developed and implemented in this research for preliminary rotorcraft design using IPPD methodology. The design methodology consists of the product and process development cycles. In the product development loop, all the technical aspects of design are considered including the vehicle engineering, dynamic analysis, stability and control, aerodynamic performance, propulsion, transmission design, weight and balance, noise analysis and economic analysis. The design loop starts with a detailed analysis of requirements. A baseline is selected and upgrade targets are identified depending on the mission requirements. An Overall Evaluation Criterion (OEC) is developed that is used to measure the goodness of the design or to compare the design with competitors. The requirements analysis and baseline upgrade targets lead to the initial sizing and performance estimation of the new design. The digital information is then passed to disciplinary experts. This is where the detailed disciplinary analyses are performed. Information is transferred from one discipline to another as the design loop is iterated. To coordinate all the disciplines in the product development cycle, Multidisciplinary Design Optimization (MDO) techniques e.g. All At Once (AAO) and Collaborative Optimization (CO) are suggested. The methodology is implemented on a Light Turbine Training Helicopter (LTTH) design. Detailed disciplinary analyses are integrated through a common platform for efficient and centralized transfer of design

  12. Pre-Hardware Optimization of Spacecraft Image Processing Software Algorithms and Hardware Implementation

    NASA Technical Reports Server (NTRS)

    Kizhner, Semion; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Petrick, David J.; Day, John H. (Technical Monitor)

    2001-01-01

    Spacecraft telemetry rates have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image processing application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms and re-configurable computing hardware technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processing (DSP). It has been shown in [1] and [2] that this configuration can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft. However, since this technology is still maturing, intensive pre-hardware steps are necessary to achieve the benefits of hardware implementation. This paper describes these steps for the GOES-8 application, a software project developed using Interactive Data Language (IDL) (Trademark of Research Systems, Inc.) on a Workstation/UNIX platform. The solution involves converting the application to a PC/Windows/RC platform, selected mainly by the availability of low cost, adaptable high-speed RC hardware. In order for the hybrid system to run, the IDL software was modified to account for platform differences. It was interesting to examine the gains and losses in performance on the new platform, as well as unexpected observations before implementing hardware. After substantial pre-hardware optimization steps, the necessity of hardware implementation for bottleneck code in the PC environment became evident and solvable beginning with the methodology described in [1], [2], and implementing a novel methodology for this specific application [6]. The PC-RC interface bandwidth problem for the

  13. Development and implementation of a fuzzy optimal expert controller for intelligent buildings

    SciTech Connect

    Rahmani, K.

    1992-01-01

    The purpose of this research is to develop and implement a fuzzy optimal expert controller at coordinator's level in an integrated building control system for improvements in comfort and better energy efficiency. The integrated approach is used to design, control, and operate an intelligent building system. The integrated system incorporates system modeling and decision making steps for scheduling of operations of different units, based on peak-load energy rates, activity periods, and weather at supervisor's level. This is followed by compensation for unexpected events such as weather conditions, variations in occupancy, changes in the type of activities, and machinery at coordinator's level for a short period of time. The integrated approach is completed by local controllers which are used to control HVAC equipment. Development of the fuzzy optimal expert controller is done by integration of fuzzy logic and optimal control techniques. The performance for this fuzzy controller is checked against: solving the system deterministically with full information about the disturbances (actual system), and the case of control design with no knowledge about the disturbances. the comparison is made based on application of the fuzzy controller to the prototype building system. The rules in the fuzzy expert controller are developed based on application of the proposed fuzzy optimal controller to the calculated dynamics in different zones (rooms, offices) for the prototype building system. The objective of the fuzzy optimal expert controller is to create a quick response in the system for a short period of time while the supervisor is calculating new set points for local controllers. Further, the fuzzy expert controller's rise time is compared with a traditional PI controller and indicated faster response.

  14. Voyager Outreach Compilation

    NASA Technical Reports Server (NTRS)

    1998-01-01

    This NASA JPL (Jet Propulsion Laboratory) video presents a collection of the best videos that have been published of the Voyager mission. Computer animation/simulations comprise the largest portion of the video and include outer planetary magnetic fields, outer planetary lunar surfaces, and the Voyager spacecraft trajectory. Voyager visited the four outer planets: Jupiter, Saturn, Uranus, and Neptune. The video contains some live shots of Jupiter (actual), the Earth's moon (from orbit), Saturn (actual), Neptune (actual) and Uranus (actual), but is mainly comprised of computer animations of these planets and their moons. Some of the individual short videos that are compiled are entitled: The Solar System; Voyage to the Outer Planets; A Tour of the Solar System; and the Neptune Encounter. Computerized simulations of Viewing Neptune from Triton, Diving over Neptune to Meet Triton, and Catching Triton in its Retrograde Orbit are included. Several animations of Neptune's atmosphere, rotation and weather features as well as significant discussion of the planet's natural satellites are also presented.

  15. Optimization models and techniques for implementation and pricing of electricity markets

    NASA Astrophysics Data System (ADS)

    Madrigal Martinez, Marcelino

    Vertically integrated electric power systems extensively use optimization models and solution techniques to guide their optimal operation and planning. The advent of electric power systems re-structuring has created needs for new optimization tools and the revision of the inherited ones from the vertical integration era into the market environment. This thesis presents further developments on the use of optimization models and techniques for implementation and pricing of primary electricity markets. New models, solution approaches, and price setting alternatives are proposed. Three different modeling groups are studied. The first modeling group considers simplified continuous and discrete models for power pool auctions driven by central-cost minimization. The direct solution of the dual problems, and the use of a Branch-and-Bound algorithm to solve the primal, allows to identify the effects of disequilibrium, and different price setting alternatives over the existence of multiple solutions. It is shown that particular pricing rules worsen the conflict of interest that arise when multiple solutions exist under disequilibrium. A price-setting alternative based on dual variables is shown to diminish such conflict. The second modeling group considers the unit commitment problem. An interior-point/cutting-plane method is proposed for the solution of the dual problem. The new method has better convergence characteristics and does not suffer from the parameter tuning drawback as previous methods The robustness characteristics of the interior-point/cutting-plane method, combined with a non-uniform price setting alternative, show that the conflict of interest is diminished when multiple near optimal solutions exist. The non-uniform price setting alternative is compared to a classic average pricing rule. The last modeling group concerns to a new type of linear network-constrained clearing system models for daily markets for power and spinning reserve. A new model and

  16. Formulation for a practical implementation of electromagnetic induction coils optimized using stream functions

    NASA Astrophysics Data System (ADS)

    Reed, Mark A.; Scott, Waymond R.

    2016-05-01

    Continuous-wave (CW) electromagnetic induction (EMI) systems used for subsurface sensing typically employ separate transmit and receive coils placed in close proximity. The closeness of the coils is desirable for both packaging and object pinpointing; however, the coils must have as little mutual coupling as possible. Otherwise, the signal from the transmit coil will couple into the receive coil, making target detection difficult or impossible. Additionally, mineralized soil can be a significant problem when attempting to detect small amounts of metal because the soil effectively couples the transmit and receive coils. Optimization of wire coils to improve their performance is difficult but can be made possible through a stream-function representation and the use of partially convex forms. Examples of such methods have been presented previously, but these methods did not account for certain practical issues with coil implementation. In this paper, the power constraint introduced into the optimization routine is modified so that it does not penalize areas of high current. It does this by representing the coils as plates carrying surface currents and adjusting the sheet resistance to be inversely proportional to the current, which is a good approximation for a wire-wound coil. Example coils are then optimized for minimum mutual coupling, maximum sensitivity, and minimum soil response at a given height with both the earlier, constant sheet resistance and the new representation. The two sets of coils are compared both to each other and other common coil types to show the method's viability.

  17. Pixel-by-pixel deconvolution of bolus-tracking data: optimization and implementation

    NASA Astrophysics Data System (ADS)

    Sourbron, S.; Dujardin, M.; Makkat, S.; Luypaert, R.

    2007-01-01

    Quantification of haemodynamic parameters with a deconvolution analysis of bolus-tracking data is an ill-posed problem which requires regularization. In a previous study, simulated data without structural errors were used to validate two methods for a pixel-by-pixel analysis: standard-form Tikhonov regularization with either the L-curve criterion (LCC) or generalized cross validation (GCV) for selecting the regularization parameter. However, problems of image artefacts were reported when the methods were applied to patient data. The aim of this study was to investigate the nature of these problems in more detail and evaluate strategies of optimization for routine application in the clinic. In addition we investigated to which extent the calculation time of the algorithm can be minimized. In order to ensure that the conclusions are relevant for a larger range of clinical applications, we relied on patient data for evaluation of the algorithms. Simulated data were used to validate the conclusions in a more quantitative manner. We conclude that the reported problems with image quality can be removed by appropriate optimization of either LCC or GCV. In all examples this could be achieved with LCC without significant perturbation of the values in pixels where the regularization parameter was originally selected accurately. GCV could not be optimized for the renal data, and in the CT data only at the cost of image resolution. Using the implementations given, calculation times were sufficiently short for routine application in the clinic.

  18. Optimizing the implementation of the target motion sampling temperature treatment technique - How fast can it get?

    SciTech Connect

    Tuomas, V.; Jaakko, L.

    2013-07-01

    This article discusses the optimization of the target motion sampling (TMS) temperature treatment method, previously implemented in the Monte Carlo reactor physics code Serpent 2. The TMS method was introduced in [1] and first practical results were presented at the PHYSOR 2012 conference [2]. The method is a stochastic method for taking the effect of thermal motion into account on-the-fly in a Monte Carlo neutron transport calculation. It is based on sampling the target velocities at collision sites and then utilizing the 0 K cross sections at target-at-rest frame for reaction sampling. The fact that the total cross section becomes a distributed quantity is handled using rejection sampling techniques. The original implementation of the TMS requires 2.0 times more CPU time in a PWR pin-cell case than a conventional Monte Carlo calculation relying on pre-broadened effective cross sections. In a HTGR case examined in this paper the overhead factor is as high as 3.6. By first changing from a multi-group to a continuous-energy implementation and then fine-tuning a parameter affecting the conservativity of the majorant cross section, it is possible to decrease the overhead factors to 1.4 and 2.3, respectively. Preliminary calculations are also made using a new and yet incomplete optimization method in which the temperature of the basis cross section is increased above 0 K. It seems that with the new approach it may be possible to decrease the factors even as low as 1.06 and 1.33, respectively, but its functionality has not yet been proven. Therefore, these performance measures should be considered preliminary. (authors)

  19. Planning, Implementation and Optimization of Future space Missions using an Immersive Visualization Environement (IVE) Machine

    NASA Astrophysics Data System (ADS)

    Harris, E.

    Planning, Implementation and Optimization of Future Space Missions using an Immersive Visualization Environment (IVE) Machine E. N. Harris, Lockheed Martin Space Systems, Denver, CO and George.W. Morgenthaler, U. of Colorado at Boulder History: A team of 3-D engineering visualization experts at the Lockheed Martin Space Systems Company have developed innovative virtual prototyping simulation solutions for ground processing and real-time visualization of design and planning of aerospace missions over the past 6 years. At the University of Colorado, a team of 3-D visualization experts are developing the science of 3-D visualization and immersive visualization at the newly founded BP Center for Visualization, which began operations in October, 2001. (See IAF/IAA-01-13.2.09, "The Use of 3-D Immersive Visualization Environments (IVEs) to Plan Space Missions," G. A. Dorn and G. W. Morgenthaler.) Progressing from Today's 3-D Engineering Simulations to Tomorrow's 3-D IVE Mission Planning, Simulation and Optimization Techniques: 3-D (IVEs) and visualization simulation tools can be combined for efficient planning and design engineering of future aerospace exploration and commercial missions. This technology is currently being developed and will be demonstrated by Lockheed Martin in the (IVE) at the BP Center using virtual simulation for clearance checks, collision detection, ergonomics and reach-ability analyses to develop fabrication and processing flows for spacecraft and launch vehicle ground support operations and to optimize mission architecture and vehicle design subject to realistic constraints. Demonstrations: Immediate aerospace applications to be demonstrated include developing streamlined processing flows for Reusable Space Transportation Systems and Atlas Launch Vehicle operations and Mars Polar Lander visual work instructions. Long-range goals include future international human and robotic space exploration missions such as the development of a Mars

  20. Optimized FPGA Implementation of Multi-Rate FIR Filters Through Thread Decomposition

    NASA Technical Reports Server (NTRS)

    Zheng, Jason Xin; Nguyen, Kayla; He, Yutao

    2010-01-01

    Multirate (decimation/interpolation) filters are among the essential signal processing components in spaceborne instruments where Finite Impulse Response (FIR) filters are often used to minimize nonlinear group delay and finite-precision effects. Cascaded (multi-stage) designs of Multi-Rate FIR (MRFIR) filters are further used for large rate change ratio, in order to lower the required throughput while simultaneously achieving comparable or better performance than single-stage designs. Traditional representation and implementation of MRFIR employ polyphase decomposition of the original filter structure, whose main purpose is to compute only the needed output at the lowest possible sampling rate. In this paper, an alternative representation and implementation technique, called TD-MRFIR (Thread Decomposition MRFIR), is presented. The basic idea is to decompose MRFIR into output computational threads, in contrast to a structural decomposition of the original filter as done in the polyphase decomposition. Each thread represents an instance of the finite convolution required to produce a single output of the MRFIR. The filter is thus viewed as a finite collection of concurrent threads. The technical details of TD-MRFIR will be explained, first showing its applicability to the implementation of downsampling, upsampling, and resampling FIR filters, and then describing a general strategy to optimally allocate the number of filter taps. A particular FPGA design of multi-stage TD-MRFIR for the L-band radar of NASA's SMAP (Soil Moisture Active Passive) instrument is demonstrated; and its implementation results in several targeted FPGA devices are summarized in terms of the functional (bit width, fixed-point error) and performance (time closure, resource usage, and power estimation) parameters.

  1. Onboard optimized hardware implementation of JPEG-LS encoder based on FPGA

    NASA Astrophysics Data System (ADS)

    Wei, Wen; Lei, Jie; Li, Yunsong

    2012-10-01

    A novel hardware implementation of JPEG-LS Encoder based on FPGA is introduced in this paper. Using a look-ahead technique, the critical delay paths of LOCO-I algorithm, such as feedback-loop circuit of parameters updating, are improved. Then an optimized architecture of JPEG-LS Encoder is proposed. Especially, run-mode encode process of JPEG-LS is covered in the architecture as well. Experiment results show that the circuit complexity and memory consumption of the proposed structure are much lower, while the data processing speed is much higher than some other available structures. So it is very suited for applying high-speed lossless compression of satellite sensing image onboard.

  2. Electrocoagulation of commercial naphthalene sulfonates: process optimization and assessment of implementation potential.

    PubMed

    Olmez-Hanci, Tugba; Kartal, Zeynep; Arslan-Alaton, Idil

    2012-05-30

    The commercially important naphthalene sulfonate K-acid (C(10)H(9)NO(9)S(3); 2-naphthylamine 3,6,8-tri sulfonic acid) was subjected to electrocoagulation employing stainless steel electrodes. An experimental design tool was used to mathematically describe and optimize the single and combined influences of major process variables on K-acid and its organic carbon (COD and TOC) removal efficiencies as well as electrical energy consumption. Current density, followed by treatment time were found to be the parameters affecting process responses most significantly, whereas initial K-acid concentration had the least influence on the electrocoagulation performance. Process economics including sludge generation, electrode consumption, and electrochemical efficiency, as well as organically bound adsorbable halogen formation and toxicity evolution were primarily considered to question the feasibility of K-acid electrocoagulation. Considering process economics and ecotoxicological parameters, process implementation appeared to be encouraging. PMID:22318240

  3. A Concept and Implementation of Optimized Operations of Airport Surface Traffic

    NASA Technical Reports Server (NTRS)

    Jung, Yoon C.; Hoang, Ty; Montoya, Justin; Gupta, Gautam; Malik, Waqar; Tobias, Leonard

    2010-01-01

    This paper presents a new concept of optimized surface operations at busy airports to improve the efficiency of taxi operations, as well as reduce environmental impacts. The suggested system architecture consists of the integration of two decoupled optimization algorithms. The Spot Release Planner provides sequence and timing advisories to tower controllers for releasing departure aircraft into the movement area to reduce taxi delay while achieving maximum throughput. The Runway Scheduler provides take-off sequence and arrival runway crossing sequence to the controllers to maximize the runway usage. The description of a prototype implementation of this integrated decision support tool for the airport control tower controllers is also provided. The prototype decision support tool was evaluated through a human-in-the-loop experiment, where both the Spot Release Planner and Runway Scheduler provided advisories to the Ground and Local Controllers. Initial results indicate the average number of stops made by each departure aircraft in the departure runway queue was reduced by more than half when the controllers were using the advisories, which resulted in reduced taxi times in the departure queue.

  4. Design and implementation of an automated compound management system in support of lead optimization.

    PubMed

    Quintero, Catherine; Kariv, Ilona

    2009-06-01

    To meet the needs of the increasingly rapid and parallelized lead optimization process, a fully integrated local compound storage and liquid handling system was designed and implemented to automate the generation of assay-ready plates directly from newly submitted and cherry-picked compounds. A key feature of the system is the ability to create project- or assay-specific compound-handling methods, which provide flexibility for any combination of plate types, layouts, and plate bar-codes. Project-specific workflows can be created by linking methods for processing new and cherry-picked compounds and control additions to produce a complete compound set for both biological testing and local storage in one uninterrupted workflow. A flexible cherry-pick approach allows for multiple, user-defined strategies to select the most appropriate replicate of a compound for retesting. Examples of custom selection parameters include available volume, compound batch, and number of freeze/thaw cycles. This adaptable and integrated combination of software and hardware provides a basis for reducing cycle time, fully automating compound processing, and ultimately increasing the rate at which accurate, biologically relevant results can be produced for compounds of interest in the lead optimization process. PMID:19487770

  5. Applying Reflective Middleware Techniques to Optimize a QoS-enabled CORBA Component Model Implementation

    NASA Technical Reports Server (NTRS)

    Wang, Nanbor; Parameswaran, Kirthika; Kircher, Michael; Schmidt, Douglas

    2003-01-01

    Although existing CORBA specifications, such as Real-time CORBA and CORBA Messaging, address many end-to-end quality-of service (QoS) properties, they do not define strategies for configuring these properties into applications flexibly, transparently, and adaptively. Therefore, application developers must make these configuration decisions manually and explicitly, which is tedious, error-prone, and open sub-optimal. Although the recently adopted CORBA Component Model (CCM) does define a standard configuration framework for packaging and deploying software components, conventional CCM implementations focus on functionality rather than adaptive quality-of-service, which makes them unsuitable for next-generation applications with demanding QoS requirements. This paper presents three contributions to the study of middleware for QoS-enabled component-based applications. It outlines rejective middleware techniques designed to adaptively (1) select optimal communication mechanisms, (2) manage QoS properties of CORBA components in their contain- ers, and (3) (re)con$gure selected component executors dynamically. Based on our ongoing research on CORBA and the CCM, we believe the application of rejective techniques to component middleware will provide a dynamically adaptive and (re)configurable framework for COTS software that is well-suited for the QoS demands of next-generation applications.

  6. Applying Reflective Middleware Techniques to Optimize a QoS-enabled CORBA Component Model Implementation

    NASA Technical Reports Server (NTRS)

    Wang, Nanbor; Kircher, Michael; Schmidt, Douglas C.

    2000-01-01

    Although existing CORBA specifications, such as Real-time CORBA and CORBA Messaging, address many end-to-end quality-of-service (QoS) properties, they do not define strategies for configuring these properties into applications flexibly, transparently, and adaptively. Therefore, application developers must make these configuration decisions manually and explicitly, which is tedious, error-prone, and often sub-optimal. Although the recently adopted CORBA Component Model (CCM) does define a standard configuration frame-work for packaging and deploying software components, conventional CCM implementations focus on functionality rather than adaptive quality-of service, which makes them unsuitable for next-generation applications with demanding QoS requirements. This paper presents three contributions to the study of middleware for QoS-enabled component-based applications. It outlines reflective middleware techniques designed to adaptively: (1) select optimal communication mechanisms, (2) man- age QoS properties of CORBA components in their containers, and (3) (re)configure selected component executors dynamically. Based on our ongoing research on CORBA and the CCM, we believe the application of reflective techniques to component middleware will provide a dynamically adaptive and (re)configurable framework for COTS software that is well-suited for the QoS demands of next-generation applications.

  7. Implementation and optimization of ultrasound signal processing algorithms on mobile GPU

    NASA Astrophysics Data System (ADS)

    Kong, Woo Kyu; Lee, Wooyoul; Kim, Kyu Cheol; Yoo, Yangmo; Song, Tai-Kyong

    2014-03-01

    A general-purpose graphics processing unit (GPGPU) has been used for improving computing power in medical ultrasound imaging systems. Recently, a mobile GPU becomes powerful to deal with 3D games and videos at high frame rates on Full HD or HD resolution displays. This paper proposes the method to implement ultrasound signal processing on a mobile GPU available in the high-end smartphone (Galaxy S4, Samsung Electronics, Seoul, Korea) with programmable shaders on the OpenGL ES 2.0 platform. To maximize the performance of the mobile GPU, the optimization of shader design and load sharing between vertex and fragment shader was performed. The beamformed data were captured from a tissue mimicking phantom (Model 539 Multipurpose Phantom, ATS Laboratories, Inc., Bridgeport, CT, USA) by using a commercial ultrasound imaging system equipped with a research package (Ultrasonix Touch, Ultrasonix, Richmond, BC, Canada). The real-time performance is evaluated by frame rates while varying the range of signal processing blocks. The implementation method of ultrasound signal processing on OpenGL ES 2.0 was verified by analyzing PSNR with MATLAB gold standard that has the same signal path. CNR was also analyzed to verify the method. From the evaluations, the proposed mobile GPU-based processing method has no significant difference with the processing using MATLAB (i.e., PSNR<52.51 dB). The comparable results of CNR were obtained from both processing methods (i.e., 11.31). From the mobile GPU implementation, the frame rates of 57.6 Hz were achieved. The total execution time was 17.4 ms that was faster than the acquisition time (i.e., 34.4 ms). These results indicate that the mobile GPU-based processing method can support real-time ultrasound B-mode processing on the smartphone.

  8. Welding and joining: A compilation

    NASA Technical Reports Server (NTRS)

    1975-01-01

    A compilation is presented of NASA-developed technology in welding and joining. Topics discussed include welding equipment, techniques in welding, general bonding, joining techniques, and clamps and holding fixtures.

  9. Final report: Compiled MPI. Cost-Effective Exascale Application Development

    SciTech Connect

    Gropp, William Douglas

    2015-12-21

    This is the final report on Compiled MPI: Cost-Effective Exascale Application Development, and summarizes the results under this project. The project investigated runtime enviroments that improve the performance of MPI (Message-Passing Interface) programs; work at Illinois in the last period of this project looked at optimizing data access optimizations expressed with MPI datatypes.

  10. Optimization of the Implementation of Renewable Resources in a Municipal Electric Utility in Arizona

    NASA Astrophysics Data System (ADS)

    Cadorin, Anthony

    A municipal electric utility in Mesa, Arizona with a peak load of approximately 85 megawatts (MW) was analyzed to determine how the implementation of renewable resources (both wind and solar) would affect the overall cost of energy purchased by the utility. The utility currently purchases all of its energy through long term energy supply contracts and does not own any generation assets and so optimization was achieved by minimizing the overall cost of energy while adhering to specific constraints on how much energy the utility could purchase from the short term energy market. Scenarios were analyzed for a five percent and a ten percent penetration of renewable energy in the years 2015 and 2025. Demand Side Management measures (through thermal storage in the City's district cooling system, electric vehicles, and customers' air conditioning improvements) were evaluated to determine if they would mitigate some of the cost increases that resulted from the addition of renewable resources. In the 2015 simulation, wind energy was less expensive than solar to integrate to the supply mix. When five percent of the utility's energy requirements in 2015 are met by wind, this caused a 3.59% increase in the overall cost of energy. When that five percent is met by solar in 2015, it is estimated to cause a 3.62% increase in the overall cost of energy. A mix of wind and solar in 2015 caused a lower increase in the overall cost of energy of 3.57%. At the ten percent implementation level in 2015, solar, wind, and a mix of solar and wind caused increases of 7.28%, 7.51% and 7.27% respectively in the overall cost of energy. In 2025, at the five percent implementation level, wind and solar caused increases in the overall cost of energy of 3.07% and 2.22% respectively. In 2025, at the ten percent implementation level, wind and solar caused increases in the overall cost of energy of 6.23% and 4.67% respectively. Demand Side Management reduced the overall cost of energy by approximately 0

  11. Implementation and on-sky results of an optimal wavefront controller for the MMT NGS adaptive optics system

    NASA Astrophysics Data System (ADS)

    Powell, Keith B.; Vaitheeswaran, Vidhya

    2010-07-01

    The MMT observatory has recently implemented and tested an optimal wavefront controller for the NGS adaptive optics system. Open loop atmospheric data collected at the telescope is used as the input to a MATLAB based analytical model. The model uses nonlinear constrained minimization to determine controller gains and optimize the system performance. The real-time controller performing the adaptive optics close loop operation is implemented on a dedicated high performance PC based quad core server. The controller algorithm is written in C and uses the GNU scientific library for linear algebra. Tests at the MMT confirmed the optimal controller significantly reduced the residual RMS wavefront compared with the previous controller. Significant reductions in image FWHM and increased peak intensities were obtained in J, H and K-bands. The optimal PID controller is now operating as the baseline wavefront controller for the MMT NGS-AO system.

  12. Model compilation: An approach to automated model derivation

    NASA Technical Reports Server (NTRS)

    Keller, Richard M.; Baudin, Catherine; Iwasaki, Yumi; Nayak, Pandurang; Tanaka, Kazuo

    1990-01-01

    An approach is introduced to automated model derivation for knowledge based systems. The approach, model compilation, involves procedurally generating the set of domain models used by a knowledge based system. With an implemented example, how this approach can be used to derive models of different precision and abstraction is illustrated, and models are tailored to different tasks, from a given set of base domain models. In particular, two implemented model compilers are described, each of which takes as input a base model that describes the structure and behavior of a simple electromechanical device, the Reaction Wheel Assembly of NASA's Hubble Space Telescope. The compilers transform this relatively general base model into simple task specific models for troubleshooting and redesign, respectively, by applying a sequence of model transformations. Each transformation in this sequence produces an increasingly more specialized model. The compilation approach lessens the burden of updating and maintaining consistency among models by enabling their automatic regeneration.

  13. TUNE: Compiler-Directed Automatic Performance Tuning

    SciTech Connect

    Hall, Mary

    2014-09-18

    This project has developed compiler-directed performance tuning technology targeting the Cray XT4 Jaguar system at Oak Ridge, which has multi-core Opteron nodes with SSE-3 SIMD extensions, and the Cray XE6 Hopper system at NERSC. To achieve this goal, we combined compiler technology for model-guided empirical optimization for memory hierarchies with SIMD code generation, which have been developed by the PIs over the past several years. We examined DOE Office of Science applications to identify performance bottlenecks and apply our system to computational kernels that operate on dense arrays. Our goal for this performance-tuning technology has been to yield hand-tuned levels of performance on DOE Office of Science computational kernels, while allowing application programmers to specify their computations at a high level without requiring manual optimization. Overall, we aim to make our technology for SIMD code generation and memory hierarchy optimization a crucial component of high-productivity Petaflops computing through a close collaboration with the scientists in national laboratories.

  14. Pre-Hardware Optimization and Implementation Of Fast Optics Closed Control Loop Algorithms

    NASA Technical Reports Server (NTRS)

    Kizhner, Semion; Lyon, Richard G.; Herman, Jay R.; Abuhassan, Nader

    2004-01-01

    One of the main heritage tools used in scientific and engineering data spectrum analysis is the Fourier Integral Transform and its high performance digital equivalent - the Fast Fourier Transform (FFT). The FFT is particularly useful in two-dimensional (2-D) image processing (FFT2) within optical systems control. However, timing constraints of a fast optics closed control loop would require a supercomputer to run the software implementation of the FFT2 and its inverse, as well as other image processing representative algorithm, such as numerical image folding and fringe feature extraction. A laboratory supercomputer is not always available even for ground operations and is not feasible for a night project. However, the computationally intensive algorithms still warrant alternative implementation using reconfigurable computing technologies (RC) such as Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGA), which provide low cost compact super-computing capabilities. We present a new RC hardware implementation and utilization architecture that significantly reduces the computational complexity of a few basic image-processing algorithm, such as FFT2, image folding and phase diversity for the NASA Solar Viewing Interferometer Prototype (SVIP) using a cluster of DSPs and FPGAs. The DSP cluster utilization architecture also assures avoidance of a single point of failure, while using commercially available hardware. This, combined with the control algorithms pre-hardware optimization, or the first time allows construction of image-based 800 Hertz (Hz) optics closed control loops on-board a spacecraft, based on the SVIP ground instrument. That spacecraft is the proposed Earth Atmosphere Solar Occultation Imager (EASI) to study greenhouse gases CO2, C2H, H2O, O3, O2, N2O from Lagrange-2 point in space. This paper provides an advanced insight into a new type of science capabilities for future space exploration missions based on on-board image processing

  15. Overcoming obstacles in the implementation of factorial design for assay optimization.

    PubMed

    Shaw, Robert; Fitzek, Martina; Mouchet, Elizabeth; Walker, Graeme; Jarvis, Philip

    2015-03-01

    Factorial experimental design (FED) is a powerful approach for efficient optimization of robust in vitro assays-it enables cost and time savings while also improving the quality of assays. Although it is a well-known technique, there can be considerable barriers to overcome to fully exploit it within an industrial or academic organization. The article describes a tactical roll out of FED to a scientist group through: training which demystifies the technical components and concentrates on principles and examples; a user-friendly Excel-based tool for deconvoluting plate data; output which focuses on graphical display of data over complex statistics. The use of FED historically has generally been in conjunction with automated technology; however we have demonstrated a much broader impact of FED on the assay development process. The standardized approaches we have rolled out have helped to integrate FED as a fundamental part of assay development best practice because it can be used independently of the automation and vendor-supplied software. The techniques are applicable to different types of assay, both enzyme and cell, and can be used flexibly in manual and automated processes. This article describes the application of FED for a cellular assay. The challenges of selling FED concepts and rolling out to a wide bioscience community together with recommendations for good working practices and effective implementation are discussed. The accessible nature of these approaches means FED can be used by industrial as well as academic users. PMID:25710279

  16. Optimizing Societal Benefit using a Systems Engineering Approach for Implementation of the GEOSS Space Segment

    NASA Technical Reports Server (NTRS)

    Killough, Brian D., Jr.; Sandford, Stephen P.; Cecil, L DeWayne; Stover, Shelley; Keith, Kim

    2008-01-01

    The Group on Earth Observations (GEO) is driving a paradigm shift in the Earth Observation community, refocusing Earth observing systems on GEO Societal Benefit Areas (SBA). Over the short history of space-based Earth observing systems most decisions have been made based on improving our scientific understanding of the Earth with the implicit assumption that this would serve society well in the long run. The space agencies responsible for developing the satellites used for global Earth observations are typically science driven. The innovation of GEO is the call for investments by space agencies to be driven by global societal needs. This paper presents the preliminary findings of an analysis focused on the observational requirements of the GEO Energy SBA. The analysis was performed by the Committee on Earth Observation Satellites (CEOS) Systems Engineering Office (SEO) which is responsible for facilitating the development of implementation plans that have the maximum potential for success while optimizing the benefit to society. The analysis utilizes a new taxonomy for organizing requirements, assesses the current gaps in spacebased measurements and missions, assesses the impact of the current and planned space-based missions, and presents a set of recommendations.

  17. Learning from colleagues about healthcare IT implementation and optimization: lessons from a medical informatics listserv.

    PubMed

    Adams, Martha B; Kaplan, Bonnie; Sobko, Heather J; Kuziemsky, Craig; Ravvaz, Kourosh; Koppel, Ross

    2015-01-01

    Communication among medical informatics communities can suffer from fragmentation across multiple forums, disciplines, and subdisciplines; variation among journals, vocabularies and ontologies; cost and distance. Online communities help overcome these obstacles, but may become onerous when listservs are flooded with cross-postings. Rich and relevant content may be ignored. The American Medical Informatics Association successfully addressed these problems when it created a virtual meeting place by merging the membership of four working groups into a single listserv known as the "Implementation and Optimization Forum." A communication explosion ensued, with thousands of interchanges, hundreds of topics, commentaries from "notables," neophytes, and students--many from different disciplines, countries, traditions. We discuss the listserv's creation, illustrate its benefits, and examine its lessons for others. We use examples from the lively, creative, deep, and occasionally conflicting discussions of user experiences--interchanges about medication reconciliation, open source strategies, nursing, ethics, system integration, and patient photos in the EMR--all enhancing knowledge, collegiality, and collaboration. PMID:25486893

  18. Low-Level Space Optimization of an AES Implementation for a Bit-Serial Fully Pipelined Architecture

    NASA Astrophysics Data System (ADS)

    Weber, Raphael; Rettberg, Achim

    A previously developed AES (Advanced Encryption Standard) implementation is optimized and described in this paper. The special architecture for which this implementation is targeted comprises synchronous and systematic bit-serial processing without a central controlling instance. In order to shrink the design in terms of logic utilization we deeply analyzed the architecture and the AES implementation to identify the most costly logic elements. We propose to merge certain parts of the logic to achieve better area efficiency. The approach was integrated into an existing synthesis tool which we used to produce synthesizable VHDL code. For testing purposes, we simulated the generated VHDL code and ran tests on an FPGA board.

  19. Optimizing revenue cycle performance before, during, and after an EHR implementation.

    PubMed

    Schuler, Margaret; Berkebile, Jane; Vallozzi, Amanda

    2016-06-01

    An electronic health record implementation brings risks of adverse revenue cycle activity. Hospitals and health systems can mitigate that risk by taking aproactive, three-phase approach: Identify potential issues prior to implementation. Create teams to oversee operations during implementation. Hold regular meetings after implementation to ensure the system is running smoothly. PMID:27451570

  20. Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers

    NASA Technical Reports Server (NTRS)

    Su, Ernesto; Lain, Antonio; Ramaswamy, Shankar; Palermo, Daniel J.; Hodges, Eugene W., IV; Banerjee, Prithviraj

    1995-01-01

    The PARADIGM compiler project provides an automated means to parallelize programs, written in a serial programming model, for efficient execution on distributed-memory multicomputers. .A previous implementation of the compiler based on the PTD representation allowed symbolic array sizes, affine loop bounds and array subscripts, and variable number of processors, provided that arrays were single or multi-dimensionally block distributed. The techniques presented here extend the compiler to also accept multidimensional cyclic and block-cyclic distributions within a uniform symbolic framework. These extensions demand more sophisticated symbolic manipulation capabilities. A novel aspect of our approach is to meet this demand by interfacing PARADIGM with a powerful off-the-shelf symbolic package, Mathematica. This paper describes some of the Mathematica routines that performs various transformations, shows how they are invoked and used by the compiler to overcome the new challenges, and presents experimental results for code involving cyclic and block-cyclic arrays as evidence of the feasibility of the approach.

  1. ZettaBricks: A Language Compiler and Runtime System for Anyscale Computing

    SciTech Connect

    Amarasinghe, Saman

    2015-03-27

    This grant supported the ZettaBricks and OpenTuner projects. ZettaBricks is a new implicitly parallel language and compiler where defining multiple implementations of multiple algorithms to solve a problem is the natural way of programming. ZettaBricks makes algorithmic choice a first class construct of the language. Choices are provided in a way that also allows our compiler to tune at a finer granularity. The ZettaBricks compiler autotunes programs by making both fine-grained as well as algorithmic choices. Choices also include different automatic parallelization techniques, data distributions, algorithmic parameters, transformations, and blocking. Additionally, ZettaBricks introduces novel techniques to autotune algorithms for different convergence criteria. When choosing between various direct and iterative methods, the ZettaBricks compiler is able to tune a program in such a way that delivers near-optimal efficiency for any desired level of accuracy. The compiler has the flexibility of utilizing different convergence criteria for the various components within a single algorithm, providing the user with accuracy choice alongside algorithmic choice. OpenTuner is a generalization of the experience gained in building an autotuner for ZettaBricks. OpenTuner is a new open source framework for building domain-specific multi-objective program autotuners. OpenTuner supports fully-customizable configuration representations, an extensible technique representation to allow for domain-specific techniques, and an easy to use interface for communicating with the program to be autotuned. A key capability inside OpenTuner is the use of ensembles of disparate search techniques simultaneously; techniques that perform well will dynamically be allocated a larger proportion of tests.

  2. Pragmatic Randomized Optimal Platelet and Plasma Ratios (PROPPR) Trial: Design, rationale and implementation

    PubMed Central

    Baraniuk, Sarah; Tilley, Barbara C.; del Junco, Deborah J.; Fox, Erin E.; van Belle, Gerald; Wade, Charles E.; Podbielski, Jeanette M.; Beeler, Angela M.; Hess, John R.; Bulger, Eileen M.; Schreiber, Martin A.; Inaba, Kenji; Fabian, Timothy C.; Kerby, Jeffrey D.; Cohen, Mitchell J.; Miller, Christopher N.; Rizoli, Sandro; Scalea, Thomas M.; O’Keeffe, Terence; Brasel, Karen J.; Cotton, Bryan A.; Muskat, Peter; Holcomb, John B.

    2014-01-01

    Background Forty percent of in-hospital deaths among injured patients involve massive truncal hemorrhage. These deaths may be prevented with rapid hemorrhage control and improved resuscitation techniques. The Pragmatic Randomized Optimal Platelet and Plasma Ratios (PROPPR) Trial was designed to determine if there is a difference in mortality between subjects who received different ratios of FDA approved blood products. This report describes the design and implementation of PROPPR. Study Design PROPPR was designed as a randomized, two-group, Phase III trial conducted in subjects with the highest level of trauma activation and predicted to have a massive transfusion. Subjects at 12 North American level 1 trauma centers were randomized into one of two standard transfusion ratio interventions: 1:1:1 or 1:1:2, (plasma, platelets, and red blood cells). Clinical data and serial blood samples were collected under Exception from Informed Consent (EFIC) regulations. Co-primary mortality endpoints of 24 hours and 30 days were evaluated. Results Between August 2012 and December 2013, 680 patients were randomized. The overall median time from admission to randomization was 26 minutes. PROPPR enrolled at higher than expected rates with fewer than expected protocol deviations. Conclusion PROPPR is the largest randomized study to enroll severely bleeding patients. This study showed that rapidly enrolling and successfully providing randomized blood products to severely injured patients in an EFIC study is feasible. PROPPR was able to achieve these goals by utilizing a collaborative structure and developing successful procedures and design elements that can be part of future trauma studies. PMID:24996573

  3. Optimizing the Physical Implementation of an Eddy-covariance System to Minimize Flow Distortion

    NASA Astrophysics Data System (ADS)

    Durden, D.; Zulueta, R. C.; Durden, N. P.; Metzger, S.; Luo, H.; Duvall, B.

    2015-12-01

    The eddy-covariance technique is widely applied to observe the exchange of energy and scalars between the earth's surface and its atmosphere. In practice, fast (≥10 Hz) sonic anemometry and enclosed infrared gas spectroscopy are used to determine fluctuations in the 3-D wind vector and trace gas concentrations, respectively. Here, two contradicting requirements need to be fulfilled: (i) the sonic anemometer and trace gas analyzer should sample the same air volume, while (ii) the presence of the gas analyzer should not affect the wind field measured by the 3-D sonic anemometer. To determine the optimal positioning of these instruments with respect to each other, a trade-off study was performed. Theoretical formulations were used to determine a range of positions between the sonic anemometer and the gas analyzer that minimize the sum of (i) decorrelation error and (ii) wind blocking error. Subsequently, the blocking error induced by the presence of the gas sampling system was experimentally tested for a range of wind directions to verify the model-predicted placement: In a controlled environment the sonic anemometer was placed in the directed flow from a fan outfitted with a large shroud, with and without the presence of the enclosed gas analyzer and its sampling system. Blocking errors were enhanced by up to 10% for wind directions deviating ≥130° from frontal, when the flow was coming from the side where the enclosed gas analyzer was mounted. Consequently, we suggest a lateral position of the enclosed gas analyzer towards the aerodynamic wake of the tower, as data from this direction is likely affected by tower-induced flow distortion already. Ultimately, this physical implementation of the sonic anemometer and enclosed gas analyzer resulted in decorrelation and blocking errors ≤5% for ≥70% of all wind directions. These findings informed the design of the National Ecological Observatory Network's (NEON) eddy-covariance system, which is currently being

  4. Cables and connectors: A compilation

    NASA Technical Reports Server (NTRS)

    1974-01-01

    A technological compilation on devices and techniques for various types of electrical cables and connections is presented. Data are reported under three sections: flat conductor cable technology, newly developed electrical connectors, and miscellaneous articles and information on cables and connector techniques.

  5. 1988 Bulletin compilation and index

    SciTech Connect

    1989-02-01

    This document is published to provide current information about the national program for managing spent fuel and high-level radioactive waste. This document is a compilation of issues from the 1988 calendar year. A table of contents and one index have been provided to assist in finding information.

  6. Yes! An object-oriented compiler compiler (YOOCC)

    SciTech Connect

    Avotins, J.; Mingins, C.; Schmidt, H.

    1995-12-31

    Grammar-based processor generation is one of the most widely studied areas in language processor construction. However, there have been very few approaches to date that reconcile object-oriented principles, processor generation, and an object-oriented language. Pertinent here also. is that currently to develop a processor using the Eiffel Parse libraries requires far too much time to be expended on tasks that can be automated. For these reasons, we have developed YOOCC (Yes! an Object-Oriented Compiler Compiler), which produces a processor framework from a grammar using an enhanced version of the Eiffel Parse libraries, incorporating the ideas hypothesized by Meyer, and Grape and Walden, as well as many others. Various essential changes have been made to the Eiffel Parse libraries. Examples are presented to illustrate the development of a processor using YOOCC, and it is concluded that the Eiffel Parse libraries are now not only an intelligent, but also a productive option for processor construction.

  7. Utilizing object-oriented design to build advanced optimization strategies with generic implementation

    SciTech Connect

    Eldred, M.S.; Hart, W.E.; Bohnhoff, W.J.; Romero, V.J.; Hutchinson, S.A.; Salinger, A.G.

    1996-08-01

    the benefits of applying optimization to computational models are well known, but their range of widespread application to date has been limited. This effort attempts to extend the disciplinary areas to which optimization algorithms may be readily applied through the development and application of advanced optimization strategies capable of handling the computational difficulties associated with complex simulation codes. Towards this goal, a flexible software framework is under continued development for the application of optimization techniques to broad classes of engineering applications, including those with high computational expense and nonsmooth, nonconvex design space features. Object-oriented software design with C++ has been employed as a tool in providing a flexible, extensible, and robust multidisciplinary toolkit with computationally intensive simulations. In this paper, demonstrations of advanced optimization strategies using the software are presented in the hybridization and parallel processing research areas. Performance of the advanced strategies is compared with a benchmark nonlinear programming optimization.

  8. Compiler-assisted multiple instruction rollback recovery using a read buffer

    NASA Technical Reports Server (NTRS)

    Alewine, N. J.; Chen, S.-K.; Fuchs, W. K.; Hwu, W.-M.

    1993-01-01

    Multiple instruction rollback (MIR) is a technique that has been implemented in mainframe computers to provide rapid recovery from transient processor failures. Hardware-based MIR designs eliminate rollback data hazards by providing data redundancy implemented in hardware. Compiler-based MIR designs have also been developed which remove rollback data hazards directly with data-flow transformations. This paper focuses on compiler-assisted techniques to achieve multiple instruction rollback recovery. We observe that some data hazards resulting from instruction rollback can be resolved efficiently by providing an operand read buffer while others are resolved more efficiently with compiler transformations. A compiler-assisted multiple instruction rollback scheme is developed which combines hardware-implemented data redundancy with compiler-driven hazard removal transformations. Experimental performance evaluations indicate improved efficiency over previous hardware-based and compiler-based schemes.

  9. Developing an Onboard Traffic-Aware Flight Optimization Capability for Near-Term Low-Cost Implementation

    NASA Technical Reports Server (NTRS)

    Wing, David J.; Ballin, Mark G.; Koczo, Stefan, Jr.; Vivona, Robert A.; Henderson, Jeffrey M.

    2013-01-01

    The concept of Traffic Aware Strategic Aircrew Requests (TASAR) combines Automatic Dependent Surveillance Broadcast (ADS-B) IN and airborne automation to enable user-optimal in-flight trajectory replanning and to increase the likelihood of Air Traffic Control (ATC) approval for the resulting trajectory change request. TASAR is designed as a near-term application to improve flight efficiency or other user-desired attributes of the flight while not impacting and potentially benefiting ATC. Previous work has indicated the potential for significant benefits for each TASAR-equipped aircraft. This paper will discuss the approach to minimizing TASAR's cost for implementation and accelerating readiness for near-term implementation.

  10. Implementation of reactive and predictive real-time control strategies to optimize dry stormwater detention ponds

    NASA Astrophysics Data System (ADS)

    Gaborit, Étienne; Anctil, François; Vanrolleghem, Peter A.; Pelletier, Geneviève

    2013-04-01

    Dry detention ponds have been widely implemented in U.S.A (National Research Council, 1993) and Canada (Shammaa et al. 2002) to mitigate the impacts of urban runoff on receiving water bodies. The aim of such structures is to allow a temporary retention of the water during rainfall events, decreasing runoff velocities and volumes (by infiltration in the pond) as well as providing some water quality improvement from sedimentation. The management of dry detention ponds currently relies on static control through a fixed pre-designed limitation of their maximum outflow (Middleton and Barrett 2008), for example via a proper choice of their outlet pipe diameter. Because these ponds are designed for large storms, typically 1- or 2-hour duration rainfall events with return periods comprised between 5 and 100 years, one of their main drawbacks is that they generally offer almost no retention for smaller rainfall events (Middleton and Barrett 2008), which are by definition much more common. Real-Time Control (RTC) has a high potential for optimizing retention time (Marsalek 2005) because it allows adopting operating strategies that are flexible and hence more suitable to the prevailing fluctuating conditions than static control. For dry ponds, this would basically imply adapting the outlet opening percentage to maximize water retention time, while being able to open it completely for severe storms. This study developed several enhanced RTC scenarios of a dry detention pond located at the outlet of a small urban catchment near Québec City, Canada, following the previous work of Muschalla et al. (2009). The catchment's runoff quantity and TSS concentration were simulated by a SWMM5 model with an improved wash-off formulation. The control procedures rely on rainfall detection and measures of the pond's water height for the reactive schemes, and on rainfall forecasts in addition to these variables for the predictive schemes. The automatic reactive control schemes implemented

  11. Lower bound of optimization in radiological protection system taking account of practical implementation of clearance

    SciTech Connect

    Hattori, Takatoshi

    2007-07-01

    The dose criterion used to derive clearance and exemption levels is of the order of 0.01 mSv/y based on the Basic Safety Standard (BSS) of the International Atomic Energy Agency (IAEA), the use of which has been agreed upon by many countries. It is important for human beings, who are facing the fact that global resources for risk reduction are limited, to carefully consider the practical implementation of radiological protection systems, particularly for low-radiation-dose regions. For example, in direct gamma ray monitoring, to achieve clearance level compliance, difficult issues on how the uncertainty (error) of gamma measurement should be handled and also how the uncertainty (scattering) of the estimation of non-gamma emitters should be treated in clearance must be resolved. To resolve these issues, a new probabilistic approach has been proposed to establish an appropriate safety factor for compliance with the clearance level in Japan. This approach is based on the fundamental concept that 0.1 mSv/y should be complied with the 97.5. percentile of the probability distribution for the uncertainties of both the measurement and estimation of non-gamma emitters. The International Commission on Radiological Protection, ICRP published a new concept of the representative person in Publication 101 Part I. The representative person is a hypothetical person exposed to a dose that is representative of those of highly exposed persons in a population. In a probabilistic dose assessment, the ICRP recommends that the representative person should be defined such that the probability of exposure occurrence is lower than about 5% that of a person randomly selected from the population receiving a high dose. From the new concept of the ICRP, it is reasonable to consider that the 95. percentile of the dose distribution for the representative person is theoretically always lower than the dose constraint. Using this established relationship, it can be concluded that the minimum dose

  12. Electronic control circuits: A compilation

    NASA Technical Reports Server (NTRS)

    1973-01-01

    A compilation of technical R and D information on circuits and modular subassemblies is presented as a part of a technology utilization program. Fundamental design principles and applications are given. Electronic control circuits discussed include: anti-noise circuit; ground protection device for bioinstrumentation; temperature compensation for operational amplifiers; hybrid gatling capacitor; automatic signal range control; integrated clock-switching control; and precision voltage tolerance detector.

  13. Optimizing State Policy Implementation: The Case of the Scientific Based Research Components of the NCLB Act

    ERIC Educational Resources Information Center

    Mohammed, Shereeza; Pisapia, John; Walker, David A.

    2009-01-01

    A hypothesized model of state implementation of federal policy was extracted from empirical studies to discover the strategies states can use to gain compliance more cost effectively. Sixteen factors were identified and applied to the implementation of the Scientific Based Research provisions of the No Child Left Behind Act. Data collected from…

  14. Optimal Full Information Synthesis for Flexible Structures Implemented on Cray Supercomputers

    NASA Technical Reports Server (NTRS)

    Lind, Rick; Balas, Gary J.

    1995-01-01

    This paper considers an algorithm for synthesis of optimal controllers for full information feedback. The synthesis procedure reduces to a single linear matrix inequality which may be solved via established convex optimization algorithms. The computational cost of the optimization is investigated. It is demonstrated the problem dimension and corresponding matrices can become large for practical engineering problems. This algorithm represents a process that is impractical for standard workstations for large order systems. A flexible structure is presented as a design example. Control synthesis requires several days on a workstation but may be solved in a reasonable amount of time using a Cray supercomputer.

  15. Rooted-tree network for optimal non-local gate implementation

    NASA Astrophysics Data System (ADS)

    Vyas, Nilesh; Saha, Debashis; Panigrahi, Prasanta K.

    2016-06-01

    A general quantum network for implementing non-local control-unitary gates, between remote parties at minimal entanglement cost, is shown to be a rooted-tree structure. Starting from a five-party scenario, we demonstrate the local implementation of simultaneous class of control-unitary(Hermitian) and multiparty control-unitary gates in an arbitrary n-party network. Previously, established networks are turned out to be special cases of this general construct.

  16. A new module for constrained multi-fragment geometry optimization in internal coordinates implemented in the MOLCAS package.

    PubMed

    Vysotskiy, Victor P; Boström, Jonas; Veryazov, Valera

    2013-11-15

    A parallel procedure for an effective optimization of relative position and orientation between two or more fragments has been implemented in the MOLCAS program package. By design, the procedure does not perturb the electronic structure of a system under the study. The original composite system is divided into frozen fragments and internal coordinates linking those fragments are the only optimized parameters. The procedure is capable to handle fully independent (no border atoms) fragments as well as fragments connected by covalent bonds. In the framework of the procedure, the optimization of relative position and orientation of the fragments are carried out in the internal "Z-matrix" coordinates using numerical derivatives. The total number of required single points energy evaluations scales with the number of fragments rather than with the total number of atoms in the system. The accuracy and the performance of the procedure have been studied by test calculations for a representative set of two- and three-fragment molecules with artificially distorted structures. The developed approach exhibits robust and smooth convergence to the reference optimal structures. As only a few internal coordinates are varied during the procedure, the proposed constrained fragment geometry optimization can be afforded even for high level ab initio methods like CCSD(T) and CASPT2. This capability has been demonstrated by applying the method to two larger cases, CCSD(T) and CASPT2 calculations on a positively charged benzene lithium complex and on the oxygen molecule interacting to iron porphyrin molecule, respectively. PMID:24006272

  17. Optimizing Blocking and Nonblocking Reduction Operations for Multicore Systems: Hierarchical Design and Implementation

    SciTech Connect

    Gorentla Venkata, Manjunath; Shamis, Pavel; Graham, Richard L; Ladd, Joshua S; Sampath, Rahul S

    2013-01-01

    Many scientific simulations, using the Message Passing Interface (MPI) programming model, are sensitive to the performance and scalability of reduction collective operations such as MPI Allreduce and MPI Reduce. These operations are the most widely used abstractions to perform mathematical operations over all processes that are part of the simulation. In this work, we propose a hierarchical design to implement the reduction operations on multicore systems. This design aims to improve the efficiency of reductions by 1) tailoring the algorithms and customizing the implementations for various communication mechanisms in the system 2) providing the ability to configure the depth of hierarchy to match the system architecture, and 3) providing the ability to independently progress each of this hierarchy. Using this design, we implement MPI Allreduce and MPI Reduce operations (and its nonblocking variants MPI Iallreduce and MPI Ireduce) for all message sizes, and evaluate on multiple architectures including InfiniBand and Cray XT5. We leverage and enhance our existing infrastructure, Cheetah, which is a framework for implementing hierarchical collective operations to implement these reductions. The experimental results show that the Cheetah reduction operations outperform the production-grade MPI implementations such as Open MPI default, Cray MPI, and MVAPICH2, demonstrating its efficiency, flexibility and portability. On Infini- Band systems, with a microbenchmark, a 512-process Cheetah nonblocking Allreduce and Reduce achieves a speedup of 23x and 10x, respectively, compared to the default Open MPI reductions. The blocking variants of the reduction operations also show similar performance benefits. A 512-process nonblocking Cheetah Allreduce achieves a speedup of 3x, compared to the default MVAPICH2 Allreduce implementation. On a Cray XT5 system, a 6144-process Cheetah Allreduce outperforms the Cray MPI by 145%. The evaluation with an application kernel, Conjugate

  18. Efficient implementation and application of the artificial bee colony algorithm to low-dimensional optimization problems

    NASA Astrophysics Data System (ADS)

    von Rudorff, Guido Falk; Wehmeyer, Christoph; Sebastiani, Daniel

    2014-06-01

    We adapt a swarm-intelligence-based optimization method (the artificial bee colony algorithm, ABC) to enhance its parallel scaling properties and to improve the escaping behavior from deep local minima. Specifically, we apply the approach to the geometry optimization of Lennard-Jones clusters. We illustrate the performance and the scaling properties of the parallelization scheme for several system sizes (5-20 particles). Our main findings are specific recommendations for ranges of the parameters of the ABC algorithm which yield maximal performance for Lennard-Jones clusters and Morse clusters. The suggested parameter ranges for these different interaction potentials turn out to be very similar; thus, we believe that our reported values are fairly general for the ABC algorithm applied to chemical optimization problems.

  19. Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Alewine, Neal Jon

    1993-01-01

    Multiple instruction rollback (MIR) is a technique to provide rapid recovery from transient processor failures and was implemented in hardware by researchers and slow in mainframe computers. Hardware-based MIR designs eliminate rollback data hazards by providing data redundancy implemented in hardware. Compiler-based MIR designs were also developed which remove rollback data hazards directly with data flow manipulations, thus eliminating the need for most data redundancy hardware. Compiler-assisted techniques to achieve multiple instruction rollback recovery are addressed. It is observed that data some hazards resulting from instruction rollback can be resolved more efficiently by providing hardware redundancy while others are resolved more efficiently with compiler transformations. A compiler-assisted multiple instruction rollback scheme is developed which combines hardware-implemented data redundancy with compiler-driven hazard removal transformations. Experimental performance evaluations were conducted which indicate improved efficiency over previous hardware-based and compiler-based schemes. Various enhancements to the compiler transformations and to the data redundancy hardware developed for the compiler-assisted MIR scheme are described and evaluated. The final topic deals with the application of compiler-assisted MIR techniques to aid in exception repair and branch repair in a speculative execution architecture.

  20. Exploration of Optimization Options for Increasing Performance of a GPU Implementation of a Three-dimensional Bilateral Filter

    SciTech Connect

    Bethel, E. Wes; Bethel, E. Wes

    2012-01-06

    This report explores using GPUs as a platform for performing high performance medical image data processing, specifically smoothing using a 3D bilateral filter, which performs anisotropic, edge-preserving smoothing. The algorithm consists of a running a specialized 3D convolution kernel over a source volume to produce an output volume. Overall, our objective is to understand what algorithmic design choices and configuration options lead to optimal performance of this algorithm on the GPU. We explore the performance impact of using different memory access patterns, of using different types of device/on-chip memories, of using strictly aligned and unaligned memory, and of varying the size/shape of thread blocks. Our results reveal optimal configuration parameters for our algorithm when executed sample 3D medical data set, and show performance gains ranging from 30x to over 200x as compared to a single-threaded CPU implementation.

  1. SVD-based optimal filtering for noise reduction in dual microphone hearing aids: a real time implementation and perceptual evaluation.

    PubMed

    Maj, Jean-Baptiste; Royackers, Liesbeth; Moonen, Marc; Wouters, Jan

    2005-09-01

    In this paper, the first real-time implementation and perceptual evaluation of a singular value decomposition (SVD)-based optimal filtering technique for noise reduction in a dual microphone behind-the-ear (BTE) hearing aid is presented. This evaluation was carried out for a speech weighted noise and multitalker babble, for single and multiple jammer sound source scenarios. Two basic microphone configurations in the hearing aid were used. The SVD-based optimal filtering technique was compared against an adaptive beamformer, which is known to give significant improvements in speech intelligibility in noisy environment. The optimal filtering technique works without assumptions about a speaker position, unlike the two-stage adaptive beamformer. However this strategy needs a robust voice activity detector (VAD). A method to improve the performance of the VAD was presented and evaluated physically. By connecting the VAD to the output of the noise reduction algorithms, a good discrimination between the speech-and-noise periods and the noise-only periods of the signals was obtained. The perceptual experiments demonstrated that the SVD-based optimal filtering technique could perform as well as the adaptive beamformer in a single noise source scenario, i.e., the ideal scenario for the latter technique, and could outperform the adaptive beamformer in multiple noise source scenarios. PMID:16189969

  2. Compiling global name-space parallel loops for distributed execution

    NASA Technical Reports Server (NTRS)

    Koelbel, Charles; Mehrotra, Piyush

    1991-01-01

    Distributed memory machines do not provide hardware support for a global address space. Thus programmers are forced to partition the data across the memories of the architecture and use explicit message passing to communicate data between processors. The compiler support required to allow programmers to express their algorithms using a global name-space is examined. A general method is presented for analysis of a high level source program and its translation into a set of independently executing tasks communicating via messages. If the compiler has enough information, this translation can be carried out at compile time. Otherwise, run-time code is generated to implement the required data movement. The analysis required in both situations is described and the performance of the generated code on the Intel iPSC/2 is presented.

  3. Compiling global name-space programs for distributed execution

    NASA Technical Reports Server (NTRS)

    Koelbel, Charles; Mehrotra, Piyush

    1990-01-01

    Distributed memory machines do not provide hardware support for a global address space. Thus programmers are forced to partition the data across the memories of the architecture and use explicit message passing to communicate data between processors. The compiler support required to allow programmers to express their algorithms using a global name-space is examined. A general method is presented for analysis of a high level source program and along with its translation to a set of independently executing tasks communicating via messages. If the compiler has enough information, this translation can be carried out at compile-time. Otherwise run-time code is generated to implement the required data movement. The analysis required in both situations is described and the performance of the generated code on the Intel iPSC/2 is presented.

  4. Distributed memory compiler design for sparse problems

    NASA Technical Reports Server (NTRS)

    Wu, Janet; Saltz, Joel; Berryman, Harry; Hiranandani, Seema

    1991-01-01

    A compiler and runtime support mechanism is described and demonstrated. The methods presented are capable of solving a wide range of sparse and unstructured problems in scientific computing. The compiler takes as input a FORTRAN 77 program enhanced with specifications for distributing data, and the compiler outputs a message passing program that runs on a distributed memory computer. The runtime support for this compiler is a library of primitives designed to efficiently support irregular patterns of distributed array accesses and irregular distributed array partitions. A variety of Intel iPSC/860 performance results obtained through the use of this compiler are presented.

  5. Approximate knowledge compilation: The first order case

    SciTech Connect

    Val, A. del

    1996-12-31

    Knowledge compilation procedures make a knowledge base more explicit so as make inference with respect to the compiled knowledge base tractable or at least more efficient. Most work to date in this area has been restricted to the propositional case, despite the importance of first order theories for expressing knowledge concisely. Focusing on (LUB) approximate compilation, our contribution is twofold: (1) We present a new ground algorithm for approximate compilation which can produce exponential savings with respect to the previously known algorithm. (2) We show that both ground algorithms can be lifted to the first order case preserving their correctness for approximate compilation.

  6. Optimized ECC Implementation for Secure Communication between Heterogeneous IoT Devices

    PubMed Central

    Marin, Leandro; Piotr Pawlowski, Marcin; Jara, Antonio

    2015-01-01

    The Internet of Things is integrating information systems, places, users and billions of constrained devices into one global network. This network requires secure and private means of communications. The building blocks of the Internet of Things are devices manufactured by various producers and are designed to fulfil different needs. There would be no common hardware platform that could be applied in every scenario. In such a heterogeneous environment, there is a strong need for the optimization of interoperable security. We present optimized elliptic curve Cryptography algorithms that address the security issues in the heterogeneous IoT networks. We have combined cryptographic algorithms for the NXP/Jennic 5148- and MSP430-based IoT devices and used them to created novel key negotiation protocol. PMID:26343677

  7. Optimized ECC Implementation for Secure Communication between Heterogeneous IoT Devices.

    PubMed

    Marin, Leandro; Pawlowski, Marcin Piotr; Jara, Antonio

    2015-01-01

    The Internet of Things is integrating information systems, places, users and billions of constrained devices into one global network. This network requires secure and private means of communications. The building blocks of the Internet of Things are devices manufactured by various producers and are designed to fulfil different needs. There would be no common hardware platform that could be applied in every scenario. In such a heterogeneous environment, there is a strong need for the optimization of interoperable security. We present optimized elliptic curve Cryptography algorithms that address the security issues in the heterogeneous IoT networks. We have combined cryptographic algorithms for the NXP/Jennic 5148- and MSP430-based IoT devices and used them to created novel key negotiation protocol. PMID:26343677

  8. Optimal parameters for clinical implementation of breast cancer patient setup using Varian DTS software.

    PubMed

    Ng, Sook Kien; Zygmanski, Piotr; Jeung, Andrew; Mostafavi, Hassan; Hesser, Juergen; Bellon, Jennifer R; Wong, Julia S; Lyatskaya, Yulia

    2012-01-01

    Digital tomosynthesis (DTS) was evaluated as an alternative to cone-beam computed tomography (CBCT) for patient setup. DTS is preferable when there are constraints with setup time, gantry-couch clearance, and imaging dose using CBCT. This study characterizes DTS data acquisition and registration parameters for the setup of breast cancer patients using nonclinical Varian DTS software. DTS images were reconstructed from CBCT projections acquired on phantoms and patients with surgical clips in the target volume. A shift-and-add algorithm was used for DTS volume reconstructions, while automated cross-correlation matches were performed within Varian DTS software. Triangulation on two short DTS arcs separated by various angular spread was done to improve 3D registration accuracy. Software performance was evaluated on two phantoms and ten breast cancer patients using the registration result as an accuracy measure; investigated parameters included arc lengths, arc orientations, angular separation between two arcs, reconstruction slice spacing, and number of arcs. The shifts determined from DTS-to-CT registration were compared to the shifts based on CBCT-to-CT registration. The difference between these shifts was used to evaluate the software accuracy. After findings were quantified, optimal parameters for the clinical use of DTS technique were determined. It was determined that at least two arcs were necessary for accurate 3D registration for patient setup. Registration accuracy of 2 mm was achieved when the reconstruction arc length was > 5° for clips with HU ≥ 1000; larger arc length (≥ 8°) was required for very low HU clips. An optimal arc separation was found to be ≥ 20° and optimal arc length was 10°. Registration accuracy did not depend on DTS slice spacing. DTS image reconstruction took 10-30 seconds and registration took less than 20 seconds. The performance of Varian DTS software was found suitable for the accurate setup of breast cancer patients

  9. Implementation and Optimization of miniGMG - a Compact Geometric Multigrid Benchmark

    SciTech Connect

    Williams, Samuel; Kalamkar, Dhiraj; Singh, Amik; Deshpande, Anand M.; Straalen, Brian Van; Smelyanskiy, Mikhail; Almgren, Ann; Dubey, Pradeep; Shalf, John; Oliker, Leonid

    2012-12-01

    Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear systems used in a number of different application areas. In this report, we describe miniGMG, our compact geometric multigrid benchmark designed to proxy the multigrid solves found in AMR applications. We explore optimization techniques for geometric multigrid on existing and emerging multicore systems including the Opteron-based Cray XE6, Intel Sandy Bridge and Nehalem-based Infiniband clusters, as well as manycore-based architectures including NVIDIA's Fermi and Kepler GPUs and Intel's Knights Corner (KNC) co-processor. This report examines a variety of novel techniques including communication-aggregation, threaded wavefront-based DRAM communication-avoiding, dynamic threading decisions, SIMDization, and fusion of operators. We quantify performance through each phase of the V-cycle for both single-node and distributed-memory experiments and provide detailed analysis for each class of optimization. Results show our optimizations yield significant speedups across a variety of subdomain sizes while simultaneously demonstrating the potential of multi- and manycore processors to dramatically accelerate single-node performance. However, our analysis also indicates that improvements in networks and communication will be essential to reap the potential of manycore processors in large-scale multigrid calculations.

  10. Decoding and optimized implementation of SECDED codes over GF(q)

    DOEpatents

    Ward, H. Lee; Ganti, Anand; Resnick, David R

    2013-10-22

    A plurality of columns for a check matrix that implements a distance d linear error correcting code are populated by providing a set of vectors from which to populate the columns, and applying to the set of vectors a filter operation that reduces the set by eliminating therefrom all vectors that would, if used to populate the columns, prevent the check matrix from satisfying a column-wise linear independence requirement associated with check matrices of distance d linear codes. One of the vectors from the reduced set may then be selected to populate one of the columns. The filtering and selecting repeats iteratively until either all of the columns are populated or the number of currently unpopulated columns exceeds the number of vectors in the reduced set. Columns for the check matrix may be processed to reduce the amount of logic needed to implement the check matrix in circuit logic.

  11. Decoding and optimized implementation of SECDED codes over GF(q)

    DOEpatents

    Ward, H Lee; Ganti, Anand; Resnick, David R

    2014-11-18

    A plurality of columns for a check matrix that implements a distance d linear error correcting code are populated by providing a set of vectors from which to populate the columns, and applying to the set of vectors a filter operation that reduces the set by eliminating therefrom all vectors that would, if used to populate the columns, prevent the check matrix from satisfying a column-wise linear independence requirement associated with check matrices of distance d linear codes. One of the vectors from the reduced set may then be selected to populate one of the columns. The filtering and selecting repeats iteratively until either all of the columns are populated or the number of currently unpopulated columns exceeds the number of vectors in the reduced set. Columns for the check matrix may be processed to reduce the amount of logic needed to implement the check matrix in circuit logic.

  12. Design, decoding and optimized implementation of SECDED codes over GF(q)

    DOEpatents

    Ward, H Lee; Ganti, Anand; Resnick, David R

    2014-06-17

    A plurality of columns for a check matrix that implements a distance d linear error correcting code are populated by providing a set of vectors from which to populate the columns, and applying to the set of vectors a filter operation that reduces the set by eliminating therefrom all vectors that would, if used to populate the columns, prevent the check matrix from satisfying a column-wise linear independence requirement associated with check matrices of distance d linear codes. One of the vectors from the reduced set may then be selected to populate one of the columns. The filtering and selecting repeats iteratively until either all of the columns are populated or the number of currently unpopulated columns exceeds the number of vectors in the reduced set. Columns for the check matrix may be processed to reduce the amount of logic needed to implement the check matrix in circuit logic.

  13. Implementation of a Surgical Safety Checklist: Interventions to Optimize the Process and Hints to Increase Compliance

    PubMed Central

    Sendlhofer, Gerald; Mosbacher, Nina; Karina, Leitgeb; Kober, Brigitte; Jantscher, Lydia; Berghold, Andrea; Pregartner, Gudrun; Brunner, Gernot; Kamolz, Lars Peter

    2015-01-01

    Background A surgical safety checklist (SSC) was implemented and routinely evaluated within our hospital. The purpose of this study was to analyze compliance, knowledge of and satisfaction with the SSC to determine further improvements. Methods The implementation of the SSC was observed in a pilot unit. After roll-out into each operating theater, compliance with the SSC was routinely measured. To assess subjective and objective knowledge, as well as satisfaction with the SSC implementation, an online survey (N = 891) was performed. Results During two test runs in a piloting unit, 305 operations were observed, 175 in test run 1 and 130 in test run 2. The SSC was used in 77.1% of all operations in test run 1 and in 99.2% in test run 2. Within used SSCs, completion rates were 36.3% in test run 1 and 1.6% in test run 2. After roll-out, three unannounced audits took place and showed that the SSC was used in 95.3%, 91.9% and 89.9%. Within used SSCs, completion rates decreased from 81.7% to 60.6% and 53.2%. In 2014, 164 (18.4%) operating team members responded to the online survey, 160 of which were included in the analysis. 146 (91.3%) consultants and nursing staff reported to use the SSC regularly in daily routine. Conclusion These data show that the implementation of new tools such as the adapted WHO SSC needs constant supervision and instruction until it becomes self-evident and accepted. Further efforts, consisting mainly of hands-on leadership and training are necessary. PMID:25658317

  14. Medical image denoising via optimal implementation of non-local means on hybrid parallel architecture.

    PubMed

    Nguyen, Tuan-Anh; Nakib, Amir; Nguyen, Huy-Nam

    2016-06-01

    The Non-local means denoising filter has been established as gold standard for image denoising problem in general and particularly in medical imaging due to its efficiency. However, its computation time limited its applications in real world application, especially in medical imaging. In this paper, a distributed version on parallel hybrid architecture is proposed to solve the computation time problem and a new method to compute the filters' coefficients is also proposed, where we focused on the implementation and the enhancement of filters' parameters via taking the neighborhood of the current voxel more accurately into account. In terms of implementation, our key contribution consists in reducing the number of shared memory accesses. The different tests of the proposed method were performed on the brain-web database for different levels of noise. Performances and the sensitivity were quantified in terms of speedup, peak signal to noise ratio, execution time, the number of floating point operations. The obtained results demonstrate the efficiency of the proposed method. Moreover, the implementation is compared to that of other techniques, recently published in the literature. PMID:27084318

  15. GALEX 1st Light Compilation

    NASA Technical Reports Server (NTRS)

    2003-01-01

    This compilation shows the constellation Hercules, as imaged on May 21 and 22 by NASA's Galaxy Evolution Explorer. The images were captured by the two channels of the spacecraft camera during the mission's 'first light' milestone.

    The Galaxy Evolution Explorer first light images are dedicated to the crew of the Space Shuttle Columbia. The Hercules region was directly above Columbia when it made its last contact with NASA Mission Control on February 1, over the skies of Texas.

    The Galaxy Evolution Explorer launched on April 28 on a mission to map the celestial sky in the ultraviolet and determine the history of star formation in the universe over the last 10 billion years.

  16. Retargeting of existing FORTRAN program and development of parallel compilers

    NASA Technical Reports Server (NTRS)

    Agrawal, Dharma P.

    1988-01-01

    The software models used in implementing the parallelizing compiler for the B-HIVE multiprocessor system are described. The various models and strategies used in the compiler development are: flexible granularity model, which allows a compromise between two extreme granularity models; communication model, which is capable of precisely describing the interprocessor communication timings and patterns; loop type detection strategy, which identifies different types of loops; critical path with coloring scheme, which is a versatile scheduling strategy for any multicomputer with some associated communication costs; and loop allocation strategy, which realizes optimum overlapped operations between computation and communication of the system. Using these models, several sample routines of the AIR3D package are examined and tested. It may be noted that automatically generated codes are highly parallelized to provide the maximized degree of parallelism, obtaining the speedup up to a 28 to 32-processor system. A comparison of parallel codes for both the existing and proposed communication model, is performed and the corresponding expected speedup factors are obtained. The experimentation shows that the B-HIVE compiler produces more efficient codes than existing techniques. Work is progressing well in completing the final phase of the compiler. Numerous enhancements are needed to improve the capabilities of the parallelizing compiler.

  17. Expected treatment dose construction and adaptive inverse planning optimization: Implementation for offline head and neck cancer adaptive radiotherapy

    SciTech Connect

    Yan Di; Liang Jian

    2013-02-15

    : Adaptive treatment modification can be implemented including the expected treatment dose in the adaptive inverse planning optimization. The retrospective evaluation results demonstrate that utilizing the weekly adaptive inverse planning optimization, the dose distribution of h and n cancer treatment can be largely improved.

  18. Optimized FPGA Implementation of Multi-Rate FIR Filters Through Thread Decomposition

    NASA Technical Reports Server (NTRS)

    Kobayashi, Kayla N.; He, Yutao; Zheng, Jason X.

    2011-01-01

    Multi-rate finite impulse response (MRFIR) filters are among the essential signal-processing components in spaceborne instruments where finite impulse response filters are often used to minimize nonlinear group delay and finite precision effects. Cascaded (multistage) designs of MRFIR filters are further used for large rate change ratio in order to lower the required throughput, while simultaneously achieving comparable or better performance than single-stage designs. Traditional representation and implementation of MRFIR employ polyphase decomposition of the original filter structure, whose main purpose is to compute only the needed output at the lowest possible sampling rate. In this innovation, an alternative representation and implementation technique called TD-MRFIR (Thread Decomposition MRFIR) is presented. The basic idea is to decompose MRFIR into output computational threads, in contrast to a structural decomposition of the original filter as done in the polyphase decomposition. A naive implementation of a decimation filter consisting of a full FIR followed by a downsampling stage is very inefficient, as most of the computations performed by the FIR state are discarded through downsampling. In fact, only 1/M of the total computations are useful (M being the decimation factor). Polyphase decomposition provides an alternative view of decimation filters, where the downsampling occurs before the FIR stage, and the outputs are viewed as the sum of M sub-filters with length of N/M taps. Although this approach leads to more efficient filter designs, in general the implementation is not straightforward if the numbers of multipliers need to be minimized. In TD-MRFIR, each thread represents an instance of the finite convolution required to produce a single output of the MRFIR. The filter is thus viewed as a finite collection of concurrent threads. Each of the threads completes when a convolution result (filter output value) is computed, and activated when the first

  19. Economic Implementation and Optimization of Secondary Oil Recovery Process: St. Mary West Field, Lafayette County, Arkansas

    SciTech Connect

    Brock P.E., Cary D.

    2003-03-10

    The purpose of this study was to investigate the economic appropriateness of several enhanced oil recovery processes that are available to a small mature oil field located in southwest Arkansas and to implement the most economic efficient process evaluated. The State of Arkansas natural resource laws require that an oilfield is to be unitized before conducting a secondary recovery project. This requires all properties that can reasonably be determined to include the oil productive reservoir must be bound together as one common lease by a legal contract that must be approved to be fair and equitable to all property owners within the proposed unit area.

  20. Parallel Implementations Of The Nelder-Mead Simplex Algorithm For Unconstrained Optimization

    NASA Astrophysics Data System (ADS)

    Dennis, J. E.; Torczon, Virginia

    1988-04-01

    We are interested in implementing direct search methods on parallel computers to solve the unconstrained minimization problem: Given a function f : IRn --? IR find an x E En that minimizes 1 (x). Our preliminary work has focused on the Nelder-Mead simplex algorithm. The origin of the algorithm can be found in a 1962 paper by Spendley, Hext and Himsworth;1 Nelder and Meade proposed an adaptive version which proved to be much more robust in practice. Dennis and Woods3 give a clear presentation of the standard Nelder-Mead simplex algorithm; Woods4 includes a more complete discussion of implementation details as well as some preliminary convergence results. Since descriptions of the standard Nelder-Mead simplex algorithm appear in Nelder and Mead,2 Dennis and Woods,3 and Woods,4 we will limit our introductory discussion to the advantages and disadvantages of the algorithm, as well as some of the features which make it so popular. We then outline the approaches we have taken and discuss our preliminary results. We conclude with a discussion of future research and some observations about our findings.

  1. A highly scalable parallel computation strategy and optimized implementation for Fresnel Seismic Tomography

    NASA Astrophysics Data System (ADS)

    Gao, Yongan; Zhao, Changhai; Li, Chuang; Yan, Haihua; Zhao, Liang

    2013-03-01

    Fresnel Seismic Tomography which uses a huge amount of seismic data is an efficient methodology of researching three-dimensional structure of earth. However, in practical application, it confronts with two key challenges of enormous data volume and huge computation. It is difficult to accomplish computation tasks under normal operating environment and computation strategies. In this paper, a Job-By-Application parallel computation strategy, which uses MPI (Message Passing Interface) and Pthread hybrid programming models based on the cluster, is designed to implement Fresnel seismic tomography, this method can solve the problem of allocating tasks dynamically, improve the load balancing and scalability of the system effectively; and we adopted the cached I/O strategy to accommodate the limited memory resources. Experimental results demonstrated that the program implemented on these strategies could completed the actual job within the idea time, the running of the program was stable, achieved load balancing, showed a good speedup and could adapt to the hardware environment of insufficient memory.

  2. Direct Methods for Predicting Movement Biomechanics Based Upon Optimal Control Theory with Implementation in OpenSim.

    PubMed

    Porsa, Sina; Lin, Yi-Chung; Pandy, Marcus G

    2016-08-01

    The aim of this study was to compare the computational performances of two direct methods for solving large-scale, nonlinear, optimal control problems in human movement. Direct shooting and direct collocation were implemented on an 8-segment, 48-muscle model of the body (24 muscles on each side) to compute the optimal control solution for maximum-height jumping. Both algorithms were executed on a freely-available musculoskeletal modeling platform called OpenSim. Direct collocation converged to essentially the same optimal solution up to 249 times faster than direct shooting when the same initial guess was assumed (3.4 h of CPU time for direct collocation vs. 35.3 days for direct shooting). The model predictions were in good agreement with the time histories of joint angles, ground reaction forces and muscle activation patterns measured for subjects jumping to their maximum achievable heights. Both methods converged to essentially the same solution when started from the same initial guess, but computation time was sensitive to the initial guess assumed. Direct collocation demonstrates exceptional computational performance and is well suited to performing predictive simulations of movement using large-scale musculoskeletal models. PMID:26715209

  3. JAVA implemented MSE optimal bit-rate allocation applied to 3-D hyperspectral imagery using JPEG2000 compression

    NASA Astrophysics Data System (ADS)

    Melchor, J. L., Jr.; Cabrera, S. D.; Aguirre, A.; Kosheleva, O. M.; Vidal, E., Jr.

    2005-08-01

    This paper describes an efficient algorithm and its Java implementation for a recently developed mean-squared error (MSE) rate-distortion optimal (RDO) inter-slice bit-rate allocation (BRA) scheme applicable to the JPEG2000 Part 2 (J2KP2) framework. Its performance is illustrated on hyperspectral imagery data using the J2KP2 with the Karhunen- Loeve transform (KLT) for decorrelation. The results are contrasted with those obtained using the traditional logvariance based BRA method and with the original RDO algorithm. The implementation has been developed as a Java plug-in to be incorporated into our evolving multi-dimensional data compression software tool denoted CompressMD. The RDO approach to BRA uses discrete rate distortion curves (RDCs) for each slice of transform coefficients. The generation of each point on a RDC requires a full decompression of that slice, therefore, the efficient version minimizes the number of RDC points needed from each slice by using a localized coarse-to-fine approach denoted RDOEfficient. The scheme is illustrated in detail using a subset of 10 bands of hyperspectral imagery data and is contrasted to the original RDO implementation and the traditional (log-variance) method of BRA showing that better results are obtained with the RDO methods. The three schemes are also tested on two hyperspectral imagery data sets with all bands present: the Cuprite radiance data from AVIRIS and a set derived from the Hyperion satellite. The results from the RDO and RDOEfficient are very close to each other in the MSE sense indicating that the adaptive approach can find almost the same BRA solution. Surprisingly, the traditional method also performs very close to the RDO methods, indicating that it is very close to being optimal for these types of data sets.

  4. Optimization of the Coupled Cluster Implementation in NWChem on Petascale Parallel Architectures

    SciTech Connect

    Anisimov, Victor; Bauer, Gregory H.; Chadalavada, Kalyana; Olson, Ryan M.; Glenski, Joseph W.; Kramer, William T.; Apra, Edoardo; Kowalski, Karol

    2014-09-04

    Coupled cluster singles and doubles (CCSD) algorithm has been optimized in NWChem software package. This modification alleviated the communication bottleneck and provided from 2- to 5-fold speedup in the CCSD iteration time depending on the problem size and available memory. Sustained 0.60 petaflop/sec performance on CCSD(T) calculation has been obtained on NCSA Blue Waters. This number included all stages of the calculation from initialization till termination, iterative computation of single and double excitations, and perturbative accounting for triple excitations. In the section of perturbative triples alone, the computation maintained 1.18 petaflop/sec performance level. CCSD computations have been performed on Guanine-Cytosine deoxydinucleotide monophosphate (GC-dDMP) to probe the conformational energy difference in DNA single strand in A- and B-conformations. The computation revealed significant discrepancy between CCSD and classical force fields in prediction of relative energy of A- and B-conformations of GC-dDMP.

  5. Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation

    NASA Technical Reports Server (NTRS)

    Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)

    2002-01-01

    Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.

  6. Optimization and implementation of scaling-free CORDIC-based direct digital frequency synthesizer for body care area network systems.

    PubMed

    Juang, Ying-Shen; Ko, Lu-Ting; Chen, Jwu-E; Sung, Tze-Yun; Hsin, Hsi-Chin

    2012-01-01

    Coordinate rotation digital computer (CORDIC) is an efficient algorithm for computations of trigonometric functions. Scaling-free-CORDIC is one of the famous CORDIC implementations with advantages of speed and area. In this paper, a novel direct digital frequency synthesizer (DDFS) based on scaling-free CORDIC is presented. The proposed multiplier-less architecture with small ROM and pipeline data path has advantages of high data rate, high precision, high performance, and less hardware cost. The design procedure with performance and hardware analysis for optimization has also been given. It is verified by Matlab simulations and then implemented with field programmable gate array (FPGA) by Verilog. The spurious-free dynamic range (SFDR) is over 86.85 dBc, and the signal-to-noise ratio (SNR) is more than 81.12 dB. The scaling-free CORDIC-based architecture is suitable for VLSI implementations for the DDFS applications in terms of hardware cost, power consumption, SNR, and SFDR. The proposed DDFS is very suitable for medical instruments and body care area network systems. PMID:23251230

  7. Optimization and Implementation of Scaling-Free CORDIC-Based Direct Digital Frequency Synthesizer for Body Care Area Network Systems

    PubMed Central

    Juang, Ying-Shen; Ko, Lu-Ting; Chen, Jwu-E.; Sung, Tze-Yun; Hsin, Hsi-Chin

    2012-01-01

    Coordinate rotation digital computer (CORDIC) is an efficient algorithm for computations of trigonometric functions. Scaling-free-CORDIC is one of the famous CORDIC implementations with advantages of speed and area. In this paper, a novel direct digital frequency synthesizer (DDFS) based on scaling-free CORDIC is presented. The proposed multiplier-less architecture with small ROM and pipeline data path has advantages of high data rate, high precision, high performance, and less hardware cost. The design procedure with performance and hardware analysis for optimization has also been given. It is verified by Matlab simulations and then implemented with field programmable gate array (FPGA) by Verilog. The spurious-free dynamic range (SFDR) is over 86.85 dBc, and the signal-to-noise ratio (SNR) is more than 81.12 dB. The scaling-free CORDIC-based architecture is suitable for VLSI implementations for the DDFS applications in terms of hardware cost, power consumption, SNR, and SFDR. The proposed DDFS is very suitable for medical instruments and body care area network systems. PMID:23251230

  8. A survey of compiler development aids. [concerning lexical, syntax, and semantic analysis

    NASA Technical Reports Server (NTRS)

    Buckles, B. P.; Hodges, B. C.; Hsia, P.

    1977-01-01

    A theoretical background was established for the compilation process by dividing it into five phases and explaining the concepts and algorithms that underpin each. The five selected phases were lexical analysis, syntax analysis, semantic analysis, optimization, and code generation. Graph theoretical optimization techniques were presented, and approaches to code generation were described for both one-pass and multipass compilation environments. Following the initial tutorial sections, more than 20 tools that were developed to aid in the process of writing compilers were surveyed. Eight of the more recent compiler development aids were selected for special attention - SIMCMP/STAGE2, LANG-PAK, COGENT, XPL, AED, CWIC, LIS, and JOCIT. The impact of compiler development aids were assessed some of their shortcomings and some of the areas of research currently in progress were inspected.

  9. Toward a fundamental theory of optimal feature selection: Part II - Implementation and computational complexity

    SciTech Connect

    Morgera, S.D.

    1987-01-01

    Certain algorithms and their computational complexity are examined for use in a VLSI implementation of the real-time pattern classifier described in Part I of this work. The most computationally intensive processing is found in the classifier training mode wherein subsets of the largest and smallest eigenvalues and associated eigenvectors of the input data covariance pair must be computed. It is shown that if the matrix of interest is centrosymmetric and the method for eigensystem decomposition is operator-based, the problem architecture assumes a parallel form. Such a matrix structure is found in a wide variety of pattern recognition and speech and signal processing applications. Each of the parallel channels requires only two specialized matrix-arithmetic modules.

  10. Compiler-assisted multiple instruction rollback recovery using a read buffer

    NASA Technical Reports Server (NTRS)

    Alewine, Neal J.; Chen, Shyh-Kwei; Fuchs, W. Kent; Hwu, Wen-Mei W.

    1995-01-01

    Multiple instruction rollback (MIR) is a technique that has been implemented in mainframe computers to provide rapid recovery from transient processor failures. Hardware-based MIR designs eliminate rollback data hazards by providing data redundancy implemented in hardware. Compiler-based MIR designs have also been developed which remove rollback data hazards directly with data-flow transformations. This paper describes compiler-assisted techniques to achieve multiple instruction rollback recovery. We observe that some data hazards resulting from instruction rollback can be resolved efficiently by providing an operand read buffer while others are resolved more efficiently with compiler transformations. The compiler-assisted scheme presented consists of hardware that is less complex than shadow files, history files, history buffers, or delayed write buffers, while experimental evaluation indicates performance improvement over compiler-based schemes.

  11. Parallel machine architecture and compiler design facilities

    NASA Technical Reports Server (NTRS)

    Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex

    1990-01-01

    The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.

  12. Automated optimization of look-up table implementation for function evaluation on FPGAs

    NASA Astrophysics Data System (ADS)

    Deng, L.; Chakrabarti, C.; Pitsianis, N.; Sun, X.

    2009-08-01

    This paper presents a systematic approach for automatic generation of look-up-table (LUT) for function evaluations and minimization in hardware resource on field programmable gate arrays (FPGAs). The class of functions supported by this approach includes sine, cosine, exponentials, Gaussians, the central B-splines, and certain cylinder functions that are frequently used in applications for signal and image processing and data processing. In order to meet customer requirements in accuracy and speed as well as constraints on the use of area and on-chip memory, the function evaluation is based on numerical approximation with Taylor polynomials. Customized data precisions are supported in both fixed point and floating point representations. The optimization procedure involves a search in three-dimensional design space of data precision, sampling density and approximation degree. It utilizes both model-based estimates and gradient-based information gathered during the search. The approach was tested with actual synthesis results on the Xilinx Virtex-2Pro FPGA platform.

  13. Different Scalable Implementations of Collision and Streaming for Optimal Computational Performance of Lattice Boltzmann Simulations

    NASA Astrophysics Data System (ADS)

    Geneva, Nicholas; Wang, Lian-Ping

    2015-11-01

    In the past 25 years, the mesoscopic lattice Boltzmann method (LBM) has become an increasingly popular approach to simulate incompressible flows including turbulent flows. While LBM solves more solution variables compared to the conventional CFD approach based on the macroscopic Navier-Stokes equation, it also offers opportunities for more efficient parallelization. In this talk we will describe several different algorithms that have been developed over the past 10 plus years, which can be used to represent the two core steps of LBM, collision and streaming, more effectively than standard approaches. The application of these algorithms spans LBM simulations ranging from basic channel to particle laden flows. We will cover the essential detail on the implementation of each algorithm for simple 2D flows, to the challenges one faces when using a given algorithm for more complex simulations. The key is to explore the best use of data structure and cache memory. Two basic data structures will be discussed and the importance of effective data storage to maximize a CPU's cache will be addressed. The performance of a 3D turbulent channel flow simulation using these different algorithms and data structures will be compared along with important hardware related issues.

  14. Optimization of semi-global stereo matching for hardware module implementation

    NASA Astrophysics Data System (ADS)

    Roszkowski, Mikołaj

    2014-11-01

    Stereo vision is one of the most intensively studied areas in the field of computer vision. It allows the creation of a 3D model of a scene given two images of the scene taken with optical cameras. Although the number of stereo algorithms keeps increasing, not many are suitable candidates for hardware implementations that could guarantee real-time processing in embedded systems. One of such algorithms is semi-global matching, which seems to balance well the quality of the disparity map and computational complexity. However, it still has quite high memory requirements, which can be a problem if the low-cost FPGAs are to be used. This is because they often suffer from a low external DRAM memory throughput. In this article, a few methods to reduce both the semi-global matching algorithm complexity and memory usage, and thus required bandwidth, are proposed. First of all, it is shown that a simple pyramid matching scheme can be used to efficiently reduce the number of disparities checked per pixel. Secondly, a method of dividing the image into independent blocks is proposed, which allows the reduction of the amount of memories required by the algorithm. Finally the exact requirements for the bandwidth and the size of the on-chip memories are given.

  15. Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets

    NASA Technical Reports Server (NTRS)

    Memarsadeghi, Nargess; Mount, David M.

    2007-01-01

    Scattered data interpolation is a problem of interest in numerous areas such as electronic imaging, smooth surface modeling, and computational geometry. Our motivation arises from applications in geology and mining, which often involve large scattered data sets and a demand for high accuracy. The method of choice is ordinary kriging. This is because it is a best unbiased estimator. Unfortunately, this interpolant is computationally very expensive to compute exactly. For n scattered data points, computing the value of a single interpolant involves solving a dense linear system of size roughly n x n. This is infeasible for large n. In practice, kriging is solved approximately by local approaches that are based on considering only a relatively small'number of points that lie close to the query point. There are many problems with this local approach, however. The first is that determining the proper neighborhood size is tricky, and is usually solved by ad hoc methods such as selecting a fixed number of nearest neighbors or all the points lying within a fixed radius. Such fixed neighborhood sizes may not work well for all query points, depending on local density of the point distribution. Local methods also suffer from the problem that the resulting interpolant is not continuous. Meyer showed that while kriging produces smooth continues surfaces, it has zero order continuity along its borders. Thus, at interface boundaries where the neighborhood changes, the interpolant behaves discontinuously. Therefore, it is important to consider and solve the global system for each interpolant. However, solving such large dense systems for each query point is impractical. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. The problems arise from the fact that the covariance functions that are used in kriging have global support. Our implementations combine, utilize, and enhance a number of different

  16. Optimization of the Implementation of Managed Aquifer Recharge - Effects of Aquifer Heterogeneity

    NASA Astrophysics Data System (ADS)

    Maliva, Robert; Missimer, Thomas; Kneppers, Angeline

    2010-05-01

    more successful MAR implementation as a tool for improved water resources management.

  17. Microprocessor-based integration of microfluidic control for the implementation of automated sensor monitoring and multithreaded optimization algorithms.

    PubMed

    Ezra, Elishai; Maor, Idan; Bavli, Danny; Shalom, Itai; Levy, Gahl; Prill, Sebastian; Jaeger, Magnus S; Nahmias, Yaakov

    2015-08-01

    Microfluidic applications range from combinatorial synthesis to high throughput screening, with platforms integrating analog perfusion components, digitally controlled micro-valves and a range of sensors that demand a variety of communication protocols. Currently, discrete control units are used to regulate and monitor each component, resulting in scattered control interfaces that limit data integration and synchronization. Here, we present a microprocessor-based control unit, utilizing the MS Gadgeteer open framework that integrates all aspects of microfluidics through a high-current electronic circuit that supports and synchronizes digital and analog signals for perfusion components, pressure elements, and arbitrary sensor communication protocols using a plug-and-play interface. The control unit supports an integrated touch screen and TCP/IP interface that provides local and remote control of flow and data acquisition. To establish the ability of our control unit to integrate and synchronize complex microfluidic circuits we developed an equi-pressure combinatorial mixer. We demonstrate the generation of complex perfusion sequences, allowing the automated sampling, washing, and calibrating of an electrochemical lactate sensor continuously monitoring hepatocyte viability following exposure to the pesticide rotenone. Importantly, integration of an optical sensor allowed us to implement automated optimization protocols that require different computational challenges including: prioritized data structures in a genetic algorithm, distributed computational efforts in multiple-hill climbing searches and real-time realization of probabilistic models in simulated annealing. Our system offers a comprehensive solution for establishing optimization protocols and perfusion sequences in complex microfluidic circuits. PMID:26227212

  18. Analysis, optimization, and implementation of a hybrid DS/FFH spread-spectrum technique for smart grid communications

    SciTech Connect

    Olama, Mohammed M.; Ma, Xiao; Killough, Stephen M.; Kuruganti, Teja; Smith, Stephen F.; Djouadi, Seddik M.

    2015-03-12

    In recent years, there has been great interest in using hybrid spread-spectrum (HSS) techniques for commercial applications, particularly in the Smart Grid, in addition to their inherent uses in military communications. This is because HSS can accommodate high data rates with high link integrity, even in the presence of significant multipath effects and interfering signals. A highly useful form of this transmission technique for many types of command, control, and sensing applications is the specific code-related combination of standard direct sequence modulation with fast frequency hopping, denoted hybrid DS/FFH, wherein multiple frequency hops occur within a single data-bit time. In this paper, error-probability analyses are performed for a hybrid DS/FFH system over standard Gaussian and fading-type channels, progressively including the effects from wide- and partial-band jamming, multi-user interference, and varying degrees of Rayleigh and Rician fading. In addition, an optimization approach is formulated that minimizes the bit-error performance of a hybrid DS/FFH communication system and solves for the resulting system design parameters. The optimization objective function is non-convex and can be solved by applying the Karush-Kuhn-Tucker conditions. We also present our efforts toward exploring the design, implementation, and evaluation of a hybrid DS/FFH radio transceiver using a single FPGA. Numerical and experimental results are presented under widely varying design parameters to demonstrate the adaptability of the waveform for varied harsh smart grid RF signal environments.

  19. Integrated topology for an aircraft electric power distribution system using MATLAB and ILP optimization technique and its implementation

    NASA Astrophysics Data System (ADS)

    Madhikar, Pratik Ravindra

    The most important and crucial design feature while designing an Aircraft Electric Power Distribution System (EPDS) is reliability. In EPDS, the distribution of power is from top level generators to bottom level loads through various sensors, actuators and rectifiers with the help of AC & DC buses and control switches. As the demands of the consumer is never ending and the safety is utmost important, there is an increase in loads and as a result increase in power management. Therefore, the design of an EPDS should be optimized to have maximum efficiency. This thesis discusses an integrated tool that is based on a Need Based Design method and Fault Tree Analysis (FTA) to achieve the optimum design of an EPDS to provide maximum reliability in terms of continuous connectivity, power management and minimum cost. If an EPDS is formulated as an optimization problem then it can be solved with the help of connectivity, cost and power constraints by using a linear solver to get the desired output of maximum reliability at minimum cost. Furthermore, the thesis also discusses the viability and implementation of the resulted topology on typical large aircraft specifications.

  20. 36 CFR 705.6 - Compilation.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 36 Parks, Forests, and Public Property 3 2013-07-01 2012-07-01 true Compilation. 705.6 Section 705.6 Parks, Forests, and Public Property LIBRARY OF CONGRESS REPRODUCTION, COMPILATION, AND DISTRIBUTION OF NEWS TRANSMISSIONS UNDER THE PROVISIONS OF THE AMERICAN TELEVISION AND RADIO ARCHIVES ACT §...

  1. 36 CFR 705.6 - Compilation.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 36 Parks, Forests, and Public Property 3 2012-07-01 2012-07-01 false Compilation. 705.6 Section 705.6 Parks, Forests, and Public Property LIBRARY OF CONGRESS REPRODUCTION, COMPILATION, AND DISTRIBUTION OF NEWS TRANSMISSIONS UNDER THE PROVISIONS OF THE AMERICAN TELEVISION AND RADIO ARCHIVES ACT §...

  2. 36 CFR 705.6 - Compilation.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 36 Parks, Forests, and Public Property 3 2014-07-01 2014-07-01 false Compilation. 705.6 Section 705.6 Parks, Forests, and Public Property LIBRARY OF CONGRESS REPRODUCTION, COMPILATION, AND DISTRIBUTION OF NEWS TRANSMISSIONS UNDER THE PROVISIONS OF THE AMERICAN TELEVISION AND RADIO ARCHIVES ACT §...

  3. 36 CFR 705.6 - Compilation.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 36 Parks, Forests, and Public Property 3 2010-07-01 2010-07-01 false Compilation. 705.6 Section 705.6 Parks, Forests, and Public Property LIBRARY OF CONGRESS REPRODUCTION, COMPILATION, AND DISTRIBUTION OF NEWS TRANSMISSIONS UNDER THE PROVISIONS OF THE AMERICAN TELEVISION AND RADIO ARCHIVES ACT §...

  4. 36 CFR 705.6 - Compilation.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 36 Parks, Forests, and Public Property 3 2011-07-01 2011-07-01 false Compilation. 705.6 Section 705.6 Parks, Forests, and Public Property LIBRARY OF CONGRESS REPRODUCTION, COMPILATION, AND DISTRIBUTION OF NEWS TRANSMISSIONS UNDER THE PROVISIONS OF THE AMERICAN TELEVISION AND RADIO ARCHIVES ACT §...

  5. Reformulating Constraints for Compilability and Efficiency

    NASA Technical Reports Server (NTRS)

    Tong, Chris; Braudaway, Wesley; Mohan, Sunil; Voigt, Kerstin

    1992-01-01

    KBSDE is a knowledge compiler that uses a classification-based approach to map solution constraints in a task specification onto particular search algorithm components that will be responsible for satisfying those constraints (e.g., local constraints are incorporated in generators; global constraints are incorporated in either testers or hillclimbing patchers). Associated with each type of search algorithm component is a subcompiler that specializes in mapping constraints into components of that type. Each of these subcompilers in turn uses a classification-based approach, matching a constraint passed to it against one of several schemas, and applying a compilation technique associated with that schema. While much progress has occurred in our research since we first laid out our classification-based approach [Ton91], we focus in this paper on our reformulation research. Two important reformulation issues that arise out of the choice of a schema-based approach are: (1) compilability-- Can a constraint that does not directly match any of a particular subcompiler's schemas be reformulated into one that does? and (2) Efficiency-- If the efficiency of the compiled search algorithm depends on the compiler's performance, and the compiler's performance depends on the form in which the constraint was expressed, can we find forms for constraints which compile better, or reformulate constraints whose forms can be recognized as ones that compile poorly? In this paper, we describe a set of techniques we are developing for partially addressing these issues.

  6. Compilation for critically constrained knowledge bases

    SciTech Connect

    Schrag, R.

    1996-12-31

    We show that many {open_quotes}critically constrained{close_quotes} Random 3SAT knowledge bases (KBs) can be compiled into disjunctive normal form easily by using a variant of the {open_quotes}Davis-Putnam{close_quotes} proof procedure. From these compiled KBs we can answer all queries about entailment of conjunctive normal formulas, also easily - compared to a {open_quotes}brute-force{close_quotes} approach to approximate knowledge compilation into unit clauses for the same KBs. We exploit this fact to develop an aggressive hybrid approach which attempts to compile a KB exactly until a given resource limit is reached, then falls back to approximate compilation into unit clauses. The resulting approach handles all of the critically constrained Random 3SAT KBs with average savings of an order of magnitude over the brute-force approach.

  7. The Columbia-Presbyterian Medical Center decision-support system as a model for implementing the Arden Syntax.

    PubMed Central

    Hripcsak, G.; Cimino, J. J.; Johnson, S. B.; Clayton, P. D.

    1991-01-01

    Columbia-Presbyterian Medical Center is implementing a decision-support system based on the Arden Syntax for Medical Logic Modules (MLM's). The system uses a compiler-interpreter pair. MLM's are first compiled into pseudo-codes, which are instructions for a virtual machine. The MLM's are then executed using an interpreter that emulates the virtual machine. This design has resulted in increased portability, easier debugging and verification, and more compact compiled MLM's. The time spent interpreting the MLM pseudo-codes has been found to be insignificant compared to database accesses. The compiler, which is written using the tools "lex" and "yacc," optimizes MLM's by minimizing the number of database accesses. The interpreter emulates a stack-oriented machine. A phased implementation of the syntax was used to speed the development of the system. PMID:1807598

  8. Analysis, optimization, and implementation of a hybrid DS/FFH spread-spectrum technique for smart grid communications

    DOE PAGESBeta

    Olama, Mohammed M.; Ma, Xiao; Killough, Stephen M.; Kuruganti, Teja; Smith, Stephen F.; Djouadi, Seddik M.

    2015-03-12

    In recent years, there has been great interest in using hybrid spread-spectrum (HSS) techniques for commercial applications, particularly in the Smart Grid, in addition to their inherent uses in military communications. This is because HSS can accommodate high data rates with high link integrity, even in the presence of significant multipath effects and interfering signals. A highly useful form of this transmission technique for many types of command, control, and sensing applications is the specific code-related combination of standard direct sequence modulation with fast frequency hopping, denoted hybrid DS/FFH, wherein multiple frequency hops occur within a single data-bit time. Inmore » this paper, error-probability analyses are performed for a hybrid DS/FFH system over standard Gaussian and fading-type channels, progressively including the effects from wide- and partial-band jamming, multi-user interference, and varying degrees of Rayleigh and Rician fading. In addition, an optimization approach is formulated that minimizes the bit-error performance of a hybrid DS/FFH communication system and solves for the resulting system design parameters. The optimization objective function is non-convex and can be solved by applying the Karush-Kuhn-Tucker conditions. We also present our efforts toward exploring the design, implementation, and evaluation of a hybrid DS/FFH radio transceiver using a single FPGA. Numerical and experimental results are presented under widely varying design parameters to demonstrate the adaptability of the waveform for varied harsh smart grid RF signal environments.« less

  9. Orbital-optimized linearized coupled-cluster doubles with density-fitting and Cholesky decomposition approximations: an efficient implementation.

    PubMed

    Bozkaya, Uğur

    2016-04-20

    An efficient implementation of the orbital-optimized linearized coupled-cluster double method with the density-fitting (DF-OLCCD) and Cholesky decomposition (CD-OLCCD) approximations is presented. The DF-OLCCD and CD-OLCCD methods are applied to a set of alkanes to compare the computational cost with the conventional orbital-optimized linearized coupled-cluster doubles (OLCCD) [U. Bozkaya and C. D. Sherrill, J. Chem. Phys., 2013, 139, 054104]. Our results demonstrate that the DF-OLCCD method provides substantially lower computational costs than OLCCD, and there are more than 9-fold reductions in the computational time for the largest member of the alkane set (C8H18). For barrier heights of hydrogen transfer reaction energies, the DF-OLCCD method again exhibits a substantially better performance than DF-LCCD, providing a mean absolute error of 0.9 kcal mol(-1), which is 7 times lower than that of DF-LCCD (6.2 kcal mol(-1)), and compared to MP2 (9.6 kcal mol(-1)) there is a more than 10-fold reduction in errors. Furthermore, the MAE value of DF-OLCCD is also lower than that of CCSD (1.2 kcal mol(-1)). For open-shell noncovalent interactions, the performance of DF-OLCCD is significantly better than that of MP2, DF-LCCD, and CCSD. Overall, the present application results indicate that the DF-OLCCD and CD-OLCCD methods are very promising for challenging open-shell systems as well as closed-shell molecular systems. PMID:27056800

  10. Implementation of CFD modeling in the performance assessment and optimization of secondary clarifiers: the PVSC case study.

    PubMed

    Xanthos, S; Ramalingam, K; Lipke, S; McKenna, B; Fillos, J

    2013-01-01

    The water industry and especially the wastewater treatment sector has come under steadily increasing pressure to optimize their existing and new facilities to meet their discharge limits and reduce overall cost. Gravity separation of solids, producing clarified overflow and thickened solids underflow has long been one of the principal separation processes used in treating secondary effluent. Final settling tanks (FSTs) are a central link in the treatment process and often times act as the limiting step to the maximum solids handling capacity when high throughput requirements need to be met. The Passaic Valley Sewerage Commission (PVSC) is interested in using a computational fluid dynamics (CFD) modeling approach to explore any further FST retrofit alternatives to sustain significantly higher plant influent flows, especially under wet weather conditions. In detail there is an interest in modifying and/or upgrading/optimizing the existing FSTs to handle flows in the range of 280-720 million gallons per day (MGD) (12.25-31.55 m(3)/s) in compliance with the plant's effluent discharge limits for total suspended solids (TSS). The CFD model development for this specific plant will be discussed, 2D and 3D simulation results will be presented and initial results of a sensitivity study between two FST effluent weir structure designs will be reviewed at a flow of 550 MGD (∼24 m(3)/s) and 1,800 mg/L MLSS (mixed liquor suspended solids). The latter will provide useful information in determining whether the existing retrofit of one of the FSTs would enable compliance under wet weather conditions and warrants further consideration for implementing it in the remaining FSTs. PMID:24225088

  11. The paradigm compiler: Mapping a functional language for the connection machine

    NASA Technical Reports Server (NTRS)

    Dennis, Jack B.

    1989-01-01

    The Paradigm Compiler implements a new approach to compiling programs written in high level languages for execution on highly parallel computers. The general approach is to identify the principal data structures constructed by the program and to map these structures onto the processing elements of the target machine. The mapping is chosen to maximize performance as determined through compile time global analysis of the source program. The source language is Sisal, a functional language designed for scientific computations, and the target language is Paris, the published low level interface to the Connection Machine. The data structures considered are multidimensional arrays whose dimensions are known at compile time. Computations that build such arrays usually offer opportunities for highly parallel execution; they are data parallel. The Connection Machine is an attractive target for these computations, and the parallel for construct of the Sisal language is a convenient high level notation for data parallel algorithms. The principles and organization of the Paradigm Compiler are discussed.

  12. The RC compiler for the DTN dataflow computer

    SciTech Connect

    Veen, A.H.; Van Den Born, R. )

    1990-12-01

    The DTN Dataflow Computer is a graphics workstation containing 32 dataflow processing elements. It may possibly be the first commercially available dataflow machine. In this paper, the main focus is on its RC compiler. Although dataflow machines are usually programmed in a declarative language, RC is imperative: it is a somewhat restricted form of C. The main problems encountered during the implementation of the compiler were due to the low level of the instruction set of the dataflow processors: these were not designed as target for a high-level language. Consequently generating efficient code for general features such as conditionals and loops required a substantial effort. Surprisingly, most imperative features were easy to deal with. The most serious problem still awaiting an efficient solution is conditional aliasing.

  13. Automating Visualization Service Generation with the WATT Compiler

    NASA Astrophysics Data System (ADS)

    Bollig, E. F.; Lyness, M. D.; Erlebacher, G.; Yuen, D. A.

    2007-12-01

    As tasks and workflows become increasingly complex, software developers are devoting increasing attention to automation tools. Among many examples, the Automator tool from Apple collects components of a workflow into a single script, with very little effort on the part of the user. Tasks are most often described as a series of instructions. The granularity of the tasks dictates the tools to use. Compilers translate fine-grained instructions to assembler code, while scripting languages (ruby, perl) are used to describe a series of tasks at a higher level. Compilers can also be viewed as transformational tools: a cross-compiler can translate executable code written on one computer to assembler code understood on another, while transformational tools can translate from one high-level language to another. We are interested in creating visualization web services automatically, starting from stand-alone VTK (Visualization Toolkit) code written in Tcl. To this end, using the OCaml programming language, we have developed a compiler that translates Tcl into C++, including all the stubs, classes and methods to interface with gSOAP, a C++ implementation of the Soap 1.1/1.2 protocols. This compiler, referred to as the Web Automation and Translation Toolkit (WATT), is the first step towards automated creation of specialized visualization web services without input from the user. The WATT compiler seeks to automate all aspects of web service generation, including the transport layer, the division of labor and the details related to interface generation. The WATT compiler is part of ongoing efforts within the NSF funded VLab consortium [1] to facilitate and automate time-consuming tasks for the science related to understanding planetary materials. Through examples of services produced by WATT for the VLab portal, we will illustrate features, limitations and the improvements necessary to achieve the ultimate goal of complete and transparent automation in the generation of web

  14. Extension of Alvis compiler front-end

    NASA Astrophysics Data System (ADS)

    Wypych, Michał; Szpyrka, Marcin; Matyasik, Piotr

    2015-12-01

    Alvis is a formal modelling language that enables possibility of verification of distributed concurrent systems. An Alvis model semantics finds expression in an LTS graph (labelled transition system). Execution of any language statement is expressed as a transition between formally defined states of such a model. An LTS graph is generated using a middle-stage Haskell representation of an Alvis model. Moreover, Haskell is used as a part of the Alvis language and is used to define parameters' types and operations on them. Thanks to the compiler's modular construction many aspects of compilation of an Alvis model may be modified. Providing new plugins for Alvis Compiler that support languages like Java or C makes possible using these languages as a part of Alvis instead of Haskell. The paper presents the compiler internal model and describes how the default specification language can be altered by new plugins.

  15. Testing methods and techniques: A compilation

    NASA Technical Reports Server (NTRS)

    1974-01-01

    Mechanical testing techniques, electrical and electronics testing techniques, thermal testing techniques, and optical testing techniques are the subject of the compilation which provides technical information and illustrations of advanced testing devices. Patent information is included where applicable.

  16. A Compilation of Internship Reports - 2012

    SciTech Connect

    Stegman M.; Morris, M.; Blackburn, N.

    2012-08-08

    This compilation documents all research project undertaken by the 2012 summer Department of Energy - Workforce Development for Teachers and Scientists interns during their internship program at Brookhaven National Laboratory.

  17. Analytical and test equipment: A compilation

    NASA Technical Reports Server (NTRS)

    1975-01-01

    A compilation is presented of innovations in testing and measuring technology for both the laboratory and industry. Topics discussed include spectrometers, radiometers, and descriptions of analytical and test equipment in several areas including thermodynamics, fluid flow, electronics, and materials testing.

  18. A prototype functional language implementation for hierarchical-memory architectures

    SciTech Connect

    Wolski, R.; Feo, J.; Cann, D.

    1991-06-05

    The first implementation of Sisal was designed for general shared-memory architectures. Since then, we have optimized the system for vector and coherent-cache multiprocessors. Coherent-cache systems can be thought of as simple, two-level hierarchical memory systems, where the memory hierarchy is managed by the hardware. The compiler and run-time system for such an architecture needs to maintain data locality so that the processor caches are used as much as possible. In this paper, we extend the coherent-cache implementation to include explicit compiler and run-time control for medium-grain and coarse-grain hierarchical-memory architectures. We implemented the extended system on the BBN Butterfly using interleaved shared memory exclusively for the purposes of data sharing and exploiting the per-processor local memories. We give preliminary performance results for this extended system. 10 refs., 7 figs.

  19. System-on-chip architecture and validation for real-time transceiver optimization: APC implementation on FPGA

    NASA Astrophysics Data System (ADS)

    Suarez, Hernan; Zhang, Yan R.

    2015-05-01

    New radar applications need to perform complex algorithms and process large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression for real-time transceiver optimization are presented, they are based on a System-on-Chip architecture for Xilinx devices. This study also evaluates the performance of dedicated coprocessor as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through the high performance AXI buses, to perform floating-point operations, control the processing blocks, and communicate with external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band tested together with a low-cost channel emulator for different types of waveforms.

  20. Systems test facilities existing capabilities compilation

    NASA Technical Reports Server (NTRS)

    Weaver, R.

    1981-01-01

    Systems test facilities (STFS) to test total photovoltaic systems and their interfaces are described. The systems development (SD) plan is compilation of existing and planned STFs, as well as subsystem and key component testing facilities. It is recommended that the existing capabilities compilation is annually updated to provide and assessment of the STF activity and to disseminate STF capabilities, status and availability to the photovoltaics program.

  1. 20 CFR 438.600 - Semi-annual compilation.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Semi-annual compilation. 438.600 Section 438....600 Semi-annual compilation. (a) The Commissioner of Social Security will collect and compile the... semi-annual compilation was submitted on May 31, 1990, and contains a compilation of the...

  2. 20 CFR 438.600 - Semi-annual compilation.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 20 Employees' Benefits 2 2012-04-01 2012-04-01 false Semi-annual compilation. 438.600 Section 438....600 Semi-annual compilation. (a) The Commissioner of Social Security will collect and compile the... semi-annual compilation was submitted on May 31, 1990, and contains a compilation of the...

  3. 20 CFR 438.600 - Semi-annual compilation.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 20 Employees' Benefits 2 2011-04-01 2011-04-01 false Semi-annual compilation. 438.600 Section 438....600 Semi-annual compilation. (a) The Commissioner of Social Security will collect and compile the... semi-annual compilation was submitted on May 31, 1990, and contains a compilation of the...

  4. 20 CFR 438.600 - Semi-annual compilation.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 20 Employees' Benefits 2 2014-04-01 2014-04-01 false Semi-annual compilation. 438.600 Section 438....600 Semi-annual compilation. (a) The Commissioner of Social Security will collect and compile the... semi-annual compilation was submitted on May 31, 1990, and contains a compilation of the...

  5. Adoption and Implementation of a Computer-delivered HIV/STD Risk-Reduction Intervention for African American Adolescent Females Seeking Services at County Health Departments: Implementation Optimization is Urgently Needed

    PubMed Central

    DiClemente, Ralph J.; Bradley, Erin; Davis, Teaniese L.; Brown, Jennifer L.; Ukuku, Mary; Sales, Jessica M.; Rose, Eve S.; Wingood, Gina M.

    2013-01-01

    Although group-delivered HIV/STD risk-reduction interventions for African American adolescent females have proven efficacious, they require significant financial and staffing resources to implement and may not be feasible in personnel- and resource-constrained public health clinics. We conducted a study assessing adoption and implementation of an evidence-based HIV/STD risk-reduction intervention that was translated from a group-delivered modality to a computer-delivered modality to facilitate use in county public health departments. Usage of the computer-delivered intervention was low across eight participating public health clinics. Further investigation is needed to optimize implementation by identifying, understanding and surmounting barriers that hamper timely and efficient implementation of technology-delivered HIV/STD risk-reduction interventions in county public health clinics. PMID:23673891

  6. Optimizing Suicide Prevention Programs and Their Implementation in Europe (OSPI Europe): an evidence-based multi-level approach

    PubMed Central

    2009-01-01

    Background Suicide and non-fatal suicidal behaviour are significant public health issues in Europe requiring effective preventive interventions. However, the evidence for effective preventive strategies is scarce. The protocol of a European research project to develop an optimized evidence based program for suicide prevention is presented. Method The groundwork for this research has been established by a regional community based intervention for suicide prevention that focuses on improving awareness and care for depression performed within the European Alliance Against Depression (EAAD). The EAAD intervention consists of (1) training sessions and practice support for primary care physicians,(2) public relations activities and mass media campaigns, (3) training sessions for community facilitators who serve as gatekeepers for depressed and suicidal persons in the community and treatment and (4) outreach and support for high risk and self-help groups (e.g. helplines). The intervention has been shown to be effective in reducing suicidal behaviour in an earlier study, the Nuremberg Alliance Against Depression. In the context of the current research project described in this paper (OSPI-Europe) the EAAD model is enhanced by other evidence based interventions and implemented simultaneously and in standardised way in four regions in Ireland, Portugal, Hungary and Germany. The enhanced intervention will be evaluated using a prospective controlled design with the primary outcomes being composite suicidal acts (fatal and non-fatal), and with intermediate outcomes being the effect of training programs, changes in public attitudes, guideline-consistent media reporting. In addition an analysis of the economic costs and consequences will be undertaken, while a process evaluation will monitor implementation of the interventions within the different regions with varying organisational and healthcare contexts. Discussion This multi-centre research seeks to overcome major challenges

  7. Compiling high-level languages for configurable computers: applying lessons from heterogeneous processing

    NASA Astrophysics Data System (ADS)

    Weaver, Glen E.; Weems, Charles C.; McKinley, Kathryn S.

    1996-10-01

    Configurable systems offer increased performance by providing hardware that matches the computational structure of a problem. This hardware is currently programmed with CAD tools and explicit library calls. To attain widespread acceptance, configurable computing must become transparently accessible from high-level programming languages, but the changeable nature of the target hardware presents a major challenge to traditional compiler technology. A compiler for a configurable computer should optimize the use of functions embedded in hardware and schedule hardware reconfigurations. The hurdles to be overcome in achieving this capability are similar in some ways to those facing compilation for heterogeneous systems. For example, current traditional compilers have neither an interface to accept new primitive operators, nor a mechanism for applying optimizations to new operators. We are building a compiler for heterogeneous computing, called Scale, which replaces the traditional monolithic compiler architecture with a flexible framework. Scale has three main parts: translation director, compilation library, and a persistent store which holds our intermediate representation as well as other data structures. The translation director exploits the framework's flexibility by using architectural information to build a plan to direct each compilation. The translation library serves as a toolkit for use by the translation director. Our compiler intermediate representation, Score, facilities the addition of new IR nodes by distinguishing features used in defining nodes from properties on which transformations depend. In this paper, we present an overview of the scale architecture and its capabilities for dealing with heterogeneity, followed by a discussion of how those capabilities apply to problems in configurable computing. We then address aspects of configurable computing that are likely to require extensions to our approach and propose some extensions.

  8. Asteroid Spin Vector Compilation V5.0

    NASA Astrophysics Data System (ADS)

    Kryszczynska, A.; Magnussen, P.

    2008-07-01

    This is a comprehensive tabulation of asteroid spin vector determinations, compiled by Agnieszka Kryszczynska and based on the earlier compilation by Per Magnusson. This is the Oct. 21, 2007 version of the compilation, containing 863 spin vector determinations.

  9. Compiling software for a hierarchical distributed processing system

    DOEpatents

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-12-31

    Compiling software for a hierarchical distributed processing system including providing to one or more compiling nodes software to be compiled, wherein at least a portion of the software to be compiled is to be executed by one or more nodes; compiling, by the compiling node, the software; maintaining, by the compiling node, any compiled software to be executed on the compiling node; selecting, by the compiling node, one or more nodes in a next tier of the hierarchy of the distributed processing system in dependence upon whether any compiled software is for the selected node or the selected node's descendents; sending to the selected node only the compiled software to be executed by the selected node or selected node's descendent.

  10. Compiled MPI: Cost-Effective Exascale Applications Development

    SciTech Connect

    Bronevetsky, G; Quinlan, D; Lumsdaine, A; Hoefler, T

    2012-04-10

    's lifetime. It includes: (1) New set of source code annotations, inserted either manually or automatically, that will clarify the application's use of MPI to the compiler infrastructure, enabling greater accuracy where needed; (2) A compiler transformation framework that leverages these annotations to transform the original MPI source code to improve its performance and scalability; (3) Novel MPI runtime implementation techniques that will provide a rich set of functionality extensions to be used by applications that have been transformed by our compiler; and (4) A novel compiler analysis that leverages simple user annotations to automatically extract the application's communication structure and synthesize most complex code annotations.

  11. Performing aggressive code optimization with an ability to rollback changes made by the aggressive optimizations

    DOEpatents

    Gschwind, Michael K

    2013-07-23

    Mechanisms for aggressively optimizing computer code are provided. With these mechanisms, a compiler determines an optimization to apply to a portion of source code and determines if the optimization as applied to the portion of source code will result in unsafe optimized code that introduces a new source of exceptions being generated by the optimized code. In response to a determination that the optimization is an unsafe optimization, the compiler generates an aggressively compiled code version, in which the unsafe optimization is applied, and a conservatively compiled code version in which the unsafe optimization is not applied. The compiler stores both versions and provides them for execution. Mechanisms are provided for switching between these versions during execution in the event of a failure of the aggressively compiled code version. Moreover, predictive mechanisms are provided for predicting whether such a failure is likely.

  12. A small evaluation suite for Ada compilers

    NASA Technical Reports Server (NTRS)

    Wilke, Randy; Roy, Daniel M.

    1986-01-01

    After completing a small Ada pilot project (OCC simulator) for the Multi Satellite Operations Control Center (MSOCC) at Goddard last year, the use of Ada to develop OCCs was recommended. To help MSOCC transition toward Ada, a suite of about 100 evaluation programs was developed which can be used to assess Ada compilers. These programs compare the overall quality of the compilation system, compare the relative efficiencies of the compilers and the environments in which they work, and compare the size and execution speed of generated machine code. Another goal of the benchmark software was to provide MSOCC system developers with rough timing estimates for the purpose of predicting performance of future systems written in Ada.

  13. Compilation of data on elementary particles

    SciTech Connect

    Trippe, T.G.

    1984-09-01

    The most widely used data compilation in the field of elementary particle physics is the Review of Particle Properties. The origin, development and current state of this compilation are described with emphasis on the features which have contributed to its success: active involvement of particle physicists; critical evaluation and review of the data; completeness of coverage; regular distribution of reliable summaries including a pocket edition; heavy involvement of expert consultants; and international collaboration. The current state of the Review and new developments such as providing interactive access to the Review's database are described. Problems and solutions related to maintaining a strong and supportive relationship between compilation groups and the researchers who produce and use the data are discussed.

  14. Extension of Alvis compiler front-end

    SciTech Connect

    Wypych, Michał; Szpyrka, Marcin; Matyasik, Piotr E-mail: mszpyrka@agh.edu.pl

    2015-12-31

    Alvis is a formal modelling language that enables possibility of verification of distributed concurrent systems. An Alvis model semantics finds expression in an LTS graph (labelled transition system). Execution of any language statement is expressed as a transition between formally defined states of such a model. An LTS graph is generated using a middle-stage Haskell representation of an Alvis model. Moreover, Haskell is used as a part of the Alvis language and is used to define parameters’ types and operations on them. Thanks to the compiler’s modular construction many aspects of compilation of an Alvis model may be modified. Providing new plugins for Alvis Compiler that support languages like Java or C makes possible using these languages as a part of Alvis instead of Haskell. The paper presents the compiler internal model and describes how the default specification language can be altered by new plugins.

  15. COMPILATION OF CURRENT HIGH ENERGY PHYSICS EXPERIMENTS

    SciTech Connect

    Wohl, C.G.; Kelly, R.L.; Armstrong, F.E.; Horne, C.P.; Hutchinson, M.S.; Rittenberg, A.; Trippe, T.G.; Yost, G.P.; Addis, L.; Ward, C.E.W.; Baggett, N.; Goldschmidt-Clermong, Y.; Joos, P.; Gelfand, N.; Oyanagi, Y.; Grudtsin, S.N.; Ryabov, Yu.G.

    1981-05-01

    This is the fourth edition of our compilation of current high energy physics experiments. It is a collaborative effort of the Berkeley Particle Data Group, the SLAC library, and nine participating laboratories: Argonne (ANL), Brookhaven (BNL), CERN, DESY, Fermilab (FNAL), the Institute for Nuclear Study, Tokyo (INS), KEK, Serpukhov (SERP), and SLAC. The compilation includes summaries of all high energy physics experiments at the above laboratories that (1) were approved (and not subsequently withdrawn) before about April 1981, and (2) had not completed taking of data by 1 January 1977. We emphasize that only approved experiments are included.

  16. Machine tools and fixtures: A compilation

    NASA Technical Reports Server (NTRS)

    1971-01-01

    As part of NASA's Technology Utilizations Program, a compilation was made of technological developments regarding machine tools, jigs, and fixtures that have been produced, modified, or adapted to meet requirements of the aerospace program. The compilation is divided into three sections that include: (1) a variety of machine tool applications that offer easier and more efficient production techniques; (2) methods, techniques, and hardware that aid in the setup, alignment, and control of machines and machine tools to further quality assurance in finished products: and (3) jigs, fixtures, and adapters that are ancillary to basic machine tools and aid in realizing their greatest potential.

  17. A ROSE-based OpenMP 3.0 Research Compiler Supporting Multiple Runtime Libraries

    SciTech Connect

    Liao, C; Quinlan, D; Panas, T

    2010-01-25

    OpenMP is a popular and evolving programming model for shared-memory platforms. It relies on compilers for optimal performance and to target modern hardware architectures. A variety of extensible and robust research compilers are key to OpenMP's sustainable success in the future. In this paper, we present our efforts to build an OpenMP 3.0 research compiler for C, C++, and Fortran; using the ROSE source-to-source compiler framework. Our goal is to support OpenMP research for ourselves and others. We have extended ROSE's internal representation to handle all of the OpenMP 3.0 constructs and facilitate their manipulation. Since OpenMP research is often complicated by the tight coupling of the compiler translations and the runtime system, we present a set of rules to define a common OpenMP runtime library (XOMP) on top of multiple runtime libraries. These rules additionally define how to build a set of translations targeting XOMP. Our work demonstrates how to reuse OpenMP translations across different runtime libraries. This work simplifies OpenMP research by decoupling the problematic dependence between the compiler translations and the runtime libraries. We present an evaluation of our work by demonstrating an analysis tool for OpenMP correctness. We also show how XOMP can be defined using both GOMP and Omni and present comparative performance results against other OpenMP compilers.

  18. Mystic: Implementation of the Static Dynamic Optimal Control Algorithm for High-Fidelity, Low-Thrust Trajectory Design

    NASA Technical Reports Server (NTRS)

    Whiffen, Gregory J.

    2006-01-01

    Mystic software is designed to compute, analyze, and visualize optimal high-fidelity, low-thrust trajectories, The software can be used to analyze inter-planetary, planetocentric, and combination trajectories, Mystic also provides utilities to assist in the operation and navigation of low-thrust spacecraft. Mystic will be used to design and navigate the NASA's Dawn Discovery mission to orbit the two largest asteroids, The underlying optimization algorithm used in the Mystic software is called Static/Dynamic Optimal Control (SDC). SDC is a nonlinear optimal control method designed to optimize both 'static variables' (parameters) and dynamic variables (functions of time) simultaneously. SDC is a general nonlinear optimal control algorithm based on Bellman's principal.

  19. System, apparatus and methods to implement high-speed network analyzers

    DOEpatents

    Ezick, James; Lethin, Richard; Ros-Giralt, Jordi; Szilagyi, Peter; Wohlford, David E

    2015-11-10

    Systems, apparatus and methods for the implementation of high-speed network analyzers are provided. A set of high-level specifications is used to define the behavior of the network analyzer emitted by a compiler. An optimized inline workflow to process regular expressions is presented without sacrificing the semantic capabilities of the processing engine. An optimized packet dispatcher implements a subset of the functions implemented by the network analyzer, providing a fast and slow path workflow used to accelerate specific processing units. Such dispatcher facility can also be used as a cache of policies, wherein if a policy is found, then packet manipulations associated with the policy can be quickly performed. An optimized method of generating DFA specifications for network signatures is also presented. The method accepts several optimization criteria, such as min-max allocations or optimal allocations based on the probability of occurrence of each signature input bit.

  20. Implementation of an optimal stomatal conductance scheme in the Australian Community Climate Earth Systems Simulator (ACCESS1.3b)

    NASA Astrophysics Data System (ADS)

    Kala, J.; De Kauwe, M. G.; Pitman, A. J.; Lorenz, R.; Medlyn, B. E.; Wang, Y.-P.; Lin, Y.-S.; Abramowitz, G.

    2015-12-01

    We implement a new stomatal conductance scheme, based on the optimality approach, within the Community Atmosphere Biosphere Land Exchange (CABLEv2.0.1) land surface model. Coupled land-atmosphere simulations are then performed using CABLEv2.0.1 within the Australian Community Climate and Earth Systems Simulator (ACCESSv1.3b) with prescribed sea surface temperatures. As in most land surface models, the default stomatal conductance scheme only accounts for differences in model parameters in relation to the photosynthetic pathway but not in relation to plant functional types. The new scheme allows model parameters to vary by plant functional type, based on a global synthesis of observations of stomatal conductance under different climate regimes over a wide range of species. We show that the new scheme reduces the latent heat flux from the land surface over the boreal forests during the Northern Hemisphere summer by 0.5-1.0 mm day-1. This leads to warmer daily maximum and minimum temperatures by up to 1.0 °C and warmer extreme maximum temperatures by up to 1.5 °C. These changes generally improve the climate model's climatology of warm extremes and improve existing biases by 10-20 %. The bias in minimum temperatures is however degraded but, overall, this is outweighed by the improvement in maximum temperatures as there is a net improvement in the diurnal temperature range in this region. In other regions such as parts of South and North America where ACCESSv1.3b has known large positive biases in both maximum and minimum temperatures (~ 5 to 10 °C), the new scheme degrades this bias by up to 1 °C. We conclude that, although several large biases remain in ACCESSv1.3b for temperature extremes, the improvements in the global climate model over large parts of the boreal forests during the Northern Hemisphere summer which result from the new stomatal scheme, constrained by a global synthesis of experimental data, provide a valuable advance in the long-term development

  1. Electronic switches and control circuits: A compilation

    NASA Technical Reports Server (NTRS)

    1971-01-01

    The innovations in this updated series of compilations dealing with electronic technology represents a carefully selected collection of items on electronic switches and control circuits. Most of the items are based on well-known circuit design concepts that have been simplified or refined to meet NASA's demanding requirement for reliability, simplicity, fail-safe characteristics, and the capability of withstanding environmental extremes.

  2. Heat Transfer and Thermodynamics: a Compilation

    NASA Technical Reports Server (NTRS)

    1974-01-01

    A compilation is presented for the dissemination of information on technological developments which have potential utility outside the aerospace and nuclear communities. Studies include theories and mechanical considerations in the transfer of heat and the thermodynamic properties of matter and the causes and effects of certain interactions.

  3. STS-45 Atlas-1 Compiled Processing Footage

    NASA Technical Reports Server (NTRS)

    1992-01-01

    Compiled footage shows shots of the Atmospheric Laboratory for Applications and Science's (Atlas-1's) move to the test stand at the Operations and Checkout (O&C) Building, the sharp edge inspection, and the Atlas-1 press showing. The STS-45 Atlantis rollover to the Vehicle Assembly Building (VAB) and subsequent rollout to Pad A are seen.

  4. Safety and maintenance engineering: A compilation

    NASA Technical Reports Server (NTRS)

    1974-01-01

    A compilation is presented for the dissemination of information on technological developments which have potential utility outside the aerospace and nuclear communities. Safety of personnel engaged in the handling of hazardous materials and equipment, protection of equipment from fire, high wind, or careless handling by personnel, and techniques for the maintenance of operating equipment are reported.

  5. Multiple Literacies. A Compilation for Adult Educators.

    ERIC Educational Resources Information Center

    Hull, Glynda A.; Mikulecky, Larry; St. Clair, Ralf; Kerka, Sandra

    Recent developments have broadened the definition of literacy to multiple literacies--bodies of knowledge, skills, and social practices with which we understand, interpret, and use the symbol systems of our culture. This compilation looks at the various literacies as the application of critical abilities to several domains of importance to adult…

  6. The dc power circuits: A compilation

    NASA Technical Reports Server (NTRS)

    1972-01-01

    A compilation of reports concerning power circuits is presented for the dissemination of aerospace information to the general public as part of the NASA Technology Utilization Program. The descriptions for the electronic circuits are grouped as follows: dc power supplies, power converters, current-voltage power supply regulators, overload protection circuits, and dc constant current power supplies.

  7. Compilation of information on melter modeling

    SciTech Connect

    Eyler, L.L.

    1996-03-01

    The objective of the task described in this report is to compile information on modeling capabilities for the High-Temperature Melter and the Cold Crucible Melter and issue a modeling capabilities letter report summarizing existing modeling capabilities. The report is to include strategy recommendations for future modeling efforts to support the High Level Waste (HLW) melter development.

  8. COMPILATION OF LEVEL 1 ENVIRONMENTAL ASSESSMENT DATA

    EPA Science Inventory

    The report gives currently available chemical data from 19 Level 1 environmental assessment studies, compiled in standard format. The data are organized within each study by the analytical technique used to generate them. Inorganic data generated by spark source mass spectroscopy...

  9. Electronic test and calibration circuits, a compilation

    NASA Technical Reports Server (NTRS)

    1972-01-01

    A wide variety of simple test calibration circuits are compiled for the engineer and laboratory technician. The majority of circuits were found inexpensive to assemble. Testing electronic devices and components, instrument and system test, calibration and reference circuits, and simple test procedures are presented.

  10. A quantum logic network for implementing optimal symmetric universal and phase-covariant telecloning of a bipartite entangled state

    NASA Astrophysics Data System (ADS)

    Meng, Fanyu; Zhu, Aidong

    2008-10-01

    A quantum logic network to implement quantum telecloning is presented in this paper. The network includes two parts: the first part is used to create the telecloning channel and the second part to teleport the state. It can be used not only to implement universal telecloning for a bipartite entangled state which is completely unknown, but also to implement the phase-covariant telecloning for one that is partially known. Furthermore, the network can also be used to construct a tele-triplicator. It can easily be implemented in experiment because only single- and two-qubit operations are used in the network.