A survey of compiler optimization techniques
NASA Technical Reports Server (NTRS)
Schneck, P. B.
1972-01-01
Major optimization techniques of compilers are described and grouped into three categories: machine dependent, architecture dependent, and architecture independent. Machine-dependent optimizations tend to be local and are performed upon short spans of generated code by using particular properties of an instruction set to reduce the time or space required by a program. Architecture-dependent optimizations are global and are performed while generating code. These optimizations consider the structure of a computer, but not its detailed instruction set. Architecture independent optimizations are also global but are based on analysis of the program flow graph and the dependencies among statements of source program. A conceptual review of a universal optimizer that performs architecture-independent optimizations at source-code level is also presented.
Trellis coding techniques for mobile communications
NASA Technical Reports Server (NTRS)
Divsalar, D.; Simon, M. K.; Jedrey, T.
1988-01-01
A criterion for designing optimum trellis codes to be used over fading channels is given. A technique is shown for reducing certain multiple trellis codes, optimally designed for the fading channel, to conventional (i.e., multiplicity one) trellis codes. The computational cutoff rate R0 is evaluated for MPSK transmitted over fading channels. Examples of trellis codes optimally designed for the Rayleigh fading channel are given and compared with respect to R0. Two types of modulation/demodulation techniques are considered, namely coherent (using pilot tone-aided carrier recovery) and differentially coherent with Doppler frequency correction. Simulation results are given for end-to-end performance of two trellis-coded systems.
Product code optimization for determinate state LDPC decoding in robust image transmission.
Thomos, Nikolaos; Boulgouris, Nikolaos V; Strintzis, Michael G
2006-08-01
We propose a novel scheme for error-resilient image transmission. The proposed scheme employs a product coder consisting of low-density parity check (LDPC) codes and Reed-Solomon codes in order to deal effectively with bit errors. The efficiency of the proposed scheme is based on the exploitation of determinate symbols in Tanner graph decoding of LDPC codes and a novel product code optimization technique based on error estimation. Experimental evaluation demonstrates the superiority of the proposed system in comparison to recent state-of-the-art techniques for image transmission.
NASA Astrophysics Data System (ADS)
Imtiaz, Waqas A.; Ilyas, M.; Khan, Yousaf
2016-11-01
This paper propose a new code to optimize the performance of spectral amplitude coding-optical code division multiple access (SAC-OCDMA) system. The unique two-matrix structure of the proposed enhanced multi diagonal (EMD) code and effective correlation properties, between intended and interfering subscribers, significantly elevates the performance of SAC-OCDMA system by negating multiple access interference (MAI) and associated phase induce intensity noise (PIIN). Performance of SAC-OCDMA system based on the proposed code is thoroughly analyzed for two detection techniques through analytic and simulation analysis by referring to bit error rate (BER), signal to noise ratio (SNR) and eye patterns at the receiving end. It is shown that EMD code while using SDD technique provides high transmission capacity, reduces the receiver complexity, and provides better performance as compared to complementary subtraction detection (CSD) technique. Furthermore, analysis shows that, for a minimum acceptable BER of 10-9 , the proposed system supports 64 subscribers at data rates of up to 2 Gbps for both up-down link transmission.
Techniques for the analysis of data from coded-mask X-ray telescopes
NASA Technical Reports Server (NTRS)
Skinner, G. K.; Ponman, T. J.; Hammersley, A. P.; Eyles, C. J.
1987-01-01
Several techniques useful in the analysis of data from coded-mask telescopes are presented. Methods of handling changes in the instrument pointing direction are reviewed and ways of using FFT techniques to do the deconvolution considered. Emphasis is on techniques for optimally-coded systems, but it is shown that the range of systems included in this class can be extended through the new concept of 'partial cycle averaging'.
Multidisciplinary Aerospace Systems Optimization: Computational AeroSciences (CAS) Project
NASA Technical Reports Server (NTRS)
Kodiyalam, S.; Sobieski, Jaroslaw S. (Technical Monitor)
2001-01-01
The report describes a method for performing optimization of a system whose analysis is so expensive that it is impractical to let the optimization code invoke it directly because excessive computational cost and elapsed time might result. In such situation it is imperative to have user control the number of times the analysis is invoked. The reported method achieves that by two techniques in the Design of Experiment category: a uniform dispersal of the trial design points over a n-dimensional hypersphere and a response surface fitting, and the technique of krigging. Analyses of all the trial designs whose number may be set by the user are performed before activation of the optimization code and the results are stored as a data base. That code is then executed and referred to the above data base. Two applications, one of the airborne laser system, and one of an aircraft optimization illustrate the method application.
DOE Office of Scientific and Technical Information (OSTI.GOV)
MAGEE,GLEN I.
Computers transfer data in a number of different ways. Whether through a serial port, a parallel port, over a modem, over an ethernet cable, or internally from a hard disk to memory, some data will be lost. To compensate for that loss, numerous error detection and correction algorithms have been developed. One of the most common error correction codes is the Reed-Solomon code, which is a special subset of BCH (Bose-Chaudhuri-Hocquenghem) linear cyclic block codes. In the AURA project, an unmanned aircraft sends the data it collects back to earth so it can be analyzed during flight and possible flightmore » modifications made. To counter possible data corruption during transmission, the data is encoded using a multi-block Reed-Solomon implementation with a possibly shortened final block. In order to maximize the amount of data transmitted, it was necessary to reduce the computation time of a Reed-Solomon encoding to three percent of the processor's time. To achieve such a reduction, many code optimization techniques were employed. This paper outlines the steps taken to reduce the processing time of a Reed-Solomon encoding and the insight into modern optimization techniques gained from the experience.« less
Channel modeling, signal processing and coding for perpendicular magnetic recording
NASA Astrophysics Data System (ADS)
Wu, Zheng
With the increasing areal density in magnetic recording systems, perpendicular recording has replaced longitudinal recording to overcome the superparamagnetic limit. Studies on perpendicular recording channels including aspects of channel modeling, signal processing and coding techniques are presented in this dissertation. To optimize a high density perpendicular magnetic recording system, one needs to know the tradeoffs between various components of the system including the read/write transducers, the magnetic medium, and the read channel. We extend the work by Chaichanavong on the parameter optimization for systems via design curves. Different signal processing and coding techniques are studied. Information-theoretic tools are utilized to determine the acceptable region for the channel parameters when optimal detection and linear coding techniques are used. Our results show that a considerable gain can be achieved by the optimal detection and coding techniques. The read-write process in perpendicular magnetic recording channels includes a number of nonlinear effects. Nonlinear transition shift (NLTS) is one of them. The signal distortion induced by NLTS can be reduced by write precompensation during data recording. We numerically evaluate the effect of NLTS on the read-back signal and examine the effectiveness of several write precompensation schemes in combating NLTS in a channel characterized by both transition jitter noise and additive white Gaussian electronics noise. We also present an analytical method to estimate the bit-error-rate and use it to help determine the optimal write precompensation values in multi-level precompensation schemes. We propose a mean-adjusted pattern-dependent noise predictive (PDNP) detection algorithm for use on the channel with NLTS. We show that this detector can offer significant improvements in bit-error-rate (BER) compared to conventional Viterbi and PDNP detectors. Moreover, the system performance can be further improved by combining the new detector with a simple write precompensation scheme. Soft-decision decoding for algebraic codes can improve performance for magnetic recording systems. In this dissertation, we propose two soft-decision decoding methods for tensor-product parity codes. We also present a list decoding algorithm for generalized error locating codes.
The Use of a Code-generating System for the Derivation of the Equations for Wind Turbine Dynamics
NASA Astrophysics Data System (ADS)
Ganander, Hans
2003-10-01
For many reasons the size of wind turbines on the rapidly growing wind energy market is increasing. Relations between aeroelastic properties of these new large turbines change. Modifications of turbine designs and control concepts are also influenced by growing size. All these trends require development of computer codes for design and certification. Moreover, there is a strong desire for design optimization procedures, which require fast codes. General codes, e.g. finite element codes, normally allow such modifications and improvements of existing wind turbine models. This is done relatively easy. However, the calculation times of such codes are unfavourably long, certainly for optimization use. The use of an automatic code generating system is an alternative for relevance of the two key issues, the code and the design optimization. This technique can be used for rapid generation of codes of particular wind turbine simulation models. These ideas have been followed in the development of new versions of the wind turbine simulation code VIDYN. The equations of the simulation model were derived according to the Lagrange equation and using Mathematica®, which was directed to output the results in Fortran code format. In this way the simulation code is automatically adapted to an actual turbine model, in terms of subroutines containing the equations of motion, definitions of parameters and degrees of freedom. Since the start in 1997, these methods, constituting a systematic way of working, have been used to develop specific efficient calculation codes. The experience with this technique has been very encouraging, inspiring the continued development of new versions of the simulation code as the need has arisen, and the interest for design optimization is growing.
NASA Technical Reports Server (NTRS)
Reichel, R. H.; Hague, D. S.; Jones, R. T.; Glatt, C. R.
1973-01-01
This computer program manual describes in two parts the automated combustor design optimization code AUTOCOM. The program code is written in the FORTRAN 4 language. The input data setup and the program outputs are described, and a sample engine case is discussed. The program structure and programming techniques are also described, along with AUTOCOM program analysis.
Optimization of algorithm of coding of genetic information of Chlamydia
NASA Astrophysics Data System (ADS)
Feodorova, Valentina A.; Ulyanov, Sergey S.; Zaytsev, Sergey S.; Saltykov, Yury V.; Ulianova, Onega V.
2018-04-01
New method of coding of genetic information using coherent optical fields is developed. Universal technique of transformation of nucleotide sequences of bacterial gene into laser speckle pattern is suggested. Reference speckle patterns of the nucleotide sequences of omp1 gene of typical wild strains of Chlamydia trachomatis of genovars D, E, F, G, J and K and Chlamydia psittaci serovar I as well are generated. Algorithm of coding of gene information into speckle pattern is optimized. Fully developed speckles with Gaussian statistics for gene-based speckles have been used as criterion of optimization.
NASA Astrophysics Data System (ADS)
Vesselinov, V. V.; Harp, D.
2010-12-01
The process of decision making to protect groundwater resources requires a detailed estimation of uncertainties in model predictions. Various uncertainties associated with modeling a natural system, such as: (1) measurement and computational errors; (2) uncertainties in the conceptual model and model-parameter estimates; (3) simplifications in model setup and numerical representation of governing processes, contribute to the uncertainties in the model predictions. Due to this combination of factors, the sources of predictive uncertainties are generally difficult to quantify individually. Decision support related to optimal design of monitoring networks requires (1) detailed analyses of existing uncertainties related to model predictions of groundwater flow and contaminant transport, (2) optimization of the proposed monitoring network locations in terms of their efficiency to detect contaminants and provide early warning. We apply existing and newly-proposed methods to quantify predictive uncertainties and to optimize well locations. An important aspect of the analysis is the application of newly-developed optimization technique based on coupling of Particle Swarm and Levenberg-Marquardt optimization methods which proved to be robust and computationally efficient. These techniques and algorithms are bundled in a software package called MADS. MADS (Model Analyses for Decision Support) is an object-oriented code that is capable of performing various types of model analyses and supporting model-based decision making. The code can be executed under different computational modes, which include (1) sensitivity analyses (global and local), (2) Monte Carlo analysis, (3) model calibration, (4) parameter estimation, (5) uncertainty quantification, and (6) model selection. The code can be externally coupled with any existing model simulator through integrated modules that read/write input and output files using a set of template and instruction files (consistent with the PEST I/O protocol). MADS can also be internally coupled with a series of built-in analytical simulators. MADS provides functionality to work directly with existing control files developed for the code PEST (Doherty 2009). To perform the computational modes mentioned above, the code utilizes (1) advanced Latin-Hypercube sampling techniques (including Improved Distributed Sampling), (2) various gradient-based Levenberg-Marquardt optimization methods, (3) advanced global optimization methods (including Particle Swarm Optimization), and (4) a selection of alternative objective functions. The code has been successfully applied to perform various model analyses related to environmental management of real contamination sites. Examples include source identification problems, quantification of uncertainty, model calibration, and optimization of monitoring networks. The methodology and software codes are demonstrated using synthetic and real case studies where monitoring networks are optimized taking into account the uncertainty in model predictions of contaminant transport.
Next-generation acceleration and code optimization for light transport in turbid media using GPUs
Alerstam, Erik; Lo, William Chun Yip; Han, Tianyi David; Rose, Jonathan; Andersson-Engels, Stefan; Lilge, Lothar
2010-01-01
A highly optimized Monte Carlo (MC) code package for simulating light transport is developed on the latest graphics processing unit (GPU) built for general-purpose computing from NVIDIA - the Fermi GPU. In biomedical optics, the MC method is the gold standard approach for simulating light transport in biological tissue, both due to its accuracy and its flexibility in modelling realistic, heterogeneous tissue geometry in 3-D. However, the widespread use of MC simulations in inverse problems, such as treatment planning for PDT, is limited by their long computation time. Despite its parallel nature, optimizing MC code on the GPU has been shown to be a challenge, particularly when the sharing of simulation result matrices among many parallel threads demands the frequent use of atomic instructions to access the slow GPU global memory. This paper proposes an optimization scheme that utilizes the fast shared memory to resolve the performance bottleneck caused by atomic access, and discusses numerous other optimization techniques needed to harness the full potential of the GPU. Using these techniques, a widely accepted MC code package in biophotonics, called MCML, was successfully accelerated on a Fermi GPU by approximately 600x compared to a state-of-the-art Intel Core i7 CPU. A skin model consisting of 7 layers was used as the standard simulation geometry. To demonstrate the possibility of GPU cluster computing, the same GPU code was executed on four GPUs, showing a linear improvement in performance with an increasing number of GPUs. The GPU-based MCML code package, named GPU-MCML, is compatible with a wide range of graphics cards and is released as an open-source software in two versions: an optimized version tuned for high performance and a simplified version for beginners (http://code.google.com/p/gpumcml). PMID:21258498
Code Optimization and Parallelization on the Origins: Looking from Users' Perspective
NASA Technical Reports Server (NTRS)
Chang, Yan-Tyng Sherry; Thigpen, William W. (Technical Monitor)
2002-01-01
Parallel machines are becoming the main compute engines for high performance computing. Despite their increasing popularity, it is still a challenge for most users to learn the basic techniques to optimize/parallelize their codes on such platforms. In this paper, we present some experiences on learning these techniques for the Origin systems at the NASA Advanced Supercomputing Division. Emphasis of this paper will be on a few essential issues (with examples) that general users should master when they work with the Origins as well as other parallel systems.
Palkowski, Marek; Bielecki, Wlodzimierz
2017-06-02
RNA secondary structure prediction is a compute intensive task that lies at the core of several search algorithms in bioinformatics. Fortunately, the RNA folding approaches, such as the Nussinov base pair maximization, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. Polyhedral compilation techniques have proven to be a powerful tool for optimization of dense array codes. However, classical affine loop nest transformations used with these techniques do not optimize effectively codes of dynamic programming of RNA structure predictions. The purpose of this paper is to present a novel approach allowing for generation of a parallel tiled Nussinov RNA loop nest exposing significantly higher performance than that of known related code. This effect is achieved due to improving code locality and calculation parallelization. In order to improve code locality, we apply our previously published technique of automatic loop nest tiling to all the three loops of the Nussinov loop nest. This approach first forms original rectangular 3D tiles and then corrects them to establish their validity by means of applying the transitive closure of a dependence graph. To produce parallel code, we apply the loop skewing technique to a tiled Nussinov loop nest. The technique is implemented as a part of the publicly available polyhedral source-to-source TRACO compiler. Generated code was run on modern Intel multi-core processors and coprocessors. We present the speed-up factor of generated Nussinov RNA parallel code and demonstrate that it is considerably faster than related codes in which only the two outer loops of the Nussinov loop nest are tiled.
Santos, José; Monteagudo, Ángel
2017-03-27
The canonical code, although prevailing in complex genomes, is not universal. It was shown the canonical genetic code superior robustness compared to random codes, but it is not clearly determined how it evolved towards its current form. The error minimization theory considers the minimization of point mutation adverse effect as the main selection factor in the evolution of the code. We have used simulated evolution in a computer to search for optimized codes, which helps to obtain information about the optimization level of the canonical code in its evolution. A genetic algorithm searches for efficient codes in a fitness landscape that corresponds with the adaptability of possible hypothetical genetic codes. The lower the effects of errors or mutations in the codon bases of a hypothetical code, the more efficient or optimal is that code. The inclusion of the fitness sharing technique in the evolutionary algorithm allows the extent to which the canonical genetic code is in an area corresponding to a deep local minimum to be easily determined, even in the high dimensional spaces considered. The analyses show that the canonical code is not in a deep local minimum and that the fitness landscape is not a multimodal fitness landscape with deep and separated peaks. Moreover, the canonical code is clearly far away from the areas of higher fitness in the landscape. Given the non-presence of deep local minima in the landscape, although the code could evolve and different forces could shape its structure, the fitness landscape nature considered in the error minimization theory does not explain why the canonical code ended its evolution in a location which is not an area of a localized deep minimum of the huge fitness landscape.
Study of information transfer optimization for communication satellites
NASA Technical Reports Server (NTRS)
Odenwalder, J. P.; Viterbi, A. J.; Jacobs, I. M.; Heller, J. A.
1973-01-01
The results are presented of a study of source coding, modulation/channel coding, and systems techniques for application to teleconferencing over high data rate digital communication satellite links. Simultaneous transmission of video, voice, data, and/or graphics is possible in various teleconferencing modes and one-way, two-way, and broadcast modes are considered. A satellite channel model including filters, limiter, a TWT, detectors, and an optimized equalizer is treated in detail. A complete analysis is presented for one set of system assumptions which exclude nonlinear gain and phase distortion in the TWT. Modulation, demodulation, and channel coding are considered, based on an additive white Gaussian noise channel model which is an idealization of an equalized channel. Source coding with emphasis on video data compression is reviewed, and the experimental facility utilized to test promising techniques is fully described.
NASA Astrophysics Data System (ADS)
Jiménez-Varona, J.; Ponsin Roca, J.
2015-06-01
Under a contract with AIRBUS MILITARY (AI-M), an exercise to analyze the potential of optimization techniques to improve the wing performances at cruise conditions has been carried out by using an in-house design code. The original wing was provided by AI-M and several constraints were posed for the redesign. To maximize the aerodynamic efficiency at cruise, optimizations were performed using the design techniques developed internally at INTA under a research program (Programa de Termofluidodinámica). The code is a gradient-based optimizaa tion code, which uses classical finite differences approach for gradient computations. Several techniques for search direction computation are implemented for unconstrained and constrained problems. Techniques for geometry modifications are based on different approaches which include perturbation functions for the thickness and/or mean line distributions and others by Bézier curves fitting of certain degree. It is very e important to afford a real design which involves several constraints that reduce significantly the feasible design space. And the assessment of the code is needed in order to check the capabilities and the possible drawbacks. Lessons learnt will help in the development of future enhancements. In addition, the validation of the results was done using also the well-known TAU flow solver and a far-field drag method in order to determine accurately the improvement in terms of drag counts.
Enhanced Multiobjective Optimization Technique for Comprehensive Aerospace Design. Part A
NASA Technical Reports Server (NTRS)
Chattopadhyay, Aditi; Rajadas, John N.
1997-01-01
A multidisciplinary design optimization procedure which couples formal multiobjectives based techniques and complex analysis procedures (such as computational fluid dynamics (CFD) codes) developed. The procedure has been demonstrated on a specific high speed flow application involving aerodynamics and acoustics (sonic boom minimization). In order to account for multiple design objectives arising from complex performance requirements, multiobjective formulation techniques are used to formulate the optimization problem. Techniques to enhance the existing Kreisselmeier-Steinhauser (K-S) function multiobjective formulation approach have been developed. The K-S function procedure used in the proposed work transforms a constrained multiple objective functions problem into an unconstrained problem which then is solved using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm. Weight factors are introduced during the transformation process to each objective function. This enhanced procedure will provide the designer the capability to emphasize specific design objectives during the optimization process. The demonstration of the procedure utilizes a computational Fluid dynamics (CFD) code which solves the three-dimensional parabolized Navier-Stokes (PNS) equations for the flow field along with an appropriate sonic boom evaluation procedure thus introducing both aerodynamic performance as well as sonic boom as the design objectives to be optimized simultaneously. Sensitivity analysis is performed using a discrete differentiation approach. An approximation technique has been used within the optimizer to improve the overall computational efficiency of the procedure in order to make it suitable for design applications in an industrial setting.
Advanced technology development for image gathering, coding, and processing
NASA Technical Reports Server (NTRS)
Huck, Friedrich O.
1990-01-01
Three overlapping areas of research activities are presented: (1) Information theory and optimal filtering are extended to visual information acquisition and processing. The goal is to provide a comprehensive methodology for quantitatively assessing the end-to-end performance of image gathering, coding, and processing. (2) Focal-plane processing techniques and technology are developed to combine effectively image gathering with coding. The emphasis is on low-level vision processing akin to the retinal processing in human vision. (3) A breadboard adaptive image-coding system is being assembled. This system will be used to develop and evaluate a number of advanced image-coding technologies and techniques as well as research the concept of adaptive image coding.
NASA Technical Reports Server (NTRS)
Sreekanta Murthy, T.
1992-01-01
Results of the investigation of formal nonlinear programming-based numerical optimization techniques of helicopter airframe vibration reduction are summarized. The objective and constraint function and the sensitivity expressions used in the formulation of airframe vibration optimization problems are presented and discussed. Implementation of a new computational procedure based on MSC/NASTRAN and CONMIN in a computer program system called DYNOPT for optimizing airframes subject to strength, frequency, dynamic response, and dynamic stress constraints is described. An optimization methodology is proposed which is thought to provide a new way of applying formal optimization techniques during the various phases of the airframe design process. Numerical results obtained from the application of the DYNOPT optimization code to a helicopter airframe are discussed.
Optimal Near-Hitless Network Failure Recovery Using Diversity Coding
ERIC Educational Resources Information Center
Avci, Serhat Nazim
2013-01-01
Link failures in wide area networks are common and cause significant data losses. Mesh-based protection schemes offer high capacity efficiency but they are slow, require complex signaling, and instable. Diversity coding is a proactive coding-based recovery technique which offers near-hitless (sub-ms) restoration with a competitive spare capacity…
Optimization technique of wavefront coding system based on ZEMAX externally compiled programs
NASA Astrophysics Data System (ADS)
Han, Libo; Dong, Liquan; Liu, Ming; Zhao, Yuejin; Liu, Xiaohua
2016-10-01
Wavefront coding technique as a means of athermalization applied to infrared imaging system, the design of phase plate is the key to system performance. This paper apply the externally compiled programs of ZEMAX to the optimization of phase mask in the normal optical design process, namely defining the evaluation function of wavefront coding system based on the consistency of modulation transfer function (MTF) and improving the speed of optimization by means of the introduction of the mathematical software. User write an external program which computes the evaluation function on account of the powerful computing feature of the mathematical software in order to find the optimal parameters of phase mask, and accelerate convergence through generic algorithm (GA), then use dynamic data exchange (DDE) interface between ZEMAX and mathematical software to realize high-speed data exchanging. The optimization of the rotational symmetric phase mask and the cubic phase mask have been completed by this method, the depth of focus increases nearly 3 times by inserting the rotational symmetric phase mask, while the other system with cubic phase mask can be increased to 10 times, the consistency of MTF decrease obviously, the maximum operating temperature of optimized system range between -40°-60°. Results show that this optimization method can be more convenient to define some unconventional optimization goals and fleetly to optimize optical system with special properties due to its externally compiled function and DDE, there will be greater significance for the optimization of unconventional optical system.
CometBoards Users Manual Release 1.0
NASA Technical Reports Server (NTRS)
Guptill, James D.; Coroneos, Rula M.; Patnaik, Surya N.; Hopkins, Dale A.; Berke, Lazlo
1996-01-01
Several nonlinear mathematical programming algorithms for structural design applications are available at present. These include the sequence of unconstrained minimizations technique, the method of feasible directions, and the sequential quadratic programming technique. The optimality criteria technique and the fully utilized design concept are two other structural design methods. A project was undertaken to bring all these design methods under a common computer environment so that a designer can select any one of these tools that may be suitable for his/her application. To facilitate selection of a design algorithm, to validate and check out the computer code, and to ascertain the relative merits of the design tools, modest finite element structural analysis programs based on the concept of stiffness and integrated force methods have been coupled to each design method. The code that contains both these design and analysis tools, by reading input information from analysis and design data files, can cast the design of a structure as a minimum-weight optimization problem. The code can then solve it with a user-specified optimization technique and a user-specified analysis method. This design code is called CometBoards, which is an acronym for Comparative Evaluation Test Bed of Optimization and Analysis Routines for the Design of Structures. This manual describes for the user a step-by-step procedure for setting up the input data files and executing CometBoards to solve a structural design problem. The manual includes the organization of CometBoards; instructions for preparing input data files; the procedure for submitting a problem; illustrative examples; and several demonstration problems. A set of 29 structural design problems have been solved by using all the optimization methods available in CometBoards. A summary of the optimum results obtained for these problems is appended to this users manual. CometBoards, at present, is available for Posix-based Cray and Convex computers, Iris and Sun workstations, and the VM/CMS system.
Evaluating and minimizing noise impact due to aircraft flyover
NASA Technical Reports Server (NTRS)
Jacobson, I. D.; Cook, G.
1979-01-01
Existing techniques were used to assess the noise impact on a community due to aircraft operation and to optimize the flight paths of an approaching aircraft with respect to the annoyance produced. Major achievements are: (1) the development of a population model suitable for determining the noise impact, (2) generation of a numerical computer code which uses this population model along with the steepest descent algorithm to optimize approach/landing trajectories, (3) implementation of this optimization code in several fictitious cases as well as for the community surrounding Patrick Henry International Airport, Virginia.
NASA Astrophysics Data System (ADS)
Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.-L.
2015-05-01
Intel Many Integrated Core (MIC) ushers in a new era of supercomputing speed, performance, and compatibility. It allows the developers to run code at trillions of calculations per second using the familiar programming model. In this paper, we present our results of optimizing the updated Goddard shortwave radiation Weather Research and Forecasting (WRF) scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The co-processor supports all important Intel development tools. Thus, the development environment is familiar one to a vast number of CPU developers. Although, getting a maximum performance out of Xeon Phi will require using some novel optimization techniques. Those optimization techniques are discusses in this paper. The results show that the optimizations improved performance of the original code on Xeon Phi 7120P by a factor of 1.3x.
NASA Technical Reports Server (NTRS)
Fishbach, L. H.
1979-01-01
The computational techniques utilized to determine the optimum propulsion systems for future aircraft applications and to identify system tradeoffs and technology requirements are described. The characteristics and use of the following computer codes are discussed: (1) NNEP - a very general cycle analysis code that can assemble an arbitrary matrix fans, turbines, ducts, shafts, etc., into a complete gas turbine engine and compute on- and off-design thermodynamic performance; (2) WATE - a preliminary design procedure for calculating engine weight using the component characteristics determined by NNEP; (3) POD DRG - a table look-up program to calculate wave and friction drag of nacelles; (4) LIFCYC - a computer code developed to calculate life cycle costs of engines based on the output from WATE; and (5) INSTAL - a computer code developed to calculate installation effects, inlet performance and inlet weight. Examples are given to illustrate how these computer techniques can be applied to analyze and optimize propulsion system fuel consumption, weight, and cost for representative types of aircraft and missions.
Merits and limitations of optimality criteria method for structural optimization
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Guptill, James D.; Berke, Laszlo
1993-01-01
The merits and limitations of the optimality criteria (OC) method for the minimum weight design of structures subjected to multiple load conditions under stress, displacement, and frequency constraints were investigated by examining several numerical examples. The examples were solved utilizing the Optimality Criteria Design Code that was developed for this purpose at NASA Lewis Research Center. This OC code incorporates OC methods available in the literature with generalizations for stress constraints, fully utilized design concepts, and hybrid methods that combine both techniques. Salient features of the code include multiple choices for Lagrange multiplier and design variable update methods, design strategies for several constraint types, variable linking, displacement and integrated force method analyzers, and analytical and numerical sensitivities. The performance of the OC method, on the basis of the examples solved, was found to be satisfactory for problems with few active constraints or with small numbers of design variables. For problems with large numbers of behavior constraints and design variables, the OC method appears to follow a subset of active constraints that can result in a heavier design. The computational efficiency of OC methods appears to be similar to some mathematical programming techniques.
NASA Technical Reports Server (NTRS)
Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.
1991-01-01
Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.
Throughput of Coded Optical CDMA Systems with AND Detectors
NASA Astrophysics Data System (ADS)
Memon, Kehkashan A.; Umrani, Fahim A.; Umrani, A. W.; Umrani, Naveed A.
2012-09-01
Conventional detection techniques used in optical code-division multiple access (OCDMA) systems are not optimal and result in poor bit error rate performance. This paper analyzes the coded performance of optical CDMA systems with AND detectors for enhanced throughput efficiencies and improved error rate performance. The results show that the use of AND detectors significantly improve the performance of an optical channel.
NASA Technical Reports Server (NTRS)
Johnson, Theodore F.; Waters, W. Allen; Singer, Thomas N.; Haftka, Raphael T.
2004-01-01
A next generation reusable launch vehicle (RLV) will require thermally efficient and light-weight cryogenic propellant tank structures. Since these tanks will be weight-critical, analytical tools must be developed to aid in sizing the thickness of insulation layers and structural geometry for optimal performance. Finite element method (FEM) models of the tank and insulation layers were created to analyze the thermal performance of the cryogenic insulation layer and thermal protection system (TPS) of the tanks. The thermal conditions of ground-hold and re-entry/soak-through for a typical RLV mission were used in the thermal sizing study. A general-purpose nonlinear FEM analysis code, capable of using temperature and pressure dependent material properties, was used as the thermal analysis code. Mechanical loads from ground handling and proof-pressure testing were used to size the structural geometry of an aluminum cryogenic tank wall. Nonlinear deterministic optimization and reliability optimization techniques were the analytical tools used to size the geometry of the isogrid stiffeners and thickness of the skin. The results from the sizing study indicate that a commercial FEM code can be used for thermal analyses to size the insulation thicknesses where the temperature and pressure were varied. The results from the structural sizing study show that using combined deterministic and reliability optimization techniques can obtain alternate and lighter designs than the designs obtained from deterministic optimization methods alone.
On the optimality of a universal noiseless coder
NASA Technical Reports Server (NTRS)
Yeh, Pen-Shu; Rice, Robert F.; Miller, Warner H.
1993-01-01
Rice developed a universal noiseless coding structure that provides efficient performance over an extremely broad range of source entropy. This is accomplished by adaptively selecting the best of several easily implemented variable length coding algorithms. Variations of such noiseless coders have been used in many NASA applications. Custom VLSI coder and decoder modules capable of processing over 50 million samples per second have been fabricated and tested. In this study, the first of the code options used in this module development is shown to be equivalent to a class of Huffman code under the Humblet condition, for source symbol sets having a Laplacian distribution. Except for the default option, other options are shown to be equivalent to the Huffman codes of a modified Laplacian symbol set, at specified symbol entropy values. Simulation results are obtained on actual aerial imagery over a wide entropy range, and they confirm the optimality of the scheme. Comparison with other known techniques are performed on several widely used images and the results further validate the coder's optimality.
Necessary conditions for the optimality of variable rate residual vector quantizers
NASA Technical Reports Server (NTRS)
Kossentini, Faouzi; Smith, Mark J. T.; Barnes, Christopher F.
1993-01-01
Residual vector quantization (RVQ), or multistage VQ, as it is also called, has recently been shown to be a competitive technique for data compression. The competitive performance of RVQ reported in results from the joint optimization of variable rate encoding and RVQ direct-sum code books. In this paper, necessary conditions for the optimality of variable rate RVQ's are derived, and an iterative descent algorithm based on a Lagrangian formulation is introduced for designing RVQ's having minimum average distortion subject to an entropy constraint. Simulation results for these entropy-constrained RVQ's (EC-RVQ's) are presented for memory less Gaussian, Laplacian, and uniform sources. A Gauss-Markov source is also considered. The performance is superior to that of entropy-constrained scalar quantizers (EC-SQ's) and practical entropy-constrained vector quantizers (EC-VQ's), and is competitive with that of some of the best source coding techniques that have appeared in the literature.
NASA Astrophysics Data System (ADS)
Hsueh, Yu-Li; Rogge, Matthew S.; Shaw, Wei-Tao; Kim, Jaedon; Yamamoto, Shu; Kazovsky, Leonid G.
2005-09-01
A simple and cost-effective upgrade of existing passive optical networks (PONs) is proposed, which realizes service overlay by novel spectral-shaping line codes. A hierarchical coding procedure allows processing simplicity and achieves desired long-term spectral properties. Different code rates are supported, and the spectral shape can be properly tailored to adapt to different systems. The computation can be simplified by quantization of trigonometric functions. DC balance is achieved by passing the dc residual between processing windows. The proposed line codes tend to introduce bit transitions to avoid long consecutive identical bits and facilitate receiver clock recovery. Experiments demonstrate and compare several different optimized line codes. For a specific tolerable interference level, the optimal line code can easily be determined, which maximizes the data throughput. The service overlay using the line-coding technique leaves existing services and field-deployed fibers untouched but fully functional, providing a very flexible and economic way to upgrade existing PONs.
Metamodels for Computer-Based Engineering Design: Survey and Recommendations
NASA Technical Reports Server (NTRS)
Simpson, Timothy W.; Peplinski, Jesse; Koch, Patrick N.; Allen, Janet K.
1997-01-01
The use of statistical techniques to build approximations of expensive computer analysis codes pervades much of todays engineering design. These statistical approximations, or metamodels, are used to replace the actual expensive computer analyses, facilitating multidisciplinary, multiobjective optimization and concept exploration. In this paper we review several of these techniques including design of experiments, response surface methodology, Taguchi methods, neural networks, inductive learning, and kriging. We survey their existing application in engineering design and then address the dangers of applying traditional statistical techniques to approximate deterministic computer analysis codes. We conclude with recommendations for the appropriate use of statistical approximation techniques in given situations and how common pitfalls can be avoided.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nakhai, B.
A new method for solving radiation transport problems is presented. The heart of the technique is a new cross section processing procedure for the calculation of group-to-point and point-to-group cross sections sets. The method is ideally suited for problems which involve media with highly fluctuating cross sections, where the results of the traditional multigroup calculations are beclouded by the group averaging procedures employed. Extensive computational efforts, which would be required to evaluate double integrals in the multigroup treatment numerically, prohibit iteration to optimize the energy boundaries. On the other hand, use of point-to-point techniques (as in the stochastic technique) ismore » often prohibitively expensive due to the large computer storage requirement. The pseudo-point code is a hybrid of the two aforementioned methods (group-to-group and point-to-point) - hence the name pseudo-point - that reduces the computational efforts of the former and the large core requirements of the latter. The pseudo-point code generates the group-to-point or the point-to-group transfer matrices, and can be coupled with the existing transport codes to calculate pointwise energy-dependent fluxes. This approach yields much more detail than is available from the conventional energy-group treatments. Due to the speed of this code, several iterations could be performed (in affordable computing efforts) to optimize the energy boundaries and the weighting functions. The pseudo-point technique is demonstrated by solving six problems, each depicting a certain aspect of the technique. The results are presented as flux vs energy at various spatial intervals. The sensitivity of the technique to the energy grid and the savings in computational effort are clearly demonstrated.« less
High-density digital recording
NASA Technical Reports Server (NTRS)
Kalil, F. (Editor); Buschman, A. (Editor)
1985-01-01
The problems associated with high-density digital recording (HDDR) are discussed. Five independent users of HDDR systems and their problems, solutions, and insights are provided as guidance for other users of HDDR systems. Various pulse code modulation coding techniques are reviewed. An introduction to error detection and correction head optimization theory and perpendicular recording are provided. Competitive tape recorder manufacturers apply all of the above theories and techniques and present their offerings. The methodology used by the HDDR Users Subcommittee of THIC to evaluate parallel HDDR systems is presented.
Optimizing Aspect-Oriented Mechanisms for Embedded Applications
NASA Astrophysics Data System (ADS)
Hundt, Christine; Stöhr, Daniel; Glesner, Sabine
As applications for small embedded mobile devices are getting larger and more complex, it becomes inevitable to adopt more advanced software engineering methods from the field of desktop application development. Aspect-oriented programming (AOP) is a promising approach due to its advanced modularization capabilities. However, existing AOP languages tend to add a substantial overhead in both execution time and code size which restricts their practicality for small devices with limited resources. In this paper, we present optimizations for aspect-oriented mechanisms at the level of the virtual machine. Our experiments show that these optimizations yield a considerable performance gain along with a reduction of the code size. Thus, our optimizations establish the base for using advanced aspect-oriented modularization techniques for developing Java applications on small embedded devices.
Adaptive distributed source coding.
Varodayan, David; Lin, Yao-Chung; Girod, Bernd
2012-05-01
We consider distributed source coding in the presence of hidden variables that parameterize the statistical dependence among sources. We derive the Slepian-Wolf bound and devise coding algorithms for a block-candidate model of this problem. The encoder sends, in addition to syndrome bits, a portion of the source to the decoder uncoded as doping bits. The decoder uses the sum-product algorithm to simultaneously recover the source symbols and the hidden statistical dependence variables. We also develop novel techniques based on density evolution (DE) to analyze the coding algorithms. We experimentally confirm that our DE analysis closely approximates practical performance. This result allows us to efficiently optimize parameters of the algorithms. In particular, we show that the system performs close to the Slepian-Wolf bound when an appropriate doping rate is selected. We then apply our coding and analysis techniques to a reduced-reference video quality monitoring system and show a bit rate saving of about 75% compared with fixed-length coding.
Gain-adaptive vector quantization for medium-rate speech coding
NASA Technical Reports Server (NTRS)
Chen, J.-H.; Gersho, A.
1985-01-01
A class of adaptive vector quantizers (VQs) that can dynamically adjust the 'gain' of codevectors according to the input signal level is introduced. The encoder uses a gain estimator to determine a suitable normalization of each input vector prior to VQ coding. The normalized vectors have reduced dynamic range and can then be more efficiently coded. At the receiver, the VQ decoder output is multiplied by the estimated gain. Both forward and backward adaptation are considered and several different gain estimators are compared and evaluated. An approach to optimizing the design of gain estimators is introduced. Some of the more obvious techniques for achieving gain adaptation are substantially less effective than the use of optimized gain estimators. A novel design technique that is needed to generate the appropriate gain-normalized codebook for the vector quantizer is introduced. Experimental results show that a significant gain in segmental SNR can be obtained over nonadaptive VQ with a negligible increase in complexity.
QR images: optimized image embedding in QR codes.
Garateguy, Gonzalo J; Arce, Gonzalo R; Lau, Daniel L; Villarreal, Ofelia P
2014-07-01
This paper introduces the concept of QR images, an automatic method to embed QR codes into color images with bounded probability of detection error. These embeddings are compatible with standard decoding applications and can be applied to any color image with full area coverage. The QR information bits are encoded into the luminance values of the image, taking advantage of the immunity of QR readers against local luminance disturbances. To mitigate the visual distortion of the QR image, the algorithm utilizes halftoning masks for the selection of modified pixels and nonlinear programming techniques to locally optimize luminance levels. A tractable model for the probability of error is developed and models of the human visual system are considered in the quality metric used to optimize the luminance levels of the QR image. To minimize the processing time, the optimization techniques proposed to consider the mechanics of a common binarization method and are designed to be amenable for parallel implementations. Experimental results show the graceful degradation of the decoding rate and the perceptual quality as a function the embedding parameters. A visual comparison between the proposed and existing methods is presented.
A grid generation system for multi-disciplinary design optimization
NASA Technical Reports Server (NTRS)
Jones, William T.; Samareh-Abolhassani, Jamshid
1995-01-01
A general multi-block three-dimensional volume grid generator is presented which is suitable for Multi-Disciplinary Design Optimization. The code is timely, robust, highly automated, and written in ANSI 'C' for platform independence. Algebraic techniques are used to generate and/or modify block face and volume grids to reflect geometric changes resulting from design optimization. Volume grids are generated/modified in a batch environment and controlled via an ASCII user input deck. This allows the code to be incorporated directly into the design loop. Generated volume grids are presented for a High Speed Civil Transport (HSCT) Wing/Body geometry as well a complex HSCT configuration including horizontal and vertical tails, engine nacelles and pylons, and canard surfaces.
NASA Astrophysics Data System (ADS)
Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.
2014-10-01
The Goddard cloud microphysics scheme is a sophisticated cloud microphysics scheme in the Weather Research and Forecasting (WRF) model. The WRF is a widely used weather prediction system in the world. It development is a done in collaborative around the globe. The Goddard microphysics scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. Compared to the earlier microphysics schemes, the Goddard scheme incorporates a large number of improvements. Thus, we have optimized the code of this important part of WRF. In this paper, we present our results of optimizing the Goddard microphysics scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The Intel MIC is capable of executing a full operating system and entire programs rather than just kernels as the GPU do. The MIC coprocessor supports all important Intel development tools. Thus, the development environment is familiar one to a vast number of CPU developers. Although, getting a maximum performance out of MICs will require using some novel optimization techniques. Those optimization techniques are discusses in this paper. The results show that the optimizations improved performance of the original code on Xeon Phi 7120P by a factor of 4.7x. Furthermore, the same optimizations improved performance on a dual socket Intel Xeon E5-2670 system by a factor of 2.8x compared to the original code.
Comparative Evaluation of Different Optimization Algorithms for Structural Design Applications
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.
1996-01-01
Non-linear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Centre, a project was initiated to assess the performance of eight different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using the eight different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems, however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with Sequential Unconstrained Minimizations Technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
Performance Trend of Different Algorithms for Structural Design Optimization
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.
1996-01-01
Nonlinear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Center, a project was initiated to assess performance of different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with the sequential unconstrained minimizations technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
Semilinear programming: applications and implementation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mohan, S.
Semilinear programming is a method of solving optimization problems with linear constraints where the non-negativity restrictions on the variables are dropped and the objective function coefficients can take on different values depending on whether the variable is positive or negative. The simplex method for linear programming is modified in this thesis to solve general semilinear and piecewise linear programs efficiently without having to transform them into equivalent standard linear programs. Several models in widely different areas of optimization such as production smoothing, facility locations, goal programming and L/sub 1/ estimation are presented first to demonstrate the compact formulation that arisesmore » when such problems are formulated as semilinear programs. A code SLP is constructed using the semilinear programming techniques. Problems in aggregate planning and L/sub 1/ estimation are solved using SLP and equivalent linear programs using a linear programming simplex code. Comparisons of CPU times and number iterations indicate SLP to be far superior. The semilinear programming techniques are extended to piecewise linear programming in the implementation of the code PLP. Piecewise linear models in aggregate planning are solved using PLP and equivalent standard linear programs using a simple upper bounded linear programming code SUBLP.« less
NASA Astrophysics Data System (ADS)
Lee, Eun Seok
2000-10-01
An improved aerodynamics performance of a turbine cascade shape can be achieved by an understanding of the flow-field associated with the stator-rotor interaction. In this research, an axial gas turbine airfoil cascade shape is optimized for improved aerodynamic performance by using an unsteady Navier-Stokes solver and a parallel genetic algorithm. The objective of the research is twofold: (1) to develop a computational fluid dynamics code having faster convergence rate and unsteady flow simulation capabilities, and (2) to optimize a turbine airfoil cascade shape with unsteady passing wakes for improved aerodynamic performance. The computer code solves the Reynolds averaged Navier-Stokes equations. It is based on the explicit, finite difference, Runge-Kutta time marching scheme and the Diagonalized Alternating Direction Implicit (DADI) scheme, with the Baldwin-Lomax algebraic and k-epsilon turbulence modeling. Improvements in the code focused on the cascade shape design capability, convergence acceleration and unsteady formulation. First, the inverse shape design method was implemented in the code to provide the design capability, where a surface transpiration concept was employed as an inverse technique to modify the geometry satisfying the user specified pressure distribution on the airfoil surface. Second, an approximation storage multigrid method was implemented as an acceleration technique. Third, the preconditioning method was adopted to speed up the convergence rate in solving the low Mach number flows. Finally, the implicit dual time stepping method was incorporated in order to simulate the unsteady flow-fields. For the unsteady code validation, the Stokes's 2nd problem and the Poiseuille flow were chosen and compared with the computed results and analytic solutions. To test the code's ability to capture the natural unsteady flow phenomena, vortex shedding past a cylinder and the shock oscillation over a bicircular airfoil were simulated and compared with experiments and other research results. The rotor cascade shape optimization with unsteady passing wakes was performed to obtain an improved aerodynamic performance using the unsteady Navier-Stokes solver. Two objective functions were defined as minimization of total pressure loss and maximization of lift, while the mass flow rate was fixed. A parallel genetic algorithm was used as an optimizer and the penalty method was introduced. Each individual's objective function was computed simultaneously by using a 32 processor distributed memory computer. One optimization took about four days.
NASA Astrophysics Data System (ADS)
Panda, Satyasen
2018-05-01
This paper proposes a modified artificial bee colony optimization (ABC) algorithm based on levy flight swarm intelligence referred as artificial bee colony levy flight stochastic walk (ABC-LFSW) optimization for optical code division multiple access (OCDMA) network. The ABC-LFSW algorithm is used to solve asset assignment problem based on signal to noise ratio (SNR) optimization in OCDM networks with quality of service constraints. The proposed optimization using ABC-LFSW algorithm provides methods for minimizing various noises and interferences, regulating the transmitted power and optimizing the network design for improving the power efficiency of the optical code path (OCP) from source node to destination node. In this regard, an optical system model is proposed for improving the network performance with optimized input parameters. The detailed discussion and simulation results based on transmitted power allocation and power efficiency of OCPs are included. The experimental results prove the superiority of the proposed network in terms of power efficiency and spectral efficiency in comparison to networks without any power allocation approach.
Error control techniques for satellite and space communications
NASA Technical Reports Server (NTRS)
Costello, Daniel J., Jr.
1992-01-01
Worked performed during the reporting period is summarized. Construction of robustly good trellis codes for use with sequential decoding was developed. The robustly good trellis codes provide a much better trade off between free distance and distance profile. The unequal error protection capabilities of convolutional codes was studied. The problem of finding good large constraint length, low rate convolutional codes for deep space applications is investigated. A formula for computing the free distance of 1/n convolutional codes was discovered. Double memory (DM) codes, codes with two memory units per unit bit position, were studied; a search for optimal DM codes is being conducted. An algorithm for constructing convolutional codes from a given quasi-cyclic code was developed. Papers based on the above work are included in the appendix.
NASA Technical Reports Server (NTRS)
Hague, D. S.; Merz, A. W.
1975-01-01
Multivariable search techniques are applied to a particular class of airfoil optimization problems. These are the maximization of lift and the minimization of disturbance pressure magnitude in an inviscid nonlinear flow field. A variety of multivariable search techniques contained in an existing nonlinear optimization code, AESOP, are applied to this design problem. These techniques include elementary single parameter perturbation methods, organized search such as steepest-descent, quadratic, and Davidon methods, randomized procedures, and a generalized search acceleration technique. Airfoil design variables are seven in number and define perturbations to the profile of an existing NACA airfoil. The relative efficiency of the techniques are compared. It is shown that elementary one parameter at a time and random techniques compare favorably with organized searches in the class of problems considered. It is also shown that significant reductions in disturbance pressure magnitude can be made while retaining reasonable lift coefficient values at low free stream Mach numbers.
Optimal design of composite hip implants using NASA technology
NASA Technical Reports Server (NTRS)
Blake, T. A.; Saravanos, D. A.; Davy, D. T.; Waters, S. A.; Hopkins, D. A.
1993-01-01
Using an adaptation of NASA software, we have investigated the use of numerical optimization techniques for the shape and material optimization of fiber composite hip implants. The original NASA inhouse codes, were originally developed for the optimization of aerospace structures. The adapted code, which was called OPORIM, couples numerical optimization algorithms with finite element analysis and composite laminate theory to perform design optimization using both shape and material design variables. The external and internal geometry of the implant and the surrounding bone is described with quintic spline curves. This geometric representation is then used to create an equivalent 2-D finite element model of the structure. Using laminate theory and the 3-D geometric information, equivalent stiffnesses are generated for each element of the 2-D finite element model, so that the 3-D stiffness of the structure can be approximated. The geometric information to construct the model of the femur was obtained from a CT scan. A variety of test cases were examined, incorporating several implant constructions and design variable sets. Typically the code was able to produce optimized shape and/or material parameters which substantially reduced stress concentrations in the bone adjacent of the implant. The results indicate that this technology can provide meaningful insight into the design of fiber composite hip implants.
NASA Astrophysics Data System (ADS)
Trejos, Sorayda; Fredy Barrera, John; Torroba, Roberto
2015-08-01
We present for the first time an optical encrypting-decrypting protocol for recovering messages without speckle noise. This is a digital holographic technique using a 2f scheme to process QR codes entries. In the procedure, letters used to compose eventual messages are individually converted into a QR code, and then each QR code is divided into portions. Through a holographic technique, we store each processed portion. After filtering and repositioning, we add all processed data to create a single pack, thus simplifying the handling and recovery of multiple QR code images, representing the first multiplexing procedure applied to processed QR codes. All QR codes are recovered in a single step and in the same plane, showing neither cross-talk nor noise problems as in other methods. Experiments have been conducted using an interferometric configuration and comparisons between unprocessed and recovered QR codes have been performed, showing differences between them due to the involved processing. Recovered QR codes can be successfully scanned, thanks to their noise tolerance. Finally, the appropriate sequence in the scanning of the recovered QR codes brings a noiseless retrieved message. Additionally, to procure maximum security, the multiplexed pack could be multiplied by a digital diffuser as to encrypt it. The encrypted pack is easily decoded by multiplying the multiplexing with the complex conjugate of the diffuser. As it is a digital operation, no noise is added. Therefore, this technique is threefold robust, involving multiplexing, encryption, and the need of a sequence to retrieve the outcome.
NASA Astrophysics Data System (ADS)
Darazi, R.; Gouze, A.; Macq, B.
2009-01-01
Reproducing a natural and real scene as we see in the real world everyday is becoming more and more popular. Stereoscopic and multi-view techniques are used for this end. However due to the fact that more information are displayed requires supporting technologies such as digital compression to ensure the storage and transmission of the sequences. In this paper, a new scheme for stereo image coding is proposed. The original left and right images are jointly coded. The main idea is to optimally exploit the existing correlation between the two images. This is done by the design of an efficient transform that reduces the existing redundancy in the stereo image pair. This approach was inspired by Lifting Scheme (LS). The novelty in our work is that the prediction step is been replaced by an hybrid step that consists in disparity compensation followed by luminance correction and an optimized prediction step. The proposed scheme can be used for lossless and for lossy coding. Experimental results show improvement in terms of performance and complexity compared to recently proposed methods.
NASA Technical Reports Server (NTRS)
Sandlin, Doral R.; Bauer, Brent Alan
1993-01-01
This paper discusses the development of a FORTRAN computer code to perform agility analysis on aircraft configurations. This code is to be part of the NASA-Ames ACSYNT (AirCraft SYNThesis) design code. This paper begins with a discussion of contemporary agility research in the aircraft industry and a survey of a few agility metrics. The methodology, techniques and models developed for the code are then presented. Finally, example trade studies using the agility module along with ACSYNT are illustrated. These trade studies were conducted using a Northrop F-20 Tigershark aircraft model. The studies show that the agility module is effective in analyzing the influence of common parameters such as thrust-to-weight ratio and wing loading on agility criteria. The module can compare the agility potential between different configurations. In addition one study illustrates the module's ability to optimize a configuration's agility performance.
NASA Astrophysics Data System (ADS)
Papior, Nick; Lorente, Nicolás; Frederiksen, Thomas; García, Alberto; Brandbyge, Mads
2017-03-01
We present novel methods implemented within the non-equilibrium Green function code (NEGF) TRANSIESTA based on density functional theory (DFT). Our flexible, next-generation DFT-NEGF code handles devices with one or multiple electrodes (Ne ≥ 1) with individual chemical potentials and electronic temperatures. We describe its novel methods for electrostatic gating, contour optimizations, and assertion of charge conservation, as well as the newly implemented algorithms for optimized and scalable matrix inversion, performance-critical pivoting, and hybrid parallelization. Additionally, a generic NEGF "post-processing" code (TBTRANS/PHTRANS) for electron and phonon transport is presented with several novelties such as Hamiltonian interpolations, Ne ≥ 1 electrode capability, bond-currents, generalized interface for user-defined tight-binding transport, transmission projection using eigenstates of a projected Hamiltonian, and fast inversion algorithms for large-scale simulations easily exceeding 106 atoms on workstation computers. The new features of both codes are demonstrated and bench-marked for relevant test systems.
Optimal boundary conditions for ORCA-2 model
NASA Astrophysics Data System (ADS)
Kazantsev, Eugene
2013-08-01
A 4D-Var data assimilation technique is applied to ORCA-2 configuration of the NEMO in order to identify the optimal parametrization of boundary conditions on the lateral boundaries as well as on the bottom and on the surface of the ocean. The influence of boundary conditions on the solution is analyzed both within and beyond the assimilation window. It is shown that the optimal bottom and surface boundary conditions allow us to better represent the jet streams, such as Gulf Stream and Kuroshio. Analyzing the reasons of the jets reinforcement, we notice that data assimilation has a major impact on parametrization of the bottom boundary conditions for u and v. Automatic generation of the tangent and adjoint codes is also discussed. Tapenade software is shown to be able to produce the adjoint code that can be used after a memory usage optimization.
High-speed architecture for the decoding of trellis-coded modulation
NASA Technical Reports Server (NTRS)
Osborne, William P.
1992-01-01
Since 1971, when the Viterbi Algorithm was introduced as the optimal method of decoding convolutional codes, improvements in circuit technology, especially VLSI, have steadily increased its speed and practicality. Trellis-Coded Modulation (TCM) combines convolutional coding with higher level modulation (non-binary source alphabet) to provide forward error correction and spectral efficiency. For binary codes, the current stare-of-the-art is a 64-state Viterbi decoder on a single CMOS chip, operating at a data rate of 25 Mbps. Recently, there has been an interest in increasing the speed of the Viterbi Algorithm by improving the decoder architecture, or by reducing the algorithm itself. Designs employing new architectural techniques are now in existence, however these techniques are currently applied to simpler binary codes, not to TCM. The purpose of this report is to discuss TCM architectural considerations in general, and to present the design, at the logic gate level, or a specific TCM decoder which applies these considerations to achieve high-speed decoding.
NASA Technical Reports Server (NTRS)
Noble, Viveca K.
1993-01-01
There are various elements such as radio frequency interference (RFI) which may induce errors in data being transmitted via a satellite communication link. When a transmission is affected by interference or other error-causing elements, the transmitted data becomes indecipherable. It becomes necessary to implement techniques to recover from these disturbances. The objective of this research is to develop software which simulates error control circuits and evaluate the performance of these modules in various bit error rate environments. The results of the evaluation provide the engineer with information which helps determine the optimal error control scheme. The Consultative Committee for Space Data Systems (CCSDS) recommends the use of Reed-Solomon (RS) and convolutional encoders and Viterbi and RS decoders for error correction. The use of forward error correction techniques greatly reduces the received signal to noise needed for a certain desired bit error rate. The use of concatenated coding, e.g. inner convolutional code and outer RS code, provides even greater coding gain. The 16-bit cyclic redundancy check (CRC) code is recommended by CCSDS for error detection.
Statistical mechanics of broadcast channels using low-density parity-check codes.
Nakamura, Kazutaka; Kabashima, Yoshiyuki; Morelos-Zaragoza, Robert; Saad, David
2003-03-01
We investigate the use of Gallager's low-density parity-check (LDPC) codes in a degraded broadcast channel, one of the fundamental models in network information theory. Combining linear codes is a standard technique in practical network communication schemes and is known to provide better performance than simple time sharing methods when algebraic codes are used. The statistical physics based analysis shows that the practical performance of the suggested method, achieved by employing the belief propagation algorithm, is superior to that of LDPC based time sharing codes while the best performance, when received transmissions are optimally decoded, is bounded by the time sharing limit.
Exact and heuristic algorithms for Space Information Flow.
Uwitonze, Alfred; Huang, Jiaqing; Ye, Yuanqing; Cheng, Wenqing; Li, Zongpeng
2018-01-01
Space Information Flow (SIF) is a new promising research area that studies network coding in geometric space, such as Euclidean space. The design of algorithms that compute the optimal SIF solutions remains one of the key open problems in SIF. This work proposes the first exact SIF algorithm and a heuristic SIF algorithm that compute min-cost multicast network coding for N (N ≥ 3) given terminal nodes in 2-D Euclidean space. Furthermore, we find that the Butterfly network in Euclidean space is the second example besides the Pentagram network where SIF is strictly better than Euclidean Steiner minimal tree. The exact algorithm design is based on two key techniques: Delaunay triangulation and linear programming. Delaunay triangulation technique helps to find practically good candidate relay nodes, after which a min-cost multicast linear programming model is solved over the terminal nodes and the candidate relay nodes, to compute the optimal multicast network topology, including the optimal relay nodes selected by linear programming from all the candidate relay nodes and the flow rates on the connection links. The heuristic algorithm design is also based on Delaunay triangulation and linear programming techniques. The exact algorithm can achieve the optimal SIF solution with an exponential computational complexity, while the heuristic algorithm can achieve the sub-optimal SIF solution with a polynomial computational complexity. We prove the correctness of the exact SIF algorithm. The simulation results show the effectiveness of the heuristic SIF algorithm.
NASA Astrophysics Data System (ADS)
Jung, Sang-Young
Design procedures for aircraft wing structures with control surfaces are presented using multidisciplinary design optimization. Several disciplines such as stress analysis, structural vibration, aerodynamics, and controls are considered simultaneously and combined for design optimization. Vibration data and aerodynamic data including those in the transonic regime are calculated by existing codes. Flutter analyses are performed using those data. A flutter suppression method is studied using control laws in the closed-loop flutter equation. For the design optimization, optimization techniques such as approximation, design variable linking, temporary constraint deletion, and optimality criteria are used. Sensitivity derivatives of stresses and displacements for static loads, natural frequency, flutter characteristics, and control characteristics with respect to design variables are calculated for an approximate optimization. The objective function is the structural weight. The design variables are the section properties of the structural elements and the control gain factors. Existing multidisciplinary optimization codes (ASTROS* and MSC/NASTRAN) are used to perform single and multiple constraint optimizations of fully built up finite element wing structures. Three benchmark wing models are developed and/or modified for this purpose. The models are tested extensively.
The effect of total noise on two-dimension OCDMA codes
NASA Astrophysics Data System (ADS)
Dulaimi, Layth A. Khalil Al; Badlishah Ahmed, R.; Yaakob, Naimah; Aljunid, Syed A.; Matem, Rima
2017-11-01
In this research, we evaluate the performance of total noise effect on two dimension (2-D) optical code-division multiple access (OCDMA) performance systems using 2-D Modified Double Weight MDW under various link parameters. The impact of the multi-access interference (MAI) and other noise effect on the system performance. The 2-D MDW is compared mathematically with other codes which use similar techniques. We analyzed and optimized the data rate and effective receive power. The performance and optimization of MDW code in OCDMA system are reported, the bit error rate (BER) can be significantly improved when the 2-D MDW code desired parameters are selected especially the cross correlation properties. It reduces the MAI in the system compensate BER and phase-induced intensity noise (PIIN) in incoherent OCDMA The analysis permits a thorough understanding of PIIN, shot and thermal noises impact on 2-D MDW OCDMA system performance. PIIN is the main noise factor in the OCDMA network.
Label consistent K-SVD: learning a discriminative dictionary for recognition.
Jiang, Zhuolin; Lin, Zhe; Davis, Larry S
2013-11-01
A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistency constraint called "discriminative sparse-code error" and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single overcomplete dictionary and an optimal linear classifier jointly. The incremental dictionary learning algorithm is presented for the situation of limited memory resources. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse-coding techniques for face, action, scene, and object category recognition under the same learning conditions.
Coded excitation with spectrum inversion (CEXSI) for ultrasound array imaging.
Wang, Yao; Metzger, Kurt; Stephens, Douglas N; Williams, Gregory; Brownlie, Scott; O'Donnell, Matthew
2003-07-01
In this paper, a scheme called coded excitation with spectrum inversion (CEXSI) is presented. An established optimal binary code whose spectrum has no nulls and possesses the least variation is encoded as a burst for transmission. Using this optimal code, the decoding filter can be derived directly from its inverse spectrum. Various transmission techniques can be used to improve energy coupling within the system pass-band. We demonstrate its potential to achieve excellent decoding with very low (< 80 dB) side-lobes. For a 2.6 micros code, an array element with a center frequency of 10 MHz and fractional bandwidth of 38%, range side-lobes of about 40 dB have been achieved experimentally with little compromise in range resolution. The signal-to-noise ratio (SNR) improvement also has been characterized at about 14 dB. Along with simulations and experimental data, we present a formulation of the scheme, according to which CEXSI can be extended to improve SNR in sparse array imaging in general.
Applied Computational Electromagnetics Society Journal and Newletter, Volume 14 No. 1
1999-03-01
code validation, performance analysis, and input/output standardization; code or technique optimization and error minimization; innovations in...SOUTH AFRICA Alamo, CA, 94507-0516 USA Washington, DC 20330 USA MANAGING EDITOR Kueichien C. Hill Krishna Naishadham Richard W. Adler Wright Laboratory...INSTITUTIONAL MEMBERS ALLGON DERA Nasvagen 17 Common Road, Funtington INNOVATIVE DYNAMICS Akersberga, SWEDEN S-18425 Chichester, P018 9PD UK 2560 N. Triphammer
Targeting multiple heterogeneous hardware platforms with OpenCL
NASA Astrophysics Data System (ADS)
Fox, Paul A.; Kozacik, Stephen T.; Humphrey, John R.; Paolini, Aaron; Kuller, Aryeh; Kelmelis, Eric J.
2014-06-01
The OpenCL API allows for the abstract expression of parallel, heterogeneous computing, but hardware implementations have substantial implementation differences. The abstractions provided by the OpenCL API are often insufficiently high-level to conceal differences in hardware architecture. Additionally, implementations often do not take advantage of potential performance gains from certain features due to hardware limitations and other factors. These factors make it challenging to produce code that is portable in practice, resulting in much OpenCL code being duplicated for each hardware platform being targeted. This duplication of effort offsets the principal advantage of OpenCL: portability. The use of certain coding practices can mitigate this problem, allowing a common code base to be adapted to perform well across a wide range of hardware platforms. To this end, we explore some general practices for producing performant code that are effective across platforms. Additionally, we explore some ways of modularizing code to enable optional optimizations that take advantage of hardware-specific characteristics. The minimum requirement for portability implies avoiding the use of OpenCL features that are optional, not widely implemented, poorly implemented, or missing in major implementations. Exposing multiple levels of parallelism allows hardware to take advantage of the types of parallelism it supports, from the task level down to explicit vector operations. Static optimizations and branch elimination in device code help the platform compiler to effectively optimize programs. Modularization of some code is important to allow operations to be chosen for performance on target hardware. Optional subroutines exploiting explicit memory locality allow for different memory hierarchies to be exploited for maximum performance. The C preprocessor and JIT compilation using the OpenCL runtime can be used to enable some of these techniques, as well as to factor in hardware-specific optimizations as necessary.
Overview: Applications of numerical optimization methods to helicopter design problems
NASA Technical Reports Server (NTRS)
Miura, H.
1984-01-01
There are a number of helicopter design problems that are well suited to applications of numerical design optimization techniques. Adequate implementation of this technology will provide high pay-offs. There are a number of numerical optimization programs available, and there are many excellent response/performance analysis programs developed or being developed. But integration of these programs in a form that is usable in the design phase should be recognized as important. It is also necessary to attract the attention of engineers engaged in the development of analysis capabilities and to make them aware that analysis capabilities are much more powerful if integrated into design oriented codes. Frequently, the shortcoming of analysis capabilities are revealed by coupling them with an optimization code. Most of the published work has addressed problems in preliminary system design, rotor system/blade design or airframe design. Very few published results were found in acoustics, aerodynamics and control system design. Currently major efforts are focused on vibration reduction, and aerodynamics/acoustics applications appear to be growing fast. The development of a computer program system to integrate the multiple disciplines required in helicopter design with numerical optimization technique is needed. Activities in Britain, Germany and Poland are identified, but no published results from France, Italy, the USSR or Japan were found.
MINIVER: Miniature version of real/ideal gas aero-heating and ablation computer program
NASA Technical Reports Server (NTRS)
Hendler, D. R.
1976-01-01
Computer code is used to determine heat transfer multiplication factors, special flow field simulation techniques, different heat transfer methods, different transition criteria, crossflow simulation, and more efficient thin skin thickness optimization procedure.
NASA Astrophysics Data System (ADS)
Chang, Ching-Chun; Liu, Yanjun; Nguyen, Son T.
2015-03-01
Data hiding is a technique that embeds information into digital cover data. This technique has been concentrated on the spatial uncompressed domain, and it is considered more challenging to perform in the compressed domain, i.e., vector quantization, JPEG, and block truncation coding (BTC). In this paper, we propose a new data hiding scheme for BTC-compressed images. In the proposed scheme, a dynamic programming strategy was used to search for the optimal solution of the bijective mapping function for LSB substitution. Then, according to the optimal solution, each mean value embeds three secret bits to obtain high hiding capacity with low distortion. The experimental results indicated that the proposed scheme obtained both higher hiding capacity and hiding efficiency than the other four existing schemes, while ensuring good visual quality of the stego-image. In addition, the proposed scheme achieved a low bit rate as original BTC algorithm.
NASA Technical Reports Server (NTRS)
Simon, M. K.; Udalov, S.; Huth, G. K.
1976-01-01
The forward link of the overall Ku-band communication system consists of the ground- TDRS-orbiter communication path. Because the last segment of the link is directed towards a relatively low orbiting shuttle, a PN code is used to reduce the spectral density. A method is presented for incorporating code acquisition and tracking functions into the orbiter's Ku-band receiver. Optimization of a three channel multiplexing technique is described. The importance of Costas loop parameters to provide false lock immunity for the receiver, and the advantage of using a sinusoidal subcarrier waveform, rather than square wave, are discussed.
Analysis of the faster-than-Nyquist optimal linear multicarrier system
NASA Astrophysics Data System (ADS)
Marquet, Alexandre; Siclet, Cyrille; Roque, Damien
2017-02-01
Faster-than-Nyquist signalization enables a better spectral efficiency at the expense of an increased computational complexity. Regarding multicarrier communications, previous work mainly relied on the study of non-linear systems exploiting coding and/or equalization techniques, with no particular optimization of the linear part of the system. In this article, we analyze the performance of the optimal linear multicarrier system when used together with non-linear receiving structures (iterative decoding and direct feedback equalization), or in a standalone fashion. We also investigate the limits of the normality assumption of the interference, used for implementing such non-linear systems. The use of this optimal linear system leads to a closed-form expression of the bit-error probability that can be used to predict the performance and help the design of coded systems. Our work also highlights the great performance/complexity trade-off offered by decision feedback equalization in a faster-than-Nyquist context. xml:lang="fr"
Optimal placement of tuning masses on truss structures by genetic algorithms
NASA Technical Reports Server (NTRS)
Ponslet, Eric; Haftka, Raphael T.; Cudney, Harley H.
1993-01-01
Optimal placement of tuning masses, actuators and other peripherals on large space structures is a combinatorial optimization problem. This paper surveys several techniques for solving this problem. The genetic algorithm approach to the solution of the placement problem is described in detail. An example of minimizing the difference between the two lowest frequencies of a laboratory truss by adding tuning masses is used for demonstrating some of the advantages of genetic algorithms. The relative efficiencies of different codings are compared using the results of a large number of optimization runs.
Surveying multidisciplinary aspects in real-time distributed coding for Wireless Sensor Networks.
Braccini, Carlo; Davoli, Franco; Marchese, Mario; Mongelli, Maurizio
2015-01-27
Wireless Sensor Networks (WSNs), where a multiplicity of sensors observe a physical phenomenon and transmit their measurements to one or more sinks, pertain to the class of multi-terminal source and channel coding problems of Information Theory. In this category, "real-time" coding is often encountered for WSNs, referring to the problem of finding the minimum distortion (according to a given measure), under transmission power constraints, attainable by encoding and decoding functions, with stringent limits on delay and complexity. On the other hand, the Decision Theory approach seeks to determine the optimal coding/decoding strategies or some of their structural properties. Since encoder(s) and decoder(s) possess different information, though sharing a common goal, the setting here is that of Team Decision Theory. A more pragmatic vision rooted in Signal Processing consists of fixing the form of the coding strategies (e.g., to linear functions) and, consequently, finding the corresponding optimal decoding strategies and the achievable distortion, generally by applying parametric optimization techniques. All approaches have a long history of past investigations and recent results. The goal of the present paper is to provide the taxonomy of the various formulations, a survey of the vast related literature, examples from the authors' own research, and some highlights on the inter-play of the different theories.
Optimization techniques using MODFLOW-GWM
Grava, Anna; Feinstein, Daniel T.; Barlow, Paul M.; Bonomi, Tullia; Buarne, Fabiola; Dunning, Charles; Hunt, Randall J.
2015-01-01
An important application of optimization codes such as MODFLOW-GWM is to maximize water supply from unconfined aquifers subject to constraints involving surface-water depletion and drawdown. In optimizing pumping for a fish hatchery in a bedrock aquifer system overlain by glacial deposits in eastern Wisconsin, various features of the GWM-2000 code were used to overcome difficulties associated with: 1) Non-linear response matrices caused by unconfined conditions and head-dependent boundaries; 2) Efficient selection of candidate well and drawdown constraint locations; and 3) Optimizing against water-level constraints inside pumping wells. Features of GWM-2000 were harnessed to test the effects of systematically varying the decision variables and constraints on the optimized solution for managing withdrawals. An important lesson of the procedure, similar to lessons learned in model calibration, is that the optimized outcome is non-unique, and depends on a range of choices open to the user. The modeler must balance the complexity of the numerical flow model used to represent the groundwater-flow system against the range of options (decision variables, objective functions, constraints) available for optimizing the model.
Integrated source and channel encoded digital communication system design study
NASA Technical Reports Server (NTRS)
Huth, G. K.; Trumpis, B. D.; Udalov, S.
1975-01-01
Various aspects of space shuttle communication systems were studied. The following major areas were investigated: burst error correction for shuttle command channels; performance optimization and design considerations for Costas receivers with and without bandpass limiting; experimental techniques for measuring low level spectral components of microwave signals; and potential modulation and coding techniques for the Ku-band return link. Results are presented.
On the Use of Statistics in Design and the Implications for Deterministic Computer Experiments
NASA Technical Reports Server (NTRS)
Simpson, Timothy W.; Peplinski, Jesse; Koch, Patrick N.; Allen, Janet K.
1997-01-01
Perhaps the most prevalent use of statistics in engineering design is through Taguchi's parameter and robust design -- using orthogonal arrays to compute signal-to-noise ratios in a process of design improvement. In our view, however, there is an equally exciting use of statistics in design that could become just as prevalent: it is the concept of metamodeling whereby statistical models are built to approximate detailed computer analysis codes. Although computers continue to get faster, analysis codes always seem to keep pace so that their computational time remains non-trivial. Through metamodeling, approximations of these codes are built that are orders of magnitude cheaper to run. These metamodels can then be linked to optimization routines for fast analysis, or they can serve as a bridge for integrating analysis codes across different domains. In this paper we first review metamodeling techniques that encompass design of experiments, response surface methodology, Taguchi methods, neural networks, inductive learning, and kriging. We discuss their existing applications in engineering design and then address the dangers of applying traditional statistical techniques to approximate deterministic computer analysis codes. We conclude with recommendations for the appropriate use of metamodeling techniques in given situations and how common pitfalls can be avoided.
Fast Algorithms for Designing Unimodular Waveform(s) With Good Correlation Properties
NASA Astrophysics Data System (ADS)
Li, Yongzhe; Vorobyov, Sergiy A.
2018-03-01
In this paper, we develop new fast and efficient algorithms for designing single/multiple unimodular waveforms/codes with good auto- and cross-correlation or weighted correlation properties, which are highly desired in radar and communication systems. The waveform design is based on the minimization of the integrated sidelobe level (ISL) and weighted ISL (WISL) of waveforms. As the corresponding optimization problems can quickly grow to large scale with increasing the code length and number of waveforms, the main issue turns to be the development of fast large-scale optimization techniques. The difficulty is also that the corresponding optimization problems are non-convex, but the required accuracy is high. Therefore, we formulate the ISL and WISL minimization problems as non-convex quartic optimization problems in frequency domain, and then simplify them into quadratic problems by utilizing the majorization-minimization technique, which is one of the basic techniques for addressing large-scale and/or non-convex optimization problems. While designing our fast algorithms, we find out and use inherent algebraic structures in the objective functions to rewrite them into quartic forms, and in the case of WISL minimization, to derive additionally an alternative quartic form which allows to apply the quartic-quadratic transformation. Our algorithms are applicable to large-scale unimodular waveform design problems as they are proved to have lower or comparable computational burden (analyzed theoretically) and faster convergence speed (confirmed by comprehensive simulations) than the state-of-the-art algorithms. In addition, the waveforms designed by our algorithms demonstrate better correlation properties compared to their counterparts.
Spherical hashing: binary code embedding with hyperspheres.
Heo, Jae-Pil; Lee, Youngwoon; He, Junfeng; Chang, Shih-Fu; Yoon, Sung-Eui
2015-11-01
Many binary code embedding schemes have been actively studied recently, since they can provide efficient similarity search, and compact data representations suitable for handling large scale image databases. Existing binary code embedding techniques encode high-dimensional data by using hyperplane-based hashing functions. In this paper we propose a novel hypersphere-based hashing function, spherical hashing, to map more spatially coherent data points into a binary code compared to hyperplane-based hashing functions. We also propose a new binary code distance function, spherical Hamming distance, tailored for our hypersphere-based binary coding scheme, and design an efficient iterative optimization process to achieve both balanced partitioning for each hash function and independence between hashing functions. Furthermore, we generalize spherical hashing to support various similarity measures defined by kernel functions. Our extensive experiments show that our spherical hashing technique significantly outperforms state-of-the-art techniques based on hyperplanes across various benchmarks with sizes ranging from one to 75 million of GIST, BoW and VLAD descriptors. The performance gains are consistent and large, up to 100 percent improvements over the second best method among tested methods. These results confirm the unique merits of using hyperspheres to encode proximity regions in high-dimensional spaces. Finally, our method is intuitive and easy to implement.
Advanced techniques and technology for efficient data storage, access, and transfer
NASA Technical Reports Server (NTRS)
Rice, Robert F.; Miller, Warner
1991-01-01
Advanced techniques for efficiently representing most forms of data are being implemented in practical hardware and software form through the joint efforts of three NASA centers. These techniques adapt to local statistical variations to continually provide near optimum code efficiency when representing data without error. Demonstrated in several earlier space applications, these techniques are the basis of initial NASA data compression standards specifications. Since the techniques clearly apply to most NASA science data, NASA invested in the development of both hardware and software implementations for general use. This investment includes high-speed single-chip very large scale integration (VLSI) coding and decoding modules as well as machine-transferrable software routines. The hardware chips were tested in the laboratory at data rates as high as 700 Mbits/s. A coding module's definition includes a predictive preprocessing stage and a powerful adaptive coding stage. The function of the preprocessor is to optimally process incoming data into a standard form data source that the second stage can handle.The built-in preprocessor of the VLSI coder chips is ideal for high-speed sampled data applications such as imaging and high-quality audio, but additionally, the second stage adaptive coder can be used separately with any source that can be externally preprocessed into the 'standard form'. This generic functionality assures that the applicability of these techniques and their recent high-speed implementations should be equally broad outside of NASA.
NASA Technical Reports Server (NTRS)
Huck, Friedrich O.; Fales, Carl L.
1990-01-01
Researchers are concerned with the end-to-end performance of image gathering, coding, and processing. The applications range from high-resolution television to vision-based robotics, wherever the resolution, efficiency and robustness of visual information acquisition and processing are critical. For the presentation at this workshop, it is convenient to divide research activities into the following two overlapping areas: The first is the development of focal-plane processing techniques and technology to effectively combine image gathering with coding, with an emphasis on low-level vision processing akin to the retinal processing in human vision. The approach includes the familiar Laplacian pyramid, the new intensity-dependent spatial summation, and parallel sensing/processing networks. Three-dimensional image gathering is attained by combining laser ranging with sensor-array imaging. The second is the rigorous extension of information theory and optimal filtering to visual information acquisition and processing. The goal is to provide a comprehensive methodology for quantitatively assessing the end-to-end performance of image gathering, coding, and processing.
NASA Technical Reports Server (NTRS)
Kuhlman, J. M.
1979-01-01
The aerodynamic design of a wind-tunnel model of a wing representative of that of a subsonic jet transport aircraft, fitted with winglets, was performed using two recently developed optimal wing-design computer programs. Both potential flow codes use a vortex lattice representation of the near-field of the aerodynamic surfaces for determination of the required mean camber surfaces for minimum induced drag, and both codes use far-field induced drag minimization procedures to obtain the required spanloads. One code uses a discrete vortex wake model for this far-field drag computation, while the second uses a 2-D advanced panel wake model. Wing camber shapes for the two codes are very similar, but the resulting winglet camber shapes differ widely. Design techniques and considerations for these two wind-tunnel models are detailed, including a description of the necessary modifications of the design geometry to format it for use by a numerically controlled machine for the actual model construction.
NASA Technical Reports Server (NTRS)
Martini, William R.
1989-01-01
A FORTRAN computer code is described that could be used to design and optimize a free-displacer, free-piston Stirling engine similar to the RE-1000 engine made by Sunpower. The code contains options for specifying displacer and power piston motion or for allowing these motions to be calculated by a force balance. The engine load may be a dashpot, inertial compressor, hydraulic pump or linear alternator. Cycle analysis may be done by isothermal analysis or adiabatic analysis. Adiabatic analysis may be done using the Martini moving gas node analysis or the Rios second-order Runge-Kutta analysis. Flow loss and heat loss equations are included. Graphical display of engine motions and pressures and temperatures are included. Programming for optimizing up to 15 independent dimensions is included. Sample performance results are shown for both specified and unconstrained piston motions; these results are shown as generated by each of the two Martini analyses. Two sample optimization searches are shown using specified piston motion isothermal analysis. One is for three adjustable input and one is for four. Also, two optimization searches for calculated piston motion are presented for three and for four adjustable inputs. The effect of leakage is evaluated. Suggestions for further work are given.
Gschwind, Michael K
2013-07-23
Mechanisms for aggressively optimizing computer code are provided. With these mechanisms, a compiler determines an optimization to apply to a portion of source code and determines if the optimization as applied to the portion of source code will result in unsafe optimized code that introduces a new source of exceptions being generated by the optimized code. In response to a determination that the optimization is an unsafe optimization, the compiler generates an aggressively compiled code version, in which the unsafe optimization is applied, and a conservatively compiled code version in which the unsafe optimization is not applied. The compiler stores both versions and provides them for execution. Mechanisms are provided for switching between these versions during execution in the event of a failure of the aggressively compiled code version. Moreover, predictive mechanisms are provided for predicting whether such a failure is likely.
Optimal sensor placement for spatial lattice structure based on genetic algorithms
NASA Astrophysics Data System (ADS)
Liu, Wei; Gao, Wei-cheng; Sun, Yi; Xu, Min-jian
2008-10-01
Optimal sensor placement technique plays a key role in structural health monitoring of spatial lattice structures. This paper considers the problem of locating sensors on a spatial lattice structure with the aim of maximizing the data information so that structural dynamic behavior can be fully characterized. Based on the criterion of optimal sensor placement for modal test, an improved genetic algorithm is introduced to find the optimal placement of sensors. The modal strain energy (MSE) and the modal assurance criterion (MAC) have been taken as the fitness function, respectively, so that three placement designs were produced. The decimal two-dimension array coding method instead of binary coding method is proposed to code the solution. Forced mutation operator is introduced when the identical genes appear via the crossover procedure. A computational simulation of a 12-bay plain truss model has been implemented to demonstrate the feasibility of the three optimal algorithms above. The obtained optimal sensor placements using the improved genetic algorithm are compared with those gained by exiting genetic algorithm using the binary coding method. Further the comparison criterion based on the mean square error between the finite element method (FEM) mode shapes and the Guyan expansion mode shapes identified by data-driven stochastic subspace identification (SSI-DATA) method are employed to demonstrate the advantage of the different fitness function. The results showed that some innovations in genetic algorithm proposed in this paper can enlarge the genes storage and improve the convergence of the algorithm. More importantly, the three optimal sensor placement methods can all provide the reliable results and identify the vibration characteristics of the 12-bay plain truss model accurately.
A Holistic Approach to Networked Information Systems Design and Analysis
2016-04-15
attain quite substantial savings. 11. Optimal algorithms for energy harvesting in wireless networks. We use a Markov- decision-process (MDP) based...approach to obtain optimal policies for transmissions . The key advantage of our approach is that it holistically considers information and energy in a...Coding technique to minimize delays and the number of transmissions in Wireless Systems. As we approach an era of ubiquitous computing with information
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sitaraman, Hariswaran; Grout, Ray W
This work investigates novel algorithm designs and optimization techniques for restructuring chemistry integrators in zero and multidimensional combustion solvers, which can then be effectively used on the emerging generation of Intel's Many Integrated Core/Xeon Phi processors. These processors offer increased computing performance via large number of lightweight cores at relatively lower clock speeds compared to traditional processors (e.g. Intel Sandybridge/Ivybridge) used in current supercomputers. This style of processor can be productively used for chemistry integrators that form a costly part of computational combustion codes, in spite of their relatively lower clock speeds. Performance commensurate with traditional processors is achieved heremore » through the combination of careful memory layout, exposing multiple levels of fine grain parallelism and through extensive use of vendor supported libraries (Cilk Plus and Math Kernel Libraries). Important optimization techniques for efficient memory usage and vectorization have been identified and quantified. These optimizations resulted in a factor of ~ 3 speed-up using Intel 2013 compiler and ~ 1.5 using Intel 2017 compiler for large chemical mechanisms compared to the unoptimized version on the Intel Xeon Phi. The strategies, especially with respect to memory usage and vectorization, should also be beneficial for general purpose computational fluid dynamics codes.« less
Optimized bit extraction using distortion modeling in the scalable extension of H.264/AVC.
Maani, Ehsan; Katsaggelos, Aggelos K
2009-09-01
The newly adopted scalable extension of H.264/AVC video coding standard (SVC) demonstrates significant improvements in coding efficiency in addition to an increased degree of supported scalability relative to the scalable profiles of prior video coding standards. Due to the complicated hierarchical prediction structure of the SVC and the concept of key pictures, content-aware rate adaptation of SVC bit streams to intermediate bit rates is a nontrivial task. The concept of quality layers has been introduced in the design of the SVC to allow for fast content-aware prioritized rate adaptation. However, existing quality layer assignment methods are suboptimal and do not consider all network abstraction layer (NAL) units from different layers for the optimization. In this paper, we first propose a technique to accurately and efficiently estimate the quality degradation resulting from discarding an arbitrary number of NAL units from multiple layers of a bitstream by properly taking drift into account. Then, we utilize this distortion estimation technique to assign quality layers to NAL units for a more efficient extraction. Experimental results show that a significant gain can be achieved by the proposed scheme.
Applied Computational Electromagnetics Society Journal, Volume 9, Number 2
1994-07-01
input/output standardization; code or technique optimization and error minimization; innovations in solution technique or in data input/output...THE APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL EDITORS 3DITOR-IN-CH•IF/ACES EDITOR-IN-CHIEP/JOURNAL MANAGING EDITOR W. Perry Wheless...Adalbert Konrad and Paul P. Biringer Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, CANADA M5S 1A4 Ailiwir
An integrated optimum design approach for high speed prop rotors
NASA Technical Reports Server (NTRS)
Chattopadhyay, Aditi; Mccarthy, Thomas R.
1995-01-01
The objective is to develop an optimization procedure for high-speed and civil tilt-rotors by coupling all of the necessary disciplines within a closed-loop optimization procedure. Both simplified and comprehensive analysis codes are used for the aerodynamic analyses. The structural properties are calculated using in-house developed algorithms for both isotropic and composite box beam sections. There are four major objectives of this study. (1) Aerodynamic optimization: The effects of blade aerodynamic characteristics on cruise and hover performance of prop-rotor aircraft are investigated using the classical blade element momentum approach with corrections for the high lift capability of rotors/propellers. (2) Coupled aerodynamic/structures optimization: A multilevel hybrid optimization technique is developed for the design of prop-rotor aircraft. The design problem is decomposed into a level for improved aerodynamics with continuous design variables and a level with discrete variables to investigate composite tailoring. The aerodynamic analysis is based on that developed in objective 1 and the structural analysis is performed using an in-house code which models a composite box beam. The results are compared to both a reference rotor and the optimum rotor found in the purely aerodynamic formulation. (3) Multipoint optimization: The multilevel optimization procedure of objective 2 is extended to a multipoint design problem. Hover, cruise, and take-off are the three flight conditions simultaneously maximized. (4) Coupled rotor/wing optimization: Using the comprehensive rotary wing code CAMRAD, an optimization procedure is developed for the coupled rotor/wing performance in high speed tilt-rotor aircraft. The developed procedure contains design variables which define the rotor and wing planforms.
Error control techniques for satellite and space communications
NASA Technical Reports Server (NTRS)
Costello, Daniel J., Jr.
1990-01-01
An expurgated upper bound on the event error probability of trellis coded modulation is presented. This bound is used to derive a lower bound on the minimum achievable free Euclidean distance d sub (free) of trellis codes. It is shown that the dominant parameters for both bounds, the expurgated error exponent and the asymptotic d sub (free) growth rate, respectively, can be obtained from the cutoff-rate R sub O of the transmission channel by a simple geometric construction, making R sub O the central parameter for finding good trellis codes. Several constellations are optimized with respect to the bounds.
Effects of pulse width and coding on radar returns from clear air
NASA Technical Reports Server (NTRS)
Cornish, C. R.
1983-01-01
In atmospheric radar studies it is desired to obtain maximum information about the atmosphere and to use efficiently the radar transmitter and processing hardware. Large pulse widths are used to increase the signal to noise ratio since clear air returns are generally weak and maximum height coverage is desired. Yet since good height resolution is equally important, pulse compression techniques such as phase coding are employed to optimize the average power of the transmitter. Considerations in implementing a coding scheme and subsequent effects of an impinging pulse on the atmosphere are investigated.
Extreme ultraviolet emission spectra of Gd and Tb ions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kilbane, D.; O'Sullivan, G.
2010-11-15
Theoretical extreme ultraviolet emission spectra of gadolinium and terbium ions calculated with the Cowan suite of codes and the flexible atomic code (FAC) relativistic code are presented. 4d-4f and 4p-4d transitions give rise to unresolved transition arrays in a range of ions. The effects of configuration interaction are investigated for transitions between singly excited configurations. Optimization of emission at 6.775 nm and 6.515 nm is achieved for Gd and Tb ions, respectively, by consideration of plasma effects. The resulting synthetic spectra are compared with experimental spectra recorded using the laser produced plasma technique.
NASA Technical Reports Server (NTRS)
Murthy, T. Sreekanta
1988-01-01
Several key issues involved in the application of formal optimization technique to helicopter airframe structures for vibration reduction are addressed. Considerations which are important in the optimization of real airframe structures are discussed. Considerations necessary to establish relevant set of design variables, constraints and objectives which are appropriate to conceptual, preliminary, detailed design, ground and flight test phases of airframe design are discussed. A methodology is suggested for optimization of airframes in various phases of design. Optimization formulations that are unique to helicopter airframes are described and expressions for vibration related functions are derived. Using a recently developed computer code, the optimization of a Bell AH-1G helicopter airframe is demonstrated.
Connectivity Restoration in Wireless Sensor Networks via Space Network Coding.
Uwitonze, Alfred; Huang, Jiaqing; Ye, Yuanqing; Cheng, Wenqing
2017-04-20
The problem of finding the number and optimal positions of relay nodes for restoring the network connectivity in partitioned Wireless Sensor Networks (WSNs) is Non-deterministic Polynomial-time hard (NP-hard) and thus heuristic methods are preferred to solve it. This paper proposes a novel polynomial time heuristic algorithm, namely, Relay Placement using Space Network Coding (RPSNC), to solve this problem, where Space Network Coding, also called Space Information Flow (SIF), is a new research paradigm that studies network coding in Euclidean space, in which extra relay nodes can be introduced to reduce the cost of communication. Unlike contemporary schemes that are often based on Minimum Spanning Tree (MST), Euclidean Steiner Minimal Tree (ESMT) or a combination of MST with ESMT, RPSNC is a new min-cost multicast space network coding approach that combines Delaunay triangulation and non-uniform partitioning techniques for generating a number of candidate relay nodes, and then linear programming is applied for choosing the optimal relay nodes and computing their connection links with terminals. Subsequently, an equilibrium method is used to refine the locations of the optimal relay nodes, by moving them to balanced positions. RPSNC can adapt to any density distribution of relay nodes and terminals, as well as any density distribution of terminals. The performance and complexity of RPSNC are analyzed and its performance is validated through simulation experiments.
The performance of trellis coded multilevel DPSK on a fading mobile satellite channel
NASA Technical Reports Server (NTRS)
Simon, Marvin K.; Divsalar, Dariush
1987-01-01
The performance of trellis coded multilevel differential phase-shift-keying (MDPSK) over Rician and Rayleigh fading channels is discussed. For operation at L-Band, this signalling technique leads to a more robust system than the coherent system with dual pilot tone calibration previously proposed for UHF. The results are obtained using a combination of analysis and simulation. The analysis shows that the design criterion for trellis codes to be operated on fading channels with interleaving/deinterleaving is no longer free Euclidean distance. The correct design criterion for optimizing bit error probability of trellis coded MDPSK over fading channels will be presented along with examples illustrating its application.
SIMD Optimization of Linear Expressions for Programmable Graphics Hardware
Bajaj, Chandrajit; Ihm, Insung; Min, Jungki; Oh, Jinsang
2009-01-01
The increased programmability of graphics hardware allows efficient graphical processing unit (GPU) implementations of a wide range of general computations on commodity PCs. An important factor in such implementations is how to fully exploit the SIMD computing capacities offered by modern graphics processors. Linear expressions in the form of ȳ = Ax̄ + b̄, where A is a matrix, and x̄, ȳ and b̄ are vectors, constitute one of the most basic operations in many scientific computations. In this paper, we propose a SIMD code optimization technique that enables efficient shader codes to be generated for evaluating linear expressions. It is shown that performance can be improved considerably by efficiently packing arithmetic operations into four-wide SIMD instructions through reordering of the operations in linear expressions. We demonstrate that the presented technique can be used effectively for programming both vertex and pixel shaders for a variety of mathematical applications, including integrating differential equations and solving a sparse linear system of equations using iterative methods. PMID:19946569
NASA Astrophysics Data System (ADS)
Kurosu, Keita; Das, Indra J.; Moskvin, Vadim P.
2016-01-01
Spot scanning, owing to its superior dose-shaping capability, provides unsurpassed dose conformity, in particular for complex targets. However, the robustness of the delivered dose distribution and prescription has to be verified. Monte Carlo (MC) simulation has the potential to generate significant advantages for high-precise particle therapy, especially for medium containing inhomogeneities. However, the inherent choice of computational parameters in MC simulation codes of GATE, PHITS and FLUKA that is observed for uniform scanning proton beam needs to be evaluated. This means that the relationship between the effect of input parameters and the calculation results should be carefully scrutinized. The objective of this study was, therefore, to determine the optimal parameters for the spot scanning proton beam for both GATE and PHITS codes by using data from FLUKA simulation as a reference. The proton beam scanning system of the Indiana University Health Proton Therapy Center was modeled in FLUKA, and the geometry was subsequently and identically transferred to GATE and PHITS. Although the beam transport is managed by spot scanning system, the spot location is always set at the center of a water phantom of 600 × 600 × 300 mm3, which is placed after the treatment nozzle. The percentage depth dose (PDD) is computed along the central axis using 0.5 × 0.5 × 0.5 mm3 voxels in the water phantom. The PDDs and the proton ranges obtained with several computational parameters are then compared to those of FLUKA, and optimal parameters are determined from the accuracy of the proton range, suppressed dose deviation, and computational time minimization. Our results indicate that the optimized parameters are different from those for uniform scanning, suggesting that the gold standard for setting computational parameters for any proton therapy application cannot be determined consistently since the impact of setting parameters depends on the proton irradiation technique. We therefore conclude that customization parameters must be set with reference to the optimized parameters of the corresponding irradiation technique in order to render them useful for achieving artifact-free MC simulation for use in computational experiments and clinical treatments.
Large-area metallic photonic lattices for military applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luk, Ting Shan
2007-11-01
In this project we developed photonic crystal modeling capability and fabrication technology that is scaleable to large area. An intelligent optimization code was developed to find the optimal structure for the desired spectral response. In terms of fabrication, an exhaustive survey of fabrication techniques that would meet the large area requirement was reduced to Deep X-ray Lithography (DXRL) and nano-imprint. Using DXRL, we fabricated a gold logpile photonic crystal in the <100> plane. For the nano-imprint technique, we fabricated a cubic array of gold squares. These two examples also represent two classes of metallic photonic crystal topologies, the connected networkmore » and cermet arrangement.« less
NASA Technical Reports Server (NTRS)
Fishbach, L. H.
1980-01-01
The computational techniques are described which are utilized at Lewis Research Center to determine the optimum propulsion systems for future aircraft applications and to identify system tradeoffs and technology requirements. Cycle performance, and engine weight can be calculated along with costs and installation effects as opposed to fuel consumption alone. Almost any conceivable turbine engine cycle can be studied. These computer codes are: NNEP, WATE, LIFCYC, INSTAL, and POD DRG. Examples are given to illustrate how these computer techniques can be applied to analyze and optimize propulsion system fuel consumption, weight and cost for representative types of aircraft and missions.
NASA Technical Reports Server (NTRS)
Skillen, Michael D.; Crossley, William A.
2008-01-01
This report documents a series of investigations to develop an approach for structural sizing of various morphing wing concepts. For the purposes of this report, a morphing wing is one whose planform can make significant shape changes in flight - increasing wing area by 50% or more from the lowest possible area, changing sweep 30 or more, and / or increasing aspect ratio by as much as 200% from the lowest possible value. These significant changes in geometry mean that the underlying load-bearing structure changes geometry. While most finite element analysis packages provide some sort of structural optimization capability, these codes are not amenable to making significant changes in the stiffness matrix to reflect the large morphing wing planform changes. The investigations presented here use a finite element code capable of aeroelastic analysis in three different optimization approaches -a "simultaneous analysis" approach, a "sequential" approach, and an "aggregate" approach.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinlan, D.; Yi, Q.; Buduc, R.
2005-02-17
ROSE is an object-oriented software infrastructure for source-to-source translation that provides an interface for programmers to write their own specialized translators for optimizing scientific applications. ROSE is a part of current research on telescoping languages, which provides optimizations of the use of libraries in scientific applications. ROSE defines approaches to extend the optimization techniques, common in well defined languages, to the optimization of scientific applications using well defined libraries. ROSE includes a rich set of tools for generating customized transformations to support optimization of applications codes. We currently support full C and C++ (including template instantiation etc.), with Fortran 90more » support under development as part of a collaboration and contract with Rice to use their version of the open source Open64 F90 front-end. ROSE represents an attempt to define an open compiler infrastructure to handle the full complexity of full scale DOE applications codes using the languages common to scientific computing within DOE. We expect that such an infrastructure will also be useful for the development of numerous tools that may then realistically expect to work on DOE full scale applications.« less
Optimization of lattice surgery is NP-hard
NASA Astrophysics Data System (ADS)
Herr, Daniel; Nori, Franco; Devitt, Simon J.
2017-09-01
The traditional method for computation in either the surface code or in the Raussendorf model is the creation of holes or "defects" within the encoded lattice of qubits that are manipulated via topological braiding to enact logic gates. However, this is not the only way to achieve universal, fault-tolerant computation. In this work, we focus on the lattice surgery representation, which realizes transversal logic operations without destroying the intrinsic 2D nearest-neighbor properties of the braid-based surface code and achieves universality without defects and braid-based logic. For both techniques there are open questions regarding the compilation and resource optimization of quantum circuits. Optimization in braid-based logic is proving to be difficult and the classical complexity associated with this problem has yet to be determined. In the context of lattice-surgery-based logic, we can introduce an optimality condition, which corresponds to a circuit with the lowest resource requirements in terms of physical qubits and computational time, and prove that the complexity of optimizing a quantum circuit in the lattice surgery model is NP-hard.
The Advanced Software Development and Commercialization Project
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gallopoulos, E.; Canfield, T.R.; Minkoff, M.
1990-09-01
This is the first of a series of reports pertaining to progress in the Advanced Software Development and Commercialization Project, a joint collaborative effort between the Center for Supercomputing Research and Development of the University of Illinois and the Computing and Telecommunications Division of Argonne National Laboratory. The purpose of this work is to apply techniques of parallel computing that were pioneered by University of Illinois researchers to mature computational fluid dynamics (CFD) and structural dynamics (SD) computer codes developed at Argonne. The collaboration in this project will bring this unique combination of expertise to bear, for the first time,more » on industrially important problems. By so doing, it will expose the strengths and weaknesses of existing techniques for parallelizing programs and will identify those problems that need to be solved in order to enable wide spread production use of parallel computers. Secondly, the increased efficiency of the CFD and SD codes themselves will enable the simulation of larger, more accurate engineering models that involve fluid and structural dynamics. In order to realize the above two goals, we are considering two production codes that have been developed at ANL and are widely used by both industry and Universities. These are COMMIX and WHAMS-3D. The first is a computational fluid dynamics code that is used for both nuclear reactor design and safety and as a design tool for the casting industry. The second is a three-dimensional structural dynamics code used in nuclear reactor safety as well as crashworthiness studies. These codes are currently available for both sequential and vector computers only. Our main goal is to port and optimize these two codes on shared memory multiprocessors. In so doing, we shall establish a process that can be followed in optimizing other sequential or vector engineering codes for parallel processors.« less
2007-07-16
issue to find a proper acquisition strategy and to optimize the algorithm. So far a two-stage acquisition algorithm based on the optical orthogonal...vol.5, May 11-15, 2003, pp. 3530-3534. [23] M. Weisenhorn and W. Hirt, "Robust noncoherent receiver exploiting UWB channel properties," in Proc. IEEE...PRF) and data rate, are programmable. I Depending on the propagation environments, either the Barker code or the optical orthogonal codes (OOC) [53
Lattice surgery on the Raussendorf lattice
NASA Astrophysics Data System (ADS)
Herr, Daniel; Paler, Alexandru; Devitt, Simon J.; Nori, Franco
2018-07-01
Lattice surgery is a method to perform quantum computation fault-tolerantly by using operations on boundary qubits between different patches of the planar code. This technique allows for universal planar code computation without eliminating the intrinsic two-dimensional nearest-neighbor properties of the surface code that eases physical hardware implementations. Lattice surgery approaches to algorithmic compilation and optimization have been demonstrated to be more resource efficient for resource-intensive components of a fault-tolerant algorithm, and consequently may be preferable over braid-based logic. Lattice surgery can be extended to the Raussendorf lattice, providing a measurement-based approach to the surface code. In this paper we describe how lattice surgery can be performed on the Raussendorf lattice and therefore give a viable alternative to computation using braiding in measurement-based implementations of topological codes.
A Hybrid Optimization Framework with POD-based Order Reduction and Design-Space Evolution Scheme
NASA Astrophysics Data System (ADS)
Ghoman, Satyajit S.
The main objective of this research is to develop an innovative multi-fidelity multi-disciplinary design, analysis and optimization suite that integrates certain solution generation codes and newly developed innovative tools to improve the overall optimization process. The research performed herein is divided into two parts: (1) the development of an MDAO framework by integration of variable fidelity physics-based computational codes, and (2) enhancements to such a framework by incorporating innovative features extending its robustness. The first part of this dissertation describes the development of a conceptual Multi-Fidelity Multi-Strategy and Multi-Disciplinary Design Optimization Environment (M3 DOE), in context of aircraft wing optimization. M 3 DOE provides the user a capability to optimize configurations with a choice of (i) the level of fidelity desired, (ii) the use of a single-step or multi-step optimization strategy, and (iii) combination of a series of structural and aerodynamic analyses. The modularity of M3 DOE allows it to be a part of other inclusive optimization frameworks. The M 3 DOE is demonstrated within the context of shape and sizing optimization of the wing of a Generic Business Jet aircraft. Two different optimization objectives, viz. dry weight minimization, and cruise range maximization are studied by conducting one low-fidelity and two high-fidelity optimization runs to demonstrate the application scope of M3 DOE. The second part of this dissertation describes the development of an innovative hybrid optimization framework that extends the robustness of M 3 DOE by employing a proper orthogonal decomposition-based design-space order reduction scheme combined with the evolutionary algorithm technique. The POD method of extracting dominant modes from an ensemble of candidate configurations is used for the design-space order reduction. The snapshot of candidate population is updated iteratively using evolutionary algorithm technique of fitness-driven retention. This strategy capitalizes on the advantages of evolutionary algorithm as well as POD-based reduced order modeling, while overcoming the shortcomings inherent with these techniques. When linked with M3 DOE, this strategy offers a computationally efficient methodology for problems with high level of complexity and a challenging design-space. This newly developed framework is demonstrated for its robustness on a nonconventional supersonic tailless air vehicle wing shape optimization problem.
Subsonic Aircraft With Regression and Neural-Network Approximators Designed
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Hopkins, Dale A.
2004-01-01
At the NASA Glenn Research Center, NASA Langley Research Center's Flight Optimization System (FLOPS) and the design optimization testbed COMETBOARDS with regression and neural-network-analysis approximators have been coupled to obtain a preliminary aircraft design methodology. For a subsonic aircraft, the optimal design, that is the airframe-engine combination, is obtained by the simulation. The aircraft is powered by two high-bypass-ratio engines with a nominal thrust of about 35,000 lbf. It is to carry 150 passengers at a cruise speed of Mach 0.8 over a range of 3000 n mi and to operate on a 6000-ft runway. The aircraft design utilized a neural network and a regression-approximations-based analysis tool, along with a multioptimizer cascade algorithm that uses sequential linear programming, sequential quadratic programming, the method of feasible directions, and then sequential quadratic programming again. Optimal aircraft weight versus the number of design iterations is shown. The central processing unit (CPU) time to solution is given. It is shown that the regression-method-based analyzer exhibited a smoother convergence pattern than the FLOPS code. The optimum weight obtained by the approximation technique and the FLOPS code differed by 1.3 percent. Prediction by the approximation technique exhibited no error for the aircraft wing area and turbine entry temperature, whereas it was within 2 percent for most other parameters. Cascade strategy was required by FLOPS as well as the approximators. The regression method had a tendency to hug the data points, whereas the neural network exhibited a propensity to follow a mean path. The performance of the neural network and regression methods was considered adequate. It was at about the same level for small, standard, and large models with redundancy ratios (defined as the number of input-output pairs to the number of unknown coefficients) of 14, 28, and 57, respectively. In an SGI octane workstation (Silicon Graphics, Inc., Mountainview, CA), the regression training required a fraction of a CPU second, whereas neural network training was between 1 and 9 min, as given. For a single analysis cycle, the 3-sec CPU time required by the FLOPS code was reduced to milliseconds by the approximators. For design calculations, the time with the FLOPS code was 34 min. It was reduced to 2 sec with the regression method and to 4 min by the neural network technique. The performance of the regression and neural network methods was found to be satisfactory for the analysis and design optimization of the subsonic aircraft.
Fitting Nonlinear Curves by use of Optimization Techniques
NASA Technical Reports Server (NTRS)
Hill, Scott A.
2005-01-01
MULTIVAR is a FORTRAN 77 computer program that fits one of the members of a set of six multivariable mathematical models (five of which are nonlinear) to a multivariable set of data. The inputs to MULTIVAR include the data for the independent and dependent variables plus the user s choice of one of the models, one of the three optimization engines, and convergence criteria. By use of the chosen optimization engine, MULTIVAR finds values for the parameters of the chosen model so as to minimize the sum of squares of the residuals. One of the optimization engines implements a routine, developed in 1982, that utilizes the Broydon-Fletcher-Goldfarb-Shanno (BFGS) variable-metric method for unconstrained minimization in conjunction with a one-dimensional search technique that finds the minimum of an unconstrained function by polynomial interpolation and extrapolation without first finding bounds on the solution. The second optimization engine is a faster and more robust commercially available code, denoted Design Optimization Tool, that also uses the BFGS method. The third optimization engine is a robust and relatively fast routine that implements the Levenberg-Marquardt algorithm.
NASA Technical Reports Server (NTRS)
Ngan, Angelen; Biezad, Daniel
1996-01-01
A study has been conducted to develop and to analyze a FORTRAN computer code for performing agility analysis on fighter aircraft configurations. This program is one of the modules of the NASA Ames ACSYNT (AirCraft SYNThesis) design code. The background of the agility research in the aircraft industry and a survey of a few agility metrics are discussed. The methodology, techniques, and models developed for the code are presented. The validity of the existing code was evaluated by comparing with existing flight test data. A FORTRAN program was developed for a specific metric, PM (Pointing Margin), as part of the agility module. Example trade studies using the agility module along with ACSYNT were conducted using a McDonnell Douglas F/A-18 Hornet aircraft model. Tile sensitivity of thrust loading, wing loading, and thrust vectoring on agility criteria were investigated. The module can compare the agility potential between different configurations and has capability to optimize agility performance in the preliminary design process. This research provides a new and useful design tool for analyzing fighter performance during air combat engagements in the preliminary design.
A motion compensation technique using sliced blocks and its application to hybrid video coding
NASA Astrophysics Data System (ADS)
Kondo, Satoshi; Sasai, Hisao
2005-07-01
This paper proposes a new motion compensation method using "sliced blocks" in DCT-based hybrid video coding. In H.264 ? MPEG-4 Advance Video Coding, a brand-new international video coding standard, motion compensation can be performed by splitting macroblocks into multiple square or rectangular regions. In the proposed method, on the other hand, macroblocks or sub-macroblocks are divided into two regions (sliced blocks) by an arbitrary line segment. The result is that the shapes of the segmented regions are not limited to squares or rectangles, allowing the shapes of the segmented regions to better match the boundaries between moving objects. Thus, the proposed method can improve the performance of the motion compensation. In addition, adaptive prediction of the shape according to the region shape of the surrounding macroblocks can reduce overheads to describe shape information in the bitstream. The proposed method also has the advantage that conventional coding techniques such as mode decision using rate-distortion optimization can be utilized, since coding processes such as frequency transform and quantization are performed on a macroblock basis, similar to the conventional coding methods. The proposed method is implemented in an H.264-based P-picture codec and an improvement in bit rate of 5% is confirmed in comparison with H.264.
NASA Astrophysics Data System (ADS)
Mielikainen, Jarno; Huang, Bormin; Huang, Allen
2015-10-01
The Thompson cloud microphysics scheme is a sophisticated cloud microphysics scheme in the Weather Research and Forecasting (WRF) model. The scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. Compared to the earlier microphysics schemes, the Thompson scheme incorporates a large number of improvements. Thus, we have optimized the speed of this important part of WRF. Intel Many Integrated Core (MIC) ushers in a new era of supercomputing speed, performance, and compatibility. It allows the developers to run code at trillions of calculations per second using the familiar programming model. In this paper, we present our results of optimizing the Thompson microphysics scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The coprocessor supports all important Intel development tools. Thus, the development environment is familiar one to a vast number of CPU developers. Although, getting a maximum performance out of MICs will require using some novel optimization techniques. New optimizations for an updated Thompson scheme are discusses in this paper. The optimizations improved the performance of the original Thompson code on Xeon Phi 7120P by a factor of 1.8x. Furthermore, the same optimizations improved the performance of the Thompson on a dual socket configuration of eight core Intel Xeon E5-2670 CPUs by a factor of 1.8x compared to the original Thompson code.
Cornut, P-L; Soldermann, Y; Robin, C; Barranco, R; Kerhoas, A; Burillon, C
2013-12-01
To report the financial impact of using modern lens and vitreoretinal surgical techniques. Bottom-up sterilization and consumables costs for new surgical techniques (microincisional coaxial phacoemulsification and transconjunctival sutureless vitrectomy) and the corresponding former techniques (phacoemulsification with 3.2-mm incision and 20G vitrectomy) were determined. These costs were compared to each other and to the target costs of the Diagnosis Related Groups for public hospitals (Groupes Homogènes de Séjours [GHS]) concerned, extracted from the analytic accounting data of the French National Cost Study (Étude Nationale des Coûts [ENC]) for 2009 (target=sum of sterilization costs posted under medical logistics, consumables, implantable medical devices, and special pharmaceuticals posted as direct expenses). For outpatient lens surgery with or without vitrectomy (GHS code: 02C05J): the ENC's target cost for 2009 was 339€ out of a total of 1432€. The cost detailed in this study was 4 % higher than the target cost when the procedure was performed using the former technique (3.2mm sutured incision) and 12 % lower when the procedure was performed using the new technique (1.8mm sutureless) after removing now unnecessary consumables and optimization of the technique. For level I retinal detachment surgeries (GHS code: 02C021): the ENC's 2009 target cost was 641€ out of a total of 3091€. The cost specified in this study was 1 % lower than the target cost when the procedure was done using the former technique (20-G vitrectomy) and 16 % less when the procedure was performed using the new technique (transconjunctival vitrectomy) after removal of now unnecessary consumables and optimization of the technique. Contrary to generally accepted ideas, implementing modern techniques in ocular surgery can result in direct cost and sterilization savings when the operator takes advantage of the possibilities these techniques offer in terms of simplification of the procedures to do away with consumables that are no longer necessary. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Random search optimization based on genetic algorithm and discriminant function
NASA Technical Reports Server (NTRS)
Kiciman, M. O.; Akgul, M.; Erarslanoglu, G.
1990-01-01
The general problem of optimization with arbitrary merit and constraint functions, which could be convex, concave, monotonic, or non-monotonic, is treated using stochastic methods. To improve the efficiency of the random search methods, a genetic algorithm for the search phase and a discriminant function for the constraint-control phase were utilized. The validity of the technique is demonstrated by comparing the results to published test problem results. Numerical experimentation indicated that for cases where a quick near optimum solution is desired, a general, user-friendly optimization code can be developed without serious penalties in both total computer time and accuracy.
Ibrahim, Khaled Z.; Madduri, Kamesh; Williams, Samuel; ...
2013-07-18
The Gyrokinetic Toroidal Code (GTC) uses the particle-in-cell method to efficiently simulate plasma microturbulence. This paper presents novel analysis and optimization techniques to enhance the performance of GTC on large-scale machines. We introduce cell access analysis to better manage locality vs. synchronization tradeoffs on CPU and GPU-based architectures. Finally, our optimized hybrid parallel implementation of GTC uses MPI, OpenMP, and NVIDIA CUDA, achieves up to a 2× speedup over the reference Fortran version on multiple parallel systems, and scales efficiently to tens of thousands of cores.
2-Step scalar deadzone quantization for bitplane image coding.
Auli-Llinas, Francesc
2013-12-01
Modern lossy image coding systems generate a quality progressive codestream that, truncated at increasing rates, produces an image with decreasing distortion. Quality progressivity is commonly provided by an embedded quantizer that employs uniform scalar deadzone quantization (USDQ) together with a bitplane coding strategy. This paper introduces a 2-step scalar deadzone quantization (2SDQ) scheme that achieves same coding performance as that of USDQ while reducing the coding passes and the emitted symbols of the bitplane coding engine. This serves to reduce the computational costs of the codec and/or to code high dynamic range images. The main insights behind 2SDQ are the use of two quantization step sizes that approximate wavelet coefficients with more or less precision depending on their density, and a rate-distortion optimization technique that adjusts the distortion decreases produced when coding 2SDQ indexes. The integration of 2SDQ in current codecs is straightforward. The applicability and efficiency of 2SDQ are demonstrated within the framework of JPEG2000.
Optimizing the Galileo space communication link
NASA Technical Reports Server (NTRS)
Statman, J. I.
1994-01-01
The Galileo mission was originally designed to investigate Jupiter and its moons utilizing a high-rate, X-band (8415 MHz) communication downlink with a maximum rate of 134.4 kb/sec. However, following the failure of the high-gain antenna (HGA) to fully deploy, a completely new communication link design was established that is based on Galileo's S-band (2295 MHz), low-gain antenna (LGA). The new link relies on data compression, local and intercontinental arraying of antennas, a (14,1/4) convolutional code, a (255,M) variable-redundancy Reed-Solomon code, decoding feedback, and techniques to reprocess recorded data to greatly reduce data losses during signal acquisition. The combination of these techniques will enable return of significant science data from the mission.
Robust Transmission of H.264/AVC Streams Using Adaptive Group Slicing and Unequal Error Protection
NASA Astrophysics Data System (ADS)
Thomos, Nikolaos; Argyropoulos, Savvas; Boulgouris, Nikolaos V.; Strintzis, Michael G.
2006-12-01
We present a novel scheme for the transmission of H.264/AVC video streams over lossy packet networks. The proposed scheme exploits the error-resilient features of H.264/AVC codec and employs Reed-Solomon codes to protect effectively the streams. A novel technique for adaptive classification of macroblocks into three slice groups is also proposed. The optimal classification of macroblocks and the optimal channel rate allocation are achieved by iterating two interdependent steps. Dynamic programming techniques are used for the channel rate allocation process in order to reduce complexity. Simulations clearly demonstrate the superiority of the proposed method over other recent algorithms for transmission of H.264/AVC streams.
Helium: lifting high-performance stencil kernels from stripped x86 binaries to halide DSL code
Mendis, Charith; Bosboom, Jeffrey; Wu, Kevin; ...
2015-06-03
Highly optimized programs are prone to bit rot, where performance quickly becomes suboptimal in the face of new hardware and compiler techniques. In this paper we show how to automatically lift performance-critical stencil kernels from a stripped x86 binary and generate the corresponding code in the high-level domain-specific language Halide. Using Halide's state-of-the-art optimizations targeting current hardware, we show that new optimized versions of these kernels can replace the originals to rejuvenate the application for newer hardware. The original optimized code for kernels in stripped binaries is nearly impossible to analyze statically. Instead, we rely on dynamic traces to regeneratemore » the kernels. We perform buffer structure reconstruction to identify input, intermediate and output buffer shapes. Here, we abstract from a forest of concrete dependency trees which contain absolute memory addresses to symbolic trees suitable for high-level code generation. This is done by canonicalizing trees, clustering them based on structure, inferring higher-dimensional buffer accesses and finally by solving a set of linear equations based on buffer accesses to lift them up to simple, high-level expressions. Helium can handle highly optimized, complex stencil kernels with input-dependent conditionals. We lift seven kernels from Adobe Photoshop giving a 75 % performance improvement, four kernels from Irfan View, leading to 4.97 x performance, and one stencil from the mini GMG multigrid benchmark netting a 4.25 x improvement in performance. We manually rejuvenated Photoshop by replacing eleven of Photoshop's filters with our lifted implementations, giving 1.12 x speedup without affecting the user experience.« less
An Advanced simulation Code for Modeling Inductive Output Tubes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thuc Bui; R. Lawrence Ives
2012-04-27
During the Phase I program, CCR completed several major building blocks for a 3D large signal, inductive output tube (IOT) code using modern computer language and programming techniques. These included a 3D, Helmholtz, time-harmonic, field solver with a fully functional graphical user interface (GUI), automeshing and adaptivity. Other building blocks included the improved electrostatic Poisson solver with temporal boundary conditions to provide temporal fields for the time-stepping particle pusher as well as the self electric field caused by time-varying space charge. The magnetostatic field solver was also updated to solve for the self magnetic field caused by time changing currentmore » density in the output cavity gap. The goal function to optimize an IOT cavity was also formulated, and the optimization methodologies were investigated.« less
A survey of compiler development aids. [concerning lexical, syntax, and semantic analysis
NASA Technical Reports Server (NTRS)
Buckles, B. P.; Hodges, B. C.; Hsia, P.
1977-01-01
A theoretical background was established for the compilation process by dividing it into five phases and explaining the concepts and algorithms that underpin each. The five selected phases were lexical analysis, syntax analysis, semantic analysis, optimization, and code generation. Graph theoretical optimization techniques were presented, and approaches to code generation were described for both one-pass and multipass compilation environments. Following the initial tutorial sections, more than 20 tools that were developed to aid in the process of writing compilers were surveyed. Eight of the more recent compiler development aids were selected for special attention - SIMCMP/STAGE2, LANG-PAK, COGENT, XPL, AED, CWIC, LIS, and JOCIT. The impact of compiler development aids were assessed some of their shortcomings and some of the areas of research currently in progress were inspected.
Fault trees for decision making in systems analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lambert, Howard E.
1975-10-09
The application of fault tree analysis (FTA) to system safety and reliability is presented within the framework of system safety analysis. The concepts and techniques involved in manual and automated fault tree construction are described and their differences noted. The theory of mathematical reliability pertinent to FTA is presented with emphasis on engineering applications. An outline of the quantitative reliability techniques of the Reactor Safety Study is given. Concepts of probabilistic importance are presented within the fault tree framework and applied to the areas of system design, diagnosis and simulation. The computer code IMPORTANCE ranks basic events and cut setsmore » according to a sensitivity analysis. A useful feature of the IMPORTANCE code is that it can accept relative failure data as input. The output of the IMPORTANCE code can assist an analyst in finding weaknesses in system design and operation, suggest the most optimal course of system upgrade, and determine the optimal location of sensors within a system. A general simulation model of system failure in terms of fault tree logic is described. The model is intended for efficient diagnosis of the causes of system failure in the event of a system breakdown. It can also be used to assist an operator in making decisions under a time constraint regarding the future course of operations. The model is well suited for computer implementation. New results incorporated in the simulation model include an algorithm to generate repair checklists on the basis of fault tree logic and a one-step-ahead optimization procedure that minimizes the expected time to diagnose system failure.« less
Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide
Tang, William; Wang, Bei; Ethier, Stephane; ...
2016-11-01
The goal of the extreme scale plasma turbulence studies described in this paper is to expedite the delivery of reliable predictions on confinement physics in large magnetic fusion systems by using world-class supercomputers to carry out simulations with unprecedented resolution and temporal duration. This has involved architecture-dependent optimizations of performance scaling and addressing code portability and energy issues, with the metrics for multi-platform comparisons being 'time-to-solution' and 'energy-to-solution'. Realistic results addressing how confinement losses caused by plasma turbulence scale from present-day devices to the much larger $25 billion international ITER fusion facility have been enabled by innovative advances in themore » GTC-P code including (i) implementation of one-sided communication from MPI 3.0 standard; (ii) creative optimization techniques on Xeon Phi processors; and (iii) development of a novel performance model for the key kernels of the PIC code. Our results show that modeling data movement is sufficient to predict performance on modern supercomputer platforms.« less
Autonomous Modelling of X-ray Spectra Using Robust Global Optimization Methods
NASA Astrophysics Data System (ADS)
Rogers, Adam; Safi-Harb, Samar; Fiege, Jason
2015-08-01
The standard approach to model fitting in X-ray astronomy is by means of local optimization methods. However, these local optimizers suffer from a number of problems, such as a tendency for the fit parameters to become trapped in local minima, and can require an involved process of detailed user intervention to guide them through the optimization process. In this work we introduce a general GUI-driven global optimization method for fitting models to X-ray data, written in MATLAB, which searches for optimal models with minimal user interaction. We directly interface with the commonly used XSPEC libraries to access the full complement of pre-existing spectral models that describe a wide range of physics appropriate for modelling astrophysical sources, including supernova remnants and compact objects. Our algorithm is powered by the Ferret genetic algorithm and Locust particle swarm optimizer from the Qubist Global Optimization Toolbox, which are robust at finding families of solutions and identifying degeneracies. This technique will be particularly instrumental for multi-parameter models and high-fidelity data. In this presentation, we provide details of the code and use our techniques to analyze X-ray data obtained from a variety of astrophysical sources.
NASA Technical Reports Server (NTRS)
Navon, I. M.
1984-01-01
A Lagrange multiplier method using techniques developed by Bertsekas (1982) was applied to solving the problem of enforcing simultaneous conservation of the nonlinear integral invariants of the shallow water equations on a limited area domain. This application of nonlinear constrained optimization is of the large dimensional type and the conjugate gradient method was found to be the only computationally viable method for the unconstrained minimization. Several conjugate-gradient codes were tested and compared for increasing accuracy requirements. Robustness and computational efficiency were the principal criteria.
Numerical study of combustion processes in afterburners
NASA Technical Reports Server (NTRS)
Zhou, Xiaoqing; Zhang, Xiaochun
1986-01-01
Mathematical models and numerical methods are presented for computer modeling of aeroengine afterburners. A computer code GEMCHIP is described briefly. The algorithms SIMPLER, for gas flow predictions, and DROPLET, for droplet flow calculations, are incorporated in this code. The block correction technique is adopted to facilitate convergence. The method of handling irregular shapes of combustors and flameholders is described. The predicted results for a low-bypass-ratio turbofan afterburner in the cases of gaseous combustion and multiphase spray combustion are provided and analyzed, and engineering guides for afterburner optimization are presented.
Vectorization, threading, and cache-blocking considerations for hydrocodes on emerging architectures
Fung, J.; Aulwes, R. T.; Bement, M. T.; ...
2015-07-14
This work reports on considerations for improving computational performance in preparation for current and expected changes to computer architecture. The algorithms studied will include increasingly complex prototypes for radiation hydrodynamics codes, such as gradient routines and diffusion matrix assembly (e.g., in [1-6]). The meshes considered for the algorithms are structured or unstructured meshes. The considerations applied for performance improvements are meant to be general in terms of architecture (not specifically graphical processing unit (GPUs) or multi-core machines, for example) and include techniques for vectorization, threading, tiling, and cache blocking. Out of a survey of optimization techniques on applications such asmore » diffusion and hydrodynamics, we make general recommendations with a view toward making these techniques conceptually accessible to the applications code developer. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.« less
NASA Astrophysics Data System (ADS)
He, Lirong; Cui, Guangmang; Feng, Huajun; Xu, Zhihai; Li, Qi; Chen, Yueting
2015-03-01
Coded exposure photography makes the motion de-blurring a well-posed problem. The integration pattern of light is modulated using the method of coded exposure by opening and closing the shutter within the exposure time, changing the traditional shutter frequency spectrum into a wider frequency band in order to preserve more image information in frequency domain. The searching method of optimal code is significant for coded exposure. In this paper, an improved criterion of the optimal code searching is proposed by analyzing relationship between code length and the number of ones in the code, considering the noise effect on code selection with the affine noise model. Then the optimal code is obtained utilizing the method of genetic searching algorithm based on the proposed selection criterion. Experimental results show that the time consuming of searching optimal code decreases with the presented method. The restoration image is obtained with better subjective experience and superior objective evaluation values.
Applications Performance Under MPL and MPI on NAS IBM SP2
NASA Technical Reports Server (NTRS)
Saini, Subhash; Simon, Horst D.; Lasinski, T. A. (Technical Monitor)
1994-01-01
On July 5, 1994, an IBM Scalable POWER parallel System (IBM SP2) with 64 nodes, was installed at the Numerical Aerodynamic Simulation (NAS) Facility Each node of NAS IBM SP2 is a "wide node" consisting of a RISC 6000/590 workstation module with a clock of 66.5 MHz which can perform four floating point operations per clock with a peak performance of 266 Mflop/s. By the end of 1994, 64 nodes of IBM SP2 will be upgraded to 160 nodes with a peak performance of 42.5 Gflop/s. An overview of the IBM SP2 hardware is presented. The basic understanding of architectural details of RS 6000/590 will help application scientists the porting, optimizing, and tuning of codes from other machines such as the CRAY C90 and the Paragon to the NAS SP2. Optimization techniques such as quad-word loading, effective utilization of two floating point units, and data cache optimization of RS 6000/590 is illustrated, with examples giving performance gains at each optimization step. The conversion of codes using Intel's message passing library NX to codes using native Message Passing Library (MPL) and the Message Passing Interface (NMI) library available on the IBM SP2 is illustrated. In particular, we will present the performance of Fast Fourier Transform (FFT) kernel from NAS Parallel Benchmarks (NPB) under MPL and MPI. We have also optimized some of Fortran BLAS 2 and BLAS 3 routines, e.g., the optimized Fortran DAXPY runs at 175 Mflop/s and optimized Fortran DGEMM runs at 230 Mflop/s per node. The performance of the NPB (Class B) on the IBM SP2 is compared with the CRAY C90, Intel Paragon, TMC CM-5E, and the CRAY T3D.
Link Correlation Based Transmit Sector Antenna Selection for Alamouti Coded OFDM
NASA Astrophysics Data System (ADS)
Ahn, Chang-Jun
In MIMO systems, the deployment of a multiple antenna technique can enhance the system performance. However, since the cost of RF transmitters is much higher than that of antennas, there is growing interest in techniques that use a larger number of antennas than the number of RF transmitters. These methods rely on selecting the optimal transmitter antennas and connecting them to the respective. In this case, feedback information (FBI) is required to select the optimal transmitter antenna elements. Since FBI is control overhead, the rate of the feedback is limited. This motivates the study of limited feedback techniques where only partial or quantized information from the receiver is conveyed back to the transmitter. However, in MIMO/OFDM systems, it is difficult to develop an effective FBI quantization method for choosing the space-time, space-frequency, or space-time-frequency processing due to the numerous subchannels. Moreover, MIMO/OFDM systems require antenna separation of 5 ∼ 10 wavelengths to keep the correlation coefficient below 0.7 to achieve a diversity gain. In this case, the base station requires a large space to set up multiple antennas. To reduce these problems, in this paper, we propose the link correlation based transmit sector antenna selection for Alamouti coded OFDM without FBI.
Large-scale structural optimization
NASA Technical Reports Server (NTRS)
Sobieszczanski-Sobieski, J.
1983-01-01
Problems encountered by aerospace designers in attempting to optimize whole aircraft are discussed, along with possible solutions. Large scale optimization, as opposed to component-by-component optimization, is hindered by computational costs, software inflexibility, concentration on a single, rather than trade-off, design methodology and the incompatibility of large-scale optimization with single program, single computer methods. The software problem can be approached by placing the full analysis outside of the optimization loop. Full analysis is then performed only periodically. Problem-dependent software can be removed from the generic code using a systems programming technique, and then embody the definitions of design variables, objective function and design constraints. Trade-off algorithms can be used at the design points to obtain quantitative answers. Finally, decomposing the large-scale problem into independent subproblems allows systematic optimization of the problems by an organization of people and machines.
User's manual for the BNW-I optimization code for dry-cooled power plants. [AMCIRC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Braun, D.J.; Daniel, D.J.; De Mier, W.V.
1977-01-01
This appendix provides a listing, called Program AMCIRC, of the BNW-1 optimization code for determining, for a particular size power plant, the optimum dry cooling tower design using ammonia flow in the heat exchanger tubes. The optimum design is determined by repeating the design of the cooling system over a range of design conditions in order to find the cooling system with the smallest incremental cost. This is accomplished by varying five parameters of the plant and cooling system over ranges of values. These parameters are varied systematically according to techniques that perform pattern and gradient searches. The dry coolingmore » system optimized by program AMCIRC is composed of a condenser/reboiler (condensation of steam and boiling of ammonia), piping system (transports ammonia vapor out and ammonia liquid from the dry cooling towers), and circular tower system (vertical one-pass heat exchangers situated in circular configurations with cocurrent ammonia flow in the tubes of the heat exchanger). (LCL)« less
Schmidt, M K; van Leeuwen, F E; Klaren, H M; Tollenaar, R A; van 't Veer, L J
2004-03-20
To answer research questions concerning the course of disease and the optimal treatment of hereditary breast cancer, genetic typing together with the clinical and tumour characteristics of breast cancer patients are an important source of information. Part of the incidence of breast cancer can be explained by BRCA1 and BRCA2 germline mutations, which with current techniques can be retrospectively analysed in stored, paraffin-embedded tissue samples. In view of the implications of BRCA1- or BRCA2-carrier status for patients and other family members and the lack of clear legal regulations regarding the procedures to be followed when analysis is performed on historical material and no individual informed consent can be asked from the patients, an appropriate procedure for coding such data or rendering it anonymous is of great importance. By using the coding procedure described in this article, it becomes possible to follow and to work out in greater detail the guidelines of the code for 'Proper secondary use of human tissue' of the Federation of Biomedical Scientific Societies and to use these valuable databases again in the future.
A programmable metasurface with dynamic polarization, scattering and focusing control
NASA Astrophysics Data System (ADS)
Yang, Huanhuan; Cao, Xiangyu; Yang, Fan; Gao, Jun; Xu, Shenheng; Li, Maokun; Chen, Xibi; Zhao, Yi; Zheng, Yuejun; Li, Sijia
2016-10-01
Diverse electromagnetic (EM) responses of a programmable metasurface with a relatively large scale have been investigated, where multiple functionalities are obtained on the same surface. The unit cell in the metasurface is integrated with one PIN diode, and thus a binary coded phase is realized for a single polarization. Exploiting this anisotropic characteristic, reconfigurable polarization conversion is presented first. Then the dynamic scattering performance for two kinds of sources, i.e. a plane wave and a point source, is carefully elaborated. To tailor the scattering properties, genetic algorithm, normally based on binary coding, is coupled with the scattering pattern analysis to optimize the coding matrix. Besides, inverse fast Fourier transform (IFFT) technique is also introduced to expedite the optimization process of a large metasurface. Since the coding control of each unit cell allows a local and direct modulation of EM wave, various EM phenomena including anomalous reflection, diffusion, beam steering and beam forming are successfully demonstrated by both simulations and experiments. It is worthwhile to point out that a real-time switch among these functionalities is also achieved by using a field-programmable gate array (FPGA). All the results suggest that the proposed programmable metasurface has great potentials for future applications.
A programmable metasurface with dynamic polarization, scattering and focusing control
Yang, Huanhuan; Cao, Xiangyu; Yang, Fan; Gao, Jun; Xu, Shenheng; Li, Maokun; Chen, Xibi; Zhao, Yi; Zheng, Yuejun; Li, Sijia
2016-01-01
Diverse electromagnetic (EM) responses of a programmable metasurface with a relatively large scale have been investigated, where multiple functionalities are obtained on the same surface. The unit cell in the metasurface is integrated with one PIN diode, and thus a binary coded phase is realized for a single polarization. Exploiting this anisotropic characteristic, reconfigurable polarization conversion is presented first. Then the dynamic scattering performance for two kinds of sources, i.e. a plane wave and a point source, is carefully elaborated. To tailor the scattering properties, genetic algorithm, normally based on binary coding, is coupled with the scattering pattern analysis to optimize the coding matrix. Besides, inverse fast Fourier transform (IFFT) technique is also introduced to expedite the optimization process of a large metasurface. Since the coding control of each unit cell allows a local and direct modulation of EM wave, various EM phenomena including anomalous reflection, diffusion, beam steering and beam forming are successfully demonstrated by both simulations and experiments. It is worthwhile to point out that a real-time switch among these functionalities is also achieved by using a field-programmable gate array (FPGA). All the results suggest that the proposed programmable metasurface has great potentials for future applications. PMID:27774997
A programmable metasurface with dynamic polarization, scattering and focusing control.
Yang, Huanhuan; Cao, Xiangyu; Yang, Fan; Gao, Jun; Xu, Shenheng; Li, Maokun; Chen, Xibi; Zhao, Yi; Zheng, Yuejun; Li, Sijia
2016-10-24
Diverse electromagnetic (EM) responses of a programmable metasurface with a relatively large scale have been investigated, where multiple functionalities are obtained on the same surface. The unit cell in the metasurface is integrated with one PIN diode, and thus a binary coded phase is realized for a single polarization. Exploiting this anisotropic characteristic, reconfigurable polarization conversion is presented first. Then the dynamic scattering performance for two kinds of sources, i.e. a plane wave and a point source, is carefully elaborated. To tailor the scattering properties, genetic algorithm, normally based on binary coding, is coupled with the scattering pattern analysis to optimize the coding matrix. Besides, inverse fast Fourier transform (IFFT) technique is also introduced to expedite the optimization process of a large metasurface. Since the coding control of each unit cell allows a local and direct modulation of EM wave, various EM phenomena including anomalous reflection, diffusion, beam steering and beam forming are successfully demonstrated by both simulations and experiments. It is worthwhile to point out that a real-time switch among these functionalities is also achieved by using a field-programmable gate array (FPGA). All the results suggest that the proposed programmable metasurface has great potentials for future applications.
Speech coding at low to medium bit rates
NASA Astrophysics Data System (ADS)
Leblanc, Wilfred Paul
1992-09-01
Improved search techniques coupled with improved codebook design methodologies are proposed to improve the performance of conventional code-excited linear predictive coders for speech. Improved methods for quantizing the short term filter are developed by employing a tree search algorithm and joint codebook design to multistage vector quantization. Joint codebook design procedures are developed to design locally optimal multistage codebooks. Weighting during centroid computation is introduced to improve the outlier performance of the multistage vector quantizer. Multistage vector quantization is shown to be both robust against input characteristics and in the presence of channel errors. Spectral distortions of about 1 dB are obtained at rates of 22-28 bits/frame. Structured codebook design procedures for excitation in code-excited linear predictive coders are compared to general codebook design procedures. Little is lost using significant structure in the excitation codebooks while greatly reducing the search complexity. Sparse multistage configurations are proposed for reducing computational complexity and memory size. Improved search procedures are applied to code-excited linear prediction which attempt joint optimization of the short term filter, the adaptive codebook, and the excitation. Improvements in signal to noise ratio of 1-2 dB are realized in practice.
Flow range enhancement by secondary flow effect in low solidity circular cascade diffusers
NASA Astrophysics Data System (ADS)
Sakaguchi, Daisaku; Tun, Min Thaw; Mizokoshi, Kanata; Kishikawa, Daiki
2014-08-01
High-pressure ratio and wide operating range are highly required for compressors and blowers. The technical issue of the design is achievement of suppression of flow separation at small flow rate without deteriorating the efficiency at design flow rate. A numerical simulation is very effective in design procedure, however, cost of the numerical simulation is generally high during the practical design process, and it is difficult to confirm the optimal design which is combined with many parameters. A multi-objective optimization technique is the idea that has been proposed for solving the problem in practical design process. In this study, a Low Solidity circular cascade Diffuser (LSD) in a centrifugal blower is successfully designed by means of multi-objective optimization technique. An optimization code with a meta-model assisted evolutionary algorithm is used with a commercial CFD code ANSYS-CFX. The optimization is aiming at improving the static pressure coefficient at design point and at low flow rate condition while constraining the slope of the lift coefficient curve. Moreover, a small tip clearance of the LSD blade was applied in order to activate and to stabilize the secondary flow effect at small flow rate condition. The optimized LSD blade has an extended operating range of 114 % towards smaller flow rate as compared to the baseline design without deteriorating the diffuser pressure recovery at design point. The diffuser pressure rise and operating flow range of the optimized LSD blade are experimentally verified by overall performance test. The detailed flow in the diffuser is also confirmed by means of a Particle Image Velocimeter. Secondary flow is clearly captured by PIV and it spreads to the whole area of LSD blade pitch. It is found that the optimized LSD blade shows good improvement of the blade loading in the whole operating range, while at small flow rate the flow separation on the LSD blade has been successfully suppressed by the secondary flow effect.
Optimization techniques applied to passive measures for in-orbit spacecraft survivability
NASA Technical Reports Server (NTRS)
Mog, Robert A.; Price, D. Marvin
1991-01-01
Spacecraft designers have always been concerned about the effects of meteoroid impacts on mission safety. The engineering solution to this problem has generally been to erect a bumper or shield placed outboard from the spacecraft wall to disrupt/deflect the incoming projectiles. Spacecraft designers have a number of tools at their disposal to aid in the design process. These include hypervelocity impact testing, analytic impact predictors, and hydrodynamic codes. Analytic impact predictors generally provide the best quick-look estimate of design tradeoffs. The most complete way to determine the characteristics of an analytic impact predictor is through optimization of the protective structures design problem formulated with the predictor of interest. Space Station Freedom protective structures design insight is provided through the coupling of design/material requirements, hypervelocity impact phenomenology, meteoroid and space debris environment sensitivities, optimization techniques and operations research strategies, and mission scenarios. Major results are presented.
Live-cell imaging of budding yeast telomerase RNA and TERRA.
Laprade, Hadrien; Lalonde, Maxime; Guérit, David; Chartrand, Pascal
2017-02-01
In most eukaryotes, the ribonucleoprotein complex telomerase is responsible for maintaining telomere length. In recent years, single-cell microscopy techniques such as fluorescent in situ hybridization and live-cell imaging have been developed to image the RNA subunit of the telomerase holoenzyme. These techniques are now becoming important tools for the study of telomerase biogenesis, its association with telomeres and its regulation. Here, we present detailed protocols for live-cell imaging of the Saccharomyces cerevisiae telomerase RNA subunit, called TLC1, and also of the non-coding telomeric repeat-containing RNA TERRA. We describe the approach used for genomic integration of MS2 stem-loops in these transcripts, and provide information for optimal live-cell imaging of these non-coding RNAs. Copyright © 2016 Elsevier Inc. All rights reserved.
Non-parametric PCM to ADM conversion. [Pulse Code to Adaptive Delta Modulation
NASA Technical Reports Server (NTRS)
Locicero, J. L.; Schilling, D. L.
1977-01-01
An all-digital technique to convert pulse code modulated (PCM) signals into adaptive delta modulation (ADM) format is presented. The converter developed is shown to be independent of the statistical parameters of the encoded signal and can be constructed with only standard digital hardware. The structure of the converter is simple enough to be fabricated on a large scale integrated circuit where the advantages of reliability and cost can be optimized. A concise evaluation of this PCM to ADM translation technique is presented and several converters are simulated on a digital computer. A family of performance curves is given which displays the signal-to-noise ratio for sinusoidal test signals subjected to the conversion process, as a function of input signal power for several ratios of ADM rate to Nyquist rate.
NASA Technical Reports Server (NTRS)
Noble, Viveca K.
1994-01-01
When data is transmitted through a noisy channel, errors are produced within the data rendering it indecipherable. Through the use of error control coding techniques, the bit error rate can be reduced to any desired level without sacrificing the transmission data rate. The Astrionics Laboratory at Marshall Space Flight Center has decided to use a modular, end-to-end telemetry data simulator to simulate the transmission of data from flight to ground and various methods of error control. The simulator includes modules for random data generation, data compression, Consultative Committee for Space Data Systems (CCSDS) transfer frame formation, error correction/detection, error generation and error statistics. The simulator utilizes a concatenated coding scheme which includes CCSDS standard (255,223) Reed-Solomon (RS) code over GF(2(exp 8)) with interleave depth of 5 as the outermost code, (7, 1/2) convolutional code as an inner code and CCSDS recommended (n, n-16) cyclic redundancy check (CRC) code as the innermost code, where n is the number of information bits plus 16 parity bits. The received signal-to-noise for a desired bit error rate is greatly reduced through the use of forward error correction techniques. Even greater coding gain is provided through the use of a concatenated coding scheme. Interleaving/deinterleaving is necessary to randomize burst errors which may appear at the input of the RS decoder. The burst correction capability length is increased in proportion to the interleave depth. The modular nature of the simulator allows for inclusion or exclusion of modules as needed. This paper describes the development and operation of the simulator, the verification of a C-language Reed-Solomon code, and the possibility of using Comdisco SPW(tm) as a tool for determining optimal error control schemes.
Development of Multiobjective Optimization Techniques for Sonic Boom Minimization
NASA Technical Reports Server (NTRS)
Chattopadhyay, Aditi; Rajadas, John Narayan; Pagaldipti, Naryanan S.
1996-01-01
A discrete, semi-analytical sensitivity analysis procedure has been developed for calculating aerodynamic design sensitivities. The sensitivities of the flow variables and the grid coordinates are numerically calculated using direct differentiation of the respective discretized governing equations. The sensitivity analysis techniques are adapted within a parabolized Navier Stokes equations solver. Aerodynamic design sensitivities for high speed wing-body configurations are calculated using the semi-analytical sensitivity analysis procedures. Representative results obtained compare well with those obtained using the finite difference approach and establish the computational efficiency and accuracy of the semi-analytical procedures. Multidisciplinary design optimization procedures have been developed for aerospace applications namely, gas turbine blades and high speed wing-body configurations. In complex applications, the coupled optimization problems are decomposed into sublevels using multilevel decomposition techniques. In cases with multiple objective functions, formal multiobjective formulation such as the Kreisselmeier-Steinhauser function approach and the modified global criteria approach have been used. Nonlinear programming techniques for continuous design variables and a hybrid optimization technique, based on a simulated annealing algorithm, for discrete design variables have been used for solving the optimization problems. The optimization procedure for gas turbine blades improves the aerodynamic and heat transfer characteristics of the blades. The two-dimensional, blade-to-blade aerodynamic analysis is performed using a panel code. The blade heat transfer analysis is performed using an in-house developed finite element procedure. The optimization procedure yields blade shapes with significantly improved velocity and temperature distributions. The multidisciplinary design optimization procedures for high speed wing-body configurations simultaneously improve the aerodynamic, the sonic boom and the structural characteristics of the aircraft. The flow solution is obtained using a comprehensive parabolized Navier Stokes solver. Sonic boom analysis is performed using an extrapolation procedure. The aircraft wing load carrying member is modeled as either an isotropic or a composite box beam. The isotropic box beam is analyzed using thin wall theory. The composite box beam is analyzed using a finite element procedure. The developed optimization procedures yield significant improvements in all the performance criteria and provide interesting design trade-offs. The semi-analytical sensitivity analysis techniques offer significant computational savings and allow the use of comprehensive analysis procedures within design optimization studies.
NASA Technical Reports Server (NTRS)
Dolinar, S.
1988-01-01
Over the past six to eight years, an extensive research effort was conducted to investigate advanced coding techniques which promised to yield more coding gain than is available with current NASA standard codes. The delay in Galileo's launch due to the temporary suspension of the shuttle program provided the Galileo project with an opportunity to evaluate the possibility of including some version of the advanced codes as a mission enhancement option. A study was initiated last summer to determine if substantial coding gain was feasible for Galileo and, is so, to recommend a suitable experimental code for use as a switchable alternative to the current NASA-standard code. The Galileo experimental code study resulted in the selection of a code with constant length 15 and rate 1/4. The code parameters were chosen to optimize performance within cost and risk constraints consistent with retrofitting the new code into the existing Galileo system design and launch schedule. The particular code was recommended after a very limited search among good codes with the chosen parameters. It will theoretically yield about 1.5 dB enhancement under idealizing assumptions relative to the current NASA-standard code at Galileo's desired bit error rates. This ideal predicted gain includes enough cushion to meet the project's target of at least 1 dB enhancement under real, non-ideal conditions.
Fast and accurate modeling of stray light in optical systems
NASA Astrophysics Data System (ADS)
Perrin, Jean-Claude
2017-11-01
The first problem to be solved in most optical designs with respect to stray light is that of internal reflections on the several surfaces of individual lenses and mirrors, and on the detector itself. The level of stray light ratio can be considerably reduced by taking into account the stray light during the optimization to determine solutions in which the irradiance due to these ghosts is kept to the minimum possible value. Unhappily, the routines available in most optical design software's, for example CODE V, do not permit all alone to make exact quantitative calculations of the stray light due to these ghosts. Therefore, the engineer in charge of the optical design is confronted to the problem of using two different software's, one for the design and optimization, for example CODE V, one for stray light analysis, for example ASAP. This makes a complete optimization very complex . Nevertheless, using special techniques and combinations of the routines available in CODE V, it is possible to have at its disposal a software macro tool to do such an analysis quickly and accurately, including Monte-Carlo ray tracing, or taking into account diffraction effects. This analysis can be done in a few minutes, to be compared to hours with other software's.
Optimization of computations for adjoint field and Jacobian needed in 3D CSEM inversion
NASA Astrophysics Data System (ADS)
Dehiya, Rahul; Singh, Arun; Gupta, Pravin K.; Israil, M.
2017-01-01
We present the features and results of a newly developed code, based on Gauss-Newton optimization technique, for solving three-dimensional Controlled-Source Electromagnetic inverse problem. In this code a special emphasis has been put on representing the operations by block matrices for conjugate gradient iteration. We show how in the computation of Jacobian, the matrix formed by differentiation of system matrix can be made independent of frequency to optimize the operations at conjugate gradient step. The coarse level parallel computing, using OpenMP framework, is used primarily due to its simplicity in implementation and accessibility of shared memory multi-core computing machine to almost anyone. We demonstrate how the coarseness of modeling grid in comparison to source (comp`utational receivers) spacing can be exploited for efficient computing, without compromising the quality of the inverted model, by reducing the number of adjoint calls. It is also demonstrated that the adjoint field can even be computed on a grid coarser than the modeling grid without affecting the inversion outcome. These observations were reconfirmed using an experiment design where the deviation of source from straight tow line is considered. Finally, a real field data inversion experiment is presented to demonstrate robustness of the code.
Tang, Haijing; Wang, Siye; Zhang, Yanjun
2013-01-01
Clustering has become a common trend in very long instruction words (VLIW) architecture to solve the problem of area, energy consumption, and design complexity. Register-file-connected clustered (RFCC) VLIW architecture uses the mechanism of global register file to accomplish the inter-cluster data communications, thus eliminating the performance and energy consumption penalty caused by explicit inter-cluster data move operations in traditional bus-connected clustered (BCC) VLIW architecture. However, the limit number of access ports to the global register file has become an issue which must be well addressed; otherwise the performance and energy consumption would be harmed. In this paper, we presented compiler optimization techniques for an RFCC VLIW architecture called Lily, which is designed for encryption systems. These techniques aim at optimizing performance and energy consumption for Lily architecture, through appropriate manipulation of the code generation process to maintain a better management of the accesses to the global register file. All the techniques have been implemented and evaluated. The result shows that our techniques can significantly reduce the penalty of performance and energy consumption due to access port limitation of global register file. PMID:23970841
Applications of Derandomization Theory in Coding
NASA Astrophysics Data System (ADS)
Cheraghchi, Mahdi
2011-07-01
Randomized techniques play a fundamental role in theoretical computer science and discrete mathematics, in particular for the design of efficient algorithms and construction of combinatorial objects. The basic goal in derandomization theory is to eliminate or reduce the need for randomness in such randomized constructions. In this thesis, we explore some applications of the fundamental notions in derandomization theory to problems outside the core of theoretical computer science, and in particular, certain problems related to coding theory. First, we consider the wiretap channel problem which involves a communication system in which an intruder can eavesdrop a limited portion of the transmissions, and construct efficient and information-theoretically optimal communication protocols for this model. Then we consider the combinatorial group testing problem. In this classical problem, one aims to determine a set of defective items within a large population by asking a number of queries, where each query reveals whether a defective item is present within a specified group of items. We use randomness condensers to explicitly construct optimal, or nearly optimal, group testing schemes for a setting where the query outcomes can be highly unreliable, as well as the threshold model where a query returns positive if the number of defectives pass a certain threshold. Finally, we design ensembles of error-correcting codes that achieve the information-theoretic capacity of a large class of communication channels, and then use the obtained ensembles for construction of explicit capacity achieving codes. [This is a shortened version of the actual abstract in the thesis.
High Order Entropy-Constrained Residual VQ for Lossless Compression of Images
NASA Technical Reports Server (NTRS)
Kossentini, Faouzi; Smith, Mark J. T.; Scales, Allen
1995-01-01
High order entropy coding is a powerful technique for exploiting high order statistical dependencies. However, the exponentially high complexity associated with such a method often discourages its use. In this paper, an entropy-constrained residual vector quantization method is proposed for lossless compression of images. The method consists of first quantizing the input image using a high order entropy-constrained residual vector quantizer and then coding the residual image using a first order entropy coder. The distortion measure used in the entropy-constrained optimization is essentially the first order entropy of the residual image. Experimental results show very competitive performance.
Design Oriented Structural Modeling for Airplane Conceptual Design Optimization
NASA Technical Reports Server (NTRS)
Livne, Eli
1999-01-01
The main goal for research conducted with the support of this grant was to develop design oriented structural optimization methods for the conceptual design of airplanes. Traditionally in conceptual design airframe weight is estimated based on statistical equations developed over years of fitting airplane weight data in data bases of similar existing air- planes. Utilization of such regression equations for the design of new airplanes can be justified only if the new air-planes use structural technology similar to the technology on the airplanes in those weight data bases. If any new structural technology is to be pursued or any new unconventional configurations designed the statistical weight equations cannot be used. In such cases any structural weight estimation must be based on rigorous "physics based" structural analysis and optimization of the airframes under consideration. Work under this grant progressed to explore airframe design-oriented structural optimization techniques along two lines of research: methods based on "fast" design oriented finite element technology and methods based on equivalent plate / equivalent shell models of airframes, in which the vehicle is modelled as an assembly of plate and shell components, each simulating a lifting surface or nacelle / fuselage pieces. Since response to changes in geometry are essential in conceptual design of airplanes, as well as the capability to optimize the shape itself, research supported by this grant sought to develop efficient techniques for parametrization of airplane shape and sensitivity analysis with respect to shape design variables. Towards the end of the grant period a prototype automated structural analysis code designed to work with the NASA Aircraft Synthesis conceptual design code ACS= was delivered to NASA Ames.
egs_brachy: a versatile and fast Monte Carlo code for brachytherapy
NASA Astrophysics Data System (ADS)
Chamberland, Marc J. P.; Taylor, Randle E. P.; Rogers, D. W. O.; Thomson, Rowan M.
2016-12-01
egs_brachy is a versatile and fast Monte Carlo (MC) code for brachytherapy applications. It is based on the EGSnrc code system, enabling simulation of photons and electrons. Complex geometries are modelled using the EGSnrc C++ class library and egs_brachy includes a library of geometry models for many brachytherapy sources, in addition to eye plaques and applicators. Several simulation efficiency enhancing features are implemented in the code. egs_brachy is benchmarked by comparing TG-43 source parameters of three source models to previously published values. 3D dose distributions calculated with egs_brachy are also compared to ones obtained with the BrachyDose code. Well-defined simulations are used to characterize the effectiveness of many efficiency improving techniques, both as an indication of the usefulness of each technique and to find optimal strategies. Efficiencies and calculation times are characterized through single source simulations and simulations of idealized and typical treatments using various efficiency improving techniques. In general, egs_brachy shows agreement within uncertainties with previously published TG-43 source parameter values. 3D dose distributions from egs_brachy and BrachyDose agree at the sub-percent level. Efficiencies vary with radionuclide and source type, number of sources, phantom media, and voxel size. The combined effects of efficiency-improving techniques in egs_brachy lead to short calculation times: simulations approximating prostate and breast permanent implant (both with (2 mm)3 voxels) and eye plaque (with (1 mm)3 voxels) treatments take between 13 and 39 s, on a single 2.5 GHz Intel Xeon E5-2680 v3 processor core, to achieve 2% average statistical uncertainty on doses within the PTV. egs_brachy will be released as free and open source software to the research community.
egs_brachy: a versatile and fast Monte Carlo code for brachytherapy.
Chamberland, Marc J P; Taylor, Randle E P; Rogers, D W O; Thomson, Rowan M
2016-12-07
egs_brachy is a versatile and fast Monte Carlo (MC) code for brachytherapy applications. It is based on the EGSnrc code system, enabling simulation of photons and electrons. Complex geometries are modelled using the EGSnrc C++ class library and egs_brachy includes a library of geometry models for many brachytherapy sources, in addition to eye plaques and applicators. Several simulation efficiency enhancing features are implemented in the code. egs_brachy is benchmarked by comparing TG-43 source parameters of three source models to previously published values. 3D dose distributions calculated with egs_brachy are also compared to ones obtained with the BrachyDose code. Well-defined simulations are used to characterize the effectiveness of many efficiency improving techniques, both as an indication of the usefulness of each technique and to find optimal strategies. Efficiencies and calculation times are characterized through single source simulations and simulations of idealized and typical treatments using various efficiency improving techniques. In general, egs_brachy shows agreement within uncertainties with previously published TG-43 source parameter values. 3D dose distributions from egs_brachy and BrachyDose agree at the sub-percent level. Efficiencies vary with radionuclide and source type, number of sources, phantom media, and voxel size. The combined effects of efficiency-improving techniques in egs_brachy lead to short calculation times: simulations approximating prostate and breast permanent implant (both with (2 mm) 3 voxels) and eye plaque (with (1 mm) 3 voxels) treatments take between 13 and 39 s, on a single 2.5 GHz Intel Xeon E5-2680 v3 processor core, to achieve 2% average statistical uncertainty on doses within the PTV. egs_brachy will be released as free and open source software to the research community.
Optimal power allocation and joint source-channel coding for wireless DS-CDMA visual sensor networks
NASA Astrophysics Data System (ADS)
Pandremmenou, Katerina; Kondi, Lisimachos P.; Parsopoulos, Konstantinos E.
2011-01-01
In this paper, we propose a scheme for the optimal allocation of power, source coding rate, and channel coding rate for each of the nodes of a wireless Direct Sequence Code Division Multiple Access (DS-CDMA) visual sensor network. The optimization is quality-driven, i.e. the received quality of the video that is transmitted by the nodes is optimized. The scheme takes into account the fact that the sensor nodes may be imaging scenes with varying levels of motion. Nodes that image low-motion scenes will require a lower source coding rate, so they will be able to allocate a greater portion of the total available bit rate to channel coding. Stronger channel coding will mean that such nodes will be able to transmit at lower power. This will both increase battery life and reduce interference to other nodes. Two optimization criteria are considered. One that minimizes the average video distortion of the nodes and one that minimizes the maximum distortion among the nodes. The transmission powers are allowed to take continuous values, whereas the source and channel coding rates can assume only discrete values. Thus, the resulting optimization problem lies in the field of mixed-integer optimization tasks and is solved using Particle Swarm Optimization. Our experimental results show the importance of considering the characteristics of the video sequences when determining the transmission power, source coding rate and channel coding rate for the nodes of the visual sensor network.
Optimizing ATLAS code with different profilers
NASA Astrophysics Data System (ADS)
Kama, S.; Seuster, R.; Stewart, G. A.; Vitillo, R. A.
2014-06-01
After the current maintenance period, the LHC will provide higher energy collisions with increased luminosity. In order to keep up with these higher rates, ATLAS software needs to speed up substantially. However, ATLAS code is composed of approximately 6M lines, written by many different programmers with different backgrounds, which makes code optimisation a challenge. To help with this effort different profiling tools and techniques are being used. These include well known tools, such as the Valgrind suite and Intel Amplifier; less common tools like Pin, PAPI, and GOoDA; as well as techniques such as library interposing. In this paper we will mainly focus on Pin tools and GOoDA. Pin is a dynamic binary instrumentation tool which can obtain statistics such as call counts, instruction counts and interrogate functions' arguments. It has been used to obtain CLHEP Matrix profiles, operations and vector sizes for linear algebra calculations which has provided the insight necessary to achieve significant performance improvements. Complimenting this, GOoDA, an in-house performance tool built in collaboration with Google, which is based on hardware performance monitoring unit events, is used to identify hot-spots in the code for different types of hardware limitations, such as CPU resources, caches, or memory bandwidth. GOoDA has been used in improvement of the performance of new magnetic field code and identification of potential vectorization targets in several places, such as Runge-Kutta propagation code.
Intel Xeon Phi accelerated Weather Research and Forecasting (WRF) Goddard microphysics scheme
NASA Astrophysics Data System (ADS)
Mielikainen, J.; Huang, B.; Huang, A. H.-L.
2014-12-01
The Weather Research and Forecasting (WRF) model is a numerical weather prediction system designed to serve both atmospheric research and operational forecasting needs. The WRF development is a done in collaboration around the globe. Furthermore, the WRF is used by academic atmospheric scientists, weather forecasters at the operational centers and so on. The WRF contains several physics components. The most time consuming one is the microphysics. One microphysics scheme is the Goddard cloud microphysics scheme. It is a sophisticated cloud microphysics scheme in the Weather Research and Forecasting (WRF) model. The Goddard microphysics scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. Compared to the earlier microphysics schemes, the Goddard scheme incorporates a large number of improvements. Thus, we have optimized the Goddard scheme code. In this paper, we present our results of optimizing the Goddard microphysics scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The Intel MIC is capable of executing a full operating system and entire programs rather than just kernels as the GPU does. The MIC coprocessor supports all important Intel development tools. Thus, the development environment is one familiar to a vast number of CPU developers. Although, getting a maximum performance out of MICs will require using some novel optimization techniques. Those optimization techniques are discussed in this paper. The results show that the optimizations improved performance of Goddard microphysics scheme on Xeon Phi 7120P by a factor of 4.7×. In addition, the optimizations reduced the Goddard microphysics scheme's share of the total WRF processing time from 20.0 to 7.5%. Furthermore, the same optimizations improved performance on Intel Xeon E5-2670 by a factor of 2.8× compared to the original code.
jMetalCpp: optimizing molecular docking problems with a C++ metaheuristic framework.
López-Camacho, Esteban; García Godoy, María Jesús; Nebro, Antonio J; Aldana-Montes, José F
2014-02-01
Molecular docking is a method for structure-based drug design and structural molecular biology, which attempts to predict the position and orientation of a small molecule (ligand) in relation to a protein (receptor) to produce a stable complex with a minimum binding energy. One of the most widely used software packages for this purpose is AutoDock, which incorporates three metaheuristic techniques. We propose the integration of AutoDock with jMetalCpp, an optimization framework, thereby providing both single- and multi-objective algorithms that can be used to effectively solve docking problems. The resulting combination of AutoDock + jMetalCpp allows users of the former to easily use the metaheuristics provided by the latter. In this way, biologists have at their disposal a richer set of optimization techniques than those already provided in AutoDock. Moreover, designers of metaheuristic techniques can use molecular docking for case studies, which can lead to more efficient algorithms oriented to solving the target problems. jMetalCpp software adapted to AutoDock is freely available as a C++ source code at http://khaos.uma.es/AutodockjMetal/.
NASA Astrophysics Data System (ADS)
Jara, Daniel; de Dreuzy, Jean-Raynald; Cochepin, Benoit
2017-12-01
Reactive transport modeling contributes to understand geophysical and geochemical processes in subsurface environments. Operator splitting methods have been proposed as non-intrusive coupling techniques that optimize the use of existing chemistry and transport codes. In this spirit, we propose a coupler relying on external geochemical and transport codes with appropriate operator segmentation that enables possible developments of additional splitting methods. We provide an object-oriented implementation in TReacLab developed in the MATLAB environment in a free open source frame with an accessible repository. TReacLab contains classical coupling methods, template interfaces and calling functions for two classical transport and reactive software (PHREEQC and COMSOL). It is tested on four classical benchmarks with homogeneous and heterogeneous reactions at equilibrium or kinetically-controlled. We show that full decoupling to the implementation level has a cost in terms of accuracy compared to more integrated and optimized codes. Use of non-intrusive implementations like TReacLab are still justified for coupling independent transport and chemical software at a minimal development effort but should be systematically and carefully assessed.
A novel neutron energy spectrum unfolding code using particle swarm optimization
NASA Astrophysics Data System (ADS)
Shahabinejad, H.; Sohrabpour, M.
2017-07-01
A novel neutron Spectrum Deconvolution using Particle Swarm Optimization (SDPSO) code has been developed to unfold the neutron spectrum from a pulse height distribution and a response matrix. The Particle Swarm Optimization (PSO) imitates the bird flocks social behavior to solve complex optimization problems. The results of the SDPSO code have been compared with those of the standard spectra and recently published Two-steps Genetic Algorithm Spectrum Unfolding (TGASU) code. The TGASU code have been previously compared with the other codes such as MAXED, GRAVEL, FERDOR and GAMCD and shown to be more accurate than the previous codes. The results of the SDPSO code have been demonstrated to match well with those of the TGASU code for both under determined and over-determined problems. In addition the SDPSO has been shown to be nearly two times faster than the TGASU code.
A Survey on Multimedia-Based Cross-Layer Optimization in Visual Sensor Networks
Costa, Daniel G.; Guedes, Luiz Affonso
2011-01-01
Visual sensor networks (VSNs) comprised of battery-operated electronic devices endowed with low-resolution cameras have expanded the applicability of a series of monitoring applications. Those types of sensors are interconnected by ad hoc error-prone wireless links, imposing stringent restrictions on available bandwidth, end-to-end delay and packet error rates. In such context, multimedia coding is required for data compression and error-resilience, also ensuring energy preservation over the path(s) toward the sink and improving the end-to-end perceptual quality of the received media. Cross-layer optimization may enhance the expected efficiency of VSNs applications, disrupting the conventional information flow of the protocol layers. When the inner characteristics of the multimedia coding techniques are exploited by cross-layer protocols and architectures, higher efficiency may be obtained in visual sensor networks. This paper surveys recent research on multimedia-based cross-layer optimization, presenting the proposed strategies and mechanisms for transmission rate adjustment, congestion control, multipath selection, energy preservation and error recovery. We note that many multimedia-based cross-layer optimization solutions have been proposed in recent years, each one bringing a wealth of contributions to visual sensor networks. PMID:22163908
Using game theory for perceptual tuned rate control algorithm in video coding
NASA Astrophysics Data System (ADS)
Luo, Jiancong; Ahmad, Ishfaq
2005-03-01
This paper proposes a game theoretical rate control technique for video compression. Using a cooperative gaming approach, which has been utilized in several branches of natural and social sciences because of its enormous potential for solving constrained optimization problems, we propose a dual-level scheme to optimize the perceptual quality while guaranteeing "fairness" in bit allocation among macroblocks. At the frame level, the algorithm allocates target bits to frames based on their coding complexity. At the macroblock level, the algorithm distributes bits to macroblocks by defining a bargaining game. Macroblocks play cooperatively to compete for shares of resources (bits) to optimize their quantization scales while considering the Human Visual System"s perceptual property. Since the whole frame is an entity perceived by viewers, macroblocks compete cooperatively under a global objective of achieving the best quality with the given bit constraint. The major advantage of the proposed approach is that the cooperative game leads to an optimal and fair bit allocation strategy based on the Nash Bargaining Solution. Another advantage is that it allows multi-objective optimization with multiple decision makers (macroblocks). The simulation results testify the algorithm"s ability to achieve accurate bit rate with good perceptual quality, and to maintain a stable buffer level.
Observations Regarding Use of Advanced CFD Analysis, Sensitivity Analysis, and Design Codes in MDO
NASA Technical Reports Server (NTRS)
Newman, Perry A.; Hou, Gene J. W.; Taylor, Arthur C., III
1996-01-01
Observations regarding the use of advanced computational fluid dynamics (CFD) analysis, sensitivity analysis (SA), and design codes in gradient-based multidisciplinary design optimization (MDO) reflect our perception of the interactions required of CFD and our experience in recent aerodynamic design optimization studies using CFD. Sample results from these latter studies are summarized for conventional optimization (analysis - SA codes) and simultaneous analysis and design optimization (design code) using both Euler and Navier-Stokes flow approximations. The amount of computational resources required for aerodynamic design using CFD via analysis - SA codes is greater than that required for design codes. Thus, an MDO formulation that utilizes the more efficient design codes where possible is desired. However, in the aerovehicle MDO problem, the various disciplines that are involved have different design points in the flight envelope; therefore, CFD analysis - SA codes are required at the aerodynamic 'off design' points. The suggested MDO formulation is a hybrid multilevel optimization procedure that consists of both multipoint CFD analysis - SA codes and multipoint CFD design codes that perform suboptimizations.
New Millenium Inflatable Structures Technology
NASA Technical Reports Server (NTRS)
Mollerick, Ralph
1997-01-01
Specific applications where inflatable technology can enable or enhance future space missions are tabulated. The applicability of the inflatable technology to large aperture infra-red astronomy missions is discussed. Space flight validation and risk reduction are emphasized along with the importance of analytical tools in deriving structurally sound concepts and performing optimizations using compatible codes. Deployment dynamics control, fabrication techniques, and system testing are addressed.
NASA Astrophysics Data System (ADS)
Dao, Thanh Hai
2018-01-01
Network coding techniques are seen as the new dimension to improve the network performances thanks to the capability of utilizing network resources more efficiently. Indeed, the application of network coding to the realm of failure recovery in optical networks has been marking a major departure from traditional protection schemes as it could potentially achieve both rapid recovery and capacity improvement, challenging the prevailing wisdom of trading capacity efficiency for speed recovery and vice versa. In this context, the maturing of all-optical XOR technologies appears as a good match to the necessity of a more efficient protection in transparent optical networks. In addressing this opportunity, we propose to use a practical all-optical XOR network coding to leverage the conventional 1 + 1 optical path protection in transparent WDM optical networks. The network coding-assisted protection solution combines protection flows of two demands sharing the same destination node in supportive conditions, paving the way for reducing the backup capacity. A novel mathematical model taking into account the operation of new protection scheme for optimal network designs is formulated as the integer linear programming. Numerical results based on extensive simulations on realistic topologies, COST239 and NSFNET networks, are presented to highlight the benefits of our proposal compared to the conventional approach in terms of wavelength resources efficiency and network throughput.
Resource allocation for error resilient video coding over AWGN using optimization approach.
An, Cheolhong; Nguyen, Truong Q
2008-12-01
The number of slices for error resilient video coding is jointly optimized with 802.11a-like media access control and the physical layers with automatic repeat request and rate compatible punctured convolutional code over additive white gaussian noise channel as well as channel times allocation for time division multiple access. For error resilient video coding, the relation between the number of slices and coding efficiency is analyzed and formulated as a mathematical model. It is applied for the joint optimization problem, and the problem is solved by a convex optimization method such as the primal-dual decomposition method. We compare the performance of a video communication system which uses the optimal number of slices with one that codes a picture as one slice. From numerical examples, end-to-end distortion of utility functions can be significantly reduced with the optimal slices of a picture especially at low signal-to-noise ratio.
Optimization of thermal protection systems for the space shuttle vehicle. Volume 1: Final report
NASA Technical Reports Server (NTRS)
1972-01-01
A study performed to continue development of computational techniques for the Space Shuttle Thermal Protection System is reported. The resulting computer code was used to perform some additional optimization studies on several TPS configurations. The program was developed in Fortran 4 for the CDC 6400, and it was converted to Fortran 5 to be used for the Univac 1108. The computational methodology is developed in modular fashion to facilitate changes and updating of the techniques and to allow overlaying the computer code to fit into approximately 131,000 octal words of core storage. The program logic involves subroutines which handle input and output of information between computer and user, thermodynamic stress, dynamic, and weight/estimate analyses of a variety of panel configurations. These include metallic, ablative, RSI (with and without an underlying phase change material), and a thermodynamic analysis only of carbon-carbon systems applied to the leading edge and flat cover panels. Two different thermodynamic analyses are used. The first is a two-dimensional, explicit precedure with variable time steps which is used to describe the behavior of metallic and carbon-carbon leading edges. The second is a one-dimensional implicity technique used to predict temperature in the charring ablator and the noncharring RSI. The latter analysis is performed simply by suppressing the chemical reactions and pyrolysis of the TPS material.
Optimization of wave rotors for use as gas turbine engine topping cycles
NASA Technical Reports Server (NTRS)
Wilson, Jack; Paxson, Daniel E.
1995-01-01
Use of a wave rotor as a topping cycle for a gas turbine engine can improve specific power and reduce specific fuel consumption. Maximum improvement requires the wave rotor to be optimized for best performance at the mass flow of the engine. The optimization is a trade-off between losses due to friction and passage opening time, and rotational effects. An experimentally validated, one-dimensional CFD code, which includes these effects, has been used to calculate wave rotor performance, and find the optimum configuration. The technique is described, and results given for wave rotors sized for engines with sea level mass flows of 4, 26, and 400 lb/sec.
Investigation of Navier-Stokes Code Verification and Design Optimization
NASA Technical Reports Server (NTRS)
Vaidyanathan, Rajkumar
2004-01-01
With rapid progress made in employing computational techniques for various complex Navier-Stokes fluid flow problems, design optimization problems traditionally based on empirical formulations and experiments are now being addressed with the aid of computational fluid dynamics (CFD). To be able to carry out an effective CFD-based optimization study, it is essential that the uncertainty and appropriate confidence limits of the CFD solutions be quantified over the chosen design space. The present dissertation investigates the issues related to code verification, surrogate model-based optimization and sensitivity evaluation. For Navier-Stokes (NS) CFD code verification a least square extrapolation (LSE) method is assessed. This method projects numerically computed NS solutions from multiple, coarser base grids onto a freer grid and improves solution accuracy by minimizing the residual of the discretized NS equations over the projected grid. In this dissertation, the finite volume (FV) formulation is focused on. The interplay between the xi concepts and the outcome of LSE, and the effects of solution gradients and singularities, nonlinear physics, and coupling of flow variables on the effectiveness of LSE are investigated. A CFD-based design optimization of a single element liquid rocket injector is conducted with surrogate models developed using response surface methodology (RSM) based on CFD solutions. The computational model consists of the NS equations, finite rate chemistry, and the k-6 turbulence closure. With the aid of these surrogate models, sensitivity and trade-off analyses are carried out for the injector design whose geometry (hydrogen flow angle, hydrogen and oxygen flow areas and oxygen post tip thickness) is optimized to attain desirable goals in performance (combustion length) and life/survivability (the maximum temperatures on the oxidizer post tip and injector face and a combustion chamber wall temperature). A preliminary multi-objective optimization study is carried out using a geometric mean approach. Following this, sensitivity analyses with the aid of variance-based non-parametric approach and partial correlation coefficients are conducted using data available from surrogate models of the objectives and the multi-objective optima to identify the contribution of the design variables to the objective variability and to analyze the variability of the design variables and the objectives. In summary the present dissertation offers insight into an improved coarse to fine grid extrapolation technique for Navier-Stokes computations and also suggests tools for a designer to conduct design optimization study and related sensitivity analyses for a given design problem.
A study of transonic aerodynamic analysis methods for use with a hypersonic aircraft synthesis code
NASA Technical Reports Server (NTRS)
Sandlin, Doral R.; Davis, Paul Christopher
1992-01-01
A means of performing routine transonic lift, drag, and moment analyses on hypersonic all-body and wing-body configurations were studied. The analysis method is to be used in conjunction with the Hypersonic Vehicle Optimization Code (HAVOC). A review of existing techniques is presented, after which three methods, chosen to represent a spectrum of capabilities, are tested and the results are compared with experimental data. The three methods consist of a wave drag code, a full potential code, and a Navier-Stokes code. The wave drag code, representing the empirical approach, has very fast CPU times, but very limited and sporadic results. The full potential code provides results which compare favorably to the wind tunnel data, but with a dramatic increase in computational time. Even more extreme is the Navier-Stokes code, which provides the most favorable and complete results, but with a very large turnaround time. The full potential code, TRANAIR, is used for additional analyses, because of the superior results it can provide over empirical and semi-empirical methods, and because of its automated grid generation. TRANAIR analyses include an all body hypersonic cruise configuration and an oblique flying wing supersonic transport.
GENIE - Generation of computational geometry-grids for internal-external flow configurations
NASA Technical Reports Server (NTRS)
Soni, B. K.
1988-01-01
Progress realized in the development of a master geometry-grid generation code GENIE is presented. The grid refinement process is enhanced by developing strategies to utilize bezier curves/surfaces and splines along with weighted transfinite interpolation technique and by formulating new forcing function for the elliptic solver based on the minimization of a non-orthogonality functional. A two step grid adaptation procedure is developed by optimally blending adaptive weightings with weighted transfinite interpolation technique. Examples of 2D-3D grids are provided to illustrate the success of these methods.
NASA Technical Reports Server (NTRS)
Bauer, Brent
1993-01-01
This paper discusses the development of a FORTRAN computer code to perform agility analysis on aircraft configurations. This code is to be part of the NASA-Ames ACSYNT (AirCraft SYNThesis) design code. This paper begins with a discussion of contemporary agility research in the aircraft industry and a survey of a few agility metrics. The methodology, techniques and models developed for the code are then presented. Finally, example trade studies using the agility module along with ACSYNT are illustrated. These trade studies were conducted using a Northrop F-20 Tigershark aircraft model. The studies show that the agility module is effective in analyzing the influence of common parameters such as thrust-to-weight ratio and wing loading on agility criteria. The module can compare the agility potential between different configurations. In addition, one study illustrates the module's ability to optimize a configuration's agility performance.
Adaptive bit plane quadtree-based block truncation coding for image compression
NASA Astrophysics Data System (ADS)
Li, Shenda; Wang, Jin; Zhu, Qing
2018-04-01
Block truncation coding (BTC) is a fast image compression technique applied in spatial domain. Traditional BTC and its variants mainly focus on reducing computational complexity for low bit rate compression, at the cost of lower quality of decoded images, especially for images with rich texture. To solve this problem, in this paper, a quadtree-based block truncation coding algorithm combined with adaptive bit plane transmission is proposed. First, the direction of edge in each block is detected using Sobel operator. For the block with minimal size, adaptive bit plane is utilized to optimize the BTC, which depends on its MSE loss encoded by absolute moment block truncation coding (AMBTC). Extensive experimental results show that our method gains 0.85 dB PSNR on average compare to some other state-of-the-art BTC variants. So it is desirable for real time image compression applications.
NASA Astrophysics Data System (ADS)
Palma, V.; Carli, M.; Neri, A.
2011-02-01
In this paper a Multi-view Distributed Video Coding scheme for mobile applications is presented. Specifically a new fusion technique between temporal and spatial side information in Zernike Moments domain is proposed. Distributed video coding introduces a flexible architecture that enables the design of very low complex video encoders compared to its traditional counterparts. The main goal of our work is to generate at the decoder the side information that optimally blends temporal and interview data. Multi-view distributed coding performance strongly depends on the side information quality built at the decoder. At this aim for improving its quality a spatial view compensation/prediction in Zernike moments domain is applied. Spatial and temporal motion activity have been fused together to obtain the overall side-information. The proposed method has been evaluated by rate-distortion performances for different inter-view and temporal estimation quality conditions.
Particle-in-Cell laser-plasma simulation on Xeon Phi coprocessors
NASA Astrophysics Data System (ADS)
Surmin, I. A.; Bastrakov, S. I.; Efimenko, E. S.; Gonoskov, A. A.; Korzhimanov, A. V.; Meyerov, I. B.
2016-05-01
This paper concerns the development of a high-performance implementation of the Particle-in-Cell method for plasma simulation on Intel Xeon Phi coprocessors. We discuss the suitability of the method for Xeon Phi architecture and present our experience in the porting and optimization of the existing parallel Particle-in-Cell code PICADOR. Direct porting without code modification gives performance on Xeon Phi close to that of an 8-core CPU on a benchmark problem with 50 particles per cell. We demonstrate step-by-step optimization techniques, such as improving data locality, enhancing parallelization efficiency and vectorization leading to an overall 4.2 × speedup on CPU and 7.5 × on Xeon Phi compared to the baseline version. The optimized version achieves 16.9 ns per particle update on an Intel Xeon E5-2660 CPU and 9.3 ns per particle update on an Intel Xeon Phi 5110P. For a real problem of laser ion acceleration in targets with surface grating, where a large number of macroparticles per cell is required, the speedup of Xeon Phi compared to CPU is 1.6 ×.
Nonlinear dynamic macromodeling techniques for audio systems
NASA Astrophysics Data System (ADS)
Ogrodzki, Jan; Bieńkowski, Piotr
2015-09-01
This paper develops a modelling method and a models identification technique for the nonlinear dynamic audio systems. Identification is performed by means of a behavioral approach based on a polynomial approximation. This approach makes use of Discrete Fourier Transform and Harmonic Balance Method. A model of an audio system is first created and identified and then it is simulated in real time using an algorithm of low computational complexity. The algorithm consists in real time emulation of the system response rather than in simulation of the system itself. The proposed software is written in Python language using object oriented programming techniques. The code is optimized for a multithreads environment.
Program optimizations: The interplay between power, performance, and energy
Leon, Edgar A.; Karlin, Ian; Grant, Ryan E.; ...
2016-05-16
Practical considerations for future supercomputer designs will impose limits on both instantaneous power consumption and total energy consumption. Working within these constraints while providing the maximum possible performance, application developers will need to optimize their code for speed alongside power and energy concerns. This paper analyzes the effectiveness of several code optimizations including loop fusion, data structure transformations, and global allocations. A per component measurement and analysis of different architectures is performed, enabling the examination of code optimizations on different compute subsystems. Using an explicit hydrodynamics proxy application from the U.S. Department of Energy, LULESH, we show how code optimizationsmore » impact different computational phases of the simulation. This provides insight for simulation developers into the best optimizations to use during particular simulation compute phases when optimizing code for future supercomputing platforms. Here, we examine and contrast both x86 and Blue Gene architectures with respect to these optimizations.« less
Multifidelity Analysis and Optimization for Supersonic Design
NASA Technical Reports Server (NTRS)
Kroo, Ilan; Willcox, Karen; March, Andrew; Haas, Alex; Rajnarayan, Dev; Kays, Cory
2010-01-01
Supersonic aircraft design is a computationally expensive optimization problem and multifidelity approaches over a significant opportunity to reduce design time and computational cost. This report presents tools developed to improve supersonic aircraft design capabilities including: aerodynamic tools for supersonic aircraft configurations; a systematic way to manage model uncertainty; and multifidelity model management concepts that incorporate uncertainty. The aerodynamic analysis tools developed are appropriate for use in a multifidelity optimization framework, and include four analysis routines to estimate the lift and drag of a supersonic airfoil, a multifidelity supersonic drag code that estimates the drag of aircraft configurations with three different methods: an area rule method, a panel method, and an Euler solver. In addition, five multifidelity optimization methods are developed, which include local and global methods as well as gradient-based and gradient-free techniques.
Blade pitch optimization methods for vertical-axis wind turbines
NASA Astrophysics Data System (ADS)
Kozak, Peter
Vertical-axis wind turbines (VAWTs) offer an inherently simpler design than horizontal-axis machines, while their lower blade speed mitigates safety and noise concerns, potentially allowing for installation closer to populated and ecologically sensitive areas. While VAWTs do offer significant operational advantages, development has been hampered by the difficulty of modeling the aerodynamics involved, further complicated by their rotating geometry. This thesis presents results from a simulation of a baseline VAWT computed using Star-CCM+, a commercial finite-volume (FVM) code. VAWT aerodynamics are shown to be dominated at low tip-speed ratios by dynamic stall phenomena and at high tip-speed ratios by wake-blade interactions. Several optimization techniques have been developed for the adjustment of blade pitch based on finite-volume simulations and streamtube models. The effectiveness of the optimization procedure is evaluated and the basic architecture for a feedback control system is proposed. Implementation of variable blade pitch is shown to increase a baseline turbine's power output between 40%-100%, depending on the optimization technique, improving the turbine's competitiveness when compared with a commercially-available horizontal-axis turbine.
Phenotypic Graphs and Evolution Unfold the Standard Genetic Code as the Optimal
NASA Astrophysics Data System (ADS)
Zamudio, Gabriel S.; José, Marco V.
2018-03-01
In this work, we explicitly consider the evolution of the Standard Genetic Code (SGC) by assuming two evolutionary stages, to wit, the primeval RNY code and two intermediate codes in between. We used network theory and graph theory to measure the connectivity of each phenotypic graph. The connectivity values are compared to the values of the codes under different randomization scenarios. An error-correcting optimal code is one in which the algebraic connectivity is minimized. We show that the SGC is optimal in regard to its robustness and error-tolerance when compared to all random codes under different assumptions.
Hamdan, Sadeque; Cheaitou, Ali
2017-08-01
This data article provides detailed optimization input and output datasets and optimization code for the published research work titled "Dynamic green supplier selection and order allocation with quantity discounts and varying supplier availability" (Hamdan and Cheaitou, 2017, In press) [1]. Researchers may use these datasets as a baseline for future comparison and extensive analysis of the green supplier selection and order allocation problem with all-unit quantity discount and varying number of suppliers. More particularly, the datasets presented in this article allow researchers to generate the exact optimization outputs obtained by the authors of Hamdan and Cheaitou (2017, In press) [1] using the provided optimization code and then to use them for comparison with the outputs of other techniques or methodologies such as heuristic approaches. Moreover, this article includes the randomly generated optimization input data and the related outputs that are used as input data for the statistical analysis presented in Hamdan and Cheaitou (2017 In press) [1] in which two different approaches for ranking potential suppliers are compared. This article also provides the time analysis data used in (Hamdan and Cheaitou (2017, In press) [1] to study the effect of the problem size on the computation time as well as an additional time analysis dataset. The input data for the time study are generated randomly, in which the problem size is changed, and then are used by the optimization problem to obtain the corresponding optimal outputs as well as the corresponding computation time.
OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms
Meng, Zhaoyi; Koniges, Alice; He, Yun Helen; ...
2016-09-21
In this paper, we investigate the OpenMP parallelization and optimization of two novel data classification algorithms. The new algorithms are based on graph and PDE solution techniques and provide significant accuracy and performance advantages over traditional data classification algorithms in serial mode. The methods leverage the Nystrom extension to calculate eigenvalue/eigenvectors of the graph Laplacian and this is a self-contained module that can be used in conjunction with other graph-Laplacian based methods such as spectral clustering. We use performance tools to collect the hotspots and memory access of the serial codes and use OpenMP as the parallelization language to parallelizemore » the most time-consuming parts. Where possible, we also use library routines. We then optimize the OpenMP implementations and detail the performance on traditional supercomputer nodes (in our case a Cray XC30), and test the optimization steps on emerging testbed systems based on Intel’s Knights Corner and Landing processors. We show both performance improvement and strong scaling behavior. Finally, a large number of optimization techniques and analyses are necessary before the algorithm reaches almost ideal scaling.« less
Comparison of stochastic optimization methods for all-atom folding of the Trp-Cage protein.
Schug, Alexander; Herges, Thomas; Verma, Abhinav; Lee, Kyu Hwan; Wenzel, Wolfgang
2005-12-09
The performances of three different stochastic optimization methods for all-atom protein structure prediction are investigated and compared. We use the recently developed all-atom free-energy force field (PFF01), which was demonstrated to correctly predict the native conformation of several proteins as the global optimum of the free energy surface. The trp-cage protein (PDB-code 1L2Y) is folded with the stochastic tunneling method, a modified parallel tempering method, and the basin-hopping technique. All the methods correctly identify the native conformation, and their relative efficiency is discussed.
Optimization of a Boiling Water Reactor Loading Pattern Using an Improved Genetic Algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kobayashi, Yoko; Aiyoshi, Eitaro
2003-08-15
A search method based on genetic algorithms (GA) using deterministic operators has been developed to generate optimized boiling water reactor (BWR) loading patterns (LPs). The search method uses an Improved GA operator, that is, crossover, mutation, and selection. The handling of the encoding technique and constraint conditions is designed so that the GA reflects the peculiar characteristics of the BWR. In addition, some strategies such as elitism and self-reproduction are effectively used to improve the search speed. LP evaluations were performed with a three-dimensional diffusion code that coupled neutronic and thermal-hydraulic models. Strong axial heterogeneities and three-dimensional-dependent constraints have alwaysmore » necessitated the use of three-dimensional core simulators for BWRs, so that an optimization method is required for computational efficiency. The proposed algorithm is demonstrated by successfully generating LPs for an actual BWR plant applying the Haling technique. In test calculations, candidates that shuffled fresh and burned fuel assemblies within a reasonable computation time were obtained.« less
Sankaran, Ramanan; Angel, Jordan; Brown, W. Michael
2015-04-08
The growth in size of networked high performance computers along with novel accelerator-based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on themore » performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter-task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm-based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. As a result, application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, therefore enabling the applications to achieve better time to solution and scalability on Titan during production.« less
Current Practice of Therapeutic Mammaplasty: A Survey of Oncoplastic Breast Surgeons in England
Aggarwal, Shweta; Marla, Sekhar; Nyanhongo, Donald; Kotecha, Sita; Basu, Narendra Nath
2016-01-01
Introduction. Therapeutic mammaplasty (TM) is a useful technique in the armamentarium of the oncoplastic breast surgeon (OBS). There is limited guidance on patient selection, technique, coding, and management of involved margins. The practices of OBS in England remain unknown. Methods. Questionnaires were sent to all OBS involved with the Training Interface Group. We assessed the number of TM cases performed per surgeon, criteria for patient selection, pedicle preference, contralateral symmetrisation, use of routine preoperative MRI, management of involved margins, and clinical coding. Results. We had an overall response rate of 43%. The most common skin resection technique utilised was wise pattern followed by vertical scar. Superior-medial pedicle was preferred by the majority of surgeons (62%) followed by inferior pedicle (34%). Twenty percent of surgeons would always proceed to a mastectomy following an involved margin, whereas the majority would offer reexcision based on several parameters. The main absolute contraindication to TM was tumour to breast ratio >50%. One in five surgeons would not perform TM in smokers and patients with multifocal disease. Discussion. There is a wide variation in the practice of TM amongst OBS. Further research and guidance would be useful to standardise practice, particularly management of involved margins and coding for optimal reimbursement. PMID:27110398
An optimization program based on the method of feasible directions: Theory and users guide
NASA Technical Reports Server (NTRS)
Belegundu, Ashok D.; Berke, Laszlo; Patnaik, Surya N.
1994-01-01
The theory and user instructions for an optimization code based on the method of feasible directions are presented. The code was written for wide distribution and ease of attachment to other simulation software. Although the theory of the method of feasible direction was developed in the 1960's, many considerations are involved in its actual implementation as a computer code. Included in the code are a number of features to improve robustness in optimization. The search direction is obtained by solving a quadratic program using an interior method based on Karmarkar's algorithm. The theory is discussed focusing on the important and often overlooked role played by the various parameters guiding the iterations within the program. Also discussed is a robust approach for handling infeasible starting points. The code was validated by solving a variety of structural optimization test problems that have known solutions obtained by other optimization codes. It has been observed that this code is robust: it has solved a variety of problems from different starting points. However, the code is inefficient in that it takes considerable CPU time as compared with certain other available codes. Further work is required to improve its efficiency while retaining its robustness.
Fuel management optimization using genetic algorithms and code independence
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeChaine, M.D.; Feltus, M.A.
1994-12-31
Fuel management optimization is a hard problem for traditional optimization techniques. Loading pattern optimization is a large combinatorial problem without analytical derivative information. Therefore, methods designed for continuous functions, such as linear programming, do not always work well. Genetic algorithms (GAs) address these problems and, therefore, appear ideal for fuel management optimization. They do not require derivative information and work well with combinatorial. functions. The GAs are a stochastic method based on concepts from biological genetics. They take a group of candidate solutions, called the population, and use selection, crossover, and mutation operators to create the next generation of bettermore » solutions. The selection operator is a {open_quotes}survival-of-the-fittest{close_quotes} operation and chooses the solutions for the next generation. The crossover operator is analogous to biological mating, where children inherit a mixture of traits from their parents, and the mutation operator makes small random changes to the solutions.« less
NASA Astrophysics Data System (ADS)
Reis, C.; Clain, S.; Figueiredo, J.; Baptista, M. A.; Miranda, J. M. A.
2015-12-01
Numerical tools turn to be very important for scenario evaluations of hazardous phenomena such as tsunami. Nevertheless, the predictions highly depends on the numerical tool quality and the design of efficient numerical schemes still receives important attention to provide robust and accurate solutions. In this study we propose a comparative study between the efficiency of two volume finite numerical codes with second-order discretization implemented with different method to solve the non-conservative shallow water equations, the MUSCL (Monotonic Upstream-Centered Scheme for Conservation Laws) and the MOOD methods (Multi-dimensional Optimal Order Detection) which optimize the accuracy of the approximation in function of the solution local smoothness. The MUSCL is based on a priori criteria where the limiting procedure is performed before updated the solution to the next time-step leading to non-necessary accuracy reduction. On the contrary, the new MOOD technique uses a posteriori detectors to prevent the solution from oscillating in the vicinity of the discontinuities. Indeed, a candidate solution is computed and corrections are performed only for the cells where non-physical oscillations are detected. Using a simple one-dimensional analytical benchmark, 'Single wave on a sloping beach', we show that the classical 1D shallow-water system can be accurately solved with the finite volume method equipped with the MOOD technique and provide better approximation with sharper shock and less numerical diffusion. For the code validation, we also use the Tohoku-Oki 2011 tsunami and reproduce two DART records, demonstrating that the quality of the solution may deeply interfere with the scenario one can assess. This work is funded by the Portugal-France research agreement, through the research project GEONUM FCT-ANR/MAT-NAN/0122/2012.Numerical tools turn to be very important for scenario evaluations of hazardous phenomena such as tsunami. Nevertheless, the predictions highly depends on the numerical tool quality and the design of efficient numerical schemes still receives important attention to provide robust and accurate solutions. In this study we propose a comparative study between the efficiency of two volume finite numerical codes with second-order discretization implemented with different method to solve the non-conservative shallow water equations, the MUSCL (Monotonic Upstream-Centered Scheme for Conservation Laws) and the MOOD methods (Multi-dimensional Optimal Order Detection) which optimize the accuracy of the approximation in function of the solution local smoothness. The MUSCL is based on a priori criteria where the limiting procedure is performed before updated the solution to the next time-step leading to non-necessary accuracy reduction. On the contrary, the new MOOD technique uses a posteriori detectors to prevent the solution from oscillating in the vicinity of the discontinuities. Indeed, a candidate solution is computed and corrections are performed only for the cells where non-physical oscillations are detected. Using a simple one-dimensional analytical benchmark, 'Single wave on a sloping beach', we show that the classical 1D shallow-water system can be accurately solved with the finite volume method equipped with the MOOD technique and provide better approximation with sharper shock and less numerical diffusion. For the code validation, we also use the Tohoku-Oki 2011 tsunami and reproduce two DART records, demonstrating that the quality of the solution may deeply interfere with the scenario one can assess. This work is funded by the Portugal-France research agreement, through the research project GEONUM FCT-ANR/MAT-NAN/0122/2012.
Optimal design of geodesically stiffened composite cylindrical shells
NASA Technical Reports Server (NTRS)
Gendron, G.; Guerdal, Z.
1992-01-01
An optimization system based on the finite element code Computations Structural Mechanics (CSM) Testbed and the optimization program, Automated Design Synthesis (ADS), is described. The optimization system can be used to obtain minimum-weight designs of composite stiffened structures. Ply thickness, ply orientations, and stiffener heights can be used as design variables. Buckling, displacement, and material failure constraints can be imposed on the design. The system is used to conduct a design study of geodesically stiffened shells. For comparison purposes, optimal designs of unstiffened shells and shells stiffened by rings and stingers are also obtained. Trends in the design of geodesically stiffened shells are identified. An approach to include local stress concentrations during the design optimization process is then presented. The method is based on a global/local analysis technique. It employs spline interpolation functions to determine displacements and rotations from a global model which are used as 'boundary conditions' for the local model. The organization of the strategy in the context of an optimization process is described. The method is validated with an example.
The effect of code expanding optimizations on instruction cache design
NASA Technical Reports Server (NTRS)
Chen, William Y.; Chang, Pohua P.; Conte, Thomas M.; Hwu, Wen-Mei W.
1991-01-01
It is shown that code expanding optimizations have strong and non-intuitive implications on instruction cache design. Three types of code expanding optimizations are studied: instruction placement, function inline expansion, and superscalar optimizations. Overall, instruction placement reduces the miss ratio of small caches. Function inline expansion improves the performance for small cache sizes, but degrades the performance of medium caches. Superscalar optimizations increases the cache size required for a given miss ratio. On the other hand, they also increase the sequentiality of instruction access so that a simple load-forward scheme effectively cancels the negative effects. Overall, it is shown that with load forwarding, the three types of code expanding optimizations jointly improve the performance of small caches and have little effect on large caches.
NASA Astrophysics Data System (ADS)
De Geyter, G.; Baes, M.; Fritz, J.; Camps, P.
2013-02-01
We present FitSKIRT, a method to efficiently fit radiative transfer models to UV/optical images of dusty galaxies. These images have the advantage that they have better spatial resolution compared to FIR/submm data. FitSKIRT uses the GAlib genetic algorithm library to optimize the output of the SKIRT Monte Carlo radiative transfer code. Genetic algorithms prove to be a valuable tool in handling the multi- dimensional search space as well as the noise induced by the random nature of the Monte Carlo radiative transfer code. FitSKIRT is tested on artificial images of a simulated edge-on spiral galaxy, where we gradually increase the number of fitted parameters. We find that we can recover all model parameters, even if all 11 model parameters are left unconstrained. Finally, we apply the FitSKIRT code to a V-band image of the edge-on spiral galaxy NGC 4013. This galaxy has been modeled previously by other authors using different combinations of radiative transfer codes and optimization methods. Given the different models and techniques and the complexity and degeneracies in the parameter space, we find reasonable agreement between the different models. We conclude that the FitSKIRT method allows comparison between different models and geometries in a quantitative manner and minimizes the need of human intervention and biasing. The high level of automation makes it an ideal tool to use on larger sets of observed data.
Automatic differentiation evaluated as a tool for rotorcraft design and optimization
NASA Technical Reports Server (NTRS)
Walsh, Joanne L.; Young, Katherine C.
1995-01-01
This paper investigates the use of automatic differentiation (AD) as a means for generating sensitivity analyses in rotorcraft design and optimization. This technique transforms an existing computer program into a new program that performs sensitivity analysis in addition to the original analysis. The original FORTRAN program calculates a set of dependent (output) variables from a set of independent (input) variables, the new FORTRAN program calculates the partial derivatives of the dependent variables with respect to the independent variables. The AD technique is a systematic implementation of the chain rule of differentiation, this method produces derivatives to machine accuracy at a cost that is comparable with that of finite-differencing methods. For this study, an analysis code that consists of the Langley-developed hover analysis HOVT, the comprehensive rotor analysis CAMRAD/JA, and associated preprocessors is processed through the AD preprocessor ADIFOR 2.0. The resulting derivatives are compared with derivatives obtained from finite-differencing techniques. The derivatives obtained with ADIFOR 2.0 are exact within machine accuracy and do not depend on the selection of step-size, as are the derivatives obtained with finite-differencing techniques.
Variable Complexity Structural Optimization of Shells
NASA Technical Reports Server (NTRS)
Haftka, Raphael T.; Venkataraman, Satchi
1999-01-01
Structural designers today face both opportunities and challenges in a vast array of available analysis and optimization programs. Some programs such as NASTRAN, are very general, permitting the designer to model any structure, to any degree of accuracy, but often at a higher computational cost. Additionally, such general procedures often do not allow easy implementation of all constraints of interest to the designer. Other programs, based on algebraic expressions used by designers one generation ago, have limited applicability for general structures with modem materials. However, when applicable, they provide easy understanding of design decisions trade-off. Finally, designers can also use specialized programs suitable for designing efficiently a subset of structural problems. For example, PASCO and PANDA2 are panel design codes, which calculate response and estimate failure much more efficiently than general-purpose codes, but are narrowly applicable in terms of geometry and loading. Therefore, the problem of optimizing structures based on simultaneous use of several models and computer programs is a subject of considerable interest. The problem of using several levels of models in optimization has been dubbed variable complexity modeling. Work under NASA grant NAG1-2110 has been concerned with the development of variable complexity modeling strategies with special emphasis on response surface techniques. In addition, several modeling issues for the design of shells of revolution were studied.
Variable Complexity Structural Optimization of Shells
NASA Technical Reports Server (NTRS)
Haftka, Raphael T.; Venkataraman, Satchi
1998-01-01
Structural designers today face both opportunities and challenges in a vast array of available analysis and optimization programs. Some programs such as NASTRAN, are very general, permitting the designer to model any structure, to any degree of accuracy, but often at a higher computational cost. Additionally, such general procedures often do not allow easy implementation of all constraints of interest to the designer. Other programs, based on algebraic expressions used by designers one generation ago, have limited applicability for general structures with modem materials. However, when applicable, they provide easy understanding of design decisions trade-off. Finally, designers can also use specialized programs suitable for designing efficiently a subset of structural problems. For example, PASCO and PANDA2 are panel design codes, which calculate response and estimate failure much more efficiently than general-purpose codes, but are narrowly applicable in terms of geometry and loading. Therefore, the problem of optimizing structures based on simultaneous use of several models and computer programs is a subject of considerable interest. The problem of using several levels of models in optimization has been dubbed variable complexity modeling. Work under NASA grant NAG1-1808 has been concerned with the development of variable complexity modeling strategies with special emphasis on response surface techniques. In addition several modeling issues for the design of shells of revolution were studied.
Concatenated Coding Using Trellis-Coded Modulation
NASA Technical Reports Server (NTRS)
Thompson, Michael W.
1997-01-01
In the late seventies and early eighties a technique known as Trellis Coded Modulation (TCM) was developed for providing spectrally efficient error correction coding. Instead of adding redundant information in the form of parity bits, redundancy is added at the modulation stage thereby increasing bandwidth efficiency. A digital communications system can be designed to use bandwidth-efficient multilevel/phase modulation such as Amplitude Shift Keying (ASK), Phase Shift Keying (PSK), Differential Phase Shift Keying (DPSK) or Quadrature Amplitude Modulation (QAM). Performance gain can be achieved by increasing the number of signals over the corresponding uncoded system to compensate for the redundancy introduced by the code. A considerable amount of research and development has been devoted toward developing good TCM codes for severely bandlimited applications. More recently, the use of TCM for satellite and deep space communications applications has received increased attention. This report describes the general approach of using a concatenated coding scheme that features TCM and RS coding. Results have indicated that substantial (6-10 dB) performance gains can be achieved with this approach with comparatively little bandwidth expansion. Since all of the bandwidth expansion is due to the RS code we see that TCM based concatenated coding results in roughly 10-50% bandwidth expansion compared to 70-150% expansion for similar concatenated scheme which use convolution code. We stress that combined coding and modulation optimization is important for achieving performance gains while maintaining spectral efficiency.
Formulation for Simultaneous Aerodynamic Analysis and Design Optimization
NASA Technical Reports Server (NTRS)
Hou, G. W.; Taylor, A. C., III; Mani, S. V.; Newman, P. A.
1993-01-01
An efficient approach for simultaneous aerodynamic analysis and design optimization is presented. This approach does not require the performance of many flow analyses at each design optimization step, which can be an expensive procedure. Thus, this approach brings us one step closer to meeting the challenge of incorporating computational fluid dynamic codes into gradient-based optimization techniques for aerodynamic design. An adjoint-variable method is introduced to nullify the effect of the increased number of design variables in the problem formulation. The method has been successfully tested on one-dimensional nozzle flow problems, including a sample problem with a normal shock. Implementations of the above algorithm are also presented that incorporate Newton iterations to secure a high-quality flow solution at the end of the design process. Implementations with iterative flow solvers are possible and will be required for large, multidimensional flow problems.
HERCULES: A Pattern Driven Code Transformation System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kartsaklis, Christos; Hernandez, Oscar R; Hsu, Chung-Hsing
2012-01-01
New parallel computers are emerging, but developing efficient scientific code for them remains difficult. A scientist must manage not only the science-domain complexity but also the performance-optimization complexity. HERCULES is a code transformation system designed to help the scientist to separate the two concerns, which improves code maintenance, and facilitates performance optimization. The system combines three technologies, code patterns, transformation scripts and compiler plugins, to provide the scientist with an environment to quickly implement code transformations that suit his needs. Unlike existing code optimization tools, HERCULES is unique in its focus on user-level accessibility. In this paper we discuss themore » design, implementation and an initial evaluation of HERCULES.« less
Optimization of Boiling Water Reactor Loading Pattern Using Two-Stage Genetic Algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kobayashi, Yoko; Aiyoshi, Eitaro
2002-10-15
A new two-stage optimization method based on genetic algorithms (GAs) using an if-then heuristic rule was developed to generate optimized boiling water reactor (BWR) loading patterns (LPs). In the first stage, the LP is optimized using an improved GA operator. In the second stage, an exposure-dependent control rod pattern (CRP) is sought using GA with an if-then heuristic rule. The procedure of the improved GA is based on deterministic operators that consist of crossover, mutation, and selection. The handling of the encoding technique and constraint conditions by that GA reflects the peculiar characteristics of the BWR. In addition, strategies suchmore » as elitism and self-reproduction are effectively used in order to improve the search speed. The LP evaluations were performed with a three-dimensional diffusion code that coupled neutronic and thermal-hydraulic models. Strong axial heterogeneities and constraints dependent on three dimensions have always necessitated the use of three-dimensional core simulators for BWRs, so that optimization of computational efficiency is required. The proposed algorithm is demonstrated by successfully generating LPs for an actual BWR plant in two phases. One phase is only LP optimization applying the Haling technique. The other phase is an LP optimization that considers the CRP during reactor operation. In test calculations, candidates that shuffled fresh and burned fuel assemblies within a reasonable computation time were obtained.« less
Fundamental Limits of Delay and Security in Device-to-Device Communication
2013-01-01
systematic MDS (maximum distance separable) codes and random binning strategies that achieve a Pareto optimal delayreconstruction tradeoff. The erasure MD...file, and a coding scheme based on erasure compression and Slepian-Wolf binning is presented. The coding scheme is shown to provide a Pareto optimal...ble) codes and random binning strategies that achieve a Pareto optimal delay- reconstruction tradeoff. The erasure MD setup is then used to propose a
Runtime Speculative Software-Only Fault Tolerance
2012-06-01
reliability of RSFT, a in-depth analysis on its window of vulnerability is also discussed and measured via simulated fault injection. The performance...propagation of faults through the entire program. For optimal performance, these techniques have to use herotic alias analysis to find the minimum set of...affect program output. No program source code or alias analysis is needed to analyze the fault propagation ahead of time. 2.3 Limitations of Existing
A Degree Distribution Optimization Algorithm for Image Transmission
NASA Astrophysics Data System (ADS)
Jiang, Wei; Yang, Junjie
2016-09-01
Luby Transform (LT) code is the first practical implementation of digital fountain code. The coding behavior of LT code is mainly decided by the degree distribution which determines the relationship between source data and codewords. Two degree distributions are suggested by Luby. They work well in typical situations but not optimally in case of finite encoding symbols. In this work, the degree distribution optimization algorithm is proposed to explore the potential of LT code. Firstly selection scheme of sparse degrees for LT codes is introduced. Then probability distribution is optimized according to the selected degrees. In image transmission, bit stream is sensitive to the channel noise and even a single bit error may cause the loss of synchronization between the encoder and the decoder. Therefore the proposed algorithm is designed for image transmission situation. Moreover, optimal class partition is studied for image transmission with unequal error protection. The experimental results are quite promising. Compared with LT code with robust soliton distribution, the proposed algorithm improves the final quality of recovered images obviously with the same overhead.
Conjunctive coding in an evolved spiking model of retrosplenial cortex.
Rounds, Emily L; Alexander, Andrew S; Nitz, Douglas A; Krichmar, Jeffrey L
2018-06-04
Retrosplenial cortex (RSC) is an association cortex supporting spatial navigation and memory. However, critical issues remain concerning the forms by which its ensemble spiking patterns register spatial relationships that are difficult for experimental techniques to fully address. We therefore applied an evolutionary algorithmic optimization technique to create spiking neural network models that matched electrophysiologically observed spiking dynamics in rat RSC neuronal ensembles. Virtual experiments conducted on the evolved networks revealed a mixed selectivity coding capability that was not built into the optimization method, but instead emerged as a consequence of replicating biological firing patterns. The experiments reveal several important outcomes of mixed selectivity that may subserve flexible navigation and spatial representation: (a) robustness to loss of specific inputs, (b) immediate and stable encoding of novel routes and route locations, (c) automatic resolution of input variable conflicts, and (d) dynamic coding that allows rapid adaptation to changing task demands without retraining. These findings suggest that biological retrosplenial cortex can generate unique, first-trial, conjunctive encodings of spatial positions and actions that can be used by downstream brain regions for navigation and path integration. Moreover, these results are consistent with the proposed role for the RSC in the transformation of representations between reference frames and navigation strategy deployment. Finally, the specific modeling framework used for evolving synthetic retrosplenial networks represents an important advance for computational modeling by which synthetic neural networks can encapsulate, describe, and predict the behavior of neural circuits at multiple levels of function. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Optimized Hyper Beamforming of Linear Antenna Arrays Using Collective Animal Behaviour
Ram, Gopi; Mandal, Durbadal; Kar, Rajib; Ghoshal, Sakti Prasad
2013-01-01
A novel optimization technique which is developed on mimicking the collective animal behaviour (CAB) is applied for the optimal design of hyper beamforming of linear antenna arrays. Hyper beamforming is based on sum and difference beam patterns of the array, each raised to the power of a hyperbeam exponent parameter. The optimized hyperbeam is achieved by optimization of current excitation weights and uniform interelement spacing. As compared to conventional hyper beamforming of linear antenna array, real coded genetic algorithm (RGA), particle swarm optimization (PSO), and differential evolution (DE) applied to the hyper beam of the same array can achieve reduction in sidelobe level (SLL) and same or less first null beam width (FNBW), keeping the same value of hyperbeam exponent. Again, further reductions of sidelobe level (SLL) and first null beam width (FNBW) have been achieved by the proposed collective animal behaviour (CAB) algorithm. CAB finds near global optimal solution unlike RGA, PSO, and DE in the present problem. The above comparative optimization is illustrated through 10-, 14-, and 20-element linear antenna arrays to establish the optimization efficacy of CAB. PMID:23970843
NASA Astrophysics Data System (ADS)
Korneta, Wojciech; Gomes, Iacyel
2017-11-01
Traditional bistable sensors use external bias signal to drive its response between states and their detection strategy is based on the output power spectral density or the residence time difference (RTD) in two sensor states. Recently, the noise activated nonlinear dynamic sensors driven only by noise based on RTD technique have been proposed. Here, we present experimental results of dc voltage measurements by noise-driven bistable sensor based on electronic Chua's circuit operating in a chaotic regime where two single scroll attractors coexist. The output of the sensor is quantified by the proportion of the time the sensor stays in one state to the total observation time and by the spike-count rate with spikes defined by crossings between attractors. The relationship between the stimuli and particular observable for different noise intensities is obtained, the usefulness of each coding scheme is discussed, and the optimal noise intensity for detection is indicated. It is shown that the obtained relationship is the same for any observation time when population coding is used. The optimal time window for both detection and the number of units in population coding is found. Our results may be useful for analyses and understanding of the neural activity and in designing bistable storage elements at length scales where thermal fluctuations drastically increase and the effect of noise must be taken into consideration.
Multi-Constraint Multi-Variable Optimization of Source-Driven Nuclear Systems
NASA Astrophysics Data System (ADS)
Watkins, Edward Francis
1995-01-01
A novel approach to the search for optimal designs of source-driven nuclear systems is investigated. Such systems include radiation shields, fusion reactor blankets and various neutron spectrum-shaping assemblies. The novel approach involves the replacement of the steepest-descents optimization algorithm incorporated in the code SWAN by a significantly more general and efficient sequential quadratic programming optimization algorithm provided by the code NPSOL. The resulting SWAN/NPSOL code system can be applied to more general, multi-variable, multi-constraint shield optimization problems. The constraints it accounts for may include simple bounds on variables, linear constraints, and smooth nonlinear constraints. It may also be applied to unconstrained, bound-constrained and linearly constrained optimization. The shield optimization capabilities of the SWAN/NPSOL code system is tested and verified in a variety of optimization problems: dose minimization at constant cost, cost minimization at constant dose, and multiple-nonlinear constraint optimization. The replacement of the optimization part of SWAN with NPSOL is found feasible and leads to a very substantial improvement in the complexity of optimization problems which can be efficiently handled.
On-board data management study for EOPAP
NASA Technical Reports Server (NTRS)
Davisson, L. D.
1975-01-01
The requirements, implementation techniques, and mission analysis associated with on-board data management for EOPAP were studied. SEASAT-A was used as a baseline, and the storage requirements, data rates, and information extraction requirements were investigated for each of the following proposed SEASAT sensors: a short pulse 13.9 GHz radar, a long pulse 13.9 GHz radar, a synthetic aperture radar, a multispectral passive microwave radiometer facility, and an infrared/visible very high resolution radiometer (VHRR). Rate distortion theory was applied to determine theoretical minimum data rates and compared with the rates required by practical techniques. It was concluded that practical techniques can be used which approach the theoretically optimum based upon an empirically determined source random process model. The results of the preceding investigations were used to recommend an on-board data management system for (1) data compression through information extraction, optimal noiseless coding, source coding with distortion, data buffering, and data selection under command or as a function of data activity, (2) for command handling, (3) for spacecraft operation and control, and (4) for experiment operation and monitoring.
Optimized nonorthogonal transforms for image compression.
Guleryuz, O G; Orchard, M T
1997-01-01
The transform coding of images is analyzed from a common standpoint in order to generate a framework for the design of optimal transforms. It is argued that all transform coders are alike in the way they manipulate the data structure formed by transform coefficients. A general energy compaction measure is proposed to generate optimized transforms with desirable characteristics particularly suited to the simple transform coding operation of scalar quantization and entropy coding. It is shown that the optimal linear decoder (inverse transform) must be an optimal linear estimator, independent of the structure of the transform generating the coefficients. A formulation that sequentially optimizes the transforms is presented, and design equations and algorithms for its computation provided. The properties of the resulting transform systems are investigated. In particular, it is shown that the resulting basis are nonorthogonal and complete, producing energy compaction optimized, decorrelated transform coefficients. Quantization issues related to nonorthogonal expansion coefficients are addressed with a simple, efficient algorithm. Two implementations are discussed, and image coding examples are given. It is shown that the proposed design framework results in systems with superior energy compaction properties and excellent coding results.
Synergia: an accelerator modeling tool with 3-D space charge
DOE Office of Scientific and Technical Information (OSTI.GOV)
Amundson, James F.; Spentzouris, P.; /Fermilab
2004-07-01
High precision modeling of space-charge effects, together with accurate treatment of single-particle dynamics, is essential for designing future accelerators as well as optimizing the performance of existing machines. We describe Synergia, a high-fidelity parallel beam dynamics simulation package with fully three dimensional space-charge capabilities and a higher order optics implementation. We describe the computational techniques, the advanced human interface, and the parallel performance obtained using large numbers of macroparticles. We also perform code benchmarks comparing to semi-analytic results and other codes. Finally, we present initial results on particle tune spread, beam halo creation, and emittance growth in the Fermilab boostermore » accelerator.« less
Adapted all-numerical correlator for face recognition applications
NASA Astrophysics Data System (ADS)
Elbouz, M.; Bouzidi, F.; Alfalou, A.; Brosseau, C.; Leonard, I.; Benkelfat, B.-E.
2013-03-01
In this study, we suggest and validate an all-numerical implementation of a VanderLugt correlator which is optimized for face recognition applications. The main goal of this implementation is to take advantage of the benefits (detection, localization, and identification of a target object within a scene) of correlation methods and exploit the reconfigurability of numerical approaches. This technique requires a numerical implementation of the optical Fourier transform. We pay special attention to adapt the correlation filter to this numerical implementation. One main goal of this work is to reduce the size of the filter in order to decrease the memory space required for real time applications. To fulfil this requirement, we code the reference images with 8 bits and study the effect of this coding on the performances of several composite filters (phase-only filter, binary phase-only filter). The saturation effect has for effect to decrease the performances of the correlator for making a decision when filters contain up to nine references. Further, an optimization is proposed based for an optimized segmented composite filter. Based on this approach, we present tests with different faces demonstrating that the above mentioned saturation effect is significantly reduced while minimizing the size of the learning data base.
Welter, David E.; Doherty, John E.; Hunt, Randall J.; Muffels, Christopher T.; Tonkin, Matthew J.; Schreuder, Willem A.
2012-01-01
An object-oriented parameter estimation code was developed to incorporate benefits of object-oriented programming techniques for solving large parameter estimation modeling problems. The code is written in C++ and is a formulation and expansion of the algorithms included in PEST, a widely used parameter estimation code written in Fortran. The new code is called PEST++ and is designed to lower the barriers of entry for users and developers while providing efficient algorithms that can accommodate large, highly parameterized problems. This effort has focused on (1) implementing the most popular features of PEST in a fashion that is easy for novice or experienced modelers to use and (2) creating a software design that is easy to extend; that is, this effort provides a documented object-oriented framework designed from the ground up to be modular and extensible. In addition, all PEST++ source code and its associated libraries, as well as the general run manager source code, have been integrated in the Microsoft Visual Studio® 2010 integrated development environment. The PEST++ code is designed to provide a foundation for an open-source development environment capable of producing robust and efficient parameter estimation tools for the environmental modeling community into the future.
Modeling of photon migration in the human lung using a finite volume solver
NASA Astrophysics Data System (ADS)
Sikorski, Zbigniew; Furmanczyk, Michal; Przekwas, Andrzej J.
2006-02-01
The application of the frequency domain and steady-state diffusive optical spectroscopy (DOS) and steady-state near infrared spectroscopy (NIRS) to diagnosis of the human lung injury challenges many elements of these techniques. These include the DOS/NIRS instrument performance and accurate models of light transport in heterogeneous thorax tissue. The thorax tissue not only consists of different media (e.g. chest wall with ribs, lungs) but its optical properties also vary with time due to respiration and changes in thorax geometry with contusion (e.g. pneumothorax or hemothorax). This paper presents a finite volume solver developed to model photon migration in the diffusion approximation in heterogeneous complex 3D tissues. The code applies boundary conditions that account for Fresnel reflections. We propose an effective diffusion coefficient for the void volumes (pneumothorax) based on the assumption of the Lambertian diffusion of photons entering the pleural cavity and accounting for the local pleural cavity thickness. The code has been validated using the MCML Monte Carlo code as a benchmark. The code environment enables a semi-automatic preparation of 3D computational geometry from medical images and its rapid automatic meshing. We present the application of the code to analysis/optimization of the hybrid DOS/NIRS/ultrasound technique in which ultrasound provides data on the localization of thorax tissue boundaries. The code effectiveness (3D complex case computation takes 1 second) enables its use to quantitatively relate detected light signal to absorption and reduced scattering coefficients that are indicators of the pulmonary physiologic state (hemoglobin concentration and oxygenation).
Solar dynamic power for the Space Station
NASA Technical Reports Server (NTRS)
Archer, J. S.; Diamant, E. S.
1986-01-01
This paper describes a computer code which provides a significant advance in the systems analysis capabilities of solar dynamic power modules. While the code can be used to advantage in the preliminary analysis of terrestrial solar dynamic modules its real value lies in the adaptions which make it particularly useful for the conceptualization of optimized power modules for space applications. In particular, as illustrated in the paper, the code can be used to establish optimum values of concentrator diameter, concentrator surface roughness, concentrator rim angle and receiver aperture corresponding to the main heat cycle options - Organic Rankine and Brayton - and for certain receiver design options. The code can also be used to establish system sizing margins to account for the loss of reflectivity in orbit or the seasonal variation of insolation. By the simulation of the interactions among the major components of a solar dynamic module and through simplified formulations of the major thermal-optic-thermodynamic interactions the code adds a powerful, efficient and economic analytical tool to the repertory of techniques available for the design of advanced space power systems.
Development of full wave code for modeling RF fields in hot non-uniform plasmas
NASA Astrophysics Data System (ADS)
Zhao, Liangji; Svidzinski, Vladimir; Spencer, Andrew; Kim, Jin-Soo
2016-10-01
FAR-TECH, Inc. is developing a full wave RF modeling code to model RF fields in fusion devices and in general plasma applications. As an important component of the code, an adaptive meshless technique is introduced to solve the wave equations, which allows resolving plasma resonances efficiently and adapting to the complexity of antenna geometry and device boundary. The computational points are generated using either a point elimination method or a force balancing method based on the monitor function, which is calculated by solving the cold plasma dispersion equation locally. Another part of the code is the conductivity kernel calculation, used for modeling the nonlocal hot plasma dielectric response. The conductivity kernel is calculated on a coarse grid of test points and then interpolated linearly onto the computational points. All the components of the code are parallelized using MPI and OpenMP libraries to optimize the execution speed and memory. The algorithm and the results of our numerical approach to solving 2-D wave equations in a tokamak geometry will be presented. Work is supported by the U.S. DOE SBIR program.
Data compression using adaptive transform coding. Appendix 1: Item 1. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Rost, Martin Christopher
1988-01-01
Adaptive low-rate source coders are described in this dissertation. These coders adapt by adjusting the complexity of the coder to match the local coding difficulty of the image. This is accomplished by using a threshold driven maximum distortion criterion to select the specific coder used. The different coders are built using variable blocksized transform techniques, and the threshold criterion selects small transform blocks to code the more difficult regions and larger blocks to code the less complex regions. A theoretical framework is constructed from which the study of these coders can be explored. An algorithm for selecting the optimal bit allocation for the quantization of transform coefficients is developed. The bit allocation algorithm is more fully developed, and can be used to achieve more accurate bit assignments than the algorithms currently used in the literature. Some upper and lower bounds for the bit-allocation distortion-rate function are developed. An obtainable distortion-rate function is developed for a particular scalar quantizer mixing method that can be used to code transform coefficients at any rate.
Intercomparison of three microwave/infrared high resolution line-by-line radiative transfer codes
NASA Astrophysics Data System (ADS)
Schreier, Franz; Milz, Mathias; Buehler, Stefan A.; von Clarmann, Thomas
2018-05-01
An intercomparison of three line-by-line (lbl) codes developed independently for atmospheric radiative transfer and remote sensing - ARTS, GARLIC, and KOPRA - has been performed for a thermal infrared nadir sounding application assuming a HIRS-like (High resolution Infrared Radiation Sounder) setup. Radiances for the 19 HIRS infrared channels and a set of 42 atmospheric profiles from the "Garand dataset" have been computed. The mutual differences of the equivalent brightness temperatures are presented and possible causes of disagreement are discussed. In particular, the impact of path integration schemes and atmospheric layer discretization is assessed. When the continuum absorption contribution is ignored because of the different implementations, residuals are generally in the sub-Kelvin range and smaller than 0.1 K for some window channels (and all atmospheric models and lbl codes). None of the three codes turned out to be perfect for all channels and atmospheres. Remaining discrepancies are attributed to different lbl optimization techniques. Lbl codes seem to have reached a maturity in the implementation of radiative transfer that the choice of the underlying physical models (line shape models, continua etc) becomes increasingly relevant.
User's manual for the BNW-I optimization code for dry-cooled power plants. Volume III. [PLCIRI
DOE Office of Scientific and Technical Information (OSTI.GOV)
Braun, D.J.; Daniel, D.J.; De Mier, W.V.
1977-01-01
This appendix to User's Manual for the BNW-1 Optimization Code for Dry-Cooled Power Plants provides a listing of the BNW-I optimization code for determining, for a particular size power plant, the optimum dry cooling tower design using a plastic tube cooling surface and circular tower arrangement of the tube bundles. (LCL)
Numerical optimization of perturbative coils for tokamaks
NASA Astrophysics Data System (ADS)
Lazerson, Samuel; Park, Jong-Kyu; Logan, Nikolas; Boozer, Allen; NSTX-U Research Team
2014-10-01
Numerical optimization of coils which apply three dimensional (3D) perturbative fields to tokamaks is presented. The application of perturbative 3D magnetic fields in tokamaks is now commonplace for control of error fields, resistive wall modes, resonant field drive, and neoclassical toroidal viscosity (NTV) torques. The design of such systems has focused on control of toroidal mode number, with coil shapes based on simple window-pane designs. In this work, a numerical optimization suite based on the STELLOPT 3D equilibrium optimization code is presented. The new code, IPECOPT, replaces the VMEC equilibrium code with the IPEC perturbed equilibrium code, and targets NTV torque by coupling to the PENT code. Fixed boundary optimizations of the 3D fields for the NSTX-U experiment are underway. Initial results suggest NTV torques can be driven by normal field spectrums which are not pitch-resonant with the magnetic field lines. Work has focused on driving core torque with n = 1 and edge torques with n = 3 fields. Optimizations of the coil currents for the planned NSTX-U NCC coils highlight the code's free boundary capability. This manuscript has been authored by Princeton University under Contract Number DE-AC02-09CH11466 with the U.S. Department of Energy.
Rocketdyne/Westinghouse nuclear thermal rocket engine modeling
NASA Technical Reports Server (NTRS)
Glass, James F.
1993-01-01
The topics are presented in viewgraph form and include the following: systems approach needed for nuclear thermal rocket (NTR) design optimization; generic NTR engine power balance codes; rocketdyne nuclear thermal system code; software capabilities; steady state model; NTR engine optimizer code-logic; reactor power calculation logic; sample multi-component configuration; NTR design code output; generic NTR code at Rocketdyne; Rocketdyne NTR model; and nuclear thermal rocket modeling directions.
2003-03-01
organizations . Reducing attrition rates through optimal selection decisions can “reduce training cost, improve job performance, and enhance...capturing the weights for use in the SNR method is not straightforward. A special VBA application had to be written to capture and organize the network...before the VBA application can be used. Appendix D provides the VBA code used to import and organize the network weights and input standardization
Automating the design of scientific computing software
NASA Technical Reports Server (NTRS)
Kant, Elaine
1992-01-01
SINAPSE is a domain-specific software design system that generates code from specifications of equations and algorithm methods. This paper describes the system's design techniques (planning in a space of knowledge-based refinement and optimization rules), user interaction style (user has option to control decision making), and representation of knowledge (rules and objects). It also summarizes how the system knowledge has evolved over time and suggests some issues in building software design systems to facilitate reuse.
Fundamental differences between optimization code test problems in engineering applications
NASA Technical Reports Server (NTRS)
Eason, E. D.
1984-01-01
The purpose here is to suggest that there is at least one fundamental difference between the problems used for testing optimization codes and the problems that engineers often need to solve; in particular, the level of precision that can be practically achieved in the numerical evaluation of the objective function, derivatives, and constraints. This difference affects the performance of optimization codes, as illustrated by two examples. Two classes of optimization problem were defined. Class One functions and constraints can be evaluated to a high precision that depends primarily on the word length of the computer. Class Two functions and/or constraints can only be evaluated to a moderate or a low level of precision for economic or modeling reasons, regardless of the computer word length. Optimization codes have not been adequately tested on Class Two problems. There are very few Class Two test problems in the literature, while there are literally hundreds of Class One test problems. The relative performance of two codes may be markedly different for Class One and Class Two problems. Less sophisticated direct search type codes may be less likely to be confused or to waste many function evaluations on Class Two problems. The analysis accuracy and minimization performance are related in a complex way that probably varies from code to code. On a problem where the analysis precision was varied over a range, the simple Hooke and Jeeves code was more efficient at low precision while the Powell code was more efficient at high precision.
Efficient Multi-Stage Time Marching for Viscous Flows via Local Preconditioning
NASA Technical Reports Server (NTRS)
Kleb, William L.; Wood, William A.; vanLeer, Bram
1999-01-01
A new method has been developed to accelerate the convergence of explicit time-marching, laminar, Navier-Stokes codes through the combination of local preconditioning and multi-stage time marching optimization. Local preconditioning is a technique to modify the time-dependent equations so that all information moves or decays at nearly the same rate, thus relieving the stiffness for a system of equations. Multi-stage time marching can be optimized by modifying its coefficients to account for the presence of viscous terms, allowing larger time steps. We show it is possible to optimize the time marching scheme for a wide range of cell Reynolds numbers for the scalar advection-diffusion equation, and local preconditioning allows this optimization to be applied to the Navier-Stokes equations. Convergence acceleration of the new method is demonstrated through numerical experiments with circular advection and laminar boundary-layer flow over a flat plate.
ALARA: The next link in a chain of activation codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilson, P.P.H.; Henderson, D.L.
1996-12-31
The Adaptive Laplace and Analytic Radioactivity Analysis [ALARA] code has been developed as the next link in the chain of DKR radioactivity codes. Its methods address the criticisms of DKR while retaining its best features. While DKR ignored loops in the transmutation/decay scheme to preserve the exactness of the mathematical solution, ALARA incorporates new computational approaches without jeopardizing the most important features of DKR`s physical modelling and mathematical methods. The physical model uses `straightened-loop, linear chains` to achieve the same accuracy in the loop solutions as is demanded in the rest of the scheme. In cases where a chain hasmore » no loops, the exact DKR solution is used. Otherwise, ALARA adaptively chooses between a direct Laplace inversion technique and a Laplace expansion inversion technique to optimize the accuracy and speed of the solution. All of these methods result in matrix solutions which allow the fastest and most accurate solution of exact pulsing histories. Since the entire history is solved for each chain as it is created, ALARA achieves the optimum combination of high accuracy, high speed and low memory usage. 8 refs., 2 figs.« less
A Wideband Satcom Based Avionics Network with CDMA Uplink and TDM Downlink
NASA Technical Reports Server (NTRS)
Agrawal, D.; Johnson, B. S.; Madhow, U.; Ramchandran, K.; Chun, K. S.
2000-01-01
The purpose of this paper is to describe some key technical ideas behind our vision of a future satcom based digital communication network for avionics applications The key features of our design are as follows: (a) Packetized transmission to permit efficient use of system resources for multimedia traffic; (b) A time division multiplexed (TDM) satellite downlink whose physical layer is designed to operate the satellite link at maximum power efficiency. We show how powerful turbo codes (invented originally for linear modulation) can be used with nonlinear constant envelope modulation, thus permitting the satellite amplifier to operate in a power efficient nonlinear regime; (c) A code division multiple access (CDMA) satellite uplink, which permits efficient access to the satellite from multiple asynchronous users. Closed loop power control is difficult for bursty packetized traffic, especially given the large round trip delay to the satellite. We show how adaptive interference suppression techniques can be used to deal with the ensuing near-far problem; (d) Joint source-channel coding techniques are required both at the physical and the data transport layer to optimize the end-to-end performance. We describe a novel approach to multiple description image encoding at the data transport layer in this paper.
NASA Astrophysics Data System (ADS)
Wang, H.; Chen, H.; Chen, X.; Wu, Q.; Wang, Z.
2016-12-01
The Global Nested Air Quality Prediction Modeling System for Hg (GNAQPMS-Hg) is a global chemical transport model coupled Hg transport module to investigate the mercury pollution. In this study, we present our work of transplanting the GNAQPMS model on Intel Xeon Phi processor, Knights Landing (KNL) to accelerate the model. KNL is the second-generation product adopting Many Integrated Core Architecture (MIC) architecture. Compared with the first generation Knight Corner (KNC), KNL has more new hardware features, that it can be used as unique processor as well as coprocessor with other CPU. According to the Vtune tool, the high overhead modules in GNAQPMS model have been addressed, including CBMZ gas chemistry, advection and convection module, and wet deposition module. These high overhead modules were accelerated by optimizing code and using new techniques of KNL. The following optimized measures was done: 1) Changing the pure MPI parallel mode to hybrid parallel mode with MPI and OpenMP; 2.Vectorizing the code to using the 512-bit wide vector computation unit. 3. Reducing unnecessary memory access and calculation. 4. Reducing Thread Local Storage (TLS) for common variables with each OpenMP thread in CBMZ. 5. Changing the way of global communication from files writing and reading to MPI functions. After optimization, the performance of GNAQPMS is greatly increased both on CPU and KNL platform, the single-node test showed that optimized version has 2.6x speedup on two sockets CPU platform and 3.3x speedup on one socket KNL platform compared with the baseline version code, which means the KNL has 1.29x speedup when compared with 2 sockets CPU platform.
Hybrid petacomputing meets cosmology: The Roadrunner Universe project
NASA Astrophysics Data System (ADS)
Habib, Salman; Pope, Adrian; Lukić, Zarija; Daniel, David; Fasel, Patricia; Desai, Nehal; Heitmann, Katrin; Hsu, Chung-Hsing; Ankeny, Lee; Mark, Graham; Bhattacharya, Suman; Ahrens, James
2009-07-01
The target of the Roadrunner Universe project at Los Alamos National Laboratory is a set of very large cosmological N-body simulation runs on the hybrid supercomputer Roadrunner, the world's first petaflop platform. Roadrunner's architecture presents opportunities and difficulties characteristic of next-generation supercomputing. We describe a new code designed to optimize performance and scalability by explicitly matching the underlying algorithms to the machine architecture, and by using the physics of the problem as an essential aid in this process. While applications will differ in specific exploits, we believe that such a design process will become increasingly important in the future. The Roadrunner Universe project code, MC3 (Mesh-based Cosmology Code on the Cell), uses grid and direct particle methods to balance the capabilities of Roadrunner's conventional (Opteron) and accelerator (Cell BE) layers. Mirrored particle caches and spectral techniques are used to overcome communication bandwidth limitations and possible difficulties with complicated particle-grid interaction templates.
Secure positioning technique based on the encrypted visible light map
NASA Astrophysics Data System (ADS)
Lee, Y. U.; Jung, G.
2017-01-01
For overcoming the performance degradation problems of the conventional visible light (VL) positioning system, which are due to the co-channel interference by adjacent light and the irregularity of the VL reception position in the three dimensional (3-D) VL channel, the secure positioning technique based on the two dimensional (2-D) encrypted VL map is proposed, implemented as the prototype for the specific embedded positioning system, and verified by performance tests in this paper. It is shown from the test results that the proposed technique achieves the performance enhancement over 21.7% value better than the conventional one in the real positioning environment, and the well known PN code is the optimal stream encryption key for the good VL positioning.
A General-Purpose Optimization Engine for Multi-Disciplinary Design Applications
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Hopkins, Dale A.; Berke, Laszlo
1996-01-01
A general purpose optimization tool for multidisciplinary applications, which in the literature is known as COMETBOARDS, is being developed at NASA Lewis Research Center. The modular organization of COMETBOARDS includes several analyzers and state-of-the-art optimization algorithms along with their cascading strategy. The code structure allows quick integration of new analyzers and optimizers. The COMETBOARDS code reads input information from a number of data files, formulates a design as a set of multidisciplinary nonlinear programming problems, and then solves the resulting problems. COMETBOARDS can be used to solve a large problem which can be defined through multiple disciplines, each of which can be further broken down into several subproblems. Alternatively, a small portion of a large problem can be optimized in an effort to improve an existing system. Some of the other unique features of COMETBOARDS include design variable formulation, constraint formulation, subproblem coupling strategy, global scaling technique, analysis approximation, use of either sequential or parallel computational modes, and so forth. The special features and unique strengths of COMETBOARDS assist convergence and reduce the amount of CPU time used to solve the difficult optimization problems of aerospace industries. COMETBOARDS has been successfully used to solve a number of problems, including structural design of space station components, design of nozzle components of an air-breathing engine, configuration design of subsonic and supersonic aircraft, mixed flow turbofan engines, wave rotor topped engines, and so forth. This paper introduces the COMETBOARDS design tool and its versatility, which is illustrated by citing examples from structures, aircraft design, and air-breathing propulsion engine design.
NASA Astrophysics Data System (ADS)
El-Wardany, Tahany; Lynch, Mathew; Gu, Wenjiong; Hsu, Arthur; Klecka, Michael; Nardi, Aaron; Viens, Daniel
This paper proposes an optimization framework enabling the integration of multi-scale / multi-physics simulation codes to perform structural optimization design for additively manufactured components. Cold spray was selected as the additive manufacturing (AM) process and its constraints were identified and included in the optimization scheme. The developed framework first utilizes topology optimization to maximize stiffness for conceptual design. The subsequent step applies shape optimization to refine the design for stress-life fatigue. The component weight was reduced by 20% while stresses were reduced by 75% and the rigidity was improved by 37%. The framework and analysis codes were implemented using Altair software as well as an in-house loading code. The optimized design was subsequently produced by the cold spray process.
Energy efficient rateless codes for high speed data transfer over free space optical channels
NASA Astrophysics Data System (ADS)
Prakash, Geetha; Kulkarni, Muralidhar; Acharya, U. S.
2015-03-01
Terrestrial Free Space Optical (FSO) links transmit information by using the atmosphere (free space) as a medium. In this paper, we have investigated the use of Luby Transform (LT) codes as a means to mitigate the effects of data corruption induced by imperfect channel which usually takes the form of lost or corrupted packets. LT codes, which are a class of Fountain codes, can be used independent of the channel rate and as many code words as required can be generated to recover all the message bits irrespective of the channel performance. Achieving error free high data rates with limited energy resources is possible with FSO systems if error correction codes with minimal overheads on the power can be used. We also employ a combination of Binary Phase Shift Keying (BPSK) with provision for modification of threshold and optimized LT codes with belief propagation for decoding. These techniques provide additional protection even under strong turbulence regimes. Automatic Repeat Request (ARQ) is another method of improving link reliability. Performance of ARQ is limited by the number of retransmissions and the corresponding time delay. We prove through theoretical computations and simulations that LT codes consume less energy per bit. We validate the feasibility of using energy efficient LT codes over ARQ for FSO links to be used in optical wireless sensor networks within the eye safety limits.
Optimized atom position and coefficient coding for matching pursuit-based image compression.
Shoa, Alireza; Shirani, Shahram
2009-12-01
In this paper, we propose a new encoding algorithm for matching pursuit image coding. We show that coding performance is improved when correlations between atom positions and atom coefficients are both used in encoding. We find the optimum tradeoff between efficient atom position coding and efficient atom coefficient coding and optimize the encoder parameters. Our proposed algorithm outperforms the existing coding algorithms designed for matching pursuit image coding. Additionally, we show that our algorithm results in better rate distortion performance than JPEG 2000 at low bit rates.
Optimizing fusion PIC code performance at scale on Cori Phase 2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koskela, T. S.; Deslippe, J.
In this paper we present the results of optimizing the performance of the gyrokinetic full-f fusion PIC code XGC1 on the Cori Phase Two Knights Landing system. The code has undergone substantial development to enable the use of vector instructions in its most expensive kernels within the NERSC Exascale Science Applications Program. We study the single-node performance of the code on an absolute scale using the roofline methodology to guide optimization efforts. We have obtained 2x speedups in single node performance due to enabling vectorization and performing memory layout optimizations. On multiple nodes, the code is shown to scale wellmore » up to 4000 nodes, near half the size of the machine. We discuss some communication bottlenecks that were identified and resolved during the work.« less
Joint-layer encoder optimization for HEVC scalable extensions
NASA Astrophysics Data System (ADS)
Tsai, Chia-Ming; He, Yuwen; Dong, Jie; Ye, Yan; Xiu, Xiaoyu; He, Yong
2014-09-01
Scalable video coding provides an efficient solution to support video playback on heterogeneous devices with various channel conditions in heterogeneous networks. SHVC is the latest scalable video coding standard based on the HEVC standard. To improve enhancement layer coding efficiency, inter-layer prediction including texture and motion information generated from the base layer is used for enhancement layer coding. However, the overall performance of the SHVC reference encoder is not fully optimized because rate-distortion optimization (RDO) processes in the base and enhancement layers are independently considered. It is difficult to directly extend the existing joint-layer optimization methods to SHVC due to the complicated coding tree block splitting decisions and in-loop filtering process (e.g., deblocking and sample adaptive offset (SAO) filtering) in HEVC. To solve those problems, a joint-layer optimization method is proposed by adjusting the quantization parameter (QP) to optimally allocate the bit resource between layers. Furthermore, to make more proper resource allocation, the proposed method also considers the viewing probability of base and enhancement layers according to packet loss rate. Based on the viewing probability, a novel joint-layer RD cost function is proposed for joint-layer RDO encoding. The QP values of those coding tree units (CTUs) belonging to lower layers referenced by higher layers are decreased accordingly, and the QP values of those remaining CTUs are increased to keep total bits unchanged. Finally the QP values with minimal joint-layer RD cost are selected to match the viewing probability. The proposed method was applied to the third temporal level (TL-3) pictures in the Random Access configuration. Simulation results demonstrate that the proposed joint-layer optimization method can improve coding performance by 1.3% for these TL-3 pictures compared to the SHVC reference encoder without joint-layer optimization.
Optimization of Particle-in-Cell Codes on RISC Processors
NASA Technical Reports Server (NTRS)
Decyk, Viktor K.; Karmesin, Steve Roy; Boer, Aeint de; Liewer, Paulette C.
1996-01-01
General strategies are developed to optimize particle-cell-codes written in Fortran for RISC processors which are commonly used on massively parallel computers. These strategies include data reorganization to improve cache utilization and code reorganization to improve efficiency of arithmetic pipelines.
Conversion from Engineering Units to Telemetry Counts on Dryden Flight Simulators
NASA Technical Reports Server (NTRS)
Fantini, Jay A.
1998-01-01
Dryden real-time flight simulators encompass the simulation of pulse code modulation (PCM) telemetry signals. This paper presents a new method whereby the calibration polynomial (from first to sixth order), representing the conversion from counts to engineering units (EU), is numerically inverted in real time. The result is less than one-count error for valid EU inputs. The Newton-Raphson method is used to numerically invert the polynomial. A reverse linear interpolation between the EU limits is used to obtain an initial value for the desired telemetry count. The method presented here is not new. What is new is how classical numerical techniques are optimized to take advantage of modem computer power to perform the desired calculations in real time. This technique makes the method simple to understand and implement. There are no interpolation tables to store in memory as in traditional methods. The NASA F-15 simulation converts and transmits over 1000 parameters at 80 times/sec. This paper presents algorithm development, FORTRAN code, and performance results.
Yukinawa, Naoto; Oba, Shigeyuki; Kato, Kikuya; Ishii, Shin
2009-01-01
Multiclass classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. There have been many studies of aggregating binary classifiers to construct a multiclass classifier based on one-versus-the-rest (1R), one-versus-one (11), or other coding strategies, as well as some comparison studies between them. However, the studies found that the best coding depends on each situation. Therefore, a new problem, which we call the "optimal coding problem," has arisen: how can we determine which coding is the optimal one in each situation? To approach this optimal coding problem, we propose a novel framework for constructing a multiclass classifier, in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. Although there is no a priori answer to the optimal coding problem, our weight tuning method can be a consistent answer to the problem. We apply this method to various classification problems including a synthesized data set and some cancer diagnosis data sets from gene expression profiling. The results demonstrate that, in most situations, our method can improve classification accuracy over simple voting heuristics and is better than or comparable to state-of-the-art multiclass predictors.
Optimizing transformations of stencil operations for parallel cache-based architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bassetti, F.; Davis, K.
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like operations for cache-based architectures. This technique takes advantage of the semantic knowledge implicity in stencil-like computations. The technique is implemented as a source-to-source program transformation; because of its specificity it could not be expected of a conventional compiler. Empirical results demonstrate a uniform factor of two speedup. The experiments clearly show the benefits of this technique to be a consequence, as intended, of the reduction in cache misses. The test codes are based on a 5-point stencil obtained by the discretization of the Poisson equation andmore » applied to a two-dimensional uniform grid using the Jacobi method as an iterative solver. Results are presented for a 1-D tiling for a single processor, and in parallel using 1-D data partition. For the parallel case both blocking and non-blocking communication are tested. The same scheme of experiments has bee n performed for the 2-D tiling case. However, for the parallel case the 2-D partitioning is not discussed here, so the parallel case handled for 2-D is 2-D tiling with 1-D data partitioning.« less
Lakshmanan, Manu N.; Greenberg, Joel A.; Samei, Ehsan; Kapadia, Anuj J.
2017-01-01
Abstract. Although transmission-based x-ray imaging is the most commonly used imaging approach for breast cancer detection, it exhibits false negative rates higher than 15%. To improve cancer detection accuracy, x-ray coherent scatter computed tomography (CSCT) has been explored to potentially detect cancer with greater consistency. However, the 10-min scan duration of CSCT limits its possible clinical applications. The coded aperture coherent scatter spectral imaging (CACSSI) technique has been shown to reduce scan time through enabling single-angle imaging while providing high detection accuracy. Here, we use Monte Carlo simulations to test analytical optimization studies of the CACSSI technique, specifically for detecting cancer in ex vivo breast samples. An anthropomorphic breast tissue phantom was modeled, a CACSSI imaging system was virtually simulated to image the phantom, a diagnostic voxel classification algorithm was applied to all reconstructed voxels in the phantom, and receiver-operator characteristics analysis of the voxel classification was used to evaluate and characterize the imaging system for a range of parameters that have been optimized in a prior analytical study. The results indicate that CACSSI is able to identify the distribution of cancerous and healthy tissues (i.e., fibroglandular, adipose, or a mix of the two) in tissue samples with a cancerous voxel identification area-under-the-curve of 0.94 through a scan lasting less than 10 s per slice. These results show that coded aperture scatter imaging has the potential to provide scatter images that automatically differentiate cancerous and healthy tissue within ex vivo samples. Furthermore, the results indicate potential CACSSI imaging system configurations for implementation in subsequent imaging development studies. PMID:28331884
Optimal patch code design via device characterization
NASA Astrophysics Data System (ADS)
Wu, Wencheng; Dalal, Edul N.
2012-01-01
In many color measurement applications, such as those for color calibration and profiling, "patch code" has been used successfully for job identification and automation to reduce operator errors. A patch code is similar to a barcode, but is intended primarily for use in measurement devices that cannot read barcodes due to limited spatial resolution, such as spectrophotometers. There is an inherent tradeoff between decoding robustness and the number of code levels available for encoding. Previous methods have attempted to address this tradeoff, but those solutions have been sub-optimal. In this paper, we propose a method to design optimal patch codes via device characterization. The tradeoff between decoding robustness and the number of available code levels is optimized in terms of printing and measurement efforts, and decoding robustness against noises from the printing and measurement devices. Effort is drastically reduced relative to previous methods because print-and-measure is minimized through modeling and the use of existing printer profiles. Decoding robustness is improved by distributing the code levels in CIE Lab space rather than in CMYK space.
NASA Astrophysics Data System (ADS)
Jubran, Mohammad K.; Bansal, Manu; Kondi, Lisimachos P.
2006-01-01
In this paper, we consider the problem of optimal bit allocation for wireless video transmission over fading channels. We use a newly developed hybrid scalable/multiple-description codec that combines the functionality of both scalable and multiple-description codecs. It produces a base layer and multiple-description enhancement layers. Any of the enhancement layers can be decoded (in a non-hierarchical manner) with the base layer to improve the reconstructed video quality. Two different channel coding schemes (Rate-Compatible Punctured Convolutional (RCPC)/Cyclic Redundancy Check (CRC) coding and, product code Reed Solomon (RS)+RCPC/CRC coding) are used for unequal error protection of the layered bitstream. Optimal allocation of the bitrate between source and channel coding is performed for discrete sets of source coding rates and channel coding rates. Experimental results are presented for a wide range of channel conditions. Also, comparisons with classical scalable coding show the effectiveness of using hybrid scalable/multiple-description coding for wireless transmission.
DSP code optimization based on cache
NASA Astrophysics Data System (ADS)
Xu, Chengfa; Li, Chengcheng; Tang, Bin
2013-03-01
DSP program's running efficiency on board is often lower than which via the software simulation during the program development, which is mainly resulted from the user's improper use and incomplete understanding of the cache-based memory. This paper took the TI TMS320C6455 DSP as an example, analyzed its two-level internal cache, and summarized the methods of code optimization. Processor can achieve its best performance when using these code optimization methods. At last, a specific algorithm application in radar signal processing is proposed. Experiment result shows that these optimization are efficient.
Demonstration of Automatically-Generated Adjoint Code for Use in Aerodynamic Shape Optimization
NASA Technical Reports Server (NTRS)
Green, Lawrence; Carle, Alan; Fagan, Mike
1999-01-01
Gradient-based optimization requires accurate derivatives of the objective function and constraints. These gradients may have previously been obtained by manual differentiation of analysis codes, symbolic manipulators, finite-difference approximations, or existing automatic differentiation (AD) tools such as ADIFOR (Automatic Differentiation in FORTRAN). Each of these methods has certain deficiencies, particularly when applied to complex, coupled analyses with many design variables. Recently, a new AD tool called ADJIFOR (Automatic Adjoint Generation in FORTRAN), based upon ADIFOR, was developed and demonstrated. Whereas ADIFOR implements forward-mode (direct) differentiation throughout an analysis program to obtain exact derivatives via the chain rule of calculus, ADJIFOR implements the reverse-mode counterpart of the chain rule to obtain exact adjoint form derivatives from FORTRAN code. Automatically-generated adjoint versions of the widely-used CFL3D computational fluid dynamics (CFD) code and an algebraic wing grid generation code were obtained with just a few hours processing time using the ADJIFOR tool. The codes were verified for accuracy and were shown to compute the exact gradient of the wing lift-to-drag ratio, with respect to any number of shape parameters, in about the time required for 7 to 20 function evaluations. The codes have now been executed on various computers with typical memory and disk space for problems with up to 129 x 65 x 33 grid points, and for hundreds to thousands of independent variables. These adjoint codes are now used in a gradient-based aerodynamic shape optimization problem for a swept, tapered wing. For each design iteration, the optimization package constructs an approximate, linear optimization problem, based upon the current objective function, constraints, and gradient values. The optimizer subroutines are called within a design loop employing the approximate linear problem until an optimum shape is found, the design loop limit is reached, or no further design improvement is possible due to active design variable bounds and/or constraints. The resulting shape parameters are then used by the grid generation code to define a new wing surface and computational grid. The lift-to-drag ratio and its gradient are computed for the new design by the automatically-generated adjoint codes. Several optimization iterations may be required to find an optimum wing shape. Results from two sample cases will be discussed. The reader should note that this work primarily represents a demonstration of use of automatically- generated adjoint code within an aerodynamic shape optimization. As such, little significance is placed upon the actual optimization results, relative to the method for obtaining the results.
Maximized exoEarth candidate yields for starshades
NASA Astrophysics Data System (ADS)
Stark, Christopher C.; Shaklan, Stuart; Lisman, Doug; Cady, Eric; Savransky, Dmitry; Roberge, Aki; Mandell, Avi M.
2016-10-01
The design and scale of a future mission to directly image and characterize potentially Earth-like planets will be impacted, to some degree, by the expected yield of such planets. Recent efforts to increase the estimated yields, by creating observation plans optimized for the detection and characterization of Earth-twins, have focused solely on coronagraphic instruments; starshade-based missions could benefit from a similar analysis. Here we explore how to prioritize observations for a starshade given the limiting resources of both fuel and time, present analytic expressions to estimate fuel use, and provide efficient numerical techniques for maximizing the yield of starshades. We implemented these techniques to create an approximate design reference mission code for starshades and used this code to investigate how exoEarth candidate yield responds to changes in mission, instrument, and astrophysical parameters for missions with a single starshade. We find that a starshade mission operates most efficiently somewhere between the fuel- and exposuretime-limited regimes and, as a result, is less sensitive to photometric noise sources as well as parameters controlling the photon collection rate in comparison to a coronagraph. We produced optimistic yield curves for starshades, assuming our optimized observation plans are schedulable and future starshades are not thrust-limited. Given these yield curves, detecting and characterizing several dozen exoEarth candidates requires either multiple starshades or an η≳0.3.
Replica Exchange Molecular Dynamics in the Age of Heterogeneous Architectures
NASA Astrophysics Data System (ADS)
Roitberg, Adrian
2014-03-01
The rise of GPU-based codes has allowed MD to reach timescales only dreamed of only 5 years ago. Even within this new paradigm there is still need for advanced sampling techniques. Modern supercomputers (e.g. Blue Waters, Titan, Keeneland) have made available to users a significant number of GPUS and CPUS, which in turn translate into amazing opportunities for dream calculations. Replica-exchange based methods can optimally use tis combination of codes and architectures to explore conformational variabilities in large systems. I will show our recent work in porting the program Amber to GPUS, and the support for replica exchange methods, where the replicated dimension could be Temperature, pH, Hamiltonian, Umbrella windows and combinations of those schemes.
Error-correction coding for digital communications
NASA Astrophysics Data System (ADS)
Clark, G. C., Jr.; Cain, J. B.
This book is written for the design engineer who must build the coding and decoding equipment and for the communication system engineer who must incorporate this equipment into a system. It is also suitable as a senior-level or first-year graduate text for an introductory one-semester course in coding theory. Fundamental concepts of coding are discussed along with group codes, taking into account basic principles, practical constraints, performance computations, coding bounds, generalized parity check codes, polynomial codes, and important classes of group codes. Other topics explored are related to simple nonalgebraic decoding techniques for group codes, soft decision decoding of block codes, algebraic techniques for multiple error correction, the convolutional code structure and Viterbi decoding, syndrome decoding techniques, and sequential decoding techniques. System applications are also considered, giving attention to concatenated codes, coding for the white Gaussian noise channel, interleaver structures for coded systems, and coding for burst noise channels.
User's manual for the BNW-II optimization code for dry/wet-cooled power plants
DOE Office of Scientific and Technical Information (OSTI.GOV)
Braun, D.J.; Bamberger, J.A.; Braun, D.J.
1978-05-01
This volume provides a listing of the BNW-II dry/wet ammonia heat rejection optimization code and is an appendix to Volume I which gives a narrative description of the code's algorithms as well as logic, input and output information.
Kim, Dong-Sun; Kwon, Jin-San
2014-01-01
Research on real-time health systems have received great attention during recent years and the needs of high-quality personal multichannel medical signal compression for personal medical product applications are increasing. The international MPEG-4 audio lossless coding (ALS) standard supports a joint channel-coding scheme for improving compression performance of multichannel signals and it is very efficient compression method for multi-channel biosignals. However, the computational complexity of such a multichannel coding scheme is significantly greater than that of other lossless audio encoders. In this paper, we present a multichannel hardware encoder based on a low-complexity joint-coding technique and shared multiplier scheme for portable devices. A joint-coding decision method and a reference channel selection scheme are modified for a low-complexity joint coder. The proposed joint coding decision method determines the optimized joint-coding operation based on the relationship between the cross correlation of residual signals and the compression ratio. The reference channel selection is designed to select a channel for the entropy coding of the joint coding. The hardware encoder operates at a 40 MHz clock frequency and supports two-channel parallel encoding for the multichannel monitoring system. Experimental results show that the compression ratio increases by 0.06%, whereas the computational complexity decreases by 20.72% compared to the MPEG-4 ALS reference software encoder. In addition, the compression ratio increases by about 11.92%, compared to the single channel based bio-signal lossless data compressor. PMID:25237900
Coding visual features extracted from video sequences.
Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2014-05-01
Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
Parallel processors and nonlinear structural dynamics algorithms and software
NASA Technical Reports Server (NTRS)
Belytschko, Ted
1990-01-01
Techniques are discussed for the implementation and improvement of vectorization and concurrency in nonlinear explicit structural finite element codes. In explicit integration methods, the computation of the element internal force vector consumes the bulk of the computer time. The program can be efficiently vectorized by subdividing the elements into blocks and executing all computations in vector mode. The structuring of elements into blocks also provides a convenient way to implement concurrency by creating tasks which can be assigned to available processors for evaluation. The techniques were implemented in a 3-D nonlinear program with one-point quadrature shell elements. Concurrency and vectorization were first implemented in a single time step version of the program. Techniques were developed to minimize processor idle time and to select the optimal vector length. A comparison of run times between the program executed in scalar, serial mode and the fully vectorized code executed concurrently using eight processors shows speed-ups of over 25. Conjugate gradient methods for solving nonlinear algebraic equations are also readily adapted to a parallel environment. A new technique for improving convergence properties of conjugate gradients in nonlinear problems is developed in conjunction with other techniques such as diagonal scaling. A significant reduction in the number of iterations required for convergence is shown for a statically loaded rigid bar suspended by three equally spaced springs.
Systematic Propulsion Optimization Tools (SPOT)
NASA Technical Reports Server (NTRS)
Bower, Mark; Celestian, John
1992-01-01
This paper describes a computer program written by senior-level Mechanical Engineering students at the University of Alabama in Huntsville which is capable of optimizing user-defined delivery systems for carrying payloads into orbit. The custom propulsion system is designed by the user through the input of configuration, payload, and orbital parameters. The primary advantages of the software, called Systematic Propulsion Optimization Tools (SPOT), are a user-friendly interface and a modular FORTRAN 77 code designed for ease of modification. The optimization of variables in an orbital delivery system is of critical concern in the propulsion environment. The mass of the overall system must be minimized within the maximum stress, force, and pressure constraints. SPOT utilizes the Design Optimization Tools (DOT) program for the optimization techniques. The SPOT program is divided into a main program and five modules: aerodynamic losses, orbital parameters, liquid engines, solid engines, and nozzles. The program is designed to be upgraded easily and expanded to meet specific user needs. A user's manual and a programmer's manual are currently being developed to facilitate implementation and modification.
Optimization Model for Web Based Multimodal Interactive Simulations.
Halic, Tansel; Ahn, Woojin; De, Suvranu
2015-07-15
This paper presents a technique for optimizing the performance of web based multimodal interactive simulations. For such applications where visual quality and the performance of simulations directly influence user experience, overloading of hardware resources may result in unsatisfactory reduction in the quality of the simulation and user satisfaction. However, optimization of simulation performance on individual hardware platforms is not practical. Hence, we present a mixed integer programming model to optimize the performance of graphical rendering and simulation performance while satisfying application specific constraints. Our approach includes three distinct phases: identification, optimization and update . In the identification phase, the computing and rendering capabilities of the client device are evaluated using an exploratory proxy code. This data is utilized in conjunction with user specified design requirements in the optimization phase to ensure best possible computational resource allocation. The optimum solution is used for rendering (e.g. texture size, canvas resolution) and simulation parameters (e.g. simulation domain) in the update phase. Test results are presented on multiple hardware platforms with diverse computing and graphics capabilities to demonstrate the effectiveness of our approach.
Weight optimization of plane truss using genetic algorithm
NASA Astrophysics Data System (ADS)
Neeraja, D.; Kamireddy, Thejesh; Santosh Kumar, Potnuru; Simha Reddy, Vijay
2017-11-01
Optimization of structure on basis of weight has many practical benefits in every engineering field. The efficiency is proportionally related to its weight and hence weight optimization gains prime importance. Considering the field of civil engineering, weight optimized structural elements are economical and easier to transport to the site. In this study, genetic optimization algorithm for weight optimization of steel truss considering its shape, size and topology aspects has been developed in MATLAB. Material strength and Buckling stability have been adopted from IS 800-2007 code of construction steel. The constraints considered in the present study are fabrication, basic nodes, displacements, and compatibility. Genetic programming is a natural selection search technique intended to combine good solutions to a problem from many generations to improve the results. All solutions are generated randomly and represented individually by a binary string with similarities of natural chromosomes, and hence it is termed as genetic programming. The outcome of the study is a MATLAB program, which can optimise a steel truss and display the optimised topology along with element shapes, deflections, and stress results.
Optimization Model for Web Based Multimodal Interactive Simulations
Halic, Tansel; Ahn, Woojin; De, Suvranu
2015-01-01
This paper presents a technique for optimizing the performance of web based multimodal interactive simulations. For such applications where visual quality and the performance of simulations directly influence user experience, overloading of hardware resources may result in unsatisfactory reduction in the quality of the simulation and user satisfaction. However, optimization of simulation performance on individual hardware platforms is not practical. Hence, we present a mixed integer programming model to optimize the performance of graphical rendering and simulation performance while satisfying application specific constraints. Our approach includes three distinct phases: identification, optimization and update. In the identification phase, the computing and rendering capabilities of the client device are evaluated using an exploratory proxy code. This data is utilized in conjunction with user specified design requirements in the optimization phase to ensure best possible computational resource allocation. The optimum solution is used for rendering (e.g. texture size, canvas resolution) and simulation parameters (e.g. simulation domain) in the update phase. Test results are presented on multiple hardware platforms with diverse computing and graphics capabilities to demonstrate the effectiveness of our approach. PMID:26085713
Parrish, Robert M; Burns, Lori A; Smith, Daniel G A; Simmonett, Andrew C; DePrince, A Eugene; Hohenstein, Edward G; Bozkaya, Uğur; Sokolov, Alexander Yu; Di Remigio, Roberto; Richard, Ryan M; Gonthier, Jérôme F; James, Andrew M; McAlexander, Harley R; Kumar, Ashutosh; Saitow, Masaaki; Wang, Xiao; Pritchard, Benjamin P; Verma, Prakash; Schaefer, Henry F; Patkowski, Konrad; King, Rollin A; Valeev, Edward F; Evangelista, Francesco A; Turney, Justin M; Crawford, T Daniel; Sherrill, C David
2017-07-11
Psi4 is an ab initio electronic structure program providing methods such as Hartree-Fock, density functional theory, configuration interaction, and coupled-cluster theory. The 1.1 release represents a major update meant to automate complex tasks, such as geometry optimization using complete-basis-set extrapolation or focal-point methods. Conversion of the top-level code to a Python module means that Psi4 can now be used in complex workflows alongside other Python tools. Several new features have been added with the aid of libraries providing easy access to techniques such as density fitting, Cholesky decomposition, and Laplace denominators. The build system has been completely rewritten to simplify interoperability with independent, reusable software components for quantum chemistry. Finally, a wide range of new theoretical methods and analyses have been added to the code base, including functional-group and open-shell symmetry adapted perturbation theory, density-fitted coupled cluster with frozen natural orbitals, orbital-optimized perturbation and coupled-cluster methods (e.g., OO-MP2 and OO-LCCD), density-fitted multiconfigurational self-consistent field, density cumulant functional theory, algebraic-diagrammatic construction excited states, improvements to the geometry optimizer, and the "X2C" approach to relativistic corrections, among many other improvements.
Shared prefetching to reduce execution skew in multi-threaded systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eichenberger, Alexandre E; Gunnels, John A
Mechanisms are provided for optimizing code to perform prefetching of data into a shared memory of a computing device that is shared by a plurality of threads that execute on the computing device. A memory stream of a portion of code that is shared by the plurality of threads is identified. A set of prefetch instructions is distributed across the plurality of threads. Prefetch instructions are inserted into the instruction sequences of the plurality of threads such that each instruction sequence has a separate sub-portion of the set of prefetch instructions, thereby generating optimized code. Executable code is generated basedmore » on the optimized code and stored in a storage device. The executable code, when executed, performs the prefetches associated with the distributed set of prefetch instructions in a shared manner across the plurality of threads.« less
Aeroelastic Tailoring Study of N+2 Low-Boom Supersonic Commercial Transport Aircraft
NASA Technical Reports Server (NTRS)
Pak, Chan-gi
2015-01-01
The Lockheed Martins N+2 Low-boom Supersonic Commercial Transport (LSCT) aircraft is optimized in this study through the use of a multidisciplinary design optimization tool developed at the NASA Armstrong Flight Research Center. A total of 111 design variables are used in the first optimization run. Total structural weight is the objective function in this optimization run. Design requirements for strength, buckling, and flutter are selected as constraint functions during the first optimization run. The MSC Nastran code is used to obtain the modal, strength, and buckling characteristics. Flutter and trim analyses are based on ZAERO code and landing and ground control loads are computed using an in-house code.
NASA Technical Reports Server (NTRS)
Lahti, G. P.
1972-01-01
A two- or three-constraint, two-dimensional radiation shield weight optimization procedure and a computer program, DOPEX, is described. The DOPEX code uses the steepest descent method to alter a set of initial (input) thicknesses for a shield configuration to achieve a minimum weight while simultaneously satisfying dose constaints. The code assumes an exponential dose-shield thickness relation with parameters specified by the user. The code also assumes that dose rates in each principal direction are dependent only on thicknesses in that direction. Code input instructions, FORTRAN 4 listing, and a sample problem are given. Typical computer time required to optimize a seven-layer shield is about 0.1 minute on an IBM 7094-2.
NASA Technical Reports Server (NTRS)
Vanderplaats, G. N.; Chen, Xiang; Zhang, Ning-Tian
1988-01-01
The use of formal numerical optimization methods for the design of gears is investigated. To achieve this, computer codes were developed for the analysis of spur gears and spiral bevel gears. These codes calculate the life, dynamic load, bending strength, surface durability, gear weight and size, and various geometric parameters. It is necessary to calculate all such important responses because they all represent competing requirements in the design process. The codes developed here were written in subroutine form and coupled to the COPES/ADS general purpose optimization program. This code allows the user to define the optimization problem at the time of program execution. Typical design variables include face width, number of teeth and diametral pitch. The user is free to choose any calculated response as the design objective to minimize or maximize and may impose lower and upper bounds on any calculated responses. Typical examples include life maximization with limits on dynamic load, stress, weight, etc. or minimization of weight subject to limits on life, dynamic load, etc. The research codes were written in modular form for easy expansion and so that they could be combined to create a multiple reduction optimization capability in future.
jFuzz: A Concolic Whitebox Fuzzer for Java
NASA Technical Reports Server (NTRS)
Jayaraman, Karthick; Harvison, David; Ganesh, Vijay; Kiezun, Adam
2009-01-01
We present jFuzz, a automatic testing tool for Java programs. jFuzz is a concolic whitebox fuzzer, built on the NASA Java PathFinder, an explicit-state Java model checker, and a framework for developing reliability and analysis tools for Java. Starting from a seed input, jFuzz automatically and systematically generates inputs that exercise new program paths. jFuzz uses a combination of concrete and symbolic execution, and constraint solving. Time spent on solving constraints can be significant. We implemented several well-known optimizations and name-independent caching, which aggressively normalizes the constraints to reduce the number of calls to the constraint solver. We present preliminary results due to the optimizations, and demonstrate the effectiveness of jFuzz in creating good test inputs. The source code of jFuzz is available as part of the NASA Java PathFinder. jFuzz is intended to be a research testbed for investigating new testing and analysis techniques based on concrete and symbolic execution. The source code of jFuzz is available as part of the NASA Java PathFinder.
Spatially adaptive bases in wavelet-based coding of semi-regular meshes
NASA Astrophysics Data System (ADS)
Denis, Leon; Florea, Ruxandra; Munteanu, Adrian; Schelkens, Peter
2010-05-01
In this paper we present a wavelet-based coding approach for semi-regular meshes, which spatially adapts the employed wavelet basis in the wavelet transformation of the mesh. The spatially-adaptive nature of the transform requires additional information to be stored in the bit-stream in order to allow the reconstruction of the transformed mesh at the decoder side. In order to limit this overhead, the mesh is first segmented into regions of approximately equal size. For each spatial region, a predictor is selected in a rate-distortion optimal manner by using a Lagrangian rate-distortion optimization technique. When compared against the classical wavelet transform employing the butterfly subdivision filter, experiments reveal that the proposed spatially-adaptive wavelet transform significantly decreases the energy of the wavelet coefficients for all subbands. Preliminary results show also that employing the proposed transform for the lowest-resolution subband systematically yields improved compression performance at low-to-medium bit-rates. For the Venus and Rabbit test models the compression improvements add up to 1.47 dB and 0.95 dB, respectively.
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
Wang, Bei; Ethier, Stephane; Tang, William; ...
2017-06-29
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Bei; Ethier, Stephane; Tang, William
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Dynamic state estimation based on Poisson spike trains—towards a theory of optimal encoding
NASA Astrophysics Data System (ADS)
Susemihl, Alex; Meir, Ron; Opper, Manfred
2013-03-01
Neurons in the nervous system convey information to higher brain regions by the generation of spike trains. An important question in the field of computational neuroscience is how these sensory neurons encode environmental information in a way which may be simply analyzed by subsequent systems. Many aspects of the form and function of the nervous system have been understood using the concepts of optimal population coding. Most studies, however, have neglected the aspect of temporal coding. Here we address this shortcoming through a filtering theory of inhomogeneous Poisson processes. We derive exact relations for the minimal mean squared error of the optimal Bayesian filter and, by optimizing the encoder, obtain optimal codes for populations of neurons. We also show that a class of non-Markovian, smooth stimuli are amenable to the same treatment, and provide results for the filtering and prediction error which hold for a general class of stochastic processes. This sets a sound mathematical framework for a population coding theory that takes temporal aspects into account. It also formalizes a number of studies which discussed temporal aspects of coding using time-window paradigms, by stating them in terms of correlation times and firing rates. We propose that this kind of analysis allows for a systematic study of temporal coding and will bring further insights into the nature of the neural code.
Development of an agility assessment module for preliminary fighter design
NASA Technical Reports Server (NTRS)
Ngan, Angelen; Bauer, Brent; Biezad, Daniel; Hahn, Andrew
1996-01-01
A FORTRAN computer program is presented to perform agility analysis on fighter aircraft configurations. This code is one of the modules of the NASA Ames ACSYNT (AirCraft SYNThesis) design code. The background of the agility research in the aircraft industry and a survey of a few agility metrics are discussed. The methodology, techniques, and models developed for the code are presented. FORTRAN programs were developed for two specific metrics, CCT (Combat Cycle Time) and PM (Pointing Margin), as part of the agility module. The validity of the code was evaluated by comparing with existing flight test data. Example trade studies using the agility module along with ACSYNT were conducted using Northrop F-20 Tigershark and McDonnell Douglas F/A-18 Hornet aircraft models. The sensitivity of thrust loading and wing loading on agility criteria were investigated. The module can compare the agility potential between different configurations and has the capability to optimize agility performance in the preliminary design process. This research provides a new and useful design tool for analyzing fighter performance during air combat engagements.
Distributed Adaptive Binary Quantization for Fast Nearest Neighbor Search.
Xianglong Liu; Zhujin Li; Cheng Deng; Dacheng Tao
2017-11-01
Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared with the projection based hashing methods, prototype-based ones own stronger power to generate discriminative binary codes for the data with complex intrinsic structure. However, existing prototype-based methods, such as spherical hashing and K-means hashing, still suffer from the ineffective coding that utilizes the complete binary codes in a hypercube. To address this problem, we propose an adaptive binary quantization (ABQ) method that learns a discriminative hash function with prototypes associated with small unique binary codes. Our alternating optimization adaptively discovers the prototype set and the code set of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes, and enjoys the fast training linear to the number of the training data. We further devise a distributed framework for the large-scale learning, which can significantly speed up the training of ABQ in the distributed environment that has been widely deployed in many areas nowadays. The extensive experiments on four large-scale (up to 80 million) data sets demonstrate that our method significantly outperforms state-of-the-art hashing methods, with up to 58.84% performance gains relatively.
Optical systems integrated modeling
NASA Technical Reports Server (NTRS)
Shannon, Robert R.; Laskin, Robert A.; Brewer, SI; Burrows, Chris; Epps, Harlan; Illingworth, Garth; Korsch, Dietrich; Levine, B. Martin; Mahajan, Vini; Rimmer, Chuck
1992-01-01
An integrated modeling capability that provides the tools by which entire optical systems and instruments can be simulated and optimized is a key technology development, applicable to all mission classes, especially astrophysics. Many of the future missions require optical systems that are physically much larger than anything flown before and yet must retain the characteristic sub-micron diffraction limited wavefront accuracy of their smaller precursors. It is no longer feasible to follow the path of 'cut and test' development; the sheer scale of these systems precludes many of the older techniques that rely upon ground evaluation of full size engineering units. The ability to accurately model (by computer) and optimize the entire flight system's integrated structural, thermal, and dynamic characteristics is essential. Two distinct integrated modeling capabilities are required. These are an initial design capability and a detailed design and optimization system. The content of an initial design package is shown. It would be a modular, workstation based code which allows preliminary integrated system analysis and trade studies to be carried out quickly by a single engineer or a small design team. A simple concept for a detailed design and optimization system is shown. This is a linkage of interface architecture that allows efficient interchange of information between existing large specialized optical, control, thermal, and structural design codes. The computing environment would be a network of large mainframe machines and its users would be project level design teams. More advanced concepts for detailed design systems would support interaction between modules and automated optimization of the entire system. Technology assessment and development plans for integrated package for initial design, interface development for detailed optimization, validation, and modeling research are presented.
Diversity-optimal power loading for intensity modulated MIMO optical wireless communications.
Zhang, Yan-Yu; Yu, Hong-Yi; Zhang, Jian-Kang; Zhu, Yi-Jun
2016-04-18
In this paper, we consider the design of space code for an intensity modulated direct detection multi-input-multi-output optical wireless communication (IM/DD MIMO-OWC) system, in which channel coefficients are independent and non-identically log-normal distributed, with variances and means known at the transmitter and channel state information available at the receiver. Utilizing the existing space code design criterion for IM/DD MIMO-OWC with a maximum likelihood (ML) detector, we design a diversity-optimal space code (DOSC) that maximizes both large-scale diversity and small-scale diversity gains and prove that the spatial repetition code (RC) with a diversity-optimized power allocation is diversity-optimal among all the high dimensional nonnegative space code schemes under a commonly used optical power constraint. In addition, we show that one of significant advantages of the DOSC is to allow low-complexity ML detection. Simulation results indicate that in high signal-to-noise ratio (SNR) regimes, our proposed DOSC significantly outperforms RC, which is the best space code currently available for such system.
A study of power cycles using supercritical carbon dioxide as the working fluid
NASA Astrophysics Data System (ADS)
Schroder, Andrew Urban
A real fluid heat engine power cycle analysis code has been developed for analyzing the zero dimensional performance of a general recuperated, recompression, precompression supercritical carbon dioxide power cycle with reheat and a unique shaft configuration. With the proposed shaft configuration, several smaller compressor-turbine pairs could be placed inside of a pressure vessel in order to avoid high speed, high pressure rotating seals. The small compressor-turbine pairs would share some resemblance with a turbocharger assembly. Variation in fluid properties within the heat exchangers is taken into account by discretizing zero dimensional heat exchangers. The cycle analysis code allows for multiple reheat stages, as well as an option for the main compressor to be powered by a dedicated turbine or an electrical motor. Variation in performance with respect to design heat exchanger pressure drops and minimum temperature differences, precompressor pressure ratio, main compressor pressure ratio, recompression mass fraction, main compressor inlet pressure, and low temperature recuperator mass fraction have been explored throughout a range of each design parameter. Turbomachinery isentropic efficiencies are implemented and the sensitivity of the cycle performance and the optimal design parameters is explored. Sensitivity of the cycle performance and optimal design parameters is studied with respect to the minimum heat rejection temperature and the maximum heat addition temperature. A hybrid stochastic and gradient based optimization technique has been used to optimize critical design parameters for maximum engine thermal efficiency. A parallel design exploration mode was also developed in order to rapidly conduct the parameter sweeps in this design space exploration. A cycle thermal efficiency of 49.6% is predicted with a 320K [47°C] minimum temperature and 923K [650°C] maximum temperature. The real fluid heat engine power cycle analysis code was expanded to study a theoretical recuperated Lenoir cycle using supercritical carbon dioxide as the working fluid. The real fluid cycle analysis code was also enhanced to study a combined cycle engine cascade. Two engine cascade configurations were studied. The first consisted of a traditional open loop gas turbine, coupled with a series of recuperated, recompression, precompression supercritical carbon dioxide power cycles, with a predicted combined cycle thermal efficiency of 65.0% using a peak temperature of 1,890K [1,617°C]. The second configuration consisted of a hybrid natural gas powered solid oxide fuel cell and gas turbine, coupled with a series of recuperated, recompression, precompression supercritical carbon dioxide power cycles, with a predicted combined cycle thermal efficiency of 73.1%. Both configurations had a minimum temperature of 306K [33°C]. The hybrid stochastic and gradient based optimization technique was used to optimize all engine design parameters for each engine in the cascade such that the entire engine cascade achieved the maximum thermal efficiency. The parallel design exploration mode was also utilized in order to understand the impact of different design parameters on the overall engine cascade thermal efficiency. Two dimensional conjugate heat transfer (CHT) numerical simulations of a straight, equal height channel heat exchanger using supercritical carbon dioxide were conducted at various Reynolds numbers and channel lengths.
Optimization Based Efficiencies in First Order Reliability Analysis
NASA Technical Reports Server (NTRS)
Peck, Jeffrey A.; Mahadevan, Sankaran
2003-01-01
This paper develops a method for updating the gradient vector of the limit state function in reliability analysis using Broyden's rank one updating technique. In problems that use commercial code as a black box, the gradient calculations are usually done using a finite difference approach, which becomes very expensive for large system models. The proposed method replaces the finite difference gradient calculations in a standard first order reliability method (FORM) with Broyden's Quasi-Newton technique. The resulting algorithm of Broyden updates within a FORM framework (BFORM) is used to run several example problems, and the results compared to standard FORM results. It is found that BFORM typically requires fewer functional evaluations that FORM to converge to the same answer.
Secure positioning technique based on encrypted visible light map for smart indoor service
NASA Astrophysics Data System (ADS)
Lee, Yong Up; Jung, Gillyoung
2018-03-01
Indoor visible light (VL) positioning systems for smart indoor services are negatively affected by both cochannel interference from adjacent light sources and VL reception position irregularity in the three-dimensional (3-D) VL channel. A secure positioning methodology based on a two-dimensional (2-D) encrypted VL map is proposed, implemented in prototypes of the specific positioning system, and analyzed based on performance tests. The proposed positioning technique enhances the positioning performance by more than 21.7% compared to the conventional method in real VL positioning tests. Further, the pseudonoise code is found to be the optimal encryption key for secure VL positioning for this smart indoor service.
Performance of the OVERFLOW-MLP and LAURA-MLP CFD Codes on the NASA Ames 512 CPU Origin System
NASA Technical Reports Server (NTRS)
Taft, James R.
2000-01-01
The shared memory Multi-Level Parallelism (MLP) technique, developed last year at NASA Ames has been very successful in dramatically improving the performance of important NASA CFD codes. This new and very simple parallel programming technique was first inserted into the OVERFLOW production CFD code in FY 1998. The OVERFLOW-MLP code's parallel performance scaled linearly to 256 CPUs on the NASA Ames 256 CPU Origin 2000 system (steger). Overall performance exceeded 20.1 GFLOP/s, or about 4.5x the performance of a dedicated 16 CPU C90 system. All of this was achieved without any major modification to the original vector based code. The OVERFLOW-MLP code is now in production on the inhouse Origin systems as well as being used offsite at commercial aerospace companies. Partially as a result of this work, NASA Ames has purchased a new 512 CPU Origin 2000 system to further test the limits of parallel performance for NASA codes of interest. This paper presents the performance obtained from the latest optimization efforts on this machine for the LAURA-MLP and OVERFLOW-MLP codes. The Langley Aerothermodynamics Upwind Relaxation Algorithm (LAURA) code is a key simulation tool in the development of the next generation shuttle, interplanetary reentry vehicles, and nearly all "X" plane development. This code sustains about 4-5 GFLOP/s on a dedicated 16 CPU C90. At this rate, expected workloads would require over 100 C90 CPU years of computing over the next few calendar years. It is not feasible to expect that this would be affordable or available to the user community. Dramatic performance gains on cheaper systems are needed. This code is expected to be perhaps the largest consumer of NASA Ames compute cycles per run in the coming year.The OVERFLOW CFD code is extensively used in the government and commercial aerospace communities to evaluate new aircraft designs. It is one of the largest consumers of NASA supercomputing cycles and large simulations of highly resolved full aircraft are routinely undertaken. Typical large problems might require 100s of Cray C90 CPU hours to complete. The dramatic performance gains with the 256 CPU steger system are exciting. Obtaining results in hours instead of months is revolutionizing the way in which aircraft manufacturers are looking at future aircraft simulation work. Figure 2 below is a current state of the art plot of OVERFLOW-MLP performance on the 512 CPU Lomax system. As can be seen, the chart indicates that OVERFLOW-MLP continues to scale linearly with CPU count up to 512 CPUs on a large 35 million point full aircraft RANS simulation. At this point performance is such that a fully converged simulation of 2500 time steps is completed in less than 2 hours of elapsed time. Further work over the next few weeks will improve the performance of this code even further.The LAURA code has been converted to the MLP format as well. This code is currently being optimized for the 512 CPU system. Performance statistics indicate that the goal of 100 GFLOP/s will be achieved by year's end. This amounts to 20x the 16 CPU C90 result and strongly demonstrates the viability of the new parallel systems rapidly solving very large simulations in a production environment.
Bian, Liheng; Suo, Jinli; Chung, Jaebum; Ou, Xiaoze; Yang, Changhuei; Chen, Feng; Dai, Qionghai
2016-06-10
Fourier ptychographic microscopy (FPM) is a novel computational coherent imaging technique for high space-bandwidth product imaging. Mathematically, Fourier ptychographic (FP) reconstruction can be implemented as a phase retrieval optimization process, in which we only obtain low resolution intensity images corresponding to the sub-bands of the sample's high resolution (HR) spatial spectrum, and aim to retrieve the complex HR spectrum. In real setups, the measurements always suffer from various degenerations such as Gaussian noise, Poisson noise, speckle noise and pupil location error, which would largely degrade the reconstruction. To efficiently address these degenerations, we propose a novel FP reconstruction method under a gradient descent optimization framework in this paper. The technique utilizes Poisson maximum likelihood for better signal modeling, and truncated Wirtinger gradient for effective error removal. Results on both simulated data and real data captured using our laser-illuminated FPM setup show that the proposed method outperforms other state-of-the-art algorithms. Also, we have released our source code for non-commercial use.
Enhancing speech recognition using improved particle swarm optimization based hidden Markov model.
Selvaraj, Lokesh; Ganesan, Balakrishnan
2014-01-01
Enhancing speech recognition is the primary intention of this work. In this paper a novel speech recognition method based on vector quantization and improved particle swarm optimization (IPSO) is suggested. The suggested methodology contains four stages, namely, (i) denoising, (ii) feature mining (iii), vector quantization, and (iv) IPSO based hidden Markov model (HMM) technique (IP-HMM). At first, the speech signals are denoised using median filter. Next, characteristics such as peak, pitch spectrum, Mel frequency Cepstral coefficients (MFCC), mean, standard deviation, and minimum and maximum of the signal are extorted from the denoised signal. Following that, to accomplish the training process, the extracted characteristics are given to genetic algorithm based codebook generation in vector quantization. The initial populations are created by selecting random code vectors from the training set for the codebooks for the genetic algorithm process and IP-HMM helps in doing the recognition. At this point the creativeness will be done in terms of one of the genetic operation crossovers. The proposed speech recognition technique offers 97.14% accuracy.
Błażej, Paweł; Wnȩtrzak, Małgorzata; Mackiewicz, Paweł
2016-12-01
One of theories explaining the present structure of canonical genetic code assumes that it was optimized to minimize harmful effects of amino acid replacements resulting from nucleotide substitutions and translational errors. A way to testify this concept is to find the optimal code under given criteria and compare it with the canonical genetic code. Unfortunately, the huge number of possible alternatives makes it impossible to find the optimal code using exhaustive methods in sensible time. Therefore, heuristic methods should be applied to search the space of possible solutions. Evolutionary algorithms (EA) seem to be ones of such promising approaches. This class of methods is founded both on mutation and crossover operators, which are responsible for creating and maintaining the diversity of candidate solutions. These operators possess dissimilar characteristics and consequently play different roles in the process of finding the best solutions under given criteria. Therefore, the effective searching for the potential solutions can be improved by applying both of them, especially when these operators are devised specifically for a given problem. To study this subject, we analyze the effectiveness of algorithms for various combinations of mutation and crossover probabilities under three models of the genetic code assuming different restrictions on its structure. To achieve that, we adapt the position based crossover operator for the most restricted model and develop a new type of crossover operator for the more general models. The applied fitness function describes costs of amino acid replacement regarding their polarity. Our results indicate that the usage of crossover operators can significantly improve the quality of the solutions. Moreover, the simulations with the crossover operator optimize the fitness function in the smaller number of generations than simulations without this operator. The optimal genetic codes without restrictions on their structure minimize the costs about 2.7 times better than the canonical genetic code. Interestingly, the optimal codes are dominated by amino acids characterized by polarity close to its average value for all amino acids. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Computer optimization of reactor-thermoelectric space power systems
NASA Technical Reports Server (NTRS)
Maag, W. L.; Finnegan, P. M.; Fishbach, L. H.
1973-01-01
A computer simulation and optimization code that has been developed for nuclear space power systems is described. The results of using this code to analyze two reactor-thermoelectric systems are presented.
Improved Evolutionary Programming with Various Crossover Techniques for Optimal Power Flow Problem
NASA Astrophysics Data System (ADS)
Tangpatiphan, Kritsana; Yokoyama, Akihiko
This paper presents an Improved Evolutionary Programming (IEP) for solving the Optimal Power Flow (OPF) problem, which is considered as a non-linear, non-smooth, and multimodal optimization problem in power system operation. The total generator fuel cost is regarded as an objective function to be minimized. The proposed method is an Evolutionary Programming (EP)-based algorithm with making use of various crossover techniques, normally applied in Real Coded Genetic Algorithm (RCGA). The effectiveness of the proposed approach is investigated on the IEEE 30-bus system with three different types of fuel cost functions; namely the quadratic cost curve, the piecewise quadratic cost curve, and the quadratic cost curve superimposed by sine component. These three cost curves represent the generator fuel cost functions with a simplified model and more accurate models of a combined-cycle generating unit and a thermal unit with value-point loading effect respectively. The OPF solutions by the proposed method and Pure Evolutionary Programming (PEP) are observed and compared. The simulation results indicate that IEP requires less computing time than PEP with better solutions in some cases. Moreover, the influences of important IEP parameters on the OPF solution are described in details.
Adams, Bradley J; Aschheim, Kenneth W
2016-01-01
Comparison of antemortem and postmortem dental records is a leading method of victim identification, especially for incidents involving a large number of decedents. This process may be expedited with computer software that provides a ranked list of best possible matches. This study provides a comparison of the most commonly used conventional coding and sorting algorithms used in the United States (WinID3) with a simplified coding format that utilizes an optimized sorting algorithm. The simplified system consists of seven basic codes and utilizes an optimized algorithm based largely on the percentage of matches. To perform this research, a large reference database of approximately 50,000 antemortem and postmortem records was created. For most disaster scenarios, the proposed simplified codes, paired with the optimized algorithm, performed better than WinID3 which uses more complex codes. The detailed coding system does show better performance with extremely large numbers of records and/or significant body fragmentation. © 2015 American Academy of Forensic Sciences.
Development of a Split Bitter-type Magnet System for Dusty Plasma Experiments
NASA Astrophysics Data System (ADS)
Bates, Evan; Romero-Talamas, Carlos A.; Birmingham, William J.; Rivera, William F.
2014-10-01
A 10 Tesla Bitter-type magnetic system is under development at the Dusty Plasma Laboratory of the University of Maryland, Baltimore County (UMBC). We present here an optimization technique that uses differential evolution to minimize the omhic heating produced by the coils, while constraining the magnetic field in the experimental volume. The code gives us the optimal dimensions for the coil system including: coil length, turn thickness, disks radii, resistance, and total current required for a constant magnetic field. Finite element parametric optimization is then used to establish the optimal design for water cooling holes. Placement of the cooling holes will also take into consideration the magnetic forces acting on the copper alloy disks to ensure the material strength is not compromised during operation. The proposed power and cooling water delivery subsystems for the coils are also presented. Upon completion and testing of the magnet system, planned experiments include the propagation of magnetized waves in dusty plasma crystals under various boundary conditions, and viscosity in rotational shear flow, among others.
Overlapped Fourier coding for optical aberration removal
Horstmeyer, Roarke; Ou, Xiaoze; Chung, Jaebum; Zheng, Guoan; Yang, Changhuei
2014-01-01
We present an imaging procedure that simultaneously optimizes a camera’s resolution and retrieves a sample’s phase over a sequence of snapshots. The technique, termed overlapped Fourier coding (OFC), first digitally pans a small aperture across a camera’s pupil plane with a spatial light modulator. At each aperture location, a unique image is acquired. The OFC algorithm then fuses these low-resolution images into a full-resolution estimate of the complex optical field incident upon the detector. Simultaneously, the algorithm utilizes redundancies within the acquired dataset to computationally estimate and remove unknown optical aberrations and system misalignments via simulated annealing. The result is an imaging system that can computationally overcome its optical imperfections to offer enhanced resolution, at the expense of taking multiple snapshots over time. PMID:25321982
DOE Office of Scientific and Technical Information (OSTI.GOV)
Machnes, S.; Institute for Theoretical Physics, University of Ulm, D-89069 Ulm; Sander, U.
2011-08-15
For paving the way to novel applications in quantum simulation, computation, and technology, increasingly large quantum systems have to be steered with high precision. It is a typical task amenable to numerical optimal control to turn the time course of pulses, i.e., piecewise constant control amplitudes, iteratively into an optimized shape. Here, we present a comparative study of optimal-control algorithms for a wide range of finite-dimensional applications. We focus on the most commonly used algorithms: GRAPE methods which update all controls concurrently, and Krotov-type methods which do so sequentially. Guidelines for their use are given and open research questions aremore » pointed out. Moreover, we introduce a unifying algorithmic framework, DYNAMO (dynamic optimization platform), designed to provide the quantum-technology community with a convenient matlab-based tool set for optimal control. In addition, it gives researchers in optimal-control techniques a framework for benchmarking and comparing newly proposed algorithms with the state of the art. It allows a mix-and-match approach with various types of gradients, update and step-size methods as well as subspace choices. Open-source code including examples is made available at http://qlib.info.« less
Integrated design optimization research and development in an industrial environment
NASA Astrophysics Data System (ADS)
Kumar, V.; German, Marjorie D.; Lee, S.-J.
1989-04-01
An overview is given of a design optimization project that is in progress at the GE Research and Development Center for the past few years. The objective of this project is to develop a methodology and a software system for design automation and optimization of structural/mechanical components and systems. The effort focuses on research and development issues and also on optimization applications that can be related to real-life industrial design problems. The overall technical approach is based on integration of numerical optimization techniques, finite element methods, CAE and software engineering, and artificial intelligence/expert systems (AI/ES) concepts. The role of each of these engineering technologies in the development of a unified design methodology is illustrated. A software system DESIGN-OPT has been developed for both size and shape optimization of structural components subjected to static as well as dynamic loadings. By integrating this software with an automatic mesh generator, a geometric modeler and an attribute specification computer code, a software module SHAPE-OPT has been developed for shape optimization. Details of these software packages together with their applications to some 2- and 3-dimensional design problems are described.
Integrated design optimization research and development in an industrial environment
NASA Technical Reports Server (NTRS)
Kumar, V.; German, Marjorie D.; Lee, S.-J.
1989-01-01
An overview is given of a design optimization project that is in progress at the GE Research and Development Center for the past few years. The objective of this project is to develop a methodology and a software system for design automation and optimization of structural/mechanical components and systems. The effort focuses on research and development issues and also on optimization applications that can be related to real-life industrial design problems. The overall technical approach is based on integration of numerical optimization techniques, finite element methods, CAE and software engineering, and artificial intelligence/expert systems (AI/ES) concepts. The role of each of these engineering technologies in the development of a unified design methodology is illustrated. A software system DESIGN-OPT has been developed for both size and shape optimization of structural components subjected to static as well as dynamic loadings. By integrating this software with an automatic mesh generator, a geometric modeler and an attribute specification computer code, a software module SHAPE-OPT has been developed for shape optimization. Details of these software packages together with their applications to some 2- and 3-dimensional design problems are described.
NASA Astrophysics Data System (ADS)
Fragkoulis, Alexandros; Kondi, Lisimachos P.; Parsopoulos, Konstantinos E.
2015-03-01
We propose a method for the fair and efficient allocation of wireless resources over a cognitive radio system network to transmit multiple scalable video streams to multiple users. The method exploits the dynamic architecture of the Scalable Video Coding extension of the H.264 standard, along with the diversity that OFDMA networks provide. We use a game-theoretic Nash Bargaining Solution (NBS) framework to ensure that each user receives the minimum video quality requirements, while maintaining fairness over the cognitive radio system. An optimization problem is formulated, where the objective is the maximization of the Nash product while minimizing the waste of resources. The problem is solved by using a Swarm Intelligence optimizer, namely Particle Swarm Optimization. Due to the high dimensionality of the problem, we also introduce a dimension-reduction technique. Our experimental results demonstrate the fairness imposed by the employed NBS framework.
Coding Strategies and Implementations of Compressive Sensing
NASA Astrophysics Data System (ADS)
Tsai, Tsung-Han
This dissertation studies the coding strategies of computational imaging to overcome the limitation of conventional sensing techniques. The information capacity of conventional sensing is limited by the physical properties of optics, such as aperture size, detector pixels, quantum efficiency, and sampling rate. These parameters determine the spatial, depth, spectral, temporal, and polarization sensitivity of each imager. To increase sensitivity in any dimension can significantly compromise the others. This research implements various coding strategies subject to optical multidimensional imaging and acoustic sensing in order to extend their sensing abilities. The proposed coding strategies combine hardware modification and signal processing to exploiting bandwidth and sensitivity from conventional sensors. We discuss the hardware architecture, compression strategies, sensing process modeling, and reconstruction algorithm of each sensing system. Optical multidimensional imaging measures three or more dimensional information of the optical signal. Traditional multidimensional imagers acquire extra dimensional information at the cost of degrading temporal or spatial resolution. Compressive multidimensional imaging multiplexes the transverse spatial, spectral, temporal, and polarization information on a two-dimensional (2D) detector. The corresponding spectral, temporal and polarization coding strategies adapt optics, electronic devices, and designed modulation techniques for multiplex measurement. This computational imaging technique provides multispectral, temporal super-resolution, and polarization imaging abilities with minimal loss in spatial resolution and noise level while maintaining or gaining higher temporal resolution. The experimental results prove that the appropriate coding strategies may improve hundreds times more sensing capacity. Human auditory system has the astonishing ability in localizing, tracking, and filtering the selected sound sources or information from a noisy environment. Using engineering efforts to accomplish the same task usually requires multiple detectors, advanced computational algorithms, or artificial intelligence systems. Compressive acoustic sensing incorporates acoustic metamaterials in compressive sensing theory to emulate the abilities of sound localization and selective attention. This research investigates and optimizes the sensing capacity and the spatial sensitivity of the acoustic sensor. The well-modeled acoustic sensor allows localizing multiple speakers in both stationary and dynamic auditory scene; and distinguishing mixed conversations from independent sources with high audio recognition rate.
A joint source-channel distortion model for JPEG compressed images.
Sabir, Muhammad F; Sheikh, Hamid Rahim; Heath, Robert W; Bovik, Alan C
2006-06-01
The need for efficient joint source-channel coding (JSCC) is growing as new multimedia services are introduced in commercial wireless communication systems. An important component of practical JSCC schemes is a distortion model that can predict the quality of compressed digital multimedia such as images and videos. The usual approach in the JSCC literature for quantifying the distortion due to quantization and channel errors is to estimate it for each image using the statistics of the image for a given signal-to-noise ratio (SNR). This is not an efficient approach in the design of real-time systems because of the computational complexity. A more useful and practical approach would be to design JSCC techniques that minimize average distortion for a large set of images based on some distortion model rather than carrying out per-image optimizations. However, models for estimating average distortion due to quantization and channel bit errors in a combined fashion for a large set of images are not available for practical image or video coding standards employing entropy coding and differential coding. This paper presents a statistical model for estimating the distortion introduced in progressive JPEG compressed images due to quantization and channel bit errors in a joint manner. Statistical modeling of important compression techniques such as Huffman coding, differential pulse-coding modulation, and run-length coding are included in the model. Examples show that the distortion in terms of peak signal-to-noise ratio (PSNR) can be predicted within a 2-dB maximum error over a variety of compression ratios and bit-error rates. To illustrate the utility of the proposed model, we present an unequal power allocation scheme as a simple application of our model. Results show that it gives a PSNR gain of around 6.5 dB at low SNRs, as compared to equal power allocation.
Automatic translation of MPI source into a latency-tolerant, data-driven form
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric
Hiding communication behind useful computation is an important performance programming technique but remains an inscrutable programming exercise even for the expert. We present Bamboo, a code transformation framework that can realize communication overlap in applications written in MPI without the need to intrusively modify the source code. We reformulate MPI source into a task dependency graph representation, which partially orders the tasks, enabling the program to execute in a data-driven fashion under the control of an external runtime system. Experimental results demonstrate that Bamboo significantly reduces communication delays while requiring only modest amounts of programmer annotation for a variety ofmore » applications and platforms, including those employing co-processors and accelerators. Moreover, Bamboo’s performance meets or exceeds that of labor-intensive hand coding. As a result, the translator is more than a means of hiding communication costs automatically; it demonstrates the utility of semantic level optimization against a well-known library.« less
Automatic translation of MPI source into a latency-tolerant, data-driven form
Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric; ...
2017-03-06
Hiding communication behind useful computation is an important performance programming technique but remains an inscrutable programming exercise even for the expert. We present Bamboo, a code transformation framework that can realize communication overlap in applications written in MPI without the need to intrusively modify the source code. We reformulate MPI source into a task dependency graph representation, which partially orders the tasks, enabling the program to execute in a data-driven fashion under the control of an external runtime system. Experimental results demonstrate that Bamboo significantly reduces communication delays while requiring only modest amounts of programmer annotation for a variety ofmore » applications and platforms, including those employing co-processors and accelerators. Moreover, Bamboo’s performance meets or exceeds that of labor-intensive hand coding. As a result, the translator is more than a means of hiding communication costs automatically; it demonstrates the utility of semantic level optimization against a well-known library.« less
Automatic translation of MPI source into a latency-tolerant, data-driven form
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric
Hiding communication behind useful computation is an important performance programming technique but remains an inscrutable programming exercise even for the expert. We present Bamboo, a code transformation framework that can realize communication overlap in applications written in MPI without the need to intrusively modify the source code. Bamboo reformulates MPI source into the form of a task dependency graph that expresses a partial ordering among tasks, enabling the program to execute in a data-driven fashion under the control of an external runtime system. Experimental results demonstrate that Bamboo significantly reduces communication delays while requiring only modest amounts of programmer annotationmore » for a variety of applications and platforms, including those employing co-processors and accelerators. Moreover, Bamboo's performance meets or exceeds that of labor-intensive hand coding. The translator is more than a means of hiding communication costs automatically; it demonstrates the utility of semantic level optimization against a wellknown library.« less
GPU accelerated manifold correction method for spinning compact binaries
NASA Astrophysics Data System (ADS)
Ran, Chong-xi; Liu, Song; Zhong, Shuang-ying
2018-04-01
The graphics processing unit (GPU) acceleration of the manifold correction algorithm based on the compute unified device architecture (CUDA) technology is designed to simulate the dynamic evolution of the Post-Newtonian (PN) Hamiltonian formulation of spinning compact binaries. The feasibility and the efficiency of parallel computation on GPU have been confirmed by various numerical experiments. The numerical comparisons show that the accuracy on GPU execution of manifold corrections method has a good agreement with the execution of codes on merely central processing unit (CPU-based) method. The acceleration ability when the codes are implemented on GPU can increase enormously through the use of shared memory and register optimization techniques without additional hardware costs, implying that the speedup is nearly 13 times as compared with the codes executed on CPU for phase space scan (including 314 × 314 orbits). In addition, GPU-accelerated manifold correction method is used to numerically study how dynamics are affected by the spin-induced quadrupole-monopole interaction for black hole binary system.
Hsieh, Sheng-Hsun; Li, Yung-Hui; Tien, Chung-Hao; Chang, Chin-Chen
2016-12-01
Iris recognition has gained increasing popularity over the last few decades; however, the stand-off distance in a conventional iris recognition system is too short, which limits its application. In this paper, we propose a novel hardware-software hybrid method to increase the stand-off distance in an iris recognition system. When designing the system hardware, we use an optimized wavefront coding technique to extend the depth of field. To compensate for the blurring of the image caused by wavefront coding, on the software side, the proposed system uses a local patch-based super-resolution method to restore the blurred image to its clear version. The collaborative effect of the new hardware design and software post-processing showed great potential in our experiment. The experimental results showed that such improvement cannot be achieved by using a hardware-or software-only design. The proposed system can increase the capture volume of a conventional iris recognition system by three times and maintain the system's high recognition rate.
New technologies for advanced three-dimensional optimum shape design in aeronautics
NASA Astrophysics Data System (ADS)
Dervieux, Alain; Lanteri, Stéphane; Malé, Jean-Michel; Marco, Nathalie; Rostaing-Schmidt, Nicole; Stoufflet, Bruno
1999-05-01
The analysis of complex flows around realistic aircraft geometries is becoming more and more predictive. In order to obtain this result, the complexity of flow analysis codes has been constantly increasing, involving more refined fluid models and sophisticated numerical methods. These codes can only run on top computers, exhausting their memory and CPU capabilities. It is, therefore, difficult to introduce best analysis codes in a shape optimization loop: most previous works in the optimum shape design field used only simplified analysis codes. Moreover, as the most popular optimization methods are the gradient-based ones, the more complex the flow solver, the more difficult it is to compute the sensitivity code. However, emerging technologies are contributing to make such an ambitious project, of including a state-of-the-art flow analysis code into an optimisation loop, feasible. Among those technologies, there are three important issues that this paper wishes to address: shape parametrization, automated differentiation and parallel computing. Shape parametrization allows faster optimization by reducing the number of design variable; in this work, it relies on a hierarchical multilevel approach. The sensitivity code can be obtained using automated differentiation. The automated approach is based on software manipulation tools, which allow the differentiation to be quick and the resulting differentiated code to be rather fast and reliable. In addition, the parallel algorithms implemented in this work allow the resulting optimization software to run on increasingly larger geometries. Copyright
Binary video codec for data reduction in wireless visual sensor networks
NASA Astrophysics Data System (ADS)
Khursheed, Khursheed; Ahmad, Naeem; Imran, Muhammad; O'Nils, Mattias
2013-02-01
Wireless Visual Sensor Networks (WVSN) is formed by deploying many Visual Sensor Nodes (VSNs) in the field. Typical applications of WVSN include environmental monitoring, health care, industrial process monitoring, stadium/airports monitoring for security reasons and many more. The energy budget in the outdoor applications of WVSN is limited to the batteries and the frequent replacement of batteries is usually not desirable. So the processing as well as the communication energy consumption of the VSN needs to be optimized in such a way that the network remains functional for longer duration. The images captured by VSN contain huge amount of data and require efficient computational resources for processing the images and wide communication bandwidth for the transmission of the results. Image processing algorithms must be designed and developed in such a way that they are computationally less complex and must provide high compression rate. For some applications of WVSN, the captured images can be segmented into bi-level images and hence bi-level image coding methods will efficiently reduce the information amount in these segmented images. But the compression rate of the bi-level image coding methods is limited by the underlined compression algorithm. Hence there is a need for designing other intelligent and efficient algorithms which are computationally less complex and provide better compression rate than that of bi-level image coding methods. Change coding is one such algorithm which is computationally less complex (require only exclusive OR operations) and provide better compression efficiency compared to image coding but it is effective for applications having slight changes between adjacent frames of the video. The detection and coding of the Region of Interest (ROIs) in the change frame efficiently reduce the information amount in the change frame. But, if the number of objects in the change frames is higher than a certain level then the compression efficiency of both the change coding and ROI coding becomes worse than that of image coding. This paper explores the compression efficiency of the Binary Video Codec (BVC) for the data reduction in WVSN. We proposed to implement all the three compression techniques i.e. image coding, change coding and ROI coding at the VSN and then select the smallest bit stream among the results of the three compression techniques. In this way the compression performance of the BVC will never become worse than that of image coding. We concluded that the compression efficiency of BVC is always better than that of change coding and is always better than or equal that of ROI coding and image coding.
Bai, Mingsian R; Hsieh, Ping-Ju; Hur, Kur-Nan
2009-02-01
The performance of the minimum mean-square error noise reduction (MMSE-NR) algorithm in conjunction with time-recursive averaging (TRA) for noise estimation is found to be very sensitive to the choice of two recursion parameters. To address this problem in a more systematic manner, this paper proposes an optimization method to efficiently search the optimal parameters of the MMSE-TRA-NR algorithms. The objective function is based on a regression model, whereas the optimization process is carried out with the simulated annealing algorithm that is well suited for problems with many local optima. Another NR algorithm proposed in the paper employs linear prediction coding as a preprocessor for extracting the correlated portion of human speech. Objective and subjective tests were undertaken to compare the optimized MMSE-TRA-NR algorithm with several conventional NR algorithms. The results of subjective tests were processed by using analysis of variance to justify the statistic significance. A post hoc test, Tukey's Honestly Significant Difference, was conducted to further assess the pairwise difference between the NR algorithms.
Launch Vehicle Propulsion Design with Multiple Selection Criteria
NASA Technical Reports Server (NTRS)
Shelton, Joey D.; Frederick, Robert A.; Wilhite, Alan W.
2005-01-01
The approach and techniques described herein define an optimization and evaluation approach for a liquid hydrogen/liquid oxygen single-stage-to-orbit system. The method uses Monte Carlo simulations, genetic algorithm solvers, a propulsion thermo-chemical code, power series regression curves for historical data, and statistical models in order to optimize a vehicle system. The system, including parameters for engine chamber pressure, area ratio, and oxidizer/fuel ratio, was modeled and optimized to determine the best design for seven separate design weight and cost cases by varying design and technology parameters. Significant model results show that a 53% increase in Design, Development, Test and Evaluation cost results in a 67% reduction in Gross Liftoff Weight. Other key findings show the sensitivity of propulsion parameters, technology factors, and cost factors and how these parameters differ when cost and weight are optimized separately. Each of the three key propulsion parameters; chamber pressure, area ratio, and oxidizer/fuel ratio, are optimized in the seven design cases and results are plotted to show impacts to engine mass and overall vehicle mass.
Multidisciplinary Analysis and Optimal Design: As Easy as it Sounds?
NASA Technical Reports Server (NTRS)
Moore, Greg; Chainyk, Mike; Schiermeier, John
2004-01-01
The viewgraph presentation examines optimal design for precision, large aperture structures. Discussion focuses on aspects of design optimization, code architecture and current capabilities, and planned activities and collaborative area suggestions. The discussion of design optimization examines design sensitivity analysis; practical considerations; and new analytical environments including finite element-based capability for high-fidelity multidisciplinary analysis, design sensitivity, and optimization. The discussion of code architecture and current capabilities includes basic thermal and structural elements, nonlinear heat transfer solutions and process, and optical modes generation.
Static Memory Deduplication for Performance Optimization in Cloud Computing.
Jia, Gangyong; Han, Guangjie; Wang, Hao; Yang, Xuan
2017-04-27
In a cloud computing environment, the number of virtual machines (VMs) on a single physical server and the number of applications running on each VM are continuously growing. This has led to an enormous increase in the demand of memory capacity and subsequent increase in the energy consumption in the cloud. Lack of enough memory has become a major bottleneck for scalability and performance of virtualization interfaces in cloud computing. To address this problem, memory deduplication techniques which reduce memory demand through page sharing are being adopted. However, such techniques suffer from overheads in terms of number of online comparisons required for the memory deduplication. In this paper, we propose a static memory deduplication (SMD) technique which can reduce memory capacity requirement and provide performance optimization in cloud computing. The main innovation of SMD is that the process of page detection is performed offline, thus potentially reducing the performance cost, especially in terms of response time. In SMD, page comparisons are restricted to the code segment, which has the highest shared content. Our experimental results show that SMD efficiently reduces memory capacity requirement and improves performance. We demonstrate that, compared to other approaches, the cost in terms of the response time is negligible.
Static Memory Deduplication for Performance Optimization in Cloud Computing
Jia, Gangyong; Han, Guangjie; Wang, Hao; Yang, Xuan
2017-01-01
In a cloud computing environment, the number of virtual machines (VMs) on a single physical server and the number of applications running on each VM are continuously growing. This has led to an enormous increase in the demand of memory capacity and subsequent increase in the energy consumption in the cloud. Lack of enough memory has become a major bottleneck for scalability and performance of virtualization interfaces in cloud computing. To address this problem, memory deduplication techniques which reduce memory demand through page sharing are being adopted. However, such techniques suffer from overheads in terms of number of online comparisons required for the memory deduplication. In this paper, we propose a static memory deduplication (SMD) technique which can reduce memory capacity requirement and provide performance optimization in cloud computing. The main innovation of SMD is that the process of page detection is performed offline, thus potentially reducing the performance cost, especially in terms of response time. In SMD, page comparisons are restricted to the code segment, which has the highest shared content. Our experimental results show that SMD efficiently reduces memory capacity requirement and improves performance. We demonstrate that, compared to other approaches, the cost in terms of the response time is negligible. PMID:28448434
NASA Technical Reports Server (NTRS)
Stahara, S. S.
1984-01-01
An investigation was carried out to complete the preliminary development of a combined perturbation/optimization procedure and associated computational code for designing optimized blade-to-blade profiles of turbomachinery blades. The overall purpose of the procedures developed is to provide demonstration of a rapid nonlinear perturbation method for minimizing the computational requirements associated with parametric design studies of turbomachinery flows. The method combines the multiple parameter nonlinear perturbation method, successfully developed in previous phases of this study, with the NASA TSONIC blade-to-blade turbomachinery flow solver, and the COPES-CONMIN optimization procedure into a user's code for designing optimized blade-to-blade surface profiles of turbomachinery blades. Results of several design applications and a documented version of the code together with a user's manual are provided.
Micrometeoroid and Orbital Debris Risk Assessment With Bumper 3
NASA Technical Reports Server (NTRS)
Hyde, J.; Bjorkman, M.; Christiansen, E.; Lear, D.
2017-01-01
The Bumper 3 computer code is the primary tool used by NASA for micrometeoroid and orbital debris (MMOD) risk analysis. Bumper 3 (and its predecessors) have been used to analyze a variety of manned and unmanned spacecraft. The code uses NASA's latest micrometeoroid (MEM-R2) and orbital debris (ORDEM 3.0) environment definition models and is updated frequently with ballistic limit equations that describe the hypervelocity impact performance of spacecraft materials. The Bumper 3 program uses these inputs along with a finite element representation of spacecraft geometry to provide a deterministic calculation of the expected number of failures. The Bumper 3 software is configuration controlled by the NASA/JSC Hypervelocity Impact Technology (HVIT) Group. This paper will demonstrate MMOD risk assessment techniques with Bumper 3 used by NASA's HVIT Group. The Permanent Multipurpose Module (PMM) was added to the International Space Station in 2011. A Bumper 3 MMOD risk assessment of this module will show techniques used to create the input model and assign the property IDs. The methodology used to optimize the MMOD shielding for minimum mass while still meeting structural penetration requirements will also be demonstrated.
Magnetically launched flyer plate technique for probing electrical conductivity of compressed copper
NASA Astrophysics Data System (ADS)
Cochrane, K. R.; Lemke, R. W.; Riford, Z.; Carpenter, J. H.
2016-03-01
The electrical conductivity of materials under extremes of temperature and pressure is of crucial importance for a wide variety of phenomena, including planetary modeling, inertial confinement fusion, and pulsed power based dynamic materials experiments. There is a dearth of experimental techniques and data for highly compressed materials, even at known states such as along the principal isentrope and Hugoniot, where many pulsed power experiments occur. We present a method for developing, calibrating, and validating material conductivity models as used in magnetohydrodynamic (MHD) simulations. The difficulty in calibrating a conductivity model is in knowing where the model should be modified. Our method isolates those regions that will have an impact. It also quantitatively prioritizes which regions will have the most beneficial impact. Finally, it tracks the quantitative improvements to the conductivity model during each incremental adjustment. In this paper, we use an experiment on Sandia National Laboratories Z-machine to isentropically launch multiple flyer plates and, with the MHD code ALEGRA and the optimization code DAKOTA, calibrated the conductivity such that we matched an experimental figure of merit to +/-1%.
Magnetically launched flyer plate technique for probing electrical conductivity of compressed copper
Cochrane, Kyle R.; Lemke, Raymond W.; Riford, Z.; ...
2016-03-11
The electrical conductivity of materials under extremes of temperature and pressure is of crucial importance for a wide variety of phenomena, including planetary modeling, inertial confinement fusion, and pulsed power based dynamic materialsexperiments. There is a dearth of experimental techniques and data for highly compressed materials, even at known states such as along the principal isentrope and Hugoniot, where many pulsed power experiments occur. We present a method for developing, calibrating, and validating material conductivity models as used in magnetohydrodynamic(MHD) simulations. The difficulty in calibrating a conductivity model is in knowing where the model should be modified. Our method isolatesmore » those regions that will have an impact. It also quantitatively prioritizes which regions will have the most beneficial impact. Finally, it tracks the quantitative improvements to the conductivity model during each incremental adjustment. In this study, we use an experiment on Sandia National Laboratories Z-machine to isentropically launch multiple flyer plates and, with the MHD code ALEGRA and the optimization code DAKOTA, calibrated the conductivity such that we matched an experimental figure of merit to +/–1%.« less
Davis, J P; Akella, S; Waddell, P H
2004-01-01
Having greater computational power on the desktop for processing taxa data sets has been a dream of biologists/statisticians involved in phylogenetics data analysis. Many existing algorithms have been highly optimized-one example being Felsenstein's PHYLIP code, written in C, for UPGMA and neighbor joining algorithms. However, the ability to process more than a few tens of taxa in a reasonable amount of time using conventional computers has not yielded a satisfactory speedup in data processing, making it difficult for phylogenetics practitioners to quickly explore data sets-such as might be done from a laptop computer. We discuss the application of custom computing techniques to phylogenetics. In particular, we apply this technology to speed up UPGMA algorithm execution by a factor of a hundred, against that of PHYLIP code running on the same PC. We report on these experiments and discuss how custom computing techniques can be used to not only accelerate phylogenetics algorithm performance on the desktop, but also on larger, high-performance computing engines, thus enabling the high-speed processing of data sets involving thousands of taxa.
Visser, Marco D.; McMahon, Sean M.; Merow, Cory; Dixon, Philip M.; Record, Sydne; Jongejans, Eelke
2015-01-01
Computation has become a critical component of research in biology. A risk has emerged that computational and programming challenges may limit research scope, depth, and quality. We review various solutions to common computational efficiency problems in ecological and evolutionary research. Our review pulls together material that is currently scattered across many sources and emphasizes those techniques that are especially effective for typical ecological and environmental problems. We demonstrate how straightforward it can be to write efficient code and implement techniques such as profiling or parallel computing. We supply a newly developed R package (aprof) that helps to identify computational bottlenecks in R code and determine whether optimization can be effective. Our review is complemented by a practical set of examples and detailed Supporting Information material (S1–S3 Texts) that demonstrate large improvements in computational speed (ranging from 10.5 times to 14,000 times faster). By improving computational efficiency, biologists can feasibly solve more complex tasks, ask more ambitious questions, and include more sophisticated analyses in their research. PMID:25811842
Growth of zinc selenide single crystals by physical vapor transport in microgravity
NASA Technical Reports Server (NTRS)
Rosenberger, Franz
1993-01-01
The goals of this research were the optimization of growth parameters for large (20 mm diameter and length) zinc selenide single crystals with low structural defect density, and the development of a 3-D numerical model for the transport rates to be expected in physical vapor transport under a given set of thermal and geometrical boundary conditions, in order to provide guidance for an advantageous conduct of the growth experiments. In the crystal growth studies, it was decided to exclusively apply the Effusive Ampoule PVT technique (EAPVT) to the growth of ZnSe. In this technique, the accumulation of transport-limiting gaseous components at the growing crystal is suppressed by continuous effusion to vacuum of part of the vapor contents. This is achieved through calibrated leaks in one of the ground joints of the ampoule. Regarding the PVT transport rates, a 3-D spectral code was modified. After introduction of the proper boundary conditions and subroutines for the composition-dependent transport properties, the code reproduced the experimentally determined transport rates for the two cases with strongest convective flux contributions to within the experimental and numerical error.
Data compression for satellite images
NASA Technical Reports Server (NTRS)
Chen, P. H.; Wintz, P. A.
1976-01-01
An efficient data compression system is presented for satellite pictures and two grey level pictures derived from satellite pictures. The compression techniques take advantages of the correlation between adjacent picture elements. Several source coding methods are investigated. Double delta coding is presented and shown to be the most efficient. Both predictive differential quantizing technique and double delta coding can be significantly improved by applying a background skipping technique. An extension code is constructed. This code requires very little storage space and operates efficiently. Simulation results are presented for various coding schemes and source codes.
TRO-2D - A code for rational transonic aerodynamic optimization
NASA Technical Reports Server (NTRS)
Davis, W. H., Jr.
1985-01-01
Features and sample applications of the transonic rational optimization (TRO-2D) code are outlined. TRO-2D includes the airfoil analysis code FLO-36, the CONMIN optimization code and a rational approach to defining aero-function shapes for geometry modification. The program is part of an effort to develop an aerodynamically smart optimizer that will simplify and shorten the design process. The user has a selection of drag minimization and associated minimum lift, moment, and the pressure distribution, a choice among 14 resident aero-function shapes, and options on aerodynamic and geometric constraints. Design variables such as the angle of attack, leading edge radius and camber, shock strength and movement, supersonic pressure plateau control, etc., are discussed. The results of calculations of a reduced leading edge camber transonic airfoil and an airfoil with a natural laminar flow are provided, showing that only four design variables need be specified to obtain satisfactory results.
3D Ultrasonic Wave Simulations for Structural Health Monitoring
NASA Technical Reports Server (NTRS)
Campbell, Leckey Cara A/; Miler, Corey A.; Hinders, Mark K.
2011-01-01
Structural health monitoring (SHM) for the detection of damage in aerospace materials is an important area of research at NASA. Ultrasonic guided Lamb waves are a promising SHM damage detection technique since the waves can propagate long distances. For complicated flaw geometries experimental signals can be difficult to interpret. High performance computing can now handle full 3-dimensional (3D) simulations of elastic wave propagation in materials. We have developed and implemented parallel 3D elastodynamic finite integration technique (3D EFIT) code to investigate ultrasound scattering from flaws in materials. EFIT results have been compared to experimental data and the simulations provide unique insight into details of the wave behavior. This type of insight is useful for developing optimized experimental SHM techniques. 3D EFIT can also be expanded to model wave propagation and scattering in anisotropic composite materials.
Nuclear fuel management optimization using genetic algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeChaine, M.D.; Feltus, M.A.
1995-07-01
The code independent genetic algorithm reactor optimization (CIGARO) system has been developed to optimize nuclear reactor loading patterns. It uses genetic algorithms (GAs) and a code-independent interface, so any reactor physics code (e.g., CASMO-3/SIMULATE-3) can be used to evaluate the loading patterns. The system is compared to other GA-based loading pattern optimizers. Tests were carried out to maximize the beginning of cycle k{sub eff} for a pressurized water reactor core loading with a penalty function to limit power peaking. The CIGARO system performed well, increasing the k{sub eff} after lowering the peak power. Tests of a prototype parallel evaluation methodmore » showed the potential for a significant speedup.« less
1981-12-01
file.library-unit{.subunit).SYMAP Statement Map: library-file. library-unit.subunit).SMAP Type Map: 1 ibrary.fi le. 1 ibrary-unit{.subunit). TMAP The library...generator SYMAP Symbol Map code generator SMAP Updated Statement Map code generator TMAP Type Map code generator A.3.5 The PUNIT Command The P UNIT...Core.Stmtmap) NAME Tmap (Core.Typemap) END Example A-3 Compiler Command Stream for the Code Generator Texas Instruments A-5 Ada Optimizing Compiler
NASA Astrophysics Data System (ADS)
Athaudage, Chandranath R. N.; Bradley, Alan B.; Lech, Margaret
2003-12-01
A dynamic programming-based optimization strategy for a temporal decomposition (TD) model of speech and its application to low-rate speech coding in storage and broadcasting is presented. In previous work with the spectral stability-based event localizing (SBEL) TD algorithm, the event localization was performed based on a spectral stability criterion. Although this approach gave reasonably good results, there was no assurance on the optimality of the event locations. In the present work, we have optimized the event localizing task using a dynamic programming-based optimization strategy. Simulation results show that an improved TD model accuracy can be achieved. A methodology of incorporating the optimized TD algorithm within the standard MELP speech coder for the efficient compression of speech spectral information is also presented. The performance evaluation results revealed that the proposed speech coding scheme achieves 50%-60% compression of speech spectral information with negligible degradation in the decoded speech quality.
Heuristic rules embedded genetic algorithm for in-core fuel management optimization
NASA Astrophysics Data System (ADS)
Alim, Fatih
The objective of this study was to develop a unique methodology and a practical tool for designing loading pattern (LP) and burnable poison (BP) pattern for a given Pressurized Water Reactor (PWR) core. Because of the large number of possible combinations for the fuel assembly (FA) loading in the core, the design of the core configuration is a complex optimization problem. It requires finding an optimal FA arrangement and BP placement in order to achieve maximum cycle length while satisfying the safety constraints. Genetic Algorithms (GA) have been already used to solve this problem for LP optimization for both PWR and Boiling Water Reactor (BWR). The GA, which is a stochastic method works with a group of solutions and uses random variables to make decisions. Based on the theories of evaluation, the GA involves natural selection and reproduction of the individuals in the population for the next generation. The GA works by creating an initial population, evaluating it, and then improving the population by using the evaluation operators. To solve this optimization problem, a LP optimization package, GARCO (Genetic Algorithm Reactor Code Optimization) code is developed in the framework of this thesis. This code is applicable for all types of PWR cores having different geometries and structures with an unlimited number of FA types in the inventory. To reach this goal, an innovative GA is developed by modifying the classical representation of the genotype. To obtain the best result in a shorter time, not only the representation is changed but also the algorithm is changed to use in-core fuel management heuristics rules. The improved GA code was tested to demonstrate and verify the advantages of the new enhancements. The developed methodology is explained in this thesis and preliminary results are shown for the VVER-1000 reactor hexagonal geometry core and the TMI-1 PWR. The improved GA code was tested to verify the advantages of new enhancements. The core physics code used for VVER in this research is Moby-Dick, which was developed to analyze the VVER by SKODA Inc. The SIMULATE-3 code, which is an advanced two-group nodal code, is used to analyze the TMI-1.
Optimized scalar promotion with load and splat SIMD instructions
Eichenberger, Alexander E; Gschwind, Michael K; Gunnels, John A
2013-10-29
Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.
Optimized scalar promotion with load and splat SIMD instructions
Eichenberger, Alexandre E [Chappaqua, NY; Gschwind, Michael K [Chappaqua, NY; Gunnels, John A [Yorktown Heights, NY
2012-08-28
Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.
Locality Aware Concurrent Start for Stencil Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shrestha, Sunil; Gao, Guang R.; Manzano Franco, Joseph B.
Stencil computations are at the heart of many physical simulations used in scientific codes. Thus, there exists a plethora of optimization efforts for this family of computations. Among these techniques, tiling techniques that allow concurrent start have proven to be very efficient in providing better performance for these critical kernels. Nevertheless, with many core designs being the norm, these optimization techniques might not be able to fully exploit locality (both spatial and temporal) on multiple levels of the memory hierarchy without compromising parallelism. It is no longer true that the machine can be seen as a homogeneous collection of nodesmore » with caches, main memory and an interconnect network. New architectural designs exhibit complex grouping of nodes, cores, threads, caches and memory connected by an ever evolving network-on-chip design. These new designs may benefit greatly from carefully crafted schedules and groupings that encourage parallel actors (i.e. threads, cores or nodes) to be aware of the computational history of other actors in close proximity. In this paper, we provide an efficient tiling technique that allows hierarchical concurrent start for memory hierarchy aware tile groups. Each execution schedule and tile shape exploit the available parallelism, load balance and locality present in the given applications. We demonstrate our technique on the Intel Xeon Phi architecture with selected and representative stencil kernels. We show improvement ranging from 5.58% to 31.17% over existing state-of-the-art techniques.« less
NASA Astrophysics Data System (ADS)
Luo, Qiankun; Wu, Jianfeng; Yang, Yun; Qian, Jiazhong; Wu, Jichun
2014-11-01
This study develops a new probabilistic multi-objective fast harmony search algorithm (PMOFHS) for optimal design of groundwater remediation systems under uncertainty associated with the hydraulic conductivity (K) of aquifers. The PMOFHS integrates the previously developed deterministic multi-objective optimization method, namely multi-objective fast harmony search algorithm (MOFHS) with a probabilistic sorting technique to search for Pareto-optimal solutions to multi-objective optimization problems in a noisy hydrogeological environment arising from insufficient K data. The PMOFHS is then coupled with the commonly used flow and transport codes, MODFLOW and MT3DMS, to identify the optimal design of groundwater remediation systems for a two-dimensional hypothetical test problem and a three-dimensional Indiana field application involving two objectives: (i) minimization of the total remediation cost through the engineering planning horizon, and (ii) minimization of the mass remaining in the aquifer at the end of the operational period, whereby the pump-and-treat (PAT) technology is used to clean up contaminated groundwater. Also, Monte Carlo (MC) analysis is employed to evaluate the effectiveness of the proposed methodology. Comprehensive analysis indicates that the proposed PMOFHS can find Pareto-optimal solutions with low variability and high reliability and is a potentially effective tool for optimizing multi-objective groundwater remediation problems under uncertainty.
Some practical universal noiseless coding techniques, part 3, module PSl14,K+
NASA Technical Reports Server (NTRS)
Rice, Robert F.
1991-01-01
The algorithmic definitions, performance characterizations, and application notes for a high-performance adaptive noiseless coding module are provided. Subsets of these algorithms are currently under development in custom very large scale integration (VLSI) at three NASA centers. The generality of coding algorithms recently reported is extended. The module incorporates a powerful adaptive noiseless coder for Standard Data Sources (i.e., sources whose symbols can be represented by uncorrelated non-negative integers, where smaller integers are more likely than the larger ones). Coders can be specified to provide performance close to the data entropy over any desired dynamic range (of entropy) above 0.75 bit/sample. This is accomplished by adaptively choosing the best of many efficient variable-length coding options to use on each short block of data (e.g., 16 samples) All code options used for entropies above 1.5 bits/sample are 'Huffman Equivalent', but they require no table lookups to implement. The coding can be performed directly on data that have been preprocessed to exhibit the characteristics of a standard source. Alternatively, a built-in predictive preprocessor can be used where applicable. This built-in preprocessor includes the familiar 1-D predictor followed by a function that maps the prediction error sequences into the desired standard form. Additionally, an external prediction can be substituted if desired. A broad range of issues dealing with the interface between the coding module and the data systems it might serve are further addressed. These issues include: multidimensional prediction, archival access, sensor noise, rate control, code rate improvements outside the module, and the optimality of certain internal code options.
A novel bit-wise adaptable entropy coding technique
NASA Technical Reports Server (NTRS)
Kiely, A.; Klimesh, M.
2001-01-01
We present a novel entropy coding technique which is adaptable in that each bit to be encoded may have an associated probability esitmate which depends on previously encoded bits. The technique may have advantages over arithmetic coding. The technique can achieve arbitrarily small redundancy and admits a simple and fast decoder.
Optimal block cosine transform image coding for noisy channels
NASA Technical Reports Server (NTRS)
Vaishampayan, V.; Farvardin, N.
1986-01-01
The two dimensional block transform coding scheme based on the discrete cosine transform was studied extensively for image coding applications. While this scheme has proven to be efficient in the absence of channel errors, its performance degrades rapidly over noisy channels. A method is presented for the joint source channel coding optimization of a scheme based on the 2-D block cosine transform when the output of the encoder is to be transmitted via a memoryless design of the quantizers used for encoding the transform coefficients. This algorithm produces a set of locally optimum quantizers and the corresponding binary code assignment for the assumed transform coefficient statistics. To determine the optimum bit assignment among the transform coefficients, an algorithm was used based on the steepest descent method, which under certain convexity conditions on the performance of the channel optimized quantizers, yields the optimal bit allocation. Comprehensive simulation results for the performance of this locally optimum system over noisy channels were obtained and appropriate comparisons against a reference system designed for no channel error were rendered.
Numerical optimization of three-dimensional coils for NSTX-U
Lazerson, S. A.; Park, J. -K.; Logan, N.; ...
2015-09-03
A tool for the calculation of optimal three-dimensional (3D) perturbative magnetic fields in tokamaks has been developed. The IPECOPT code builds upon the stellarator optimization code STELLOPT to allow for optimization of linear ideal magnetohydrodynamic perturbed equilibrium (IPEC). This tool has been applied to NSTX-U equilibria, addressing which fields are the most effective at driving NTV torques. The NTV torque calculation is performed by the PENT code. Optimization of the normal field spectrum shows that fields with n = 1 character can drive a large core torque. It is also shown that fields with n = 3 features are capablemore » of driving edge torque and some core torque. Coil current optimization (using the planned in-vessel and existing RWM coils) on NSTX-U suggest the planned coils set is adequate for core and edge torque control. In conclusion, comparison between error field correction experiments on DIII-D and the optimizer show good agreement.« less
NASA Technical Reports Server (NTRS)
Baecher, Juergen; Bandte, Oliver; DeLaurentis, Dan; Lewis, Kemper; Sicilia, Jose; Soboleski, Craig
1995-01-01
This report documents the efforts of a Georgia Tech High Speed Civil Transport (HSCT) aerospace student design team in completing a design methodology demonstration under NASA's Advanced Design Program (ADP). Aerodynamic and propulsion analyses are integrated into the synthesis code FLOPS in order to improve its prediction accuracy. Executing the integrated product and process development (IPPD) methodology proposed at the Aerospace Systems Design Laboratory (ASDL), an improved sizing process is described followed by a combined aero-propulsion optimization, where the objective function, average yield per revenue passenger mile ($/RPM), is constrained by flight stability, noise, approach speed, and field length restrictions. Primary goals include successful demonstration of the application of the response surface methodolgy (RSM) to parameter design, introduction to higher fidelity disciplinary analysis than normally feasible at the conceptual and early preliminary level, and investigations of relationships between aerodynamic and propulsion design parameters and their effect on the objective function, $/RPM. A unique approach to aircraft synthesis is developed in which statistical methods, specifically design of experiments and the RSM, are used to more efficiently search the design space for optimum configurations. In particular, two uses of these techniques are demonstrated. First, response model equations are formed which represent complex analysis in the form of a regression polynomial. Next, a second regression equation is constructed, not for modeling purposes, but instead for the purpose of optimization at the system level. Such an optimization problem with the given tools normally would be difficult due to the need for hard connections between the various complex codes involved. The statistical methodology presents an alternative and is demonstrated via an example of aerodynamic modeling and planform optimization for a HSCT.
Optimal periodic binary codes of lengths 28 to 64
NASA Technical Reports Server (NTRS)
Tyler, S.; Keston, R.
1980-01-01
Results from computer searches performed to find repeated binary phase coded waveforms with optimal periodic autocorrelation functions are discussed. The best results for lengths 28 to 64 are given. The code features of major concern are where (1) the peak sidelobe in the autocorrelation function is small and (2) the sum of the squares of the sidelobes in the autocorrelation function is small.
Two Classes of New Optimal Asymmetric Quantum Codes
NASA Astrophysics Data System (ADS)
Chen, Xiaojing; Zhu, Shixin; Kai, Xiaoshan
2018-03-01
Let q be an even prime power and ω be a primitive element of F_{q2}. By analyzing the structure of cyclotomic cosets, we determine a sufficient condition for ω q- 1-constacyclic codes over F_{q2} to be Hermitian dual-containing codes. By the CSS construction, two classes of new optimal AQECCs are obtained according to the Singleton bound for AQECCs.
A Subsonic Aircraft Design Optimization With Neural Network and Regression Approximators
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.; Haller, William J.
2004-01-01
The Flight-Optimization-System (FLOPS) code encountered difficulty in analyzing a subsonic aircraft. The limitation made the design optimization problematic. The deficiencies have been alleviated through use of neural network and regression approximations. The insight gained from using the approximators is discussed in this paper. The FLOPS code is reviewed. Analysis models are developed and validated for each approximator. The regression method appears to hug the data points, while the neural network approximation follows a mean path. For an analysis cycle, the approximate model required milliseconds of central processing unit (CPU) time versus seconds by the FLOPS code. Performance of the approximators was satisfactory for aircraft analysis. A design optimization capability has been created by coupling the derived analyzers to the optimization test bed CometBoards. The approximators were efficient reanalysis tools in the aircraft design optimization. Instability encountered in the FLOPS analyzer was eliminated. The convergence characteristics were improved for the design optimization. The CPU time required to calculate the optimum solution, measured in hours with the FLOPS code was reduced to minutes with the neural network approximation and to seconds with the regression method. Generation of the approximators required the manipulation of a very large quantity of data. Design sensitivity with respect to the bounds of aircraft constraints is easily generated.
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Guptill, James D.; Hopkins, Dale A.; Lavelle, Thomas M.
2000-01-01
The NASA Engine Performance Program (NEPP) can configure and analyze almost any type of gas turbine engine that can be generated through the interconnection of a set of standard physical components. In addition, the code can optimize engine performance by changing adjustable variables under a set of constraints. However, for engine cycle problems at certain operating points, the NEPP code can encounter difficulties: nonconvergence in the currently implemented Powell's optimization algorithm and deficiencies in the Newton-Raphson solver during engine balancing. A project was undertaken to correct these deficiencies. Nonconvergence was avoided through a cascade optimization strategy, and deficiencies associated with engine balancing were eliminated through neural network and linear regression methods. An approximation-interspersed cascade strategy was used to optimize the engine's operation over its flight envelope. Replacement of Powell's algorithm by the cascade strategy improved the optimization segment of the NEPP code. The performance of the linear regression and neural network methods as alternative engine analyzers was found to be satisfactory. This report considers two examples-a supersonic mixed-flow turbofan engine and a subsonic waverotor-topped engine-to illustrate the results, and it discusses insights gained from the improved version of the NEPP code.
Novel Integration of Frame Rate Up Conversion and HEVC Coding Based on Rate-Distortion Optimization.
Guo Lu; Xiaoyun Zhang; Li Chen; Zhiyong Gao
2018-02-01
Frame rate up conversion (FRUC) can improve the visual quality by interpolating new intermediate frames. However, high frame rate videos by FRUC are confronted with more bitrate consumption or annoying artifacts of interpolated frames. In this paper, a novel integration framework of FRUC and high efficiency video coding (HEVC) is proposed based on rate-distortion optimization, and the interpolated frames can be reconstructed at encoder side with low bitrate cost and high visual quality. First, joint motion estimation (JME) algorithm is proposed to obtain robust motion vectors, which are shared between FRUC and video coding. What's more, JME is embedded into the coding loop and employs the original motion search strategy in HEVC coding. Then, the frame interpolation is formulated as a rate-distortion optimization problem, where both the coding bitrate consumption and visual quality are taken into account. Due to the absence of original frames, the distortion model for interpolated frames is established according to the motion vector reliability and coding quantization error. Experimental results demonstrate that the proposed framework can achieve 21% ~ 42% reduction in BDBR, when compared with the traditional methods of FRUC cascaded with coding.
From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation
Blazewicz, Marek; Hinder, Ian; Koppelman, David M.; ...
2013-01-01
Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization ismore » based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.« less
Wing Weight Optimization Under Aeroelastic Loads Subject to Stress Constraints
NASA Technical Reports Server (NTRS)
Kapania, Rakesh K.; Issac, J.; Macmurdy, D.; Guruswamy, Guru P.
1997-01-01
A minimum weight optimization of the wing under aeroelastic loads subject to stress constraints is carried out. The loads for the optimization are based on aeroelastic trim. The design variables are the thickness of the wing skins and planform variables. The composite plate structural model incorporates first-order shear deformation theory, the wing deflections are expressed using Chebyshev polynomials and a Rayleigh-Ritz procedure is adopted for the structural formulation. The aerodynamic pressures provided by the aerodynamic code at a discrete number of grid points is represented as a bilinear distribution on the composite plate code to solve for the deflections and stresses in the wing. The lifting-surface aerodynamic code FAST is presently being used to generate the pressure distribution over the wing. The envisioned ENSAERO/Plate is an aeroelastic analysis code which combines ENSAERO version 3.0 (for analysis of wing-body configurations) with the composite plate code.
NASA Technical Reports Server (NTRS)
Lyons, J. T.
1993-01-01
The Minimum Hamiltonian Ascent Trajectory Evaluation (MASTRE) program and its predecessors, the ROBOT and the RAGMOP programs, have had a long history of supporting MSFC in the simulation of space boosters for the purpose of performance evaluation. The ROBOT program was used in the simulation of the Saturn 1B and Saturn 5 vehicles in the 1960's and provided the first utilization of the minimum Hamiltonian (or min-H) methodology and the steepest ascent technique to solve the optimum trajectory problem. The advent of the Space Shuttle in the 1970's and its complex airplane design required a redesign of the trajectory simulation code since aerodynamic flight and controllability were required for proper simulation. The RAGMOP program was the first attempt to incorporate the complex equations of the Space Shuttle into an optimization tool by using an optimization method based on steepest ascent techniques (but without the min-H methodology). Development of the complex partial derivatives associated with the Space Shuttle configuration and using techniques from the RAGMOP program, the ROBOT program was redesigned to incorporate these additional complexities. This redesign created the MASTRE program, which was referred to as the Minimum Hamiltonian Ascent Shuttle TRajectory Evaluation program at that time. Unique to this program were first-stage (or booster) nonlinear aerodynamics, upper-stage linear aerodynamics, engine control via moment balance, liquid and solid thrust forces, variable liquid throttling to maintain constant acceleration limits, and a total upgrade of the equations used in the forward and backward integration segments of the program. This modification of the MASTRE code has been used to simulate the new space vehicles associated with the National Launch Systems (NLS). Although not as complicated as the Space Shuttle, the simulation and analysis of the NLS vehicles required additional modifications to the MASTRE program in the areas of providing additional flexibility in the use of the program, allowing additional optimization options, and providing special options for the NLS configuration.
Optimized iterative decoding method for TPC coded CPM
NASA Astrophysics Data System (ADS)
Ma, Yanmin; Lai, Penghui; Wang, Shilian; Xie, Shunqin; Zhang, Wei
2018-05-01
Turbo Product Code (TPC) coded Continuous Phase Modulation (CPM) system (TPC-CPM) has been widely used in aeronautical telemetry and satellite communication. This paper mainly investigates the improvement and optimization on the TPC-CPM system. We first add the interleaver and deinterleaver to the TPC-CPM system, and then establish an iterative system to iteratively decode. However, the improved system has a poor convergence ability. To overcome this issue, we use the Extrinsic Information Transfer (EXIT) analysis to find the optimal factors for the system. The experiments show our method is efficient to improve the convergence performance.
Nuclear Energy Knowledge and Validation Center (NEKVaC) Needs Workshop Summary Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gougar, Hans
2015-02-01
The Department of Energy (DOE) has made significant progress developing simulation tools to predict the behavior of nuclear systems with greater accuracy and of increasing our capability to predict the behavior of these systems outside of the standard range of applications. These analytical tools require a more complex array of validation tests to accurately simulate the physics and multiple length and time scales. Results from modern simulations will allow experiment designers to narrow the range of conditions needed to bound system behavior and to optimize the deployment of instrumentation to limit the breadth and cost of the campaign. Modern validation,more » verification and uncertainty quantification (VVUQ) techniques enable analysts to extract information from experiments in a systematic manner and provide the users with a quantified uncertainty estimate. Unfortunately, the capability to perform experiments that would enable taking full advantage of the formalisms of these modern codes has progressed relatively little (with some notable exceptions in fuels and thermal-hydraulics); the majority of the experimental data available today is the "historic" data accumulated over the last decades of nuclear systems R&D. A validated code-model is a tool for users. An unvalidated code-model is useful for code developers to gain understanding, publish research results, attract funding, etc. As nuclear analysis codes have become more sophisticated, so have the measurement and validation methods and the challenges that confront them. A successful yet cost-effective validation effort requires expertise possessed only by a few, resources possessed only by the well-capitalized (or a willing collective), and a clear, well-defined objective (validating a code that is developed to satisfy the need(s) of an actual user). To that end, the Idaho National Laboratory established the Nuclear Energy Knowledge and Validation Center to address the challenges of modern code validation and to manage the knowledge from past, current, and future experimental campaigns. By pulling together the best minds involved in code development, experiment design, and validation to establish and disseminate best practices and new techniques, the Nuclear Energy Knowledge and Validation Center (NEKVaC or the ‘Center’) will be a resource for industry, DOE Programs, and academia validation efforts.« less
NASA Technical Reports Server (NTRS)
Fishbach, L. H.
1979-01-01
The paper describes the computational techniques employed in determining the optimal propulsion systems for future aircraft applications and to identify system tradeoffs and technology requirements. The computer programs used to perform calculations for all the factors that enter into the selection process of determining the optimum combinations of airplanes and engines are examined. Attention is given to the description of the computer codes including NNEP, WATE, LIFCYC, INSTAL, and POD DRG. A process is illustrated by which turbine engines can be evaluated as to fuel consumption, engine weight, cost and installation effects. Examples are shown as to the benefits of variable geometry and of the tradeoff between fuel burned and engine weights. Future plans for further improvements in the analytical modeling of engine systems are also described.
Newton-based optimization for Kullback-Leibler nonnegative tensor factorizations
Plantenga, Todd; Kolda, Tamara G.; Hansen, Samantha
2015-04-30
Tensor factorizations with nonnegativity constraints have found application in analysing data from cyber traffic, social networks, and other areas. We consider application data best described as being generated by a Poisson process (e.g. count data), which leads to sparse tensors that can be modelled by sparse factor matrices. In this paper, we investigate efficient techniques for computing an appropriate canonical polyadic tensor factorization based on the Kullback–Leibler divergence function. We propose novel subproblem solvers within the standard alternating block variable approach. Our new methods exploit structure and reformulate the optimization problem as small independent subproblems. We employ bound-constrained Newton andmore » quasi-Newton methods. Finally, we compare our algorithms against other codes, demonstrating superior speed for high accuracy results and the ability to quickly find sparse solutions.« less
Using Intel Xeon Phi to accelerate the WRF TEMF planetary boundary layer scheme
NASA Astrophysics Data System (ADS)
Mielikainen, Jarno; Huang, Bormin; Huang, Allen
2014-05-01
The Weather Research and Forecasting (WRF) model is designed for numerical weather prediction and atmospheric research. The WRF software infrastructure consists of several components such as dynamic solvers and physics schemes. Numerical models are used to resolve the large-scale flow. However, subgrid-scale parameterizations are for an estimation of small-scale properties (e.g., boundary layer turbulence and convection, clouds, radiation). Those have a significant influence on the resolved scale due to the complex nonlinear nature of the atmosphere. For the cloudy planetary boundary layer (PBL), it is fundamental to parameterize vertical turbulent fluxes and subgrid-scale condensation in a realistic manner. A parameterization based on the Total Energy - Mass Flux (TEMF) that unifies turbulence and moist convection components produces a better result that the other PBL schemes. For that reason, the TEMF scheme is chosen as the PBL scheme we optimized for Intel Many Integrated Core (MIC), which ushers in a new era of supercomputing speed, performance, and compatibility. It allows the developers to run code at trillions of calculations per second using the familiar programming model. In this paper, we present our optimization results for TEMF planetary boundary layer scheme. The optimizations that were performed were quite generic in nature. Those optimizations included vectorization of the code to utilize vector units inside each CPU. Furthermore, memory access was improved by scalarizing some of the intermediate arrays. The results show that the optimization improved MIC performance by 14.8x. Furthermore, the optimizations increased CPU performance by 2.6x compared to the original multi-threaded code on quad core Intel Xeon E5-2603 running at 1.8 GHz. Compared to the optimized code running on a single CPU socket the optimized MIC code is 6.2x faster.
NASA Astrophysics Data System (ADS)
Tian, Lei; Waller, Laura
2017-05-01
Microscope lenses can have either large field of view (FOV) or high resolution, not both. Computational microscopy based on illumination coding circumvents this limit by fusing images from different illumination angles using nonlinear optimization algorithms. The result is a Gigapixel-scale image having both wide FOV and high resolution. We demonstrate an experimentally robust reconstruction algorithm based on a 2nd order quasi-Newton's method, combined with a novel phase initialization scheme. To further extend the Gigapixel imaging capability to 3D, we develop a reconstruction method to process the 4D light field measurements from sequential illumination scanning. The algorithm is based on a 'multislice' forward model that incorporates both 3D phase and diffraction effects, as well as multiple forward scatterings. To solve the inverse problem, an iterative update procedure that combines both phase retrieval and 'error back-propagation' is developed. To avoid local minimum solutions, we further develop a novel physical model-based initialization technique that accounts for both the geometric-optic and 1st order phase effects. The result is robust reconstructions of Gigapixel 3D phase images having both wide FOV and super resolution in all three dimensions. Experimental results from an LED array microscope were demonstrated.
PuReMD-GPU: A reactive molecular dynamics simulation package for GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kylasa, S.B., E-mail: skylasa@purdue.edu; Aktulga, H.M., E-mail: hmaktulga@lbl.gov; Grama, A.Y., E-mail: ayg@cs.purdue.edu
2014-09-01
We present an efficient and highly accurate GP-GPU implementation of our community code, PuReMD, for reactive molecular dynamics simulations using the ReaxFF force field. PuReMD and its incorporation into LAMMPS (Reax/C) is used by a large number of research groups worldwide for simulating diverse systems ranging from biomembranes to explosives (RDX) at atomistic level of detail. The sub-femtosecond time-steps associated with ReaxFF strongly motivate significant improvements to per-timestep simulation time through effective use of GPUs. This paper presents, in detail, the design and implementation of PuReMD-GPU, which enables ReaxFF simulations on GPUs, as well as various performance optimization techniques wemore » developed to obtain high performance on state-of-the-art hardware. Comprehensive experiments on model systems (bulk water and amorphous silica) are presented to quantify the performance improvements achieved by PuReMD-GPU and to verify its accuracy. In particular, our experiments show up to 16× improvement in runtime compared to our highly optimized CPU-only single-core ReaxFF implementation. PuReMD-GPU is a unique production code, and is currently available on request from the authors.« less
Super-resolution processing for multi-functional LPI waveforms
NASA Astrophysics Data System (ADS)
Li, Zhengzheng; Zhang, Yan; Wang, Shang; Cai, Jingxiao
2014-05-01
Super-resolution (SR) is a radar processing technique closely related to the pulse compression (or correlation receiver). There are many super-resolution algorithms developed for the improved range resolution and reduced sidelobe contaminations. Traditionally, the waveforms used for the SR have been either phase-coding (such as LKP3 code, Barker code) or the frequency modulation (chirp, or nonlinear frequency modulation). There are, however, an important class of waveforms which are either random in nature (such as random noise waveform), or randomly modulated for multiple function operations (such as the ADS-B radar signals in [1]). These waveforms have the advantages of low-probability-of-intercept (LPI). If the existing SR techniques can be applied to these waveforms, there will be much more flexibility for using these waveforms in actual sensing missions. Also, SR usually has great advantage that the final output (as estimation of ground truth) is largely independent of the waveform. Such benefits are attractive to many important primary radar applications. In this paper the general introduction of the SR algorithms are provided first, and some implementation considerations are discussed. The selected algorithms are applied to the typical LPI waveforms, and the results are discussed. It is observed that SR algorithms can be reliably used for LPI waveforms, on the other hand, practical considerations should be kept in mind in order to obtain the optimal estimation results.
Dai, Wenrui; Xiong, Hongkai; Jiang, Xiaoqian; Chen, Chang Wen
2014-01-01
This paper proposes a novel model on intra coding for High Efficiency Video Coding (HEVC), which simultaneously predicts blocks of pixels with optimal rate distortion. It utilizes the spatial statistical correlation for the optimal prediction based on 2-D contexts, in addition to formulating the data-driven structural interdependences to make the prediction error coherent with the probability distribution, which is desirable for successful transform and coding. The structured set prediction model incorporates a max-margin Markov network (M3N) to regulate and optimize multiple block predictions. The model parameters are learned by discriminating the actual pixel value from other possible estimates to maximize the margin (i.e., decision boundary bandwidth). Compared to existing methods that focus on minimizing prediction error, the M3N-based model adaptively maintains the coherence for a set of predictions. Specifically, the proposed model concurrently optimizes a set of predictions by associating the loss for individual blocks to the joint distribution of succeeding discrete cosine transform coefficients. When the sample size grows, the prediction error is asymptotically upper bounded by the training error under the decomposable loss function. As an internal step, we optimize the underlying Markov network structure to find states that achieve the maximal energy using expectation propagation. For validation, we integrate the proposed model into HEVC for optimal mode selection on rate-distortion optimization. The proposed prediction model obtains up to 2.85% bit rate reduction and achieves better visual quality in comparison to the HEVC intra coding. PMID:25505829
NASA Astrophysics Data System (ADS)
Gomes, J. M.; Papaderos, P.
2017-07-01
The goal of population spectral synthesis (pss; also referred to as inverse, semi-empirical evolutionary- or fossil record approach) is to decipher from the spectrum of a galaxy the mass, age and metallicity of its constituent stellar populations. This technique, which is the reverse of but complementary to evolutionary synthesis, has been established as fundamental tool in extragalactic research. It has been extensively applied to large spectroscopic data sets, notably the SDSS, leading to important insights into the galaxy assembly history. However, despite significant improvements over the past decade, all current pss codes suffer from two major deficiencies that inhibit us from gaining sharp insights into the star-formation history (SFH) of galaxies and potentially introduce substantial biases in studies of their physical properties (e.g., stellar mass, mass-weighted stellar age and specific star formation rate). These are I) the neglect of nebular emission in spectral fits, consequently; II) the lack of a mechanism that ensures consistency between the best-fitting SFH and the observed nebular emission characteristics of a star-forming (SF) galaxy (e.g., hydrogen Balmer-line luminosities and equivalent widths-EWs, shape of the continuum in the region around the Balmer and Paschen jump). In this article, we present fado (Fitting Analysis using Differential evolution Optimization) - a conceptually novel, publicly available pss tool with the distinctive capability of permitting identification of the SFH that reproduces the observed nebular characteristics of a SF galaxy. This so-far unique self-consistency concept allows us to significantly alleviate degeneracies in current spectral synthesis, thereby opening a new avenue to the exploration of the assembly history of galaxies. The innovative character of fado is further augmented by its mathematical foundation: fado is the first pss code employing genetic differential evolution optimization. This, in conjunction with various other currently unique elements in its mathematical concept and numerical realization (e.g., mid-analysis optimization of the spectral library using artificial intelligence, test for convergence through a procedure inspired by Markov chain Monte Carlo techniques, quasi-parallelization embedded within a modular architecture) results in key improvements with respect to computational efficiency and uniqueness of the best-fitting SFHs. Furthermore, fado incorporates within a single code the entire chain of pre-processing, modeling, post-processing, storage and graphical representation of the relevant output from pss, including emission-line measurements and estimates of uncertainties for all primary and secondary products from spectral synthesis (e.g., mass contributions of individual stellar populations, mass- and luminosity-weighted stellar ages and metallicities). This integrated concept greatly simplifies and accelerates a lengthy sequence of individual time-consuming steps that are generally involved in pss modeling, further enhancing the overall efficiency of the code and inviting to its automated application to large spectroscopic data sets. The distribution package of the FADO v.1 tool contains the binary and its auxiliary files. FADO v.1 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/603/A63
Chirp- and random-based coded ultrasonic excitation for localized blood-brain barrier opening
Kamimura, HAS; Wang, S; Wu, S-Y; Karakatsani, ME; Acosta, C; Carneiro, AAO; Konofagou, EE
2015-01-01
Chirp- and random-based coded excitation methods have been proposed to reduce standing wave formation and improve focusing of transcranial ultrasound. However, no clear evidence has been shown to support the benefits of these ultrasonic excitation sequences in vivo. This study evaluates the chirp and periodic selection of random frequency (PSRF) coded-excitation methods for opening the blood-brain barrier (BBB) in mice. Three groups of mice (n=15) were injected with polydisperse microbubbles and sonicated in the caudate putamen using the chirp/PSRF coded (bandwidth: 1.5-1.9 MHz, peak negative pressure: 0.52 MPa, duration: 30 s) or standard ultrasound (frequency: 1.5 MHz, pressure: 0.52 MPa, burst duration: 20 ms, duration: 5 min) sequences. T1-weighted contrast-enhanced MRI scans were performed to quantitatively analyze focused ultrasound induced BBB opening. The mean opening volumes evaluated from the MRI were 9.38±5.71 mm3, 8.91±3.91 mm3 and 35.47 ± 5.10 mm3 for the chirp, random and regular sonications, respectively. The mean cavitation levels were 55.40±28.43 V.s, 63.87±29.97 V.s and 356.52±257.15 V.s for the chirp, random and regular sonications, respectively. The chirp and PSRF coded pulsing sequences improved the BBB opening localization by inducing lower cavitation levels and smaller opening volumes compared to results of the regular sonication technique. Larger bandwidths were associated with more focused targeting but were limited by the frequency response of the transducer, the skull attenuation and the microbubbles optimal frequency range. The coded methods could therefore facilitate highly localized drug delivery as well as benefit other transcranial ultrasound techniques that use higher pressure levels and higher precision to induce the necessary bioeffects in a brain region while avoiding damage to the surrounding healthy tissue. PMID:26394091
NASA Astrophysics Data System (ADS)
Veerraju, R. P. S. P.; Rao, A. Srinivasa; Murali, G.
2010-10-01
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. It improves internal code structure without altering its external functionality by transforming functions and rethinking algorithms. It is an iterative process. Refactoring include reducing scope, replacing complex instructions with simpler or built-in instructions, and combining multiple statements into one statement. By transforming the code with refactoring techniques it will be faster to change, execute, and download. It is an excellent best practice to adopt for programmers wanting to improve their productivity. Refactoring is similar to things like performance optimizations, which are also behavior- preserving transformations. It also helps us find bugs when we are trying to fix a bug in difficult-to-understand code. By cleaning things up, we make it easier to expose the bug. Refactoring improves the quality of application design and implementation. In general, three cases concerning refactoring. Iterative refactoring, Refactoring when is necessary, Not refactor. Mr. Martin Fowler identifies four key reasons to refractor. Refactoring improves the design of software, makes software easier to understand, helps us find bugs and also helps in executing the program faster. There is an additional benefit of refactoring. It changes the way a developer thinks about the implementation when not refactoring. There are the three types of refactorings. 1) Code refactoring: It often referred to simply as refactoring. This is the refactoring of programming source code. 2) Database refactoring: It is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. 3) User interface (UI) refactoring: It is a simple change to the UI which retains its semantics. Finally, we conclude the benefits of Refactoring are: Improves the design of software, Makes software easier to understand, Software gets cleaned up and Helps us to find bugs and Helps us to program faster.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Veerraju, R. P. S. P.; Rao, A. Srinivasa; Murali, G.
2010-10-26
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. It improves internal code structure without altering its external functionality by transforming functions and rethinking algorithms. It is an iterative process. Refactoring include reducing scope, replacing complex instructions with simpler or built-in instructions, and combining multiple statements into one statement. By transforming the code with refactoring techniques it will be faster to change, execute, and download. It is an excellent best practice to adopt for programmers wanting to improve their productivity. Refactoring is similar to things like performance optimizations,more » which are also behavior- preserving transformations. It also helps us find bugs when we are trying to fix a bug in difficult-to-understand code. By cleaning things up, we make it easier to expose the bug. Refactoring improves the quality of application design and implementation. In general, three cases concerning refactoring. Iterative refactoring, Refactoring when is necessary, Not refactor.Mr. Martin Fowler identifies four key reasons to refractor. Refactoring improves the design of software, makes software easier to understand, helps us find bugs and also helps in executing the program faster. There is an additional benefit of refactoring. It changes the way a developer thinks about the implementation when not refactoring. There are the three types of refactorings. 1) Code refactoring: It often referred to simply as refactoring. This is the refactoring of programming source code. 2) Database refactoring: It is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. 3) User interface (UI) refactoring: It is a simple change to the UI which retains its semantics. Finally, we conclude the benefits of Refactoring are: Improves the design of software, Makes software easier to understand, Software gets cleaned up and Helps us to find bugs and Helps us to program faster.« less
Recent advances in lossless coding techniques
NASA Astrophysics Data System (ADS)
Yovanof, Gregory S.
Current lossless techniques are reviewed with reference to both sequential data files and still images. Two major groups of sequential algorithms, dictionary and statistical techniques, are discussed. In particular, attention is given to Lempel-Ziv coding, Huffman coding, and arithmewtic coding. The subject of lossless compression of imagery is briefly discussed. Finally, examples of practical implementations of lossless algorithms and some simulation results are given.
Self-adaptive multimethod optimization applied to a tailored heating forging process
NASA Astrophysics Data System (ADS)
Baldan, M.; Steinberg, T.; Baake, E.
2018-05-01
The presented paper describes an innovative self-adaptive multi-objective optimization code. Investigation goals concern proving the superiority of this code compared to NGSA-II and applying it to an inductor’s design case study addressed to a “tailored” heating forging application. The choice of the frequency and the heating time are followed by the determination of the turns number and their positions. Finally, a straightforward optimization is performed in order to minimize energy consumption using “optimal control”.
Olama, Mohammed M.; Ma, Xiao; Killough, Stephen M.; ...
2015-03-12
In recent years, there has been great interest in using hybrid spread-spectrum (HSS) techniques for commercial applications, particularly in the Smart Grid, in addition to their inherent uses in military communications. This is because HSS can accommodate high data rates with high link integrity, even in the presence of significant multipath effects and interfering signals. A highly useful form of this transmission technique for many types of command, control, and sensing applications is the specific code-related combination of standard direct sequence modulation with fast frequency hopping, denoted hybrid DS/FFH, wherein multiple frequency hops occur within a single data-bit time. Inmore » this paper, error-probability analyses are performed for a hybrid DS/FFH system over standard Gaussian and fading-type channels, progressively including the effects from wide- and partial-band jamming, multi-user interference, and varying degrees of Rayleigh and Rician fading. In addition, an optimization approach is formulated that minimizes the bit-error performance of a hybrid DS/FFH communication system and solves for the resulting system design parameters. The optimization objective function is non-convex and can be solved by applying the Karush-Kuhn-Tucker conditions. We also present our efforts toward exploring the design, implementation, and evaluation of a hybrid DS/FFH radio transceiver using a single FPGA. Numerical and experimental results are presented under widely varying design parameters to demonstrate the adaptability of the waveform for varied harsh smart grid RF signal environments.« less
ToTem: a tool for variant calling pipeline optimization.
Tom, Nikola; Tom, Ondrej; Malcikova, Jitka; Pavlova, Sarka; Kubesova, Blanka; Rausch, Tobias; Kolarik, Miroslav; Benes, Vladimir; Bystry, Vojtech; Pospisilova, Sarka
2018-06-26
High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user's priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. ToTem is a tool for automated pipeline optimization which is freely available as a web application at https://totem.software .
Wireless visual sensor network resource allocation using cross-layer optimization
NASA Astrophysics Data System (ADS)
Bentley, Elizabeth S.; Matyjas, John D.; Medley, Michael J.; Kondi, Lisimachos P.
2009-01-01
In this paper, we propose an approach to manage network resources for a Direct Sequence Code Division Multiple Access (DS-CDMA) visual sensor network where nodes monitor scenes with varying levels of motion. It uses cross-layer optimization across the physical layer, the link layer and the application layer. Our technique simultaneously assigns a source coding rate, a channel coding rate, and a power level to all nodes in the network based on one of two criteria that maximize the quality of video of the entire network as a whole, subject to a constraint on the total chip rate. One criterion results in the minimal average end-to-end distortion amongst all nodes, while the other criterion minimizes the maximum distortion of the network. Our approach allows one to determine the capacity of the visual sensor network based on the number of nodes and the quality of video that must be transmitted. For bandwidth-limited applications, one can also determine the minimum bandwidth needed to accommodate a number of nodes with a specific target chip rate. Video captured by a sensor node camera is encoded and decoded using the H.264 video codec by a centralized control unit at the network layer. To reduce the computational complexity of the solution, Universal Rate-Distortion Characteristics (URDCs) are obtained experimentally to relate bit error probabilities to the distortion of corrupted video. Bit error rates are found first by using Viterbi's upper bounds on the bit error probability and second, by simulating nodes transmitting data spread by Total Square Correlation (TSC) codes over a Rayleigh-faded DS-CDMA channel and receiving that data using Auxiliary Vector (AV) filtering.
NASA Technical Reports Server (NTRS)
Lee, P. J.
1984-01-01
For rate 1/N convolutional codes, a recursive algorithm for finding the transfer function bound on bit error rate (BER) at the output of a Viterbi decoder is described. This technique is very fast and requires very little storage since all the unnecessary operations are eliminated. Using this technique, we find and plot bounds on the BER performance of known codes of rate 1/2 with K 18, rate 1/3 with K 14. When more than one reported code with the same parameter is known, we select the code that minimizes the required signal to noise ratio for a desired bit error rate of 0.000001. This criterion of determining goodness of a code had previously been found to be more useful than the maximum free distance criterion and was used in the code search procedures of very short constraint length codes. This very efficient technique can also be used for searches of longer constraint length codes.
Optimal simultaneous superpositioning of multiple structures with missing data.
Theobald, Douglas L; Steindel, Phillip A
2012-08-01
Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually 'missing' from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether. Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation-maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case. The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org. dtheobald@brandeis.edu Supplementary data are available at Bioinformatics online.
Super-linear Precision in Simple Neural Population Codes
NASA Astrophysics Data System (ADS)
Schwab, David; Fiete, Ila
2015-03-01
A widely used tool for quantifying the precision with which a population of noisy sensory neurons encodes the value of an external stimulus is the Fisher Information (FI). Maximizing the FI is also a commonly used objective for constructing optimal neural codes. The primary utility and importance of the FI arises because it gives, through the Cramer-Rao bound, the smallest mean-squared error achievable by any unbiased stimulus estimator. However, it is well-known that when neural firing is sparse, optimizing the FI can result in codes that perform very poorly when considering the resulting mean-squared error, a measure with direct biological relevance. Here we construct optimal population codes by minimizing mean-squared error directly and study the scaling properties of the resulting network, focusing on the optimal tuning curve width. We then extend our results to continuous attractor networks that maintain short-term memory of external stimuli in their dynamics. Here we find similar scaling properties in the structure of the interactions that minimize diffusive information loss.
Constellation labeling optimization for bit-interleaved coded APSK
NASA Astrophysics Data System (ADS)
Xiang, Xingyu; Mo, Zijian; Wang, Zhonghai; Pham, Khanh; Blasch, Erik; Chen, Genshe
2016-05-01
This paper investigates the constellation and mapping optimization for amplitude phase shift keying (APSK) modulation, which is deployed in Digital Video Broadcasting Satellite - Second Generation (DVB-S2) and Digital Video Broadcasting - Satellite services to Handhelds (DVB-SH) broadcasting standards due to its merits of power and spectral efficiency together with the robustness against nonlinear distortion. The mapping optimization is performed for 32-APSK according to combined cost functions related to Euclidean distance and mutual information. A Binary switching algorithm and its modified version are used to minimize the cost function and the estimated error between the original and received data. The optimized constellation mapping is tested by combining DVB-S2 standard Low-Density Parity-Check (LDPC) codes in both Bit-Interleaved Coded Modulation (BICM) and BICM with iterative decoding (BICM-ID) systems. The simulated results validate the proposed constellation labeling optimization scheme which yields better performance against conventional 32-APSK constellation defined in DVB-S2 standard.
Optimizing a liquid propellant rocket engine with an automated combustor design code (AUTOCOM)
NASA Technical Reports Server (NTRS)
Hague, D. S.; Reichel, R. H.; Jones, R. T.; Glatt, C. R.
1972-01-01
A procedure for automatically designing a liquid propellant rocket engine combustion chamber in an optimal fashion is outlined. The procedure is contained in a digital computer code, AUTOCOM. The code is applied to an existing engine, and design modifications are generated which provide a substantial potential payload improvement over the existing design. Computer time requirements for this payload improvement were small, approximately four minutes in the CDC 6600 computer.
A stimulus-dependent spike threshold is an optimal neural coder
Jones, Douglas L.; Johnson, Erik C.; Ratnam, Rama
2015-01-01
A neural code based on sequences of spikes can consume a significant portion of the brain's energy budget. Thus, energy considerations would dictate that spiking activity be kept as low as possible. However, a high spike-rate improves the coding and representation of signals in spike trains, particularly in sensory systems. These are competing demands, and selective pressure has presumably worked to optimize coding by apportioning a minimum number of spikes so as to maximize coding fidelity. The mechanisms by which a neuron generates spikes while maintaining a fidelity criterion are not known. Here, we show that a signal-dependent neural threshold, similar to a dynamic or adapting threshold, optimizes the trade-off between spike generation (encoding) and fidelity (decoding). The threshold mimics a post-synaptic membrane (a low-pass filter) and serves as an internal decoder. Further, it sets the average firing rate (the energy constraint). The decoding process provides an internal copy of the coding error to the spike-generator which emits a spike when the error equals or exceeds a spike threshold. When optimized, the trade-off leads to a deterministic spike firing-rule that generates optimally timed spikes so as to maximize fidelity. The optimal coder is derived in closed-form in the limit of high spike-rates, when the signal can be approximated as a piece-wise constant signal. The predicted spike-times are close to those obtained experimentally in the primary electrosensory afferent neurons of weakly electric fish (Apteronotus leptorhynchus) and pyramidal neurons from the somatosensory cortex of the rat. We suggest that KCNQ/Kv7 channels (underlying the M-current) are good candidates for the decoder. They are widely coupled to metabolic processes and do not inactivate. We conclude that the neural threshold is optimized to generate an energy-efficient and high-fidelity neural code. PMID:26082710
Engineering calculations for communications satellite systems planning
NASA Technical Reports Server (NTRS)
Reilly, C. H.; Levis, C. A.; Mount-Campbell, C.; Gonsalvez, D. J.; Wang, C. W.; Yamamura, Y.
1985-01-01
Computer-based techniques for optimizing communications-satellite orbit and frequency assignments are discussed. A gradient-search code was tested against a BSS scenario derived from the RARC-83 data. Improvement was obtained, but each iteration requires about 50 minutes of IBM-3081 CPU time. Gradient-search experiments on a small FSS test problem, consisting of a single service area served by 8 satellites, showed quickest convergence when the satellites were all initially placed near the center of the available orbital arc with moderate spacing. A transformation technique is proposed for investigating the surface topography of the objective function used in the gradient-search method. A new synthesis approach is based on transforming single-entry interference constraints into corresponding constraints on satellite spacings. These constraints are used with linear objective functions to formulate the co-channel orbital assignment task as a linear-programming (LP) problem or mixed integer programming (MIP) problem. Globally optimal solutions are always found with the MIP problems, but not necessarily with the LP problems. The MIP solutions can be used to evaluate the quality of the LP solutions. The initial results are very encouraging.
Yurtkuran, Alkın; Emel, Erdal
2014-01-01
The traveling salesman problem with time windows (TSPTW) is a variant of the traveling salesman problem in which each customer should be visited within a given time window. In this paper, we propose an electromagnetism-like algorithm (EMA) that uses a new constraint handling technique to minimize the travel cost in TSPTW problems. The EMA utilizes the attraction-repulsion mechanism between charged particles in a multidimensional space for global optimization. This paper investigates the problem-specific constraint handling capability of the EMA framework using a new variable bounding strategy, in which real-coded particle's boundary constraints associated with the corresponding time windows of customers, is introduced and combined with the penalty approach to eliminate infeasibilities regarding time window violations. The performance of the proposed algorithm and the effectiveness of the constraint handling technique have been studied extensively, comparing it to that of state-of-the-art metaheuristics using several sets of benchmark problems reported in the literature. The results of the numerical experiments show that the EMA generates feasible and near-optimal results within shorter computational times compared to the test algorithms.
Yurtkuran, Alkın
2014-01-01
The traveling salesman problem with time windows (TSPTW) is a variant of the traveling salesman problem in which each customer should be visited within a given time window. In this paper, we propose an electromagnetism-like algorithm (EMA) that uses a new constraint handling technique to minimize the travel cost in TSPTW problems. The EMA utilizes the attraction-repulsion mechanism between charged particles in a multidimensional space for global optimization. This paper investigates the problem-specific constraint handling capability of the EMA framework using a new variable bounding strategy, in which real-coded particle's boundary constraints associated with the corresponding time windows of customers, is introduced and combined with the penalty approach to eliminate infeasibilities regarding time window violations. The performance of the proposed algorithm and the effectiveness of the constraint handling technique have been studied extensively, comparing it to that of state-of-the-art metaheuristics using several sets of benchmark problems reported in the literature. The results of the numerical experiments show that the EMA generates feasible and near-optimal results within shorter computational times compared to the test algorithms. PMID:24723834
Wind Farm Turbine Type and Placement Optimization
NASA Astrophysics Data System (ADS)
Graf, Peter; Dykes, Katherine; Scott, George; Fields, Jason; Lunacek, Monte; Quick, Julian; Rethore, Pierre-Elouan
2016-09-01
The layout of turbines in a wind farm is already a challenging nonlinear, nonconvex, nonlinearly constrained continuous global optimization problem. Here we begin to address the next generation of wind farm optimization problems by adding the complexity that there is more than one turbine type to choose from. The optimization becomes a nonlinear constrained mixed integer problem, which is a very difficult class of problems to solve. This document briefly summarizes the algorithm and code we have developed, the code validation steps we have performed, and the initial results for multi-turbine type and placement optimization (TTP_OPT) we have run.
Wind farm turbine type and placement optimization
Graf, Peter; Dykes, Katherine; Scott, George; ...
2016-10-03
The layout of turbines in a wind farm is already a challenging nonlinear, nonconvex, nonlinearly constrained continuous global optimization problem. Here we begin to address the next generation of wind farm optimization problems by adding the complexity that there is more than one turbine type to choose from. The optimization becomes a nonlinear constrained mixed integer problem, which is a very difficult class of problems to solve. Furthermore, this document briefly summarizes the algorithm and code we have developed, the code validation steps we have performed, and the initial results for multi-turbine type and placement optimization (TTP_OPT) we have run.
Optimal wavelets for biomedical signal compression.
Nielsen, Mogens; Kamavuako, Ernest Nlandu; Andersen, Michael Midtgaard; Lucas, Marie-Françoise; Farina, Dario
2006-07-01
Signal compression is gaining importance in biomedical engineering due to the potential applications in telemedicine. In this work, we propose a novel scheme of signal compression based on signal-dependent wavelets. To adapt the mother wavelet to the signal for the purpose of compression, it is necessary to define (1) a family of wavelets that depend on a set of parameters and (2) a quality criterion for wavelet selection (i.e., wavelet parameter optimization). We propose the use of an unconstrained parameterization of the wavelet for wavelet optimization. A natural performance criterion for compression is the minimization of the signal distortion rate given the desired compression rate. For coding the wavelet coefficients, we adopted the embedded zerotree wavelet coding algorithm, although any coding scheme may be used with the proposed wavelet optimization. As a representative example of application, the coding/encoding scheme was applied to surface electromyographic signals recorded from ten subjects. The distortion rate strongly depended on the mother wavelet (for example, for 50% compression rate, optimal wavelet, mean+/-SD, 5.46+/-1.01%; worst wavelet 12.76+/-2.73%). Thus, optimization significantly improved performance with respect to previous approaches based on classic wavelets. The algorithm can be applied to any signal type since the optimal wavelet is selected on a signal-by-signal basis. Examples of application to ECG and EEG signals are also reported.
NASA Astrophysics Data System (ADS)
Bezan, Scott; Shirani, Shahram
2006-12-01
To reliably transmit video over error-prone channels, the data should be both source and channel coded. When multiple channels are available for transmission, the problem extends to that of partitioning the data across these channels. The condition of transmission channels, however, varies with time. Therefore, the error protection added to the data at one instant of time may not be optimal at the next. In this paper, we propose a method for adaptively adding error correction code in a rate-distortion (RD) optimized manner using rate-compatible punctured convolutional codes to an MJPEG2000 constant rate-coded frame of video. We perform an analysis on the rate-distortion tradeoff of each of the coding units (tiles and packets) in each frame and adapt the error correction code assigned to the unit taking into account the bandwidth and error characteristics of the channels. This method is applied to both single and multiple time-varying channel environments. We compare our method with a basic protection method in which data is either not transmitted, transmitted with no protection, or transmitted with a fixed amount of protection. Simulation results show promising performance for our proposed method.
NASA Astrophysics Data System (ADS)
Yahampath, Pradeepa
2017-12-01
Consider communicating a correlated Gaussian source over a Rayleigh fading channel with no knowledge of the channel signal-to-noise ratio (CSNR) at the transmitter. In this case, a digital system cannot be optimal for a range of CSNRs. Analog transmission however is optimal at all CSNRs, if the source and channel are memoryless and bandwidth matched. This paper presents new hybrid digital-analog (HDA) systems for sources with memory and channels with bandwidth expansion, which outperform both digital-only and analog-only systems over a wide range of CSNRs. The digital part is either a predictive quantizer or a transform code, used to achieve a coding gain. Analog part uses linear encoding to transmit the quantization error which improves the performance under CSNR variations. The hybrid encoder is optimized to achieve the minimum AMMSE (average minimum mean square error) over the CSNR distribution. To this end, analytical expressions are derived for the AMMSE of asymptotically optimal systems. It is shown that the outage CSNR of the channel code and the analog-digital power allocation must be jointly optimized to achieve the minimum AMMSE. In the case of HDA predictive quantization, a simple algorithm is presented to solve the optimization problem. Experimental results are presented for both Gauss-Markov sources and speech signals.
Hard decoding algorithm for optimizing thresholds under general Markovian noise
NASA Astrophysics Data System (ADS)
Chamberland, Christopher; Wallman, Joel; Beale, Stefanie; Laflamme, Raymond
2017-04-01
Quantum error correction is instrumental in protecting quantum systems from noise in quantum computing and communication settings. Pauli channels can be efficiently simulated and threshold values for Pauli error rates under a variety of error-correcting codes have been obtained. However, realistic quantum systems can undergo noise processes that differ significantly from Pauli noise. In this paper, we present an efficient hard decoding algorithm for optimizing thresholds and lowering failure rates of an error-correcting code under general completely positive and trace-preserving (i.e., Markovian) noise. We use our hard decoding algorithm to study the performance of several error-correcting codes under various non-Pauli noise models by computing threshold values and failure rates for these codes. We compare the performance of our hard decoding algorithm to decoders optimized for depolarizing noise and show improvements in thresholds and reductions in failure rates by several orders of magnitude. Our hard decoding algorithm can also be adapted to take advantage of a code's non-Pauli transversal gates to further suppress noise. For example, we show that using the transversal gates of the 5-qubit code allows arbitrary rotations around certain axes to be perfectly corrected. Furthermore, we show that Pauli twirling can increase or decrease the threshold depending upon the code properties. Lastly, we show that even if the physical noise model differs slightly from the hypothesized noise model used to determine an optimized decoder, failure rates can still be reduced by applying our hard decoding algorithm.
A Spherical Active Coded Aperture for 4π Gamma-ray Imaging
Hellfeld, Daniel; Barton, Paul; Gunter, Donald; ...
2017-09-22
Gamma-ray imaging facilitates the efficient detection, characterization, and localization of compact radioactive sources in cluttered environments. Fieldable detector systems employing active planar coded apertures have demonstrated broad energy sensitivity via both coded aperture and Compton imaging modalities. But, planar configurations suffer from a limited field-of-view, especially in the coded aperture mode. In order to improve upon this limitation, we introduce a novel design by rearranging the detectors into an active coded spherical configuration, resulting in a 4pi isotropic field-of-view for both coded aperture and Compton imaging. This work focuses on the low- energy coded aperture modality and the optimization techniquesmore » used to determine the optimal number and configuration of 1 cm 3 CdZnTe coplanar grid detectors on a 14 cm diameter sphere with 192 available detector locations.« less
Towards a generalized computational fluid dynamics technique for all Mach numbers
NASA Technical Reports Server (NTRS)
Walters, R. W.; Slack, D. C.; Godfrey, A. G.
1993-01-01
Currently there exists no single unified approach for efficiently and accurately solving computational fluid dynamics (CFD) problems across the Mach number regime, from truly low speed incompressible flows to hypersonic speeds. There are several CFD codes that have evolved into sophisticated prediction tools with a wide variety of features including multiblock capabilities, generalized chemistry and thermodynamics models among other features. However, as these codes evolve, the demand placed on the end user also increases simply because of the myriad of features that are incorporated into these codes. In order for a user to be able to solve a wide range of problems, several codes may be needed requiring the user to be familiar with the intricacies of each code and their rather complicated input files. Moreover, the cost of training users and maintaining several codes becomes prohibitive. The objective of the current work is to extend the compressible, characteristic-based, thermochemical nonequilibrium Navier-Stokes code GASP to very low speed flows and simultaneously improve convergence at all speeds. Before this work began, the practical speed range of GASP was Mach numbers on the order of 0.1 and higher. In addition, a number of new techniques have been developed for more accurate physical and numerical modeling. The primary focus has been on the development of optimal preconditioning techniques for the Euler and the Navier-Stokes equations with general finite-rate chemistry models and both equilibrium and nonequilibrium thermodynamics models. We began with the work of Van Leer, Lee, and Roe for inviscid, one-dimensional perfect gases and extended their approach to include three-dimensional reacting flows. The basic steps required to accomplish this task were a transformation to stream-aligned coordinates, the formulation of the preconditioning matrix, incorporation into both explicit and implicit temporal integration schemes, and modification of the numerical flux formulae. In addition, we improved the convergence rate of the implicit time integration schemes in GASP through the use of inner iteration strategies and the use of the GMRES (General Minimized Resisual) which belongs to the class of algorithms referred to as Krylov subspace iteration. Finally, we significantly improved the practical utility of GASP through the addition of mesh sequencing, a technique in which computations begin on a coarse grid and get interpolated onto successively finer grids. The fluid dynamic problems of interest to the propulsion community involve complex flow physics spanning different velocity regimes and possibly involving chemical reactions. This class of problems results in widely disparate time scales causing numerical stiffness. Even in the absence of chemical reactions, eigenvalue stiffness manifests itself at transonic and very low speed flows which can be quantified by the large condition number of the system and evidenced by slow convergence rates. This results in the need for thorough numerical analysis and subsequent implementation of sophisticated numerical techniques for these difficult yet practical problems. As a result of this work, we have been able to extend the range of applicability of compressible codes to very low speed inviscid flows (M = .001) and reacting flows.
NASA Technical Reports Server (NTRS)
Hribar, Michelle R.; Frumkin, Michael; Jin, Haoqiang; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
Over the past decade, high performance computing has evolved rapidly; systems based on commodity microprocessors have been introduced in quick succession from at least seven vendors/families. Porting codes to every new architecture is a difficult problem; in particular, here at NASA, there are many large CFD applications that are very costly to port to new machines by hand. The LCM ("Legacy Code Modernization") Project is the development of an integrated parallelization environment (IPE) which performs the automated mapping of legacy CFD (Fortran) applications to state-of-the-art high performance computers. While most projects to port codes focus on the parallelization of the code, we consider porting to be an iterative process consisting of several steps: 1) code cleanup, 2) serial optimization,3) parallelization, 4) performance monitoring and visualization, 5) intelligent tools for automated tuning using performance prediction and 6) machine specific optimization. The approach for building this parallelization environment is to build the components for each of the steps simultaneously and then integrate them together. The demonstration will exhibit our latest research in building this environment: 1. Parallelizing tools and compiler evaluation. 2. Code cleanup and serial optimization using automated scripts 3. Development of a code generator for performance prediction 4. Automated partitioning 5. Automated insertion of directives. These demonstrations will exhibit the effectiveness of an automated approach for all the steps involved with porting and tuning a legacy code application for a new architecture.
NASA Astrophysics Data System (ADS)
Wang, W.; Liu, J.
2016-12-01
Forward modelling is the general way to obtain responses of geoelectrical structures. Field investigators might find it useful for planning surveys and choosing optimal electrode configurations with respect to their targets. During the past few decades much effort has been put into the development of numerical forward codes, such as integral equation method, finite difference method and finite element method. Nowadays, most researchers prefer the finite element method (FEM) for its flexible meshing scheme, which can handle models with complex geometry. Resistivity Modelling with commercial sofewares such as ANSYS and COMSOL is convenient, but like working with a black box. Modifying the existed codes or developing new codes is somehow a long period. We present a new way to obtain resistivity forward modelling codes quickly, which is based on the commercial sofeware FEPG (Finite element Program Generator). Just with several demanding scripts, FEPG could generate FORTRAN program framework which can easily be altered to adjust our targets. By supposing the electric potential is quadratic in each element of a two-layer model, we obtain quite accurate results with errors less than 1%, while more than 5% errors could appear by linear FE codes. The anisotropic half-space model is supposed to concern vertical distributed fractures. The measured apparent resistivities along the fractures are bigger than results from its orthogonal direction, which are opposite of the true resistivities. Interpretation could be misunderstood if this anisotropic paradox is ignored. The technique we used can obtain scientific codes in a short time. The generated powerful FORTRAN codes could reach accurate results by higher-order assumption and can handle anisotropy to make better interpretations. The method we used could be expand easily to other domain where FE codes are needed.
Topology optimization of natural convection: Flow in a differentially heated cavity
NASA Astrophysics Data System (ADS)
Saglietti, Clio; Schlatter, Philipp; Berggren, Martin; Henningson, Dan
2017-11-01
The goal of the present work is to develop methods for optimization of the design of natural convection cooled heat sinks, using resolved simulation of both fluid flow and heat transfer. We rely on mathematical programming techniques combined with direct numerical simulations in order to iteratively update the topology of a solid structure towards optimality, i.e. until the design yielding the best performance is found, while satisfying a specific set of constraints. The investigated test case is a two-dimensional differentially heated cavity, in which the two vertical walls are held at different temperatures. The buoyancy force induces a swirling convective flow around a solid structure, whose topology is optimized to maximize the heat flux through the cavity. We rely on the spectral-element code Nek5000 to compute a high-order accurate solution of the natural convection flow arising from the conjugate heat transfer in the cavity. The laminar, steady-state solution of the problem is evaluated with a time-marching scheme that has an increased convergence rate; the actual iterative optimization is obtained using a steepest-decent algorithm, and the gradients are conveniently computed using the continuous adjoint equations for convective heat transfer.
Exploiting variability for energy optimization of parallel programs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lavrijsen, Wim; Iancu, Costin; de Jong, Wibe
2016-04-18
Here in this paper we present optimizations that use DVFS mechanisms to reduce the total energy usage in scientific applications. Our main insight is that noise is intrinsic to large scale parallel executions and it appears whenever shared resources are contended. The presence of noise allows us to identify and manipulate any program regions amenable to DVFS. When compared to previous energy optimizations that make per core decisions using predictions of the running time, our scheme uses a qualitative approach to recognize the signature of executions amenable to DVFS. By recognizing the "shape of variability" we can optimize codes withmore » highly dynamic behavior, which pose challenges to all existing DVFS techniques. We validate our approach using offline and online analyses for one-sided and two-sided communication paradigms. We have applied our methods to NWChem, and we show best case improvements in energy use of 12% at no loss in performance when using online optimizations running on 720 Haswell cores with one-sided communication. With NWChem on MPI two-sided and offline analysis, capturing the initialization, we find energy savings of up to 20%, with less than 1% performance cost.« less
Particle Swarm Optimization approach to defect detection in armour ceramics.
Kesharaju, Manasa; Nagarajah, Romesh
2017-03-01
In this research, various extracted features were used in the development of an automated ultrasonic sensor based inspection system that enables defect classification in each ceramic component prior to despatch to the field. Classification is an important task and large number of irrelevant, redundant features commonly introduced to a dataset reduces the classifiers performance. Feature selection aims to reduce the dimensionality of the dataset while improving the performance of a classification system. In the context of a multi-criteria optimization problem (i.e. to minimize classification error rate and reduce number of features) such as one discussed in this research, the literature suggests that evolutionary algorithms offer good results. Besides, it is noted that Particle Swarm Optimization (PSO) has not been explored especially in the field of classification of high frequency ultrasonic signals. Hence, a binary coded Particle Swarm Optimization (BPSO) technique is investigated in the implementation of feature subset selection and to optimize the classification error rate. In the proposed method, the population data is used as input to an Artificial Neural Network (ANN) based classification system to obtain the error rate, as ANN serves as an evaluator of PSO fitness function. Copyright © 2016. Published by Elsevier B.V.
Optimum Design of High-Speed Prop-Rotors
NASA Technical Reports Server (NTRS)
Chattopadhyay, Aditi; McCarthy, Thomas Robert
1993-01-01
An integrated multidisciplinary optimization procedure is developed for application to rotary wing aircraft design. The necessary disciplines such as dynamics, aerodynamics, aeroelasticity, and structures are coupled within a closed-loop optimization process. The procedure developed is applied to address two different problems. The first problem considers the optimization of a helicopter rotor blade and the second problem addresses the optimum design of a high-speed tilting proprotor. In the helicopter blade problem, the objective is to reduce the critical vibratory shear forces and moments at the blade root, without degrading rotor aerodynamic performance and aeroelastic stability. In the case of the high-speed proprotor, the goal is to maximize the propulsive efficiency in high-speed cruise without deteriorating the aeroelastic stability in cruise and the aerodynamic performance in hover. The problems studied involve multiple design objectives; therefore, the optimization problems are formulated using multiobjective design procedures. A comprehensive helicopter analysis code is used for the rotary wing aerodynamic, dynamic and aeroelastic stability analyses and an algorithm developed specifically for these purposes is used for the structural analysis. A nonlinear programming technique coupled with an approximate analysis procedure is used to perform the optimization. The optimum blade designs obtained in each case are compared to corresponding reference designs.
Bilayer Protograph Codes for Half-Duplex Relay Channels
NASA Technical Reports Server (NTRS)
Divsalar, Dariush; VanNguyen, Thuy; Nosratinia, Aria
2013-01-01
Direct to Earth return links are limited by the size and power of lander devices. A standard alternative is provided by a two-hops return link: a proximity link (from lander to orbiter relay) and a deep-space link (from orbiter relay to Earth). Although direct to Earth return links are limited by the size and power of lander devices, using an additional link and a proposed coding for relay channels, one can obtain a more reliable signal. Although significant progress has been made in the relay coding problem, existing codes must be painstakingly optimized to match to a single set of channel conditions, many of them do not offer easy encoding, and most of them do not have structured design. A high-performing LDPC (low-density parity-check) code for the relay channel addresses simultaneously two important issues: a code structure that allows low encoding complexity, and a flexible rate-compatible code that allows matching to various channel conditions. Most of the previous high-performance LDPC codes for the relay channel are tightly optimized for a given channel quality, and are not easily adapted without extensive re-optimization for various channel conditions. This code for the relay channel combines structured design and easy encoding with rate compatibility to allow adaptation to the three links involved in the relay channel, and furthermore offers very good performance. The proposed code is constructed by synthesizing a bilayer structure with a pro to graph. In addition to the contribution to relay encoding, an improved family of protograph codes was produced for the point-to-point AWGN (additive white Gaussian noise) channel whose high-rate members enjoy thresholds that are within 0.07 dB of capacity. These LDPC relay codes address three important issues in an integrative manner: low encoding complexity, modular structure allowing for easy design, and rate compatibility so that the code can be easily matched to a variety of channel conditions without extensive re-optimization. The main problem of half-duplex relay coding can be reduced to the simultaneous design of two codes at two rates and two SNRs (signal-to-noise ratios), such that one is a subset of the other. This problem can be addressed by forceful optimization, but a clever method of addressing this problem is via the bilayer lengthened (BL) LDPC structure. This method uses a bilayer Tanner graph to make the two codes while using a concept of "parity forwarding" with subsequent successive decoding that removes the need to directly address the issue of uneven SNRs among the symbols of a given codeword. This method is attractive in that it addresses some of the main issues in the design of relay codes, but it does not by itself give rise to highly structured codes with simple encoding, nor does it give rate-compatible codes. The main contribution of this work is to construct a class of codes that simultaneously possess a bilayer parity- forwarding mechanism, while also benefiting from the properties of protograph codes having an easy encoding, a modular design, and being a rate-compatible code.
Deductive Glue Code Synthesis for Embedded Software Systems Based on Code Patterns
NASA Technical Reports Server (NTRS)
Liu, Jian; Fu, Jicheng; Zhang, Yansheng; Bastani, Farokh; Yen, I-Ling; Tai, Ann; Chau, Savio N.
2006-01-01
Automated code synthesis is a constructive process that can be used to generate programs from specifications. It can, thus, greatly reduce the software development cost and time. The use of formal code synthesis approach for software generation further increases the dependability of the system. Though code synthesis has many potential benefits, the synthesis techniques are still limited. Meanwhile, components are widely used in embedded system development. Applying code synthesis to component based software development (CBSD) process can greatly enhance the capability of code synthesis while reducing the component composition efforts. In this paper, we discuss the issues and techniques for applying deductive code synthesis techniques to CBSD. For deductive synthesis in CBSD, a rule base is the key for inferring appropriate component composition. We use the code patterns to guide the development of rules. Code patterns have been proposed to capture the typical usages of the components. Several general composition operations have been identified to facilitate systematic composition. We present the technique for rule development and automated generation of new patterns from existing code patterns. A case study of using this method in building a real-time control system is also presented.
Power optimization of wireless media systems with space-time block codes.
Yousefi'zadeh, Homayoun; Jafarkhani, Hamid; Moshfeghi, Mehran
2004-07-01
We present analytical and numerical solutions to the problem of power control in wireless media systems with multiple antennas. We formulate a set of optimization problems aimed at minimizing total power consumption of wireless media systems subject to a given level of QoS and an available bit rate. Our formulation takes into consideration the power consumption related to source coding, channel coding, and transmission of multiple-transmit antennas. In our study, we consider Gauss-Markov and video source models, Rayleigh fading channels along with the Bernoulli/Gilbert-Elliott loss models, and space-time block codes.
Survey Of Lossless Image Coding Techniques
NASA Astrophysics Data System (ADS)
Melnychuck, Paul W.; Rabbani, Majid
1989-04-01
Many image transmission/storage applications requiring some form of data compression additionally require that the decoded image be an exact replica of the original. Lossless image coding algorithms meet this requirement by generating a decoded image that is numerically identical to the original. Several lossless coding techniques are modifications of well-known lossy schemes, whereas others are new. Traditional Markov-based models and newer arithmetic coding techniques are applied to predictive coding, bit plane processing, and lossy plus residual coding. Generally speaking, the compression ratio offered by these techniques are in the area of 1.6:1 to 3:1 for 8-bit pictorial images. Compression ratios for 12-bit radiological images approach 3:1, as these images have less detailed structure, and hence, their higher pel correlation leads to a greater removal of image redundancy.
NASA Astrophysics Data System (ADS)
Roshanian, Jafar; Jodei, Jahangir; Mirshams, Mehran; Ebrahimi, Reza; Mirzaee, Masood
A new automated multi-level of fidelity Multi-Disciplinary Design Optimization (MDO) methodology has been developed at the MDO Laboratory of K.N. Toosi University of Technology. This paper explains a new design approach by formulation of developed disciplinary modules. A conceptual design for a small, solid-propellant launch vehicle was considered at two levels of fidelity structure. Low and medium level of fidelity disciplinary codes were developed and linked. Appropriate design and analysis codes were defined according to their effect on the conceptual design process. Simultaneous optimization of the launch vehicle was performed at the discipline level and system level. Propulsion, aerodynamics, structure and trajectory disciplinary codes were used. To reach the minimum launch weight, the Low LoF code first searches the whole design space to achieve the mission requirements. Then the medium LoF code receives the output of the low LoF and gives a value near the optimum launch weight with more details and higher fidelity.
Improving soft FEC performance for higher-order modulations via optimized bit channel mappings.
Häger, Christian; Amat, Alexandre Graell I; Brännström, Fredrik; Alvarado, Alex; Agrell, Erik
2014-06-16
Soft forward error correction with higher-order modulations is often implemented in practice via the pragmatic bit-interleaved coded modulation paradigm, where a single binary code is mapped to a nonbinary modulation. In this paper, we study the optimization of the mapping of the coded bits to the modulation bits for a polarization-multiplexed fiber-optical system without optical inline dispersion compensation. Our focus is on protograph-based low-density parity-check (LDPC) codes which allow for an efficient hardware implementation, suitable for high-speed optical communications. The optimization is applied to the AR4JA protograph family, and further extended to protograph-based spatially coupled LDPC codes assuming a windowed decoder. Full field simulations via the split-step Fourier method are used to verify the analysis. The results show performance gains of up to 0.25 dB, which translate into a possible extension of the transmission reach by roughly up to 8%, without significantly increasing the system complexity.
Design optimization of steel frames using an enhanced firefly algorithm
NASA Astrophysics Data System (ADS)
Carbas, Serdar
2016-12-01
Mathematical modelling of real-world-sized steel frames under the Load and Resistance Factor Design-American Institute of Steel Construction (LRFD-AISC) steel design code provisions, where the steel profiles for the members are selected from a table of steel sections, turns out to be a discrete nonlinear programming problem. Finding the optimum design of such design optimization problems using classical optimization techniques is difficult. Metaheuristic algorithms provide an alternative way of solving such problems. The firefly algorithm (FFA) belongs to the swarm intelligence group of metaheuristics. The standard FFA has the drawback of being caught up in local optima in large-sized steel frame design problems. This study attempts to enhance the performance of the FFA by suggesting two new expressions for the attractiveness and randomness parameters of the algorithm. Two real-world-sized design examples are designed by the enhanced FFA and its performance is compared with standard FFA as well as with particle swarm and cuckoo search algorithms.
The Optimization Design of An AC-Electroosmotic Micro mixer
NASA Astrophysics Data System (ADS)
Wang, Yangyang; Suh, Yongkweon; Kang, Sangmo
2007-11-01
We propose the optimization design of an AC-electroosmotic micro-mixer, which is composed of a channel and a series of pairs of electrodes attached on the bottom wall in zigzag patterns. The AC electric field is applied to the electrodes so that a fluid flow takes place around the electrodes across the channel, thus contributing to the mixing of the fluid within the channel. We have performed numerical simulations by using a commercial code (CFX 10) to optimize the shape and pattern of the electrodes via the concept of mixing index. It is found that the best combination of two kinds of electrodes, which leads to good mixing performance, is not simply harmonic one. When the length ratio of the two kinds of electrodes closes to 2:1, we can get the best mixing effect. Furthermore, we will visualize the flow pattern and measure the velocity field with a PTV technique to validate the numerical simulations. In addition, the mixing pattern will be visualized via the experiment.
Development of an adaptive hp-version finite element method for computational optimal control
NASA Technical Reports Server (NTRS)
Hodges, Dewey H.; Warner, Michael S.
1994-01-01
In this research effort, the usefulness of hp-version finite elements and adaptive solution-refinement techniques in generating numerical solutions to optimal control problems has been investigated. Under NAG-939, a general FORTRAN code was developed which approximated solutions to optimal control problems with control constraints and state constraints. Within that methodology, to get high-order accuracy in solutions, the finite element mesh would have to be refined repeatedly through bisection of the entire mesh in a given phase. In the current research effort, the order of the shape functions in each element has been made a variable, giving more flexibility in error reduction and smoothing. Similarly, individual elements can each be subdivided into many pieces, depending on the local error indicator, while other parts of the mesh remain coarsely discretized. The problem remains to reduce and smooth the error while still keeping computational effort reasonable enough to calculate time histories in a short enough time for on-board applications.
Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations
Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W.
2016-01-01
Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures. PMID:26904094
Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations.
Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W
2016-01-01
Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures.
IESIP - AN IMPROVED EXPLORATORY SEARCH TECHNIQUE FOR PURE INTEGER LINEAR PROGRAMMING PROBLEMS
NASA Technical Reports Server (NTRS)
Fogle, F. R.
1994-01-01
IESIP, an Improved Exploratory Search Technique for Pure Integer Linear Programming Problems, addresses the problem of optimizing an objective function of one or more variables subject to a set of confining functions or constraints by a method called discrete optimization or integer programming. Integer programming is based on a specific form of the general linear programming problem in which all variables in the objective function and all variables in the constraints are integers. While more difficult, integer programming is required for accuracy when modeling systems with small numbers of components such as the distribution of goods, machine scheduling, and production scheduling. IESIP establishes a new methodology for solving pure integer programming problems by utilizing a modified version of the univariate exploratory move developed by Robert Hooke and T.A. Jeeves. IESIP also takes some of its technique from the greedy procedure and the idea of unit neighborhoods. A rounding scheme uses the continuous solution found by traditional methods (simplex or other suitable technique) and creates a feasible integer starting point. The Hook and Jeeves exploratory search is modified to accommodate integers and constraints and is then employed to determine an optimal integer solution from the feasible starting solution. The user-friendly IESIP allows for rapid solution of problems up to 10 variables in size (limited by DOS allocation). Sample problems compare IESIP solutions with the traditional branch-and-bound approach. IESIP is written in Borland's TURBO Pascal for IBM PC series computers and compatibles running DOS. Source code and an executable are provided. The main memory requirement for execution is 25K. This program is available on a 5.25 inch 360K MS DOS format diskette. IESIP was developed in 1990. IBM is a trademark of International Business Machines. TURBO Pascal is registered by Borland International.
Knowledge-Based Reinforcement Learning for Data Mining
NASA Astrophysics Data System (ADS)
Kudenko, Daniel; Grzes, Marek
Data Mining is the process of extracting patterns from data. Two general avenues of research in the intersecting areas of agents and data mining can be distinguished. The first approach is concerned with mining an agent’s observation data in order to extract patterns, categorize environment states, and/or make predictions of future states. In this setting, data is normally available as a batch, and the agent’s actions and goals are often independent of the data mining task. The data collection is mainly considered as a side effect of the agent’s activities. Machine learning techniques applied in such situations fall into the class of supervised learning. In contrast, the second scenario occurs where an agent is actively performing the data mining, and is responsible for the data collection itself. For example, a mobile network agent is acquiring and processing data (where the acquisition may incur a certain cost), or a mobile sensor agent is moving in a (perhaps hostile) environment, collecting and processing sensor readings. In these settings, the tasks of the agent and the data mining are highly intertwined and interdependent (or even identical). Supervised learning is not a suitable technique for these cases. Reinforcement Learning (RL) enables an agent to learn from experience (in form of reward and punishment for explorative actions) and adapt to new situations, without a teacher. RL is an ideal learning technique for these data mining scenarios, because it fits the agent paradigm of continuous sensing and acting, and the RL agent is able to learn to make decisions on the sampling of the environment which provides the data. Nevertheless, RL still suffers from scalability problems, which have prevented its successful use in many complex real-world domains. The more complex the tasks, the longer it takes a reinforcement learning algorithm to converge to a good solution. For many real-world tasks, human expert knowledge is available. For example, human experts have developed heuristics that help them in planning and scheduling resources in their work place. However, this domain knowledge is often rough and incomplete. When the domain knowledge is used directly by an automated expert system, the solutions are often sub-optimal, due to the incompleteness of the knowledge, the uncertainty of environments, and the possibility to encounter unexpected situations. RL, on the other hand, can overcome the weaknesses of the heuristic domain knowledge and produce optimal solutions. In the talk we propose two techniques, which represent first steps in the area of knowledge-based RL (KBRL). The first technique [1] uses high-level STRIPS operator knowledge in reward shaping to focus the search for the optimal policy. Empirical results show that the plan-based reward shaping approach outperforms other RL techniques, including alternative manual and MDP-based reward shaping when it is used in its basic form. We showed that MDP-based reward shaping may fail and successful experiments with STRIPS-based shaping suggest modifications which can overcome encountered problems. The STRIPSbased method we propose allows expressing the same domain knowledge in a different way and the domain expert can choose whether to define an MDP or STRIPS planning task. We also evaluated the robustness of the proposed STRIPS-based technique to errors in the plan knowledge. In case that STRIPS knowledge is not available, we propose a second technique [2] that shapes the reward with hierarchical tile coding. Where the Q-function is represented with low-level tile coding, a V-function with coarser tile coding can be learned in parallel and used to approximate the potential for ground states. In the context of data mining, our KBRL approaches can also be used for any data collection task where the acquisition of data may incur considerable cost. In addition, observing the data collection agent in specific scenarios may lead to new insights into optimal data collection behaviour in the respective domains. In future work, we intend to demonstrate and evaluate our techniques on concrete real-world data mining applications.
Functional Programming with C++ Template Metaprograms
NASA Astrophysics Data System (ADS)
Porkoláb, Zoltán
Template metaprogramming is an emerging new direction of generative programming. With the clever definitions of templates we can force the C++ compiler to execute algorithms at compilation time. Among the application areas of template metaprograms are the expression templates, static interface checking, code optimization with adaption, language embedding and active libraries. However, as template metaprogramming was not an original design goal, the C++ language is not capable of elegant expression of metaprograms. The complicated syntax leads to the creation of code that is hard to write, understand and maintain. Although template metaprogramming has a strong relationship with functional programming, this is not reflected in the language syntax and existing libraries. In this paper we give a short and incomplete introduction to C++ templates and the basics of template metaprogramming. We will enlight the role of template metaprograms, and some important and widely used idioms. We give an overview of the possible application areas as well as debugging and profiling techniques. We suggest a pure functional style programming interface for C++ template metaprograms in the form of embedded Haskell code which is transformed to standard compliant C++ source.
Zucchelli, Silvia; Patrucco, Laura; Persichetti, Francesca; Gustincich, Stefano; Cotella, Diego
2016-01-01
Mammalian cells are an indispensable tool for the production of recombinant proteins in contexts where function depends on post-translational modifications. Among them, Chinese Hamster Ovary (CHO) cells are the primary factories for the production of therapeutic proteins, including monoclonal antibodies (MAbs). To improve expression and stability, several methodologies have been adopted, including methods based on media formulation, selective pressure and cell- or vector engineering. This review presents current approaches aimed at improving mammalian cell factories that are based on the enhancement of translation. Among well-established techniques (codon optimization and improvement of mRNA secondary structure), we describe SINEUPs, a family of antisense long non-coding RNAs that are able to increase translation of partially overlapping protein-coding mRNAs. By exploiting their modular structure, SINEUP molecules can be designed to target virtually any mRNA of interest, and thus to increase the production of secreted proteins. Thus, synthetic SINEUPs represent a new versatile tool to improve the production of secreted proteins in biomanufacturing processes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hellfeld, Daniel; Barton, Paul; Gunter, Donald
Gamma-ray imaging facilitates the efficient detection, characterization, and localization of compact radioactive sources in cluttered environments. Fieldable detector systems employing active planar coded apertures have demonstrated broad energy sensitivity via both coded aperture and Compton imaging modalities. But, planar configurations suffer from a limited field-of-view, especially in the coded aperture mode. In order to improve upon this limitation, we introduce a novel design by rearranging the detectors into an active coded spherical configuration, resulting in a 4pi isotropic field-of-view for both coded aperture and Compton imaging. This work focuses on the low- energy coded aperture modality and the optimization techniquesmore » used to determine the optimal number and configuration of 1 cm 3 CdZnTe coplanar grid detectors on a 14 cm diameter sphere with 192 available detector locations.« less
Vector processing efficiency of plasma MHD codes by use of the FACOM 230-75 APU
NASA Astrophysics Data System (ADS)
Matsuura, T.; Tanaka, Y.; Naraoka, K.; Takizuka, T.; Tsunematsu, T.; Tokuda, S.; Azumi, M.; Kurita, G.; Takeda, T.
1982-06-01
In the framework of pipelined vector architecture, the efficiency of vector processing is assessed with respect to plasma MHD codes in nuclear fusion research. By using a vector processor, the FACOM 230-75 APU, the limit of the enhancement factor due to parallelism of current vector machines is examined for three numerical codes based on a fluid model. Reasonable speed-up factors of approximately 6,6 and 4 times faster than the highly optimized scalar version are obtained for ERATO (linear stability code), AEOLUS-R1 (nonlinear stability code) and APOLLO (1-1/2D transport code), respectively. Problems of the pipelined vector processors are discussed from the viewpoint of restructuring, optimization and choice of algorithms. In conclusion, the important concept of "concurrency within pipelined parallelism" is emphasized.
Klann, Jeffrey G; Phillips, Lori C; Turchin, Alexander; Weiler, Sarah; Mandl, Kenneth D; Murphy, Shawn N
2015-12-11
Interoperable phenotyping algorithms, needed to identify patient cohorts meeting eligibility criteria for observational studies or clinical trials, require medical data in a consistent structured, coded format. Data heterogeneity limits such algorithms' applicability. Existing approaches are often: not widely interoperable; or, have low sensitivity due to reliance on the lowest common denominator (ICD-9 diagnoses). In the Scalable Collaborative Infrastructure for a Learning Healthcare System (SCILHS) we endeavor to use the widely-available Current Procedural Terminology (CPT) procedure codes with ICD-9. Unfortunately, CPT changes drastically year-to-year - codes are retired/replaced. Longitudinal analysis requires grouping retired and current codes. BioPortal provides a navigable CPT hierarchy, which we imported into the Informatics for Integrating Biology and the Bedside (i2b2) data warehouse and analytics platform. However, this hierarchy does not include retired codes. We compared BioPortal's 2014AA CPT hierarchy with Partners Healthcare's SCILHS datamart, comprising three-million patients' data over 15 years. 573 CPT codes were not present in 2014AA (6.5 million occurrences). No existing terminology provided hierarchical linkages for these missing codes, so we developed a method that automatically places missing codes in the most specific "grouper" category, using the numerical similarity of CPT codes. Two informaticians reviewed the results. We incorporated the final table into our i2b2 SCILHS/PCORnet ontology, deployed it at seven sites, and performed a gap analysis and an evaluation against several phenotyping algorithms. The reviewers found the method placed the code correctly with 97 % precision when considering only miscategorizations ("correctness precision") and 52 % precision using a gold-standard of optimal placement ("optimality precision"). High correctness precision meant that codes were placed in a reasonable hierarchal position that a reviewer can quickly validate. Lower optimality precision meant that codes were not often placed in the optimal hierarchical subfolder. The seven sites encountered few occurrences of codes outside our ontology, 93 % of which comprised just four codes. Our hierarchical approach correctly grouped retired and non-retired codes in most cases and extended the temporal reach of several important phenotyping algorithms. We developed a simple, easily-validated, automated method to place retired CPT codes into the BioPortal CPT hierarchy. This complements existing hierarchical terminologies, which do not include retired codes. The approach's utility is confirmed by the high correctness precision and successful grouping of retired with non-retired codes.
NASA Technical Reports Server (NTRS)
Lin, Shu; Fossorier, Marc
1998-01-01
For long linear block codes, maximum likelihood decoding based on full code trellises would be very hard to implement if not impossible. In this case, we may wish to trade error performance for the reduction in decoding complexity. Sub-optimum soft-decision decoding of a linear block code based on a low-weight sub-trellis can be devised to provide an effective trade-off between error performance and decoding complexity. This chapter presents such a suboptimal decoding algorithm for linear block codes. This decoding algorithm is iterative in nature and based on an optimality test. It has the following important features: (1) a simple method to generate a sequence of candidate code-words, one at a time, for test; (2) a sufficient condition for testing a candidate code-word for optimality; and (3) a low-weight sub-trellis search for finding the most likely (ML) code-word.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Graf, Peter; Dykes, Katherine; Scott, George
The layout of turbines in a wind farm is already a challenging nonlinear, nonconvex, nonlinearly constrained continuous global optimization problem. Here we begin to address the next generation of wind farm optimization problems by adding the complexity that there is more than one turbine type to choose from. The optimization becomes a nonlinear constrained mixed integer problem, which is a very difficult class of problems to solve. Furthermore, this document briefly summarizes the algorithm and code we have developed, the code validation steps we have performed, and the initial results for multi-turbine type and placement optimization (TTP_OPT) we have run.
Manual of phosphoric acid fuel cell power plant optimization model and computer program
NASA Technical Reports Server (NTRS)
Lu, C. Y.; Alkasab, K. A.
1984-01-01
An optimized cost and performance model for a phosphoric acid fuel cell power plant system was derived and developed into a modular FORTRAN computer code. Cost, energy, mass, and electrochemical analyses were combined to develop a mathematical model for optimizing the steam to methane ratio in the reformer, hydrogen utilization in the PAFC plates per stack. The nonlinear programming code, COMPUTE, was used to solve this model, in which the method of mixed penalty function combined with Hooke and Jeeves pattern search was chosen to evaluate this specific optimization problem.
Design optimization studies using COSMIC NASTRAN
NASA Technical Reports Server (NTRS)
Pitrof, Stephen M.; Bharatram, G.; Venkayya, Vipperla B.
1993-01-01
The purpose of this study is to create, test and document a procedure to integrate mathematical optimization algorithms with COSMIC NASTRAN. This procedure is very important to structural design engineers who wish to capitalize on optimization methods to ensure that their design is optimized for its intended application. The OPTNAST computer program was created to link NASTRAN and design optimization codes into one package. This implementation was tested using two truss structure models and optimizing their designs for minimum weight, subject to multiple loading conditions and displacement and stress constraints. However, the process is generalized so that an engineer could design other types of elements by adding to or modifying some parts of the code.
NASA Technical Reports Server (NTRS)
Lewis, Michael
1994-01-01
Statistical encoding techniques enable the reduction of the number of bits required to encode a set of symbols, and are derived from their probabilities. Huffman encoding is an example of statistical encoding that has been used for error-free data compression. The degree of compression given by Huffman encoding in this application can be improved by the use of prediction methods. These replace the set of elevations by a set of corrections that have a more advantageous probability distribution. In particular, the method of Lagrange Multipliers for minimization of the mean square error has been applied to local geometrical predictors. Using this technique, an 8-point predictor achieved about a 7 percent improvement over an existing simple triangular predictor.
Balancing Particle and Mesh Computation in a Particle-In-Cell Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Worley, Patrick H; D'Azevedo, Eduardo; Hager, Robert
2016-01-01
The XGC1 plasma microturbulence particle-in-cell simulation code has both particle-based and mesh-based computational kernels that dominate performance. Both of these are subject to load imbalances that can degrade performance and that evolve during a simulation. Each separately can be addressed adequately, but optimizing just for one can introduce significant load imbalances in the other, degrading overall performance. A technique has been developed based on Golden Section Search that minimizes wallclock time given prior information on wallclock time, and on current particle distribution and mesh cost per cell, and also adapts to evolution in load imbalance in both particle and meshmore » work. In problems of interest this doubled the performance on full system runs on the XK7 at the Oak Ridge Leadership Computing Facility compared to load balancing only one of the kernels.« less
A high-throughput exploration of magnetic materials by using structure predicting methods
NASA Astrophysics Data System (ADS)
Arapan, S.; Nieves, P.; Cuesta-López, S.
2018-02-01
We study the capability of a structure predicting method based on genetic/evolutionary algorithm for a high-throughput exploration of magnetic materials. We use the USPEX and VASP codes to predict stable and generate low-energy meta-stable structures for a set of representative magnetic structures comprising intermetallic alloys, oxides, interstitial compounds, and systems containing rare-earths elements, and for both types of ferromagnetic and antiferromagnetic ordering. We have modified the interface between USPEX and VASP codes to improve the performance of structural optimization as well as to perform calculations in a high-throughput manner. We show that exploring the structure phase space with a structure predicting technique reveals large sets of low-energy metastable structures, which not only improve currently exiting databases, but also may provide understanding and solutions to stabilize and synthesize magnetic materials suitable for permanent magnet applications.
Conditional Entropy-Constrained Residual VQ with Application to Image Coding
NASA Technical Reports Server (NTRS)
Kossentini, Faouzi; Chung, Wilson C.; Smith, Mark J. T.
1996-01-01
This paper introduces an extension of entropy-constrained residual vector quantization (VQ) where intervector dependencies are exploited. The method, which we call conditional entropy-constrained residual VQ, employs a high-order entropy conditioning strategy that captures local information in the neighboring vectors. When applied to coding images, the proposed method is shown to achieve better rate-distortion performance than that of entropy-constrained residual vector quantization with less computational complexity and lower memory requirements. Moreover, it can be designed to support progressive transmission in a natural way. It is also shown to outperform some of the best predictive and finite-state VQ techniques reported in the literature. This is due partly to the joint optimization between the residual vector quantizer and a high-order conditional entropy coder as well as the efficiency of the multistage residual VQ structure and the dynamic nature of the prediction.
Application of CFD to the analysis and design of high-speed inlets
NASA Technical Reports Server (NTRS)
Rose, William C.
1995-01-01
Over the past seven years, efforts under the present Grant have been aimed at being able to apply modern Computational Fluid Dynamics to the design of high-speed engine inlets. In this report, a review of previous design capabilities (prior to the advent of functioning CFD) was presented and the example of the NASA 'Mach 5 inlet' design was given as the premier example of the historical approach to inlet design. The philosophy used in the Mach 5 inlet design was carried forward in the present study, in which CFD was used to design a new Mach 10 inlet. An example of an inlet redesign was also shown. These latter efforts were carried out using today's state-of-the-art, full computational fluid dynamics codes applied in an iterative man-in-the-loop technique. The potential usefulness of an automated machine design capability using an optimizer code was also discussed.
Symmetry-Based Variance Reduction Applied to 60Co Teletherapy Unit Monte Carlo Simulations
NASA Astrophysics Data System (ADS)
Sheikh-Bagheri, D.
A new variance reduction technique (VRT) is implemented in the BEAM code [1] to specifically improve the efficiency of calculating penumbral distributions of in-air fluence profiles calculated for isotopic sources. The simulations focus on 60Co teletherapy units. The VRT includes splitting of photons exiting the source capsule of a 60Co teletherapy source according to a splitting recipe and distributing the split photons randomly on the periphery of a circle, preserving the direction cosine along the beam axis, in addition to the energy of the photon. It is shown that the use of the VRT developed in this work can lead to a 6-9 fold improvement in the efficiency of the penumbral photon fluence of a 60Co beam compared to that calculated using the standard optimized BEAM code [1] (i.e., one with the proper selection of electron transport parameters).
Gregarious Data Re-structuring in a Many Core Architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shrestha, Sunil; Manzano Franco, Joseph B.; Marquez, Andres
this paper, we have developed a new methodology that takes in consideration the access patterns from a single parallel actor (e.g. a thread), as well as, the access patterns of “grouped” parallel actors that share a resource (e.g. a distributed Level 3 cache). We start with a hierarchical tile code for our target machine and apply a series of transformations at the tile level to improve data residence in a given memory hierarchy level. The contribution of this paper includes (a) collaborative data restructuring for group reuse and (b) low overhead transformation technique to improve access pattern and bring closelymore » connected data elements together. Preliminary results in a many core architecture, Tilera TileGX, shows promising improvements over optimized OpenMP code (up to 31% increase in GFLOPS) and over our own previous work on fine grained runtimes (up to 16%) for selected kernels« less
NASA Astrophysics Data System (ADS)
Sanan, P.; Tackley, P. J.; Gerya, T.; Kaus, B. J. P.; May, D.
2017-12-01
StagBL is an open-source parallel solver and discretization library for geodynamic simulation,encapsulating and optimizing operations essential to staggered-grid finite volume Stokes flow solvers.It provides a parallel staggered-grid abstraction with a high-level interface in C and Fortran.On top of this abstraction, tools are available to define boundary conditions and interact with particle systems.Tools and examples to efficiently solve Stokes systems defined on the grid are provided in small (direct solver), medium (simple preconditioners), and large (block factorization and multigrid) model regimes.By working directly with leading application codes (StagYY, I3ELVIS, and LaMEM) and providing an API and examples to integrate with others, StagBL aims to become a community tool supplying scalable, portable, reproducible performance toward novel science in regional- and planet-scale geodynamics and planetary science.By implementing kernels used by many research groups beneath a uniform abstraction layer, the library will enable optimization for modern hardware, thus reducing community barriers to large- or extreme-scale parallel simulation on modern architectures. In particular, the library will include CPU-, Manycore-, and GPU-optimized variants of matrix-free operators and multigrid components.The common layer provides a framework upon which to introduce innovative new tools.StagBL will leverage p4est to provide distributed adaptive meshes, and incorporate a multigrid convergence analysis tool.These options, in addition to a wealth of solver options provided by an interface to PETSc, will make the most modern solution techniques available from a common interface. StagBL in turn provides a PETSc interface, DMStag, to its central staggered grid abstraction.We present public version 0.5 of StagBL, including preliminary integration with application codes and demonstrations with its own demonstration application, StagBLDemo. Central to StagBL is the notion of an uninterrupted pipeline from toy/teaching codes to high-performance, extreme-scale solves. StagBLDemo replicates the functionality of an advanced MATLAB-style regional geodynamics code, thus providing users with a concrete procedure to exceed the performance and scalability limitations of smaller-scale tools.
NASA Technical Reports Server (NTRS)
Cheng, Michael K.; Lyubarev, Mark; Nakashima, Michael A.; Andrews, Kenneth S.; Lee, Dennis
2008-01-01
Low-density parity-check (LDPC) codes are the state-of-the-art in forward error correction (FEC) technology that exhibits capacity approaching performance. The Jet Propulsion Laboratory (JPL) has designed a family of LDPC codes that are similar in structure and therefore, leads to a single decoder implementation. The Accumulate-Repeat-by-4-Jagged- Accumulate (AR4JA) code design offers a family of codes with rates 1/2, 2/3, 4/5 and lengths 1024, 4096, 16384 information bits. Performance is less than one dB from capacity for all combinations.Integrating a stand-alone LDPC decoder with a commercial-off-the-shelf (COTS) receiver faces additional challenges than building a single receiver-decoder unit from scratch. In this work, we outline the issues and show that these additional challenges can be over-come by simple solutions. To demonstrate that an LDPC decoder can be made to work seamlessly with a COTS receiver, we interface an AR4JA LDPC decoder developed on a field-programmable gate array (FPGA) with a modern high data rate receiver and mea- sure the combined receiver-decoder performance. Through optimizations that include an improved frame synchronizer and different soft-symbol scaling algorithms, we show that a combined implementation loss of less than one dB is possible and therefore, most of the coding gain evidence in theory can also be obtained in practice. Our techniques can benefit any modem that utilizes an advanced FEC code.
Joint Source-Channel Decoding of Variable-Length Codes with Soft Information: A Survey
NASA Astrophysics Data System (ADS)
Guillemot, Christine; Siohan, Pierre
2005-12-01
Multimedia transmission over time-varying wireless channels presents a number of challenges beyond existing capabilities conceived so far for third-generation networks. Efficient quality-of-service (QoS) provisioning for multimedia on these channels may in particular require a loosening and a rethinking of the layer separation principle. In that context, joint source-channel decoding (JSCD) strategies have gained attention as viable alternatives to separate decoding of source and channel codes. A statistical framework based on hidden Markov models (HMM) capturing dependencies between the source and channel coding components sets the foundation for optimal design of techniques of joint decoding of source and channel codes. The problem has been largely addressed in the research community, by considering both fixed-length codes (FLC) and variable-length source codes (VLC) widely used in compression standards. Joint source-channel decoding of VLC raises specific difficulties due to the fact that the segmentation of the received bitstream into source symbols is random. This paper makes a survey of recent theoretical and practical advances in the area of JSCD with soft information of VLC-encoded sources. It first describes the main paths followed for designing efficient estimators for VLC-encoded sources, the key component of the JSCD iterative structure. It then presents the main issues involved in the application of the turbo principle to JSCD of VLC-encoded sources as well as the main approaches to source-controlled channel decoding. This survey terminates by performance illustrations with real image and video decoding systems.
Non-isotopic Method for In Situ LncRNA Visualization and Quantitation.
Maqsodi, Botoul; Nikoloff, Corina
2016-01-01
In mammals and other eukaryotes, most of the genome is transcribed in a developmentally regulated manner to produce large numbers of long noncoding RNAs (lncRNAs). Genome-wide studies have identified thousands of lncRNAs lacking protein-coding capacity. RNA in situ hybridization technique is especially beneficial for the visualization of RNA (mRNA and lncRNA) expression in a heterogeneous population of cells/tissues; however its utility has been hampered by complicated procedures typically developed and optimized for the detection of a specific gene and therefore not amenable to a wide variety of genes and tissues.Recently, bDNA has revolutionized RNA in situ detection with fully optimized, robust assays for the detection of any mRNA and lncRNA targets in formalin-fixed paraffin-embedded (FFPE) and fresh frozen tissue sections using manual processing.
Efficacy of Code Optimization on Cache-Based Processors
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.; Saphir, William C.; Chancellor, Marisa K. (Technical Monitor)
1997-01-01
In this paper a number of techniques for improving the cache performance of a representative piece of numerical software is presented. Target machines are popular processors from several vendors: MIPS R5000 (SGI Indy), MIPS R8000 (SGI PowerChallenge), MIPS R10000 (SGI Origin), DEC Alpha EV4 + EV5 (Cray T3D & T3E), IBM RS6000 (SP Wide-node), Intel PentiumPro (Ames' Whitney), Sun UltraSparc (NERSC's NOW). The optimizations all attempt to increase the locality of memory accesses. But they meet with rather varied and often counterintuitive success on the different computing platforms. We conclude that it may be genuinely impossible to obtain portable performance on the current generation of cache-based machines. At the least, it appears that the performance of modern commodity processors cannot be described with parameters defining the cache alone.
Multivariable optimization of liquid rocket engines using particle swarm algorithms
NASA Astrophysics Data System (ADS)
Jones, Daniel Ray
Liquid rocket engines are highly reliable, controllable, and efficient compared to other conventional forms of rocket propulsion. As such, they have seen wide use in the space industry and have become the standard propulsion system for launch vehicles, orbit insertion, and orbital maneuvering. Though these systems are well understood, historical optimization techniques are often inadequate due to the highly non-linear nature of the engine performance problem. In this thesis, a Particle Swarm Optimization (PSO) variant was applied to maximize the specific impulse of a finite-area combustion chamber (FAC) equilibrium flow rocket performance model by controlling the engine's oxidizer-to-fuel ratio and de Laval nozzle expansion and contraction ratios. In addition to the PSO-controlled parameters, engine performance was calculated based on propellant chemistry, combustion chamber pressure, and ambient pressure, which are provided as inputs to the program. The performance code was validated by comparison with NASA's Chemical Equilibrium with Applications (CEA) and the commercially available Rocket Propulsion Analysis (RPA) tool. Similarly, the PSO algorithm was validated by comparison with brute-force optimization, which calculates all possible solutions and subsequently determines which is the optimum. Particle Swarm Optimization was shown to be an effective optimizer capable of quick and reliable convergence for complex functions of multiple non-linear variables.
Simulation of prompt gamma-ray emission during proton radiotherapy.
Verburg, Joost M; Shih, Helen A; Seco, Joao
2012-09-07
The measurement of prompt gamma rays emitted from proton-induced nuclear reactions has been proposed as a method to verify in vivo the range of a clinical proton radiotherapy beam. A good understanding of the prompt gamma-ray emission during proton therapy is key to develop a clinically feasible technique, as it can facilitate accurate simulations and uncertainty analysis of gamma detector designs. Also, the gamma production cross-sections may be incorporated as prior knowledge in the reconstruction of the proton range from the measurements. In this work, we performed simulations of proton-induced nuclear reactions with the main elements of human tissue, carbon-12, oxygen-16 and nitrogen-14, using the nuclear reaction models of the GEANT4 and MCNP6 Monte Carlo codes and the dedicated nuclear reaction codes TALYS and EMPIRE. For each code, we made an effort to optimize the input parameters and model selection. The results of the models were compared to available experimental data of discrete gamma line cross-sections. Overall, the dedicated nuclear reaction codes reproduced the experimental data more consistently, while the Monte Carlo codes showed larger discrepancies for a number of gamma lines. The model differences lead to a variation of the total gamma production near the end of the proton range by a factor of about 2. These results indicate a need for additional theoretical and experimental study of proton-induced gamma emission in human tissue.
NASA Technical Reports Server (NTRS)
Meyn, Larry A.
2018-01-01
One of the goals of NASA's Revolutionary Vertical Lift Technology Project (RVLT) is to provide validated tools for multidisciplinary design, analysis and optimization (MDAO) of vertical lift vehicles. As part of this effort, the software package, RotorCraft Optimization Tools (RCOTOOLS), is being developed to facilitate incorporating key rotorcraft conceptual design codes into optimizations using the OpenMDAO multi-disciplinary optimization framework written in Python. RCOTOOLS, also written in Python, currently supports the incorporation of the NASA Design and Analysis of RotorCraft (NDARC) vehicle sizing tool and the Comprehensive Analytical Model of Rotorcraft Aerodynamics and Dynamics II (CAMRAD II) analysis tool into OpenMDAO-driven optimizations. Both of these tools use detailed, file-based inputs and outputs, so RCOTOOLS provides software wrappers to update input files with new design variable values, execute these codes and then extract specific response variable values from the file outputs. These wrappers are designed to be flexible and easy to use. RCOTOOLS also provides several utilities to aid in optimization model development, including Graphical User Interface (GUI) tools for browsing input and output files in order to identify text strings that are used to identify specific variables as optimization input and response variables. This paper provides an overview of RCOTOOLS and its use
NASA Astrophysics Data System (ADS)
Cai, Huai-yu; Dong, Xiao-tong; Zhu, Meng; Huang, Zhan-hua
2018-01-01
Wavefront coding for athermal technique can effectively ensure the stability of the optical system imaging in large temperature range, as well as the advantages of compact structure and low cost. Using simulation method to analyze the properties such as PSF and MTF of wavefront coding athermal system under several typical temperature gradient distributions has directive function to characterize the working state of non-ideal temperature environment, and can effectively realize the system design indicators as well. In this paper, we utilize the interoperability of data between Solidworks and ZEMAX to simplify the traditional process of structure/thermal/optical integrated analysis. Besides, we design and build the optical model and corresponding mechanical model of the infrared imaging wavefront coding athermal system. The axial and radial temperature gradients of different degrees are applied to the whole system by using SolidWorks software, thus the changes of curvature, refractive index and the distance between the lenses are obtained. Then, we import the deformation model to ZEMAX for ray tracing, and obtain the changes of PSF and MTF in optical system. Finally, we discuss and evaluate the consistency of the PSF (MTF) of the wavefront coding athermal system and the image restorability, which provides the basis and reference for the optimal design of the wavefront coding athermal system. The results show that the adaptability of single material infrared wavefront coding athermal system to axial temperature gradient can reach the upper limit of temperature fluctuation of 60°C, which is much higher than that of radial temperature gradient.
Optimal Codes for the Burst Erasure Channel
NASA Technical Reports Server (NTRS)
Hamkins, Jon
2010-01-01
Deep space communications over noisy channels lead to certain packets that are not decodable. These packets leave gaps, or bursts of erasures, in the data stream. Burst erasure correcting codes overcome this problem. These are forward erasure correcting codes that allow one to recover the missing gaps of data. Much of the recent work on this topic concentrated on Low-Density Parity-Check (LDPC) codes. These are more complicated to encode and decode than Single Parity Check (SPC) codes or Reed-Solomon (RS) codes, and so far have not been able to achieve the theoretical limit for burst erasure protection. A block interleaved maximum distance separable (MDS) code (e.g., an SPC or RS code) offers near-optimal burst erasure protection, in the sense that no other scheme of equal total transmission length and code rate could improve the guaranteed correctible burst erasure length by more than one symbol. The optimality does not depend on the length of the code, i.e., a short MDS code block interleaved to a given length would perform as well as a longer MDS code interleaved to the same overall length. As a result, this approach offers lower decoding complexity with better burst erasure protection compared to other recent designs for the burst erasure channel (e.g., LDPC codes). A limitation of the design is its lack of robustness to channels that have impairments other than burst erasures (e.g., additive white Gaussian noise), making its application best suited for correcting data erasures in layers above the physical layer. The efficiency of a burst erasure code is the length of its burst erasure correction capability divided by the theoretical upper limit on this length. The inefficiency is one minus the efficiency. The illustration compares the inefficiency of interleaved RS codes to Quasi-Cyclic (QC) LDPC codes, Euclidean Geometry (EG) LDPC codes, extended Irregular Repeat Accumulate (eIRA) codes, array codes, and random LDPC codes previously proposed for burst erasure protection. As can be seen, the simple interleaved RS codes have substantially lower inefficiency over a wide range of transmission lengths.
NASA Technical Reports Server (NTRS)
Rajpal, Sandeep; Rhee, DoJun; Lin, Shu
1997-01-01
In this paper, we will use the construction technique proposed in to construct multidimensional trellis coded modulation (TCM) codes for both the additive white Gaussian noise (AWGN) and the fading channels. Analytical performance bounds and simulation results show that these codes perform very well and achieve significant coding gains over uncoded reference modulation systems. In addition, the proposed technique can be used to construct codes which have a performance/decoding complexity advantage over the codes listed in literature.
Winter, Sandra J; Sheats, Jylana L; King, Abby C
2016-01-01
This review examined the use of health behavior change techniques and theory in technology-enabled interventions targeting risk factors and indicators for cardiovascular disease (CVD) prevention and treatment. Articles targeting physical activity, weight loss, smoking cessation and management of hypertension, lipids and blood glucose were sourced from PubMed (November 2010-2015) and coded for use of 1) technology, 2) health behavior change techniques (using the CALO-RE taxonomy), and 3) health behavior theories. Of the 984 articles reviewed, 304 were relevant (240=intervention, 64=review). Twenty-two different technologies were used (M=1.45, SD=+/−0.719). The most frequently used behavior change techniques were self-monitoring and feedback on performance (M=5.4, SD=+/−2.9). Half (52%) of the intervention studies named a theory/model - most frequently Social Cognitive Theory, the Trans-theoretical Model, and the Theory of Planned Behavior/Reasoned Action. To optimize technology-enabled interventions targeting CVD risk factors, integrated behavior change theories that incorporate a variety of evidence-based health behavior change techniques are needed. PMID:26902519
A finite element code for electric motor design
NASA Technical Reports Server (NTRS)
Campbell, C. Warren
1994-01-01
FEMOT is a finite element program for solving the nonlinear magnetostatic problem. This version uses nonlinear, Newton first order elements. The code can be used for electric motor design and analysis. FEMOT can be embedded within an optimization code that will vary nodal coordinates to optimize the motor design. The output from FEMOT can be used to determine motor back EMF, torque, cogging, and magnet saturation. It will run on a PC and will be available to anyone who wants to use it.
Practical adaptive quantum tomography
NASA Astrophysics Data System (ADS)
Granade, Christopher; Ferrie, Christopher; Flammia, Steven T.
2017-11-01
We introduce a fast and accurate heuristic for adaptive tomography that addresses many of the limitations of prior methods. Previous approaches were either too computationally intensive or tailored to handle special cases such as single qubits or pure states. By contrast, our approach combines the efficiency of online optimization with generally applicable and well-motivated data-processing techniques. We numerically demonstrate these advantages in several scenarios including mixed states, higher-dimensional systems, and restricted measurements. http://cgranade.com complete data and source code for this work are available online [1], and can be previewed at https://goo.gl/koiWxR.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spentzouris, Panagiotis; Amundson, J.; /Fermilab
2005-01-01
During the past year, the Fermilab Booster has been pushed to record intensities in order to satisfy the needs of the Tevatron collider and neutrino programs. This high intensity makes the study of space-charge effects and halo formation highly relevant to optimizing Booster performance. We present measurements of beam width evolution, halo formation, and coherent tune shifts, emphasizing the experimental techniques used and the calibration of the measuring devices. We also use simulations utilizing the 3D space-charge code Synergia to study the physical origins of these effects.
Near-field phase-change recording using a GaN laser diode
NASA Astrophysics Data System (ADS)
Kishima, Koichiro; Ichimura, Isao; Yamamoto, Kenji; Osato, Kiyoshi; Kuroda, Yuji; Iida, Atsushi; Saito, Kimihiro
2000-09-01
We developed a 1.5-Numerical-Aperture optical setup using a GaN blue-violet laser diode. We used a 1.0 mm-diameter super-hemispherical solid immersion lens, and optimized a phase-change disk structure including the cover layer by the method of MTF simulation. The disk surface was polished by tape burnishing technique. An eye-pattern of (1-7)-coded data at the linear density of 80 nm/bit was demonstrated on the phase-change disk below a 50 nm gap height, which was realized through our air-gap servo mechanism.
Support vector machine multiuser receiver for DS-CDMA signals in multipath channels.
Chen, S; Samingan, A K; Hanzo, L
2001-01-01
The problem of constructing an adaptive multiuser detector (MUD) is considered for direct sequence code division multiple access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVM), is proposed as a method of obtaining a nonlinear MUD from a relatively small training data block. Computer simulation is used to study this SVM MUD, and the results show that it can closely match the performance of the optimal Bayesian one-shot detector. Comparisons with an adaptive radial basis function (RBF) MUD trained by an unsupervised clustering algorithm are discussed.
Supersonic second order analysis and optimization program user's manual
NASA Technical Reports Server (NTRS)
Clever, W. C.
1984-01-01
Approximate nonlinear inviscid theoretical techniques for predicting aerodynamic characteristics and surface pressures for relatively slender vehicles at supersonic and moderate hypersonic speeds were developed. Emphasis was placed on approaches that would be responsive to conceptual configuration design level of effort. Second order small disturbance theory was utilized to meet this objective. Numerical codes were developed for analysis and design of relatively general three dimensional geometries. Results from the computations indicate good agreement with experimental results for a variety of wing, body, and wing-body shapes. Case computational time of one minute on a CDC 176 are typical for practical aircraft arrangement.
Design of a setup for 252Cf neutron source for storage and analysis purpose
NASA Astrophysics Data System (ADS)
Hei, Daqian; Zhuang, Haocheng; Jia, Wenbao; Cheng, Can; Jiang, Zhou; Wang, Hongtao; Chen, Da
2016-11-01
252Cf is a reliable isotopic neutron source and widely used in the prompt gamma ray neutron activation analysis (PGNAA) technique. A cylindrical barrel made by polymethyl methacrylate contained with the boric acid solution was designed for storage and application of a 5 μg 252Cf neutron source. The size of the setup was optimized with Monte Carlo code. The experiments were performed and the results showed the doses were reduced with the setup and less than the allowable limit. The intensity and collimating radius of the neutron beam could also be adjusted through different collimator.
Serenity: A subsystem quantum chemistry program.
Unsleber, Jan P; Dresselhaus, Thomas; Klahr, Kevin; Schnieders, David; Böckers, Michael; Barton, Dennis; Neugebauer, Johannes
2018-05-15
We present the new quantum chemistry program Serenity. It implements a wide variety of functionalities with a focus on subsystem methodology. The modular code structure in combination with publicly available external tools and particular design concepts ensures extensibility and robustness with a focus on the needs of a subsystem program. Several important features of the program are exemplified with sample calculations with subsystem density-functional theory, potential reconstruction techniques, a projection-based embedding approach and combinations thereof with geometry optimization, semi-numerical frequency calculations and linear-response time-dependent density-functional theory. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
Survey of adaptive image coding techniques
NASA Technical Reports Server (NTRS)
Habibi, A.
1977-01-01
The general problem of image data compression is discussed briefly with attention given to the use of Karhunen-Loeve transforms, suboptimal systems, and block quantization. A survey is then conducted encompassing the four categories of adaptive systems: (1) adaptive transform coding (adaptive sampling, adaptive quantization, etc.), (2) adaptive predictive coding (adaptive delta modulation, adaptive DPCM encoding, etc.), (3) adaptive cluster coding (blob algorithms and the multispectral cluster coding technique), and (4) adaptive entropy coding.
NASA Astrophysics Data System (ADS)
Brandelik, Andreas
2009-07-01
CALCMIN, an open source Visual Basic program, was implemented in EXCEL™. The program was primarily developed to support geoscientists in their routine task of calculating structural formulae of minerals on the basis of chemical analysis mainly obtained by electron microprobe (EMP) techniques. Calculation programs for various minerals are already included in the form of sub-routines. These routines are arranged in separate modules containing a minimum of code. The architecture of CALCMIN allows the user to easily develop new calculation routines or modify existing routines with little knowledge of programming techniques. By means of a simple mouse-click, the program automatically generates a rudimentary framework of code using the object model of the Visual Basic Editor (VBE). Within this framework simple commands and functions, which are provided by the program, can be used, for example, to perform various normalization procedures or to output the results of the computations. For the clarity of the code, element symbols are used as variables initialized by the program automatically. CALCMIN does not set any boundaries in complexity of the code used, resulting in a wide range of possible applications. Thus, matrix and optimization methods can be included, for instance, to determine end member contents for subsequent thermodynamic calculations. Diverse input procedures are provided, such as the automated read-in of output files created by the EMP. Furthermore, a subsequent filter routine enables the user to extract specific analyses in order to use them for a corresponding calculation routine. An event-driven, interactive operating mode was selected for easy application of the program. CALCMIN leads the user from the beginning to the end of the calculation process.
Noussa-Yao, Joseph; Heudes, Didier; Escudie, Jean-Baptiste; Degoulet, Patrice
2016-01-01
Short-stay MSO (Medicine, Surgery, Obstetrics) hospitalization activities in public and private hospitals providing public services are funded through charges for the services provided (T2A in French). Coding must be well matched to the severity of the patient's condition, to ensure that appropriate funding is provided to the hospital. We propose the use of an autocompletion process and multidimensional matrix, to help physicians to improve the expression of information and to optimize clinical coding. With this approach, physicians without knowledge of the encoding rules begin from a rough concept, which is gradually refined through semantic proximity and uses information on the associated codes stemming of optimized knowledge bases of diagnosis code.
Multidisciplinary High-Fidelity Analysis and Optimization of Aerospace Vehicles. Part 1; Formulation
NASA Technical Reports Server (NTRS)
Walsh, J. L.; Townsend, J. C.; Salas, A. O.; Samareh, J. A.; Mukhopadhyay, V.; Barthelemy, J.-F.
2000-01-01
An objective of the High Performance Computing and Communication Program at the NASA Langley Research Center is to demonstrate multidisciplinary shape and sizing optimization of a complete aerospace vehicle configuration by using high-fidelity, finite element structural analysis and computational fluid dynamics aerodynamic analysis in a distributed, heterogeneous computing environment that includes high performance parallel computing. A software system has been designed and implemented to integrate a set of existing discipline analysis codes, some of them computationally intensive, into a distributed computational environment for the design of a highspeed civil transport configuration. The paper describes the engineering aspects of formulating the optimization by integrating these analysis codes and associated interface codes into the system. The discipline codes are integrated by using the Java programming language and a Common Object Request Broker Architecture (CORBA) compliant software product. A companion paper presents currently available results.
ERIC Educational Resources Information Center
Lee, Seong-Soo
1982-01-01
Tenth-grade students (n=144) received training on one of three processing methods: coding-mapping (simultaneous), coding only, or decision tree (sequential). The induced simultaneous processing strategy worked optimally under rule learning, while the sequential strategy was difficult to induce and/or not optimal for rule-learning operations.…
Memory-efficient decoding of LDPC codes
NASA Technical Reports Server (NTRS)
Kwok-San Lee, Jason; Thorpe, Jeremy; Hawkins, Jon
2005-01-01
We present a low-complexity quantization scheme for the implementation of regular (3,6) LDPC codes. The quantization parameters are optimized to maximize the mutual information between the source and the quantized messages. Using this non-uniform quantized belief propagation algorithm, we have simulated that an optimized 3-bit quantizer operates with 0.2dB implementation loss relative to a floating point decoder, and an optimized 4-bit quantizer operates less than 0.1dB quantization loss.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koskela, Tuomas S.; Lobet, Mathieu; Deslippe, Jack
In this session we show, in two case studies, how the roofline feature of Intel Advisor has been utilized to optimize the performance of kernels of the XGC1 and PICSAR codes in preparation for Intel Knights Landing architecture. The impact of the implemented optimizations and the benefits of using the automatic roofline feature of Intel Advisor to study performance of large applications will be presented. This demonstrates an effective optimization strategy that has enabled these science applications to achieve up to 4.6 times speed-up and prepare for future exascale architectures. # Goal/Relevance of Session The roofline model [1,2] is amore » powerful tool for analyzing the performance of applications with respect to the theoretical peak achievable on a given computer architecture. It allows one to graphically represent the performance of an application in terms of operational intensity, i.e. the ratio of flops performed and bytes moved from memory in order to guide optimization efforts. Given the scale and complexity of modern science applications, it can often be a tedious task for the user to perform the analysis on the level of functions or loops to identify where performance gains can be made. With new Intel tools, it is now possible to automate this task, as well as base the estimates of peak performance on measurements rather than vendor specifications. The goal of this session is to demonstrate how the roofline feature of Intel Advisor can be used to balance memory vs. computation related optimization efforts and effectively identify performance bottlenecks. A series of typical optimization techniques: cache blocking, structure refactoring, data alignment, and vectorization illustrated by the kernel cases will be addressed. # Description of the codes ## XGC1 The XGC1 code [3] is a magnetic fusion Particle-In-Cell code that uses an unstructured mesh for its Poisson solver that allows it to accurately resolve the edge plasma of a magnetic fusion device. After recent optimizations to its collision kernel [4], most of the computing time is spent in the electron push (pushe) kernel, where these optimization efforts have been focused. The kernel code scaled well with MPI+OpenMP but had almost no automatic compiler vectorization, in part due to indirect memory addresses and in part due to low trip counts of low-level loops that would be candidates for vectorization. Particle blocking and sorting have been implemented to increase trip counts of low-level loops and improve memory locality, and OpenMP directives have been added to vectorize compute-intensive loops that were identified by Advisor. The optimizations have improved the performance of the pushe kernel 2x on Haswell processors and 1.7x on KNL. The KNL node-for-node performance has been brought to within 30% of a NERSC Cori phase I Haswell node and we expect to bridge this gap by reducing the memory footprint of compute intensive routines to improve cache reuse. ## PICSAR is a Fortran/Python high-performance Particle-In-Cell library targeting at MIC architectures first designed to be coupled with the PIC code WARP for the simulation of laser-matter interaction and particle accelerators. PICSAR also contains a FORTRAN stand-alone kernel for performance studies and benchmarks. A MPI domain decomposition is used between NUMA domains and a tile decomposition (cache-blocking) handled by OpenMP has been added for shared-memory parallelism and better cache management. The so-called current deposition and field gathering steps that compose the PIC time loop constitute major hotspots that have been rewritten to enable more efficient vectorization. Particle communications between tiles and MPI domain has been merged and parallelized. All considered, these improvements provide speedups of 3.1 for order 1 and 4.6 for order 3 interpolation shape factors on KNL configured in SNC4 quadrant flat mode. Performance is similar between a node of cori phase 1 and KNL at order 1 and better on KNL by a factor 1.6 at order 3 with the considered test case (homogeneous thermal plasma).« less
Shaping electromagnetic waves using software-automatically-designed metasurfaces.
Zhang, Qian; Wan, Xiang; Liu, Shuo; Yuan Yin, Jia; Zhang, Lei; Jun Cui, Tie
2017-06-15
We present a fully digital procedure of designing reflective coding metasurfaces to shape reflected electromagnetic waves. The design procedure is completely automatic, controlled by a personal computer. In details, the macro coding units of metasurface are automatically divided into several types (e.g. two types for 1-bit coding, four types for 2-bit coding, etc.), and each type of the macro coding units is formed by discretely random arrangement of micro coding units. By combining an optimization algorithm and commercial electromagnetic software, the digital patterns of the macro coding units are optimized to possess constant phase difference for the reflected waves. The apertures of the designed reflective metasurfaces are formed by arranging the macro coding units with certain coding sequence. To experimentally verify the performance, a coding metasurface is fabricated by automatically designing two digital 1-bit unit cells, which are arranged in array to constitute a periodic coding metasurface to generate the required four-beam radiations with specific directions. Two complicated functional metasurfaces with circularly- and elliptically-shaped radiation beams are realized by automatically designing 4-bit macro coding units, showing excellent performance of the automatic designs by software. The proposed method provides a smart tool to realize various functional devices and systems automatically.
On the optimality of code options for a universal noiseless coder
NASA Technical Reports Server (NTRS)
Yeh, Pen-Shu; Rice, Robert F.; Miller, Warner
1991-01-01
A universal noiseless coding structure was developed that provides efficient performance over an extremely broad range of source entropy. This is accomplished by adaptively selecting the best of several easily implemented variable length coding algorithms. Custom VLSI coder and decoder modules capable of processing over 20 million samples per second are currently under development. The first of the code options used in this module development is shown to be equivalent to a class of Huffman code under the Humblet condition, other options are shown to be equivalent to the Huffman codes of a modified Laplacian symbol set, at specified symbol entropy values. Simulation results are obtained on actual aerial imagery, and they confirm the optimality of the scheme. On sources having Gaussian or Poisson distributions, coder performance is also projected through analysis and simulation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, T; Lin, H; Xu, X
Purpose: To develop a nuclear medicine dosimetry module for the GPU-based Monte Carlo code ARCHER. Methods: We have developed a nuclear medicine dosimetry module for the fast Monte Carlo code ARCHER. The coupled electron-photon Monte Carlo transport kernel included in ARCHER is built upon the Dose Planning Method code (DPM). The developed module manages the radioactive decay simulation by consecutively tracking several types of radiation on a per disintegration basis using the statistical sampling method. Optimization techniques such as persistent threads and prefetching are studied and implemented. The developed module is verified against the VIDA code, which is based onmore » Geant4 toolkit and has previously been verified against OLINDA/EXM. A voxelized geometry is used in the preliminary test: a sphere made of ICRP soft tissue is surrounded by a box filled with water. Uniform activity distribution of I-131 is assumed in the sphere. Results: The self-absorption dose factors (mGy/MBqs) of the sphere with varying diameters are calculated by ARCHER and VIDA respectively. ARCHER’s result is in agreement with VIDA’s that are obtained from a previous publication. VIDA takes hours of CPU time to finish the computation, while it takes ARCHER 4.31 seconds for the 12.4-cm uniform activity sphere case. For a fairer CPU-GPU comparison, more effort will be made to eliminate the algorithmic differences. Conclusion: The coupled electron-photon Monte Carlo code ARCHER has been extended to radioactive decay simulation for nuclear medicine dosimetry. The developed code exhibits good performance in our preliminary test. The GPU-based Monte Carlo code is developed with grant support from the National Institute of Biomedical Imaging and Bioengineering through an R01 grant (R01EB015478)« less
Optimal lightpath placement on a metropolitan-area network linked with optical CDMA local nets
NASA Astrophysics Data System (ADS)
Wang, Yih-Fuh; Huang, Jen-Fa
2008-01-01
A flexible optical metropolitan-area network (OMAN) [J.F. Huang, Y.F. Wang, C.Y. Yeh, Optimal configuration of OCDMA-based MAN with multimedia services, in: 23rd Biennial Symposium on Communications, Queen's University, Kingston, Canada, May 29-June 2, 2006, pp. 144-148] structured with OCDMA linkage is proposed to support multimedia services with multi-rate or various qualities of service. To prioritize transmissions in OCDMA, the orthogonal variable spreading factor (OVSF) codes widely used in wireless CDMA are adopted. In addition, for feasible multiplexing, unipolar OCDMA modulation [L. Nguyen, B. Aazhang, J.F. Young, All-optical CDMA with bipolar codes, IEEE Electron. Lett. 31 (6) (1995) 469-470] is used to generate the code selector of multi-rate OMAN, and a flexible fiber-grating-based system is used for the equipment on OCDMA-OVSF code. These enable an OMAN to assign suitable OVSF codes when creating different-rate lightpaths. How to optimally configure a multi-rate OMAN is a challenge because of displaced lightpaths. In this paper, a genetically modified genetic algorithm (GMGA) [L.R. Chen, Flexible fiber Bragg grating encoder/decoder for hybrid wavelength-time optical CDMA, IEEE Photon. Technol. Lett. 13 (11) (2001) 1233-1235] is used to preplan lightpaths in order to optimally configure an OMAN. To evaluate the performance of the GMGA, we compared it with different preplanning optimization algorithms. Simulation results revealed that the GMGA very efficiently solved the problem.
Aeroelastic Tailoring Study of N+2 Low Boom Supersonic Commerical Transport Aircraft
NASA Technical Reports Server (NTRS)
Pak, Chan-Gi
2015-01-01
The Lockheed Martin N+2 Low - boom Supersonic Commercial Transport (LSCT) aircraft was optimized in this study through the use of a multidisciplinary design optimization tool developed at the National Aeronautics and S pace Administration Armstrong Flight Research Center. A total of 111 design variables we re used in the first optimization run. Total structural weight was the objective function in this optimization run. Design requirements for strength, buckling, and flutter we re selected as constraint functions during the first optimization run. The MSC Nastran code was used to obtain the modal, strength, and buckling characteristics. Flutter and trim analyses we re based on ZAERO code, and landing and ground control loads were computed using an in - house code. The w eight penalty to satisfy all the design requirement s during the first optimization run was 31,367 lb, a 9.4% increase from the baseline configuration. The second optimization run was prepared and based on the big-bang big-crunch algorithm. Six composite ply angles for the second and fourth composite layers were selected as discrete design variables for the second optimization run. Composite ply angle changes can't improve the weight configuration of the N+2 LSCT aircraft. However, this second optimization run can create more tolerance for the active and near active strength constraint values for future weight optimization runs.
Calvello, Simone; Piccardo, Matteo; Rao, Shashank Vittal; Soncini, Alessandro
2018-03-05
We have developed and implemented a new ab initio code, Ceres (Computational Emulator of Rare Earth Systems), completely written in C++11, which is dedicated to the efficient calculation of the electronic structure and magnetic properties of the crystal field states arising from the splitting of the ground state spin-orbit multiplet in lanthanide complexes. The new code gains efficiency via an optimized implementation of a direct configurational averaged Hartree-Fock (CAHF) algorithm for the determination of 4f quasi-atomic active orbitals common to all multi-electron spin manifolds contributing to the ground spin-orbit multiplet of the lanthanide ion. The new CAHF implementation is based on quasi-Newton convergence acceleration techniques coupled to an efficient library for the direct evaluation of molecular integrals, and problem-specific density matrix guess strategies. After describing the main features of the new code, we compare its efficiency with the current state-of-the-art ab initio strategy to determine crystal field levels and properties, and show that our methodology, as implemented in Ceres, represents a more time-efficient computational strategy for the evaluation of the magnetic properties of lanthanide complexes, also allowing a full representation of non-perturbative spin-orbit coupling effects. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Bird, Robert; Nystrom, David; Albright, Brian
2017-10-01
The ability of scientific simulations to effectively deliver performant computation is increasingly being challenged by successive generations of high-performance computing architectures. Code development to support efficient computation on these modern architectures is both expensive, and highly complex; if it is approached without due care, it may also not be directly transferable between subsequent hardware generations. Previous works have discussed techniques to support the process of adapting a legacy code for modern hardware generations, but despite the breakthroughs in the areas of mini-app development, portable-performance, and cache oblivious algorithms the problem still remains largely unsolved. In this work we demonstrate how a focus on platform agnostic modern code-development can be applied to Particle-in-Cell (PIC) simulations to facilitate effective scientific delivery. This work builds directly on our previous work optimizing VPIC, in which we replaced intrinsic based vectorisation with compile generated auto-vectorization to improve the performance and portability of VPIC. In this work we present the use of a specialized SIMD queue for processing some particle operations, and also preview a GPU capable OpenMP variant of VPIC. Finally we include a lessons learnt. Work performed under the auspices of the U.S. Dept. of Energy by the Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by the LANL LDRD program.
Coding for Efficient Image Transmission
NASA Technical Reports Server (NTRS)
Rice, R. F.; Lee, J. J.
1986-01-01
NASA publication second in series on data-coding techniques for noiseless channels. Techniques used even in noisy channels, provided data further processed with Reed-Solomon or other error-correcting code. Techniques discussed in context of transmission of monochrome imagery from Voyager II spacecraft but applicable to other streams of data. Objective of this type coding to "compress" data; that is, to transmit using as few bits as possible by omitting as much as possible of portion of information repeated in subsequent samples (or picture elements).
Optimizing the use of a sensor resource for opponent polarization coding
Heras, Francisco J.H.
2017-01-01
Flies use specialized photoreceptors R7 and R8 in the dorsal rim area (DRA) to detect skylight polarization. R7 and R8 form a tiered waveguide (central rhabdomere pair, CRP) with R7 on top, filtering light delivered to R8. We examine how the division of a given resource, CRP length, between R7 and R8 affects their ability to code polarization angle. We model optical absorption to show how the length fractions allotted to R7 and R8 determine the rates at which they transduce photons, and correct these rates for transduction unit saturation. The rates give polarization signal and photon noise in R7, and in R8. Their signals are combined in an opponent unit, intrinsic noise added, and the unit’s output analysed to extract two measures of coding ability, number of discriminable polarization angles and mutual information. A very long R7 maximizes opponent signal amplitude, but codes inefficiently due to photon noise in the very short R8. Discriminability and mutual information are optimized by maximizing signal to noise ratio, SNR. At lower light levels approximately equal lengths of R7 and R8 are optimal because photon noise dominates. At higher light levels intrinsic noise comes to dominate and a shorter R8 is optimum. The optimum R8 length fractions falls to one third. This intensity dependent range of optimal length fractions corresponds to the range observed in different fly species and is not affected by transduction unit saturation. We conclude that a limited resource, rhabdom length, can be divided between two polarization sensors, R7 and R8, to optimize opponent coding. We also find that coding ability increases sub-linearly with total rhabdom length, according to the law of diminishing returns. Consequently, the specialized shorter central rhabdom in the DRA codes polarization twice as efficiently with respect to rhabdom length than the longer rhabdom used in the rest of the eye. PMID:28316880
System, methods and apparatus for program optimization for multi-threaded processor architectures
Bastoul, Cedric; Lethin, Richard A; Leung, Allen K; Meister, Benoit J; Szilagyi, Peter; Vasilache, Nicolas T; Wohlford, David E
2015-01-06
Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deslippe, Jack; da Jornada, Felipe H.; Vigil-Fowler, Derek
2016-10-06
We profile and optimize calculations performed with the BerkeleyGW code on the Xeon-Phi architecture. BerkeleyGW depends both on hand-tuned critical kernels as well as on BLAS and FFT libraries. We describe the optimization process and performance improvements achieved. We discuss a layered parallelization strategy to take advantage of vector, thread and node-level parallelism. We discuss locality changes (including the consequence of the lack of L3 cache) and effective use of the on-package high-bandwidth memory. We show preliminary results on Knights-Landing including a roofline study of code performance before and after a number of optimizations. We find that the GW methodmore » is particularly well-suited for many-core architectures due to the ability to exploit a large amount of parallelism over plane-wave components, band-pairs, and frequencies.« less
Davidsson, Marcus; Diaz-Fernandez, Paula; Schwich, Oliver D.; Torroba, Marcos; Wang, Gang; Björklund, Tomas
2016-01-01
Detailed characterization and mapping of oligonucleotide function in vivo is generally a very time consuming effort that only allows for hypothesis driven subsampling of the full sequence to be analysed. Recent advances in deep sequencing together with highly efficient parallel oligonucleotide synthesis and cloning techniques have, however, opened up for entirely new ways to map genetic function in vivo. Here we present a novel, optimized protocol for the generation of universally applicable, barcode labelled, plasmid libraries. The libraries are designed to enable the production of viral vector preparations assessing coding or non-coding RNA function in vivo. When generating high diversity libraries, it is a challenge to achieve efficient cloning, unambiguous barcoding and detailed characterization using low-cost sequencing technologies. With the presented protocol, diversity of above 3 million uniquely barcoded adeno-associated viral (AAV) plasmids can be achieved in a single reaction through a process achievable in any molecular biology laboratory. This approach opens up for a multitude of in vivo assessments from the evaluation of enhancer and promoter regions to the optimization of genome editing. The generated plasmid libraries are also useful for validation of sequencing clustering algorithms and we here validate the newly presented message passing clustering process named Starcode. PMID:27874090
Optimization of lightweight structure and supporting bipod flexure for a space mirror.
Chen, Yi-Cheng; Huang, Bo-Kai; You, Zhen-Ting; Chan, Chia-Yen; Huang, Ting-Ming
2016-12-20
This article presents an optimization process for integrated optomechanical design. The proposed optimization process for integrated optomechanical design comprises computer-aided drafting, finite element analysis (FEA), optomechanical transfer codes, and an optimization solver. The FEA was conducted to determine mirror surface deformation; then, deformed surface nodal data were transferred into Zernike polynomials through MATLAB optomechanical transfer codes to calculate the resulting optical path difference (OPD) and optical aberrations. To achieve an optimum design, the optimization iterations of the FEA, optomechanical transfer codes, and optimization solver were automatically connected through a self-developed Tcl script. Two examples of optimization design were illustrated in this research, namely, an optimum lightweight design of a Zerodur primary mirror with an outer diameter of 566 mm that is used in a spaceborne telescope and an optimum bipod flexure design that supports the optimum lightweight primary mirror. Finally, optimum designs were successfully accomplished in both examples, achieving a minimum peak-to-valley (PV) value for the OPD of the deformed optical surface. The simulated optimization results showed that (1) the lightweight ratio of the primary mirror increased from 56% to 66%; and (2) the PV value of the mirror supported by optimum bipod flexures in the horizontal position effectively decreased from 228 to 61 nm.
Wireless Visual Sensor Network Resource Allocation using Cross-Layer Optimization
2009-01-01
Rate Compatible Punctured Convolutional (RCPC) codes for channel...vol. 44, pp. 2943–2959, November 1998. [22] J. Hagenauer, “ Rate - compatible punctured convolutional codes (RCPC codes ) and their applications,” IEEE... coding rate for H.264/AVC video compression is determined. At the data link layer, the Rate - Compatible Puctured Convolutional (RCPC) channel coding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castellano, T.; De Palma, L.; Laneve, D.
2015-07-01
A homemade computer code for designing a Side- Coupled Linear Accelerator (SCL) is written. It integrates a simplified model of SCL tanks with the Particle Swarm Optimization (PSO) algorithm. The computer code main aim is to obtain useful guidelines for the design of Linear Accelerator (LINAC) resonant cavities. The design procedure, assisted via the aforesaid approach seems very promising, allowing future improvements towards the optimization of actual accelerating geometries. (authors)
Plain Language to Communicate Physical Activity Information: A Website Content Analysis.
Paige, Samantha R; Black, David R; Mattson, Marifran; Coster, Daniel C; Stellefson, Michael
2018-04-01
Plain language techniques are health literacy universal precautions intended to enhance health care system navigation and health outcomes. Physical activity (PA) is a popular topic on the Internet, yet it is unknown if information is communicated in plain language. This study examined how plain language techniques are included in PA websites, and if the use of plain language techniques varies according to search procedures (keyword, search engine) and website host source (government, commercial, educational/organizational). Three keywords ("physical activity," "fitness," and "exercise") were independently entered into three search engines (Google, Bing, and Yahoo) to locate a nonprobability sample of websites ( N = 61). Fourteen plain language techniques were coded within each website to examine content formatting, clarity and conciseness, and multimedia use. Approximately half ( M = 6.59; SD = 1.68) of the plain language techniques were included in each website. Keyword physical activity resulted in websites with fewer clear and concise plain language techniques ( p < .05), whereas fitness resulted in websites with more clear and concise techniques ( p < .01). Plain language techniques did not vary by search engine or the website host source. Accessing PA information that is easy to understand and behaviorally oriented may remain a challenge for users. Transdisciplinary collaborations are needed to optimize plain language techniques while communicating online PA information.
FRANOPP: Framework for analysis and optimization problems user's guide
NASA Technical Reports Server (NTRS)
Riley, K. M.
1981-01-01
Framework for analysis and optimization problems (FRANOPP) is a software aid for the study and solution of design (optimization) problems which provides the driving program and plotting capability for a user generated programming system. In addition to FRANOPP, the programming system also contains the optimization code CONMIN, and two user supplied codes, one for analysis and one for output. With FRANOPP the user is provided with five options for studying a design problem. Three of the options utilize the plot capability and present an indepth study of the design problem. The study can be focused on a history of the optimization process or on the interaction of variables within the design problem.
Relay selection in energy harvesting cooperative networks with rateless codes
NASA Astrophysics Data System (ADS)
Zhu, Kaiyan; Wang, Fei
2018-04-01
This paper investigates the relay selection in energy harvesting cooperative networks, where the relays harvests energy from the radio frequency (RF) signals transmitted by a source, and the optimal relay is selected and uses the harvested energy to assist the information transmission from the source to its destination. Both source and the selected relay transmit information using rateless code, which allows the destination recover original information after collecting codes bits marginally surpass the entropy of original information. In order to improve transmission performance and efficiently utilize the harvested power, the optimal relay is selected. The optimization problem are formulated to maximize the achievable information rates of the system. Simulation results demonstrate that our proposed relay selection scheme outperform other strategies.
NASA Technical Reports Server (NTRS)
Chang, C. Y.; Kwok, R.; Curlander, J. C.
1987-01-01
Five coding techniques in the spatial and transform domains have been evaluated for SAR image compression: linear three-point predictor (LTPP), block truncation coding (BTC), microadaptive picture sequencing (MAPS), adaptive discrete cosine transform (ADCT), and adaptive Hadamard transform (AHT). These techniques have been tested with Seasat data. Both LTPP and BTC spatial domain coding techniques provide very good performance at rates of 1-2 bits/pixel. The two transform techniques, ADCT and AHT, demonstrate the capability to compress the SAR imagery to less than 0.5 bits/pixel without visible artifacts. Tradeoffs such as the rate distortion performance, the computational complexity, the algorithm flexibility, and the controllability of compression ratios are also discussed.
NASA Technical Reports Server (NTRS)
Rasmussen, John
1990-01-01
Structural optimization has attracted the attention since the days of Galileo. Olhoff and Taylor have produced an excellent overview of the classical research within this field. However, the interest in structural optimization has increased greatly during the last decade due to the advent of reliable general numerical analysis methods and the computer power necessary to use them efficiently. This has created the possibility of developing general numerical systems for shape optimization. Several authors, eg., Esping; Braibant & Fleury; Bennet & Botkin; Botkin, Yang, and Bennet; and Stanton have published practical and successful applications of general optimization systems. Ding and Homlein have produced extensive overviews of available systems. Furthermore, a number of commercial optimization systems based on well-established finite element codes have been introduced. Systems like ANSYS, IDEAS, OASIS, and NISAOPT are widely known examples. In parallel to this development, the technology of computer aided design (CAD) has gained a large influence on the design process of mechanical engineering. The CAD technology has already lived through a rapid development driven by the drastically growing capabilities of digital computers. However, the systems of today are still considered as being only the first generation of a long row of computer integrated manufacturing (CIM) systems. These systems to come will offer an integrated environment for design, analysis, and fabrication of products of almost any character. Thus, the CAD system could be regarded as simply a database for geometrical information equipped with a number of tools with the purpose of helping the user in the design process. Among these tools are facilities for structural analysis and optimization as well as present standard CAD features like drawing, modeling, and visualization tools. The state of the art of structural optimization is that a large amount of mathematical and mechanical techniques are available for the solution of single problems. By implementing collections of the available techniques into general software systems, operational environments for structural optimization have been created. The forthcoming years must bring solutions to the problem of integrating such systems into more general design environments. The result of this work should be CAD systems for rational design in which structural optimization is one important design tool among many others.
Bulk Data Dissemination in Low Power Sensor Networks: Present and Future Directions
Xu, Zhirong; Hu, Tianlei; Song, Qianshu
2017-01-01
Wireless sensor network-based (WSN-based) applications need an efficient and reliable data dissemination service to facilitate maintenance, management and data distribution tasks. As WSNs nowadays are becoming pervasive and data intensive, bulk data dissemination protocols have been extensively studied recently. This paper provides a comprehensive survey of the state-of-the-art bulk data dissemination protocols. The large number of papers available in the literature propose various techniques to optimize the dissemination protocols. Different from the existing survey works which separately explores the building blocks of dissemination, our work categorizes the literature according to the optimization purposes: Reliability, Scalability and Transmission/Energy efficiency. By summarizing and reviewing the key insights and techniques, we further discuss on the future directions for each category. Our survey helps unveil three key findings for future direction: (1) The recent advances in wireless communications (e.g., study on cross-technology interference, error estimating codes, constructive interference, capture effect) can be potentially exploited to support further optimization on the reliability and energy efficiency of dissemination protocols; (2) Dissemination in multi-channel, multi-task and opportunistic networks requires more efforts to fully exploit the spatial-temporal network resources to enhance the data propagation; (3) Since many designs incur changes on MAC layer protocols, the co-existence of dissemination with other network protocols is another problem left to be addressed. PMID:28098830
Optimal simultaneous superpositioning of multiple structures with missing data
Theobald, Douglas L.; Steindel, Phillip A.
2012-01-01
Motivation: Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually ‘missing’ from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether. Results: Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation–maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case. Availability and implementation: The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org. Contact: dtheobald@brandeis.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22543369
Fast decoding techniques for extended single-and-double-error-correcting Reed Solomon codes
NASA Technical Reports Server (NTRS)
Costello, D. J., Jr.; Deng, H.; Lin, S.
1984-01-01
A problem in designing semiconductor memories is to provide some measure of error control without requiring excessive coding overhead or decoding time. For example, some 256K-bit dynamic random access memories are organized as 32K x 8 bit-bytes. Byte-oriented codes such as Reed Solomon (RS) codes provide efficient low overhead error control for such memories. However, the standard iterative algorithm for decoding RS codes is too slow for these applications. Some special high speed decoding techniques for extended single and double error correcting RS codes. These techniques are designed to find the error locations and the error values directly from the syndrome without having to form the error locator polynomial and solve for its roots.
NASA Astrophysics Data System (ADS)
Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.
2015-10-01
Next-generation mesoscale numerical weather prediction system, the Weather Research and Forecasting (WRF) model, is a designed for dual use for forecasting and research. WRF offers multiple physics options that can be combined in any way. One of the physics options is radiance computation. The major source for energy for the earth's climate is solar radiation. Thus, it is imperative to accurately model horizontal and vertical distribution of the heating. Goddard solar radiative transfer model includes the absorption duo to water vapor,ozone, ozygen, carbon dioxide, clouds and aerosols. The model computes the interactions among the absorption and scattering by clouds, aerosols, molecules and surface. Finally, fluxes are integrated over the entire longwave spectrum.In this paper, we present our results of optimizing the Goddard longwave radiative transfer scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The coprocessor supports all important Intel development tools. Thus, the development environment is familiar one to a vast number of CPU developers. Although, getting a maximum performance out of MICs will require using some novel optimization techniques. Those optimization techniques are discusses in this paper. The optimizations improved the performance of the original Goddard longwave radiative transfer scheme on Xeon Phi 7120P by a factor of 2.2x. Furthermore, the same optimizations improved the performance of the Goddard longwave radiative transfer scheme on a dual socket configuration of eight core Intel Xeon E5-2670 CPUs by a factor of 2.1x compared to the original Goddard longwave radiative transfer scheme code.
MULTI-SCALE MODELING AND APPROXIMATION ASSISTED OPTIMIZATION OF BARE TUBE HEAT EXCHANGERS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bacellar, Daniel; Ling, Jiazhen; Aute, Vikrant
2014-01-01
Air-to-refrigerant heat exchangers are very common in air-conditioning, heat pump and refrigeration applications. In these heat exchangers, there is a great benefit in terms of size, weight, refrigerant charge and heat transfer coefficient, by moving from conventional channel sizes (~ 9mm) to smaller channel sizes (< 5mm). This work investigates new designs for air-to-refrigerant heat exchangers with tube outer diameter ranging from 0.5 to 2.0mm. The goal of this research is to develop and optimize the design of these heat exchangers and compare their performance with existing state of the art designs. The air-side performance of various tube bundle configurationsmore » are analyzed using a Parallel Parameterized CFD (PPCFD) technique. PPCFD allows for fast-parametric CFD analyses of various geometries with topology change. Approximation techniques drastically reduce the number of CFD evaluations required during optimization. Maximum Entropy Design method is used for sampling and Kriging method is used for metamodeling. Metamodels are developed for the air-side heat transfer coefficients and pressure drop as a function of tube-bundle dimensions and air velocity. The metamodels are then integrated with an air-to-refrigerant heat exchanger design code. This integration allows a multi-scale analysis of air-side performance heat exchangers including air-to-refrigerant heat transfer and phase change. Overall optimization is carried out using a multi-objective genetic algorithm. The optimal designs found can exhibit 50 percent size reduction, 75 percent decrease in air side pressure drop and doubled air heat transfer coefficients compared to a high performance compact micro channel heat exchanger with same capacity and flow rates.« less
Maximum-likelihood soft-decision decoding of block codes using the A* algorithm
NASA Technical Reports Server (NTRS)
Ekroot, L.; Dolinar, S.
1994-01-01
The A* algorithm finds the path in a finite depth binary tree that optimizes a function. Here, it is applied to maximum-likelihood soft-decision decoding of block codes where the function optimized over the codewords is the likelihood function of the received sequence given each codeword. The algorithm considers codewords one bit at a time, making use of the most reliable received symbols first and pursuing only the partially expanded codewords that might be maximally likely. A version of the A* algorithm for maximum-likelihood decoding of block codes has been implemented for block codes up to 64 bits in length. The efficiency of this algorithm makes simulations of codes up to length 64 feasible. This article details the implementation currently in use, compares the decoding complexity with that of exhaustive search and Viterbi decoding algorithms, and presents performance curves obtained with this implementation of the A* algorithm for several codes.
NASA Technical Reports Server (NTRS)
Rajpal, Sandeep; Rhee, Do Jun; Lin, Shu
1997-01-01
The first part of this paper presents a simple and systematic technique for constructing multidimensional M-ary phase shift keying (MMK) trellis coded modulation (TCM) codes. The construction is based on a multilevel concatenation approach in which binary convolutional codes with good free branch distances are used as the outer codes and block MPSK modulation codes are used as the inner codes (or the signal spaces). Conditions on phase invariance of these codes are derived and a multistage decoding scheme for these codes is proposed. The proposed technique can be used to construct good codes for both the additive white Gaussian noise (AWGN) and fading channels as is shown in the second part of this paper.
DROP: Detecting Return-Oriented Programming Malicious Code
NASA Astrophysics Data System (ADS)
Chen, Ping; Xiao, Hai; Shen, Xiaobin; Yin, Xinchun; Mao, Bing; Xie, Li
Return-Oriented Programming (ROP) is a new technique that helps the attacker construct malicious code mounted on x86/SPARC executables without any function call at all. Such technique makes the ROP malicious code contain no instruction, which is different from existing attacks. Moreover, it hides the malicious code in benign code. Thus, it circumvents the approaches that prevent control flow diversion outside legitimate regions (such as W ⊕ X ) and most malicious code scanning techniques (such as anti-virus scanners). However, ROP has its own intrinsic feature which is different from normal program design: (1) uses short instruction sequence ending in "ret", which is called gadget, and (2) executes the gadgets contiguously in specific memory space, such as standard GNU libc. Based on the features of the ROP malicious code, in this paper, we present a tool DROP, which is focused on dynamically detecting ROP malicious code. Preliminary experimental results show that DROP can efficiently detect ROP malicious code, and have no false positives and negatives.
3D tomographic reconstruction using geometrical models
NASA Astrophysics Data System (ADS)
Battle, Xavier L.; Cunningham, Gregory S.; Hanson, Kenneth M.
1997-04-01
We address the issue of reconstructing an object of constant interior density in the context of 3D tomography where there is prior knowledge about the unknown shape. We explore the direct estimation of the parameters of a chosen geometrical model from a set of radiographic measurements, rather than performing operations (segmentation for example) on a reconstructed volume. The inverse problem is posed in the Bayesian framework. A triangulated surface describes the unknown shape and the reconstruction is computed with a maximum a posteriori (MAP) estimate. The adjoint differentiation technique computes the derivatives needed for the optimization of the model parameters. We demonstrate the usefulness of the approach and emphasize the techniques of designing forward and adjoint codes. We use the system response of the University of Arizona Fast SPECT imager to illustrate this method by reconstructing the shape of a heart phantom.
Data Sciences Summer Institute Topology Optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Watts, Seth
DSSI_TOPOPT is a 2D topology optimization code that designs stiff structures made of a single linear elastic material and void space. The code generates a finite element mesh of a rectangular design domain on which the user specifies displacement and load boundary conditions. The code iteratively designs a structure that minimizes the compliance (maximizes the stiffness) of the structure under the given loading, subject to an upper bound on the amount of material used. Depending on user options, the code can evaluate the performance of a user-designed structure, or create a design from scratch. Output includes the finite element mesh,more » design, and visualizations of the design.« less
Cooperative optimization and their application in LDPC codes
NASA Astrophysics Data System (ADS)
Chen, Ke; Rong, Jian; Zhong, Xiaochun
2008-10-01
Cooperative optimization is a new way for finding global optima of complicated functions of many variables. The proposed algorithm is a class of message passing algorithms and has solid theory foundations. It can achieve good coding gains over the sum-product algorithm for LDPC codes. For (6561, 4096) LDPC codes, the proposed algorithm can achieve 2.0 dB gains over the sum-product algorithm at BER of 4×10-7. The decoding complexity of the proposed algorithm is lower than the sum-product algorithm can do; furthermore, the former can achieve much lower error floor than the latter can do after the Eb / No is higher than 1.8 dB.
MILC Code Performance on High End CPU and GPU Supercomputer Clusters
NASA Astrophysics Data System (ADS)
DeTar, Carleton; Gottlieb, Steven; Li, Ruizi; Toussaint, Doug
2018-03-01
With recent developments in parallel supercomputing architecture, many core, multi-core, and GPU processors are now commonplace, resulting in more levels of parallelism, memory hierarchy, and programming complexity. It has been necessary to adapt the MILC code to these new processors starting with NVIDIA GPUs, and more recently, the Intel Xeon Phi processors. We report on our efforts to port and optimize our code for the Intel Knights Landing architecture. We consider performance of the MILC code with MPI and OpenMP, and optimizations with QOPQDP and QPhiX. For the latter approach, we concentrate on the staggered conjugate gradient and gauge force. We also consider performance on recent NVIDIA GPUs using the QUDA library.
Fast H.264/AVC FRExt intra coding using belief propagation.
Milani, Simone
2011-01-01
In the H.264/AVC FRExt coder, the coding performance of Intra coding significantly overcomes the previous still image coding standards, like JPEG2000, thanks to a massive use of spatial prediction. Unfortunately, the adoption of an extensive set of predictors induces a significant increase of the computational complexity required by the rate-distortion optimization routine. The paper presents a complexity reduction strategy that aims at reducing the computational load of the Intra coding with a small loss in the compression performance. The proposed algorithm relies on selecting a reduced set of prediction modes according to their probabilities, which are estimated adopting a belief-propagation procedure. Experimental results show that the proposed method permits saving up to 60 % of the coding time required by an exhaustive rate-distortion optimization method with a negligible loss in performance. Moreover, it permits an accurate control of the computational complexity unlike other methods where the computational complexity depends upon the coded sequence.
Haylen, Bernard T; Lee, Joseph; Maher, Chris; Deprest, Jan; Freeman, Robert
2014-06-01
Results of interobserver reliability studies for the International Urogynecological Association-International Continence Society (IUGA-ICS) Complication Classification coding can be greatly influenced by study design factors such as participant instruction, motivation, and test-question clarity. We attempted to optimize these factors. After a 15-min instructional lecture with eight clinical case examples (including images) and with classification/coding charts available, those clinicians attending an IUGA Surgical Complications workshop were presented with eight similar-style test cases over 10 min and asked to code them using the Category, Time and Site classification. Answers were compared to predetermined correct codes obtained by five instigators of the IUGA-ICS prostheses and grafts complications classification. Prelecture and postquiz participant confidence levels using a five-step Likert scale were assessed. Complete sets of answers to the questions (24 codings) were provided by 34 respondents, only three of whom reported prior use of the charts. Average score [n (%)] out of eight, as well as median score (range) for each coding category were: (i) Category: 7.3 (91 %); 7 (4-8); (ii) Time: 7.8 (98 %); 7 (6-8); (iii) Site: 7.2 (90 %); 7 (5-8). Overall, the equivalent calculations (out of 24) were 22.3 (93 %) and 22 (18-24). Mean prelecture confidence was 1.37 (out of 5), rising to 3.85 postquiz. Urogynecologists had the highest correlation with correct coding, followed closely by fellows and general gynecologists. Optimizing training and study design can lead to excellent results for interobserver reliability of the IUGA-ICS Complication Classification coding, with increased participant confidence in complication-coding ability.
Reliable Adaptive Video Streaming Driven by Perceptual Semantics for Situational Awareness
Pimentel-Niño, M. A.; Saxena, Paresh; Vazquez-Castro, M. A.
2015-01-01
A novel cross-layer optimized video adaptation driven by perceptual semantics is presented. The design target is streamed live video to enhance situational awareness in challenging communications conditions. Conventional solutions for recreational applications are inadequate and novel quality of experience (QoE) framework is proposed which allows fully controlled adaptation and enables perceptual semantic feedback. The framework relies on temporal/spatial abstraction for video applications serving beyond recreational purposes. An underlying cross-layer optimization technique takes into account feedback on network congestion (time) and erasures (space) to best distribute available (scarce) bandwidth. Systematic random linear network coding (SRNC) adds reliability while preserving perceptual semantics. Objective metrics of the perceptual features in QoE show homogeneous high performance when using the proposed scheme. Finally, the proposed scheme is in line with content-aware trends, by complying with information-centric-networking philosophy and architecture. PMID:26247057
Parallel performance optimizations on unstructured mesh-based simulations
Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas; ...
2015-06-01
This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches.more » We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.« less
SOL - SIZING AND OPTIMIZATION LANGUAGE COMPILER
NASA Technical Reports Server (NTRS)
Scotti, S. J.
1994-01-01
SOL is a computer language which is geared to solving design problems. SOL includes the mathematical modeling and logical capabilities of a computer language like FORTRAN but also includes the additional power of non-linear mathematical programming methods (i.e. numerical optimization) at the language level (as opposed to the subroutine level). The language-level use of optimization has several advantages over the traditional, subroutine-calling method of using an optimizer: first, the optimization problem is described in a concise and clear manner which closely parallels the mathematical description of optimization; second, a seamless interface is automatically established between the optimizer subroutines and the mathematical model of the system being optimized; third, the results of an optimization (objective, design variables, constraints, termination criteria, and some or all of the optimization history) are output in a form directly related to the optimization description; and finally, automatic error checking and recovery from an ill-defined system model or optimization description is facilitated by the language-level specification of the optimization problem. Thus, SOL enables rapid generation of models and solutions for optimum design problems with greater confidence that the problem is posed correctly. The SOL compiler takes SOL-language statements and generates the equivalent FORTRAN code and system calls. Because of this approach, the modeling capabilities of SOL are extended by the ability to incorporate existing FORTRAN code into a SOL program. In addition, SOL has a powerful MACRO capability. The MACRO capability of the SOL compiler effectively gives the user the ability to extend the SOL language and can be used to develop easy-to-use shorthand methods of generating complex models and solution strategies. The SOL compiler provides syntactic and semantic error-checking, error recovery, and detailed reports containing cross-references to show where each variable was used. The listings summarize all optimizations, listing the objective functions, design variables, and constraints. The compiler offers error-checking specific to optimization problems, so that simple mistakes will not cost hours of debugging time. The optimization engine used by and included with the SOL compiler is a version of Vanderplatt's ADS system (Version 1.1) modified specifically to work with the SOL compiler. SOL allows the use of the over 100 ADS optimization choices such as Sequential Quadratic Programming, Modified Feasible Directions, interior and exterior penalty function and variable metric methods. Default choices of the many control parameters of ADS are made for the user, however, the user can override any of the ADS control parameters desired for each individual optimization. The SOL language and compiler were developed with an advanced compiler-generation system to ensure correctness and simplify program maintenance. Thus, SOL's syntax was defined precisely by a LALR(1) grammar and the SOL compiler's parser was generated automatically from the LALR(1) grammar with a parser-generator. Hence unlike ad hoc, manually coded interfaces, the SOL compiler's lexical analysis insures that the SOL compiler recognizes all legal SOL programs, can recover from and correct for many errors and report the location of errors to the user. This version of the SOL compiler has been implemented on VAX/VMS computer systems and requires 204 KB of virtual memory to execute. Since the SOL compiler produces FORTRAN code, it requires the VAX FORTRAN compiler to produce an executable program. The SOL compiler consists of 13,000 lines of Pascal code. It was developed in 1986 and last updated in 1988. The ADS and other utility subroutines amount to 14,000 lines of FORTRAN code and were also updated in 1988.
ODECS -- A computer code for the optimal design of S.I. engine control strategies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arsie, I.; Pianese, C.; Rizzo, G.
1996-09-01
The computer code ODECS (Optimal Design of Engine Control Strategies) for the design of Spark Ignition engine control strategies is presented. This code has been developed starting from the author`s activity in this field, availing of some original contributions about engine stochastic optimization and dynamical models. This code has a modular structure and is composed of a user interface for the definition, the execution and the analysis of different computations performed with 4 independent modules. These modules allow the following calculations: (1) definition of the engine mathematical model from steady-state experimental data; (2) engine cycle test trajectory corresponding to amore » vehicle transient simulation test such as ECE15 or FTP drive test schedule; (3) evaluation of the optimal engine control maps with a steady-state approach; (4) engine dynamic cycle simulation and optimization of static control maps and/or dynamic compensation strategies, taking into account dynamical effects due to the unsteady fluxes of air and fuel and the influences of combustion chamber wall thermal inertia on fuel consumption and emissions. Moreover, in the last two modules it is possible to account for errors generated by a non-deterministic behavior of sensors and actuators and the related influences on global engine performances, and compute robust strategies, less sensitive to stochastic effects. In the paper the four models are described together with significant results corresponding to the simulation and the calculation of optimal control strategies for dynamic transient tests.« less
NASA Astrophysics Data System (ADS)
Doi, Masafumi; Tokutomi, Tsukasa; Hachiya, Shogo; Kobayashi, Atsuro; Tanakamaru, Shuhei; Ning, Sheyang; Ogura Iwasaki, Tomoko; Takeuchi, Ken
2016-08-01
NAND flash memory’s reliability degrades with increasing endurance, retention-time and/or temperature. After a comprehensive evaluation of 1X nm triple-level cell (TLC) NAND flash, two highly reliable techniques are proposed. The first proposal, quick low-density parity check (Quick-LDPC), requires only one cell read in order to accurately estimate a bit-error rate (BER) that includes the effects of temperature, write and erase (W/E) cycles and retention-time. As a result, 83% read latency reduction is achieved compared to conventional AEP-LDPC. Also, W/E cycling is extended by 100% compared with conventional Bose-Chaudhuri-Hocquenghem (BCH) error-correcting code (ECC). The second proposal, dynamic threshold voltage optimization (DVO) has two parts, adaptive V Ref shift (AVS) and V TH space control (VSC). AVS reduces read error and latency by adaptively optimizing the reference voltage (V Ref) based on temperature, W/E cycles and retention-time. AVS stores the optimal V Ref’s in a table in order to enable one cell read. VSC further improves AVS by optimizing the voltage margins between V TH states. DVO reduces BER by 80%.
Visual pattern image sequence coding
NASA Technical Reports Server (NTRS)
Silsbee, Peter; Bovik, Alan C.; Chen, Dapang
1990-01-01
The visual pattern image coding (VPIC) configurable digital image-coding process is capable of coding with visual fidelity comparable to the best available techniques, at compressions which (at 30-40:1) exceed all other technologies. These capabilities are associated with unprecedented coding efficiencies; coding and decoding operations are entirely linear with respect to image size and entail a complexity that is 1-2 orders of magnitude faster than any previous high-compression technique. The visual pattern image sequence coding to which attention is presently given exploits all the advantages of the static VPIC in the reduction of information from an additional, temporal dimension, to achieve unprecedented image sequence coding performance.
Decoder synchronization for deep space missions
NASA Technical Reports Server (NTRS)
Statman, J. I.; Cheung, K.-M.; Chauvin, T. H.; Rabkin, J.; Belongie, M. L.
1994-01-01
The Consultative Committee for Space Data Standards (CCSDS) recommends that space communication links employ a concatenated, error-correcting, channel-coding system in which the inner code is a convolutional (7,1/2) code and the outer code is a (255,223) Reed-Solomon code. The traditional implementation is to perform the node synchronization for the Viterbi decoder and the frame synchronization for the Reed-Solomon decoder as separate, sequential operations. This article discusses a unified synchronization technique that is required for deep space missions that have data rates and signal-to-noise ratios (SNR's) that are extremely low. This technique combines frame synchronization in the bit and symbol domains and traditional accumulated-metric growth techniques to establish a joint frame and node synchronization. A variation on this technique is used for the Galileo spacecraft on its Jupiter-bound mission.
Optimizing zonal advection of the Advanced Research WRF (ARW) dynamics for Intel MIC
NASA Astrophysics Data System (ADS)
Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.
2014-10-01
The Weather Research and Forecast (WRF) model is the most widely used community weather forecast and research model in the world. There are two distinct varieties of WRF. The Advanced Research WRF (ARW) is an experimental, advanced research version featuring very high resolution. The WRF Nonhydrostatic Mesoscale Model (WRF-NMM) has been designed for forecasting operations. WRF consists of dynamics code and several physics modules. The WRF-ARW core is based on an Eulerian solver for the fully compressible nonhydrostatic equations. In the paper, we will use Intel Intel Many Integrated Core (MIC) architecture to substantially increase the performance of a zonal advection subroutine for optimization. It is of the most time consuming routines in the ARW dynamics core. Advection advances the explicit perturbation horizontal momentum equations by adding in the large-timestep tendency along with the small timestep pressure gradient tendency. We will describe the challenges we met during the development of a high-speed dynamics code subroutine for MIC architecture. Furthermore, lessons learned from the code optimization process will be discussed. The results show that the optimizations improved performance of the original code on Xeon Phi 5110P by a factor of 2.4x.
NASA Astrophysics Data System (ADS)
Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.-L.
2015-05-01
The most widely used community weather forecast and research model in the world is the Weather Research and Forecast (WRF) model. Two distinct varieties of WRF exist. The one we are interested is the Advanced Research WRF (ARW) is an experimental, advanced research version featuring very high resolution. The WRF Nonhydrostatic Mesoscale Model (WRF-NMM) has been designed for forecasting operations. WRF consists of dynamics code and several physics modules. The WRF-ARW core is based on an Eulerian solver for the fully compressible nonhydrostatic equations. In the paper, we optimize a meridional (north-south direction) advection subroutine for Intel Xeon Phi coprocessor. Advection is of the most time consuming routines in the ARW dynamics core. It advances the explicit perturbation horizontal momentum equations by adding in the large-timestep tendency along with the small timestep pressure gradient tendency. We will describe the challenges we met during the development of a high-speed dynamics code subroutine for MIC architecture. Furthermore, lessons learned from the code optimization process will be discussed. The results show that the optimizations improved performance of the original code on Xeon Phi 7120P by a factor of 1.2x.
Improved Speech Coding Based on Open-Loop Parameter Estimation
NASA Technical Reports Server (NTRS)
Juang, Jer-Nan; Chen, Ya-Chin; Longman, Richard W.
2000-01-01
A nonlinear optimization algorithm for linear predictive speech coding was developed early that not only optimizes the linear model coefficients for the open loop predictor, but does the optimization including the effects of quantization of the transmitted residual. It also simultaneously optimizes the quantization levels used for each speech segment. In this paper, we present an improved method for initialization of this nonlinear algorithm, and demonstrate substantial improvements in performance. In addition, the new procedure produces monotonically improving speech quality with increasing numbers of bits used in the transmitted error residual. Examples of speech encoding and decoding are given for 8 speech segments and signal to noise levels as high as 47 dB are produced. As in typical linear predictive coding, the optimization is done on the open loop speech analysis model. Here we demonstrate that minimizing the error of the closed loop speech reconstruction, instead of the simpler open loop optimization, is likely to produce negligible improvement in speech quality. The examples suggest that the algorithm here is close to giving the best performance obtainable from a linear model, for the chosen order with the chosen number of bits for the codebook.
Circular codes revisited: a statistical approach.
Gonzalez, D L; Giannerini, S; Rosa, R
2011-04-21
In 1996 Arquès and Michel [1996. A complementary circular code in the protein coding genes. J. Theor. Biol. 182, 45-58] discovered the existence of a common circular code in eukaryote and prokaryote genomes. Since then, circular code theory has provoked great interest and underwent a rapid development. In this paper we discuss some theoretical issues related to the synchronization properties of coding sequences and circular codes with particular emphasis on the problem of retrieval and maintenance of the reading frame. Motivated by the theoretical discussion, we adopt a rigorous statistical approach in order to try to answer different questions. First, we investigate the covering capability of the whole class of 216 self-complementary, C(3) maximal codes with respect to a large set of coding sequences. The results indicate that, on average, the code proposed by Arquès and Michel has the best covering capability but, still, there exists a great variability among sequences. Second, we focus on such code and explore the role played by the proportion of the bases by means of a hierarchy of permutation tests. The results show the existence of a sort of optimization mechanism such that coding sequences are tailored as to maximize or minimize the coverage of circular codes on specific reading frames. Such optimization clearly relates the function of circular codes with reading frame synchronization. Copyright © 2011 Elsevier Ltd. All rights reserved.
Continuous welding of unidirectional fiber reinforced thermoplastic tape material
NASA Astrophysics Data System (ADS)
Schledjewski, Ralf
2017-10-01
Continuous welding techniques like thermoplastic tape placement with in situ consolidation offer several advantages over traditional manufacturing processes like autoclave consolidation, thermoforming, etc. However, still there is a need to solve several important processing issues before it becomes a viable economic process. Intensive process analysis and optimization has been carried out in the past through experimental investigation, model definition and simulation development. Today process simulation is capable to predict resulting consolidation quality. Effects of material imperfections or process parameter variations are well known. But using this knowledge to control the process based on online process monitoring and according adaption of the process parameters is still challenging. Solving inverse problems and using methods for automated code generation allowing fast implementation of algorithms on targets are required. The paper explains the placement technique in general. Process-material-property-relationships and typical material imperfections are described. Furthermore, online monitoring techniques and how to use them for a model based process control system are presented.
3D Modeling of Ultrasonic Wave Interaction with Disbonds and Weak Bonds
NASA Technical Reports Server (NTRS)
Leckey, C.; Hinders, M.
2011-01-01
Ultrasonic techniques, such as the use of guided waves, can be ideal for finding damage in the plate and pipe-like structures used in aerospace applications. However, the interaction of waves with real flaw types and geometries can lead to experimental signals that are difficult to interpret. 3-dimensional (3D) elastic wave simulations can be a powerful tool in understanding the complicated wave scattering involved in flaw detection and for optimizing experimental techniques. We have developed and implemented parallel 3D elastodynamic finite integration technique (3D EFIT) code to investigate Lamb wave scattering from realistic flaws. This paper discusses simulation results for an aluminum-aluminum diffusion disbond and an aluminum-epoxy disbond and compares results from the disbond case to the common artificial flaw type of a flat-bottom hole. The paper also discusses the potential for extending the 3D EFIT equations to incorporate physics-based weak bond models for simulating wave scattering from weak adhesive bonds.
NASA Astrophysics Data System (ADS)
Ciaramello, Francis M.; Hemami, Sheila S.
2007-02-01
For members of the Deaf Community in the United States, current communication tools include TTY/TTD services, video relay services, and text-based communication. With the growth of cellular technology, mobile sign language conversations are becoming a possibility. Proper coding techniques must be employed to compress American Sign Language (ASL) video for low-rate transmission while maintaining the quality of the conversation. In order to evaluate these techniques, an appropriate quality metric is needed. This paper demonstrates that traditional video quality metrics, such as PSNR, fail to predict subjective intelligibility scores. By considering the unique structure of ASL video, an appropriate objective metric is developed. Face and hand segmentation is performed using skin-color detection techniques. The distortions in the face and hand regions are optimally weighted and pooled across all frames to create an objective intelligibility score for a distorted sequence. The objective intelligibility metric performs significantly better than PSNR in terms of correlation with subjective responses.
On entanglement-assisted quantum codes achieving the entanglement-assisted Griesmer bound
NASA Astrophysics Data System (ADS)
Li, Ruihu; Li, Xueliang; Guo, Luobin
2015-12-01
The theory of entanglement-assisted quantum error-correcting codes (EAQECCs) is a generalization of the standard stabilizer formalism. Any quaternary (or binary) linear code can be used to construct EAQECCs under the entanglement-assisted (EA) formalism. We derive an EA-Griesmer bound for linear EAQECCs, which is a quantum analog of the Griesmer bound for classical codes. This EA-Griesmer bound is tighter than known bounds for EAQECCs in the literature. For a given quaternary linear code {C}, we show that the parameters of the EAQECC that EA-stabilized by the dual of {C} can be determined by a zero radical quaternary code induced from {C}, and a necessary condition under which a linear EAQECC may achieve the EA-Griesmer bound is also presented. We construct four families of optimal EAQECCs and then show the necessary condition for existence of EAQECCs is also sufficient for some low-dimensional linear EAQECCs. The four families of optimal EAQECCs are degenerate codes and go beyond earlier constructions. What is more, except four codes, our [[n,k,d_{ea};c
Advanced GF(32) nonbinary LDPC coded modulation with non-uniform 9-QAM outperforming star 8-QAM.
Liu, Tao; Lin, Changyu; Djordjevic, Ivan B
2016-06-27
In this paper, we first describe a 9-symbol non-uniform signaling scheme based on Huffman code, in which different symbols are transmitted with different probabilities. By using the Huffman procedure, prefix code is designed to approach the optimal performance. Then, we introduce an algorithm to determine the optimal signal constellation sets for our proposed non-uniform scheme with the criterion of maximizing constellation figure of merit (CFM). The proposed nonuniform polarization multiplexed signaling 9-QAM scheme has the same spectral efficiency as the conventional 8-QAM. Additionally, we propose a specially designed GF(32) nonbinary quasi-cyclic LDPC code for the coded modulation system based on the 9-QAM non-uniform scheme. Further, we study the efficiency of our proposed non-uniform 9-QAM, combined with nonbinary LDPC coding, and demonstrate by Monte Carlo simulation that the proposed GF(23) nonbinary LDPC coded 9-QAM scheme outperforms nonbinary LDPC coded uniform 8-QAM by at least 0.8dB.
Some Practical Universal Noiseless Coding Techniques
NASA Technical Reports Server (NTRS)
Rice, Robert F.
1994-01-01
Report discusses noiseless data-compression-coding algorithms, performance characteristics and practical consideration in implementation of algorithms in coding modules composed of very-large-scale integrated circuits. Report also has value as tutorial document on data-compression-coding concepts. Coding techniques and concepts in question "universal" in sense that, in principle, applicable to streams of data from variety of sources. However, discussion oriented toward compression of high-rate data generated by spaceborne sensors for lower-rate transmission back to earth.
Network Coding in Relay-based Device-to-Device Communications
Huang, Jun; Gharavi, Hamid; Yan, Huifang; Xing, Cong-cong
2018-01-01
Device-to-Device (D2D) communications has been realized as an effective means to improve network throughput, reduce transmission latency, and extend cellular coverage in 5G systems. Network coding is a well-established technique known for its capability to reduce the number of retransmissions. In this article, we review state-of-the-art network coding in relay-based D2D communications, in terms of application scenarios and network coding techniques. We then apply two representative network coding techniques to dual-hop D2D communications and present an efficient relay node selecting mechanism as a case study. We also outline potential future research directions, according to the current research challenges. Our intention is to provide researchers and practitioners with a comprehensive overview of the current research status in this area and hope that this article may motivate more researchers to participate in developing network coding techniques for different relay-based D2D communications scenarios. PMID:29503504
Haji-Saeed, B; Sengupta, S K; Testorf, M; Goodhue, W; Khoury, J; Woods, C L; Kierstead, J
2006-05-10
We propose and demonstrate a new photorefractive real-time holographic deconvolution technique for adaptive one-way image transmission through aberrating media by means of four-wave mixing. In contrast with earlier methods, which typically required various codings of the exact phase or two-way image transmission for correcting phase distortion, our technique relies on one-way image transmission through the use of exact phase information. Our technique can simultaneously correct both amplitude and phase distortions. We include several forms of image degradation, various test cases, and experimental results. We characterize the performance as a function of the input beam ratios for four metrics: signal-to-noise ratio, normalized root-mean-square error, edge restoration, and peak-to-total energy ratio. In our characterization we use false-color graphic images to display the best beam-intensity ratio two-dimensional region(s) for each of these metrics. Test cases are simulated at the optimal values of the beam-intensity ratios. We demonstrate our results through both experiment and computer simulation.
Parallelisation study of a three-dimensional environmental flow model
NASA Astrophysics Data System (ADS)
O'Donncha, Fearghal; Ragnoli, Emanuele; Suits, Frank
2014-03-01
There are many simulation codes in the geosciences that are serial and cannot take advantage of the parallel computational resources commonly available today. One model important for our work in coastal ocean current modelling is EFDC, a Fortran 77 code configured for optimal deployment on vector computers. In order to take advantage of our cache-based, blade computing system we restructured EFDC from serial to parallel, thereby allowing us to run existing models more quickly, and to simulate larger and more detailed models that were previously impractical. Since the source code for EFDC is extensive and involves detailed computation, it is important to do such a port in a manner that limits changes to the files, while achieving the desired speedup. We describe a parallelisation strategy involving surgical changes to the source files to minimise error-prone alteration of the underlying computations, while allowing load-balanced domain decomposition for efficient execution on a commodity cluster. The use of conjugate gradient posed particular challenges due to implicit non-local communication posing a hindrance to standard domain partitioning schemes; a number of techniques are discussed to address this in a feasible, computationally efficient manner. The parallel implementation demonstrates good scalability in combination with a novel domain partitioning scheme that specifically handles mixed water/land regions commonly found in coastal simulations. The approach presented here represents a practical methodology to rejuvenate legacy code on a commodity blade cluster with reasonable effort; our solution has direct application to other similar codes in the geosciences.
Spatial coding-based approach for partitioning big spatial data in Hadoop
NASA Astrophysics Data System (ADS)
Yao, Xiaochuang; Mokbel, Mohamed F.; Alarabi, Louai; Eldawy, Ahmed; Yang, Jianyu; Yun, Wenju; Li, Lin; Ye, Sijing; Zhu, Dehai
2017-09-01
Spatial data partitioning (SDP) plays a powerful role in distributed storage and parallel computing for spatial data. However, due to skew distribution of spatial data and varying volume of spatial vector objects, it leads to a significant challenge to ensure both optimal performance of spatial operation and data balance in the cluster. To tackle this problem, we proposed a spatial coding-based approach for partitioning big spatial data in Hadoop. This approach, firstly, compressed the whole big spatial data based on spatial coding matrix to create a sensing information set (SIS), including spatial code, size, count and other information. SIS was then employed to build spatial partitioning matrix, which was used to spilt all spatial objects into different partitions in the cluster finally. Based on our approach, the neighbouring spatial objects can be partitioned into the same block. At the same time, it also can minimize the data skew in Hadoop distributed file system (HDFS). The presented approach with a case study in this paper is compared against random sampling based partitioning, with three measurement standards, namely, the spatial index quality, data skew in HDFS, and range query performance. The experimental results show that our method based on spatial coding technique can improve the query performance of big spatial data, as well as the data balance in HDFS. We implemented and deployed this approach in Hadoop, and it is also able to support efficiently any other distributed big spatial data systems.
NASA Astrophysics Data System (ADS)
Weiss, Armin; Geisler, Reinhard; Schwermer, Till; Yorita, Daisuke; Henne, Ulrich; Klein, Christian; Raffel, Markus
2017-09-01
A pressure-sensitive paint (PSP) system is presented to measure global surface pressures on fast rotating blades. It is dedicated to solve the problem of blurred image data employing the single-shot lifetime method. The efficient blur reduction capability of an optimized double-shutter imaging technique is demonstrated omitting error-prone post-processing or laborious de-rotation setups. The system is applied on Mach-scaled DSA-9A helicopter blades in climb at various collective pitch settings and blade tip Mach and chord Reynolds numbers (M_{ {tip}} = 0.29-0.57; Re_{ {tip}} = 4.63-9.26 × 10^5). Temperature effects in the PSP are corrected by a theoretical approximation validated against measured temperatures using temperature-sensitive paint (TSP) on a separate blade. Ensemble-averaged PSP results are comparable to pressure-tap data on the same blade to within 250 Pa. Resulting pressure maps on the blade suction side reveal spatially high resolved flow features such as the leading edge suction peak, footprints of blade-tip vortices and evidence of laminar-turbulent boundary-layer (BL) transition. The findings are validated by a separately conducted BL transition measurement by means of TSP and numerical simulations using a 2D coupled Euler/boundary-layer code. Moreover, the principal ability of the single-shot technique to capture unsteady flow phenomena is stressed revealing three-dimensional pressure fluctuations at stall.
Language Recognition via Sparse Coding
2016-09-08
a posteriori (MAP) adaptation scheme that further optimizes the discriminative quality of sparse-coded speech fea - tures. We empirically validate the...significantly improve the discriminative quality of sparse-coded speech fea - tures. In Section 4, we evaluate the proposed approaches against an i-vector
Reliability enhancement of Navier-Stokes codes through convergence enhancement
NASA Technical Reports Server (NTRS)
Choi, K.-Y.; Dulikravich, G. S.
1993-01-01
Reduction of total computing time required by an iterative algorithm for solving Navier-Stokes equations is an important aspect of making the existing and future analysis codes more cost effective. Several attempts have been made to accelerate the convergence of an explicit Runge-Kutta time-stepping algorithm. These acceleration methods are based on local time stepping, implicit residual smoothing, enthalpy damping, and multigrid techniques. Also, an extrapolation procedure based on the power method and the Minimal Residual Method (MRM) were applied to the Jameson's multigrid algorithm. The MRM uses same values of optimal weights for the corrections to every equation in a system and has not been shown to accelerate the scheme without multigriding. Our Distributed Minimal Residual (DMR) method based on our General Nonlinear Minimal Residual (GNLMR) method allows each component of the solution vector in a system of equations to have its own convergence speed. The DMR method was found capable of reducing the computation time by 10-75 percent depending on the test case and grid used. Recently, we have developed and tested a new method termed Sensitivity Based DMR or SBMR method that is easier to implement in different codes and is even more robust and computationally efficient than our DMR method.
Reliability enhancement of Navier-Stokes codes through convergence enhancement
NASA Astrophysics Data System (ADS)
Choi, K.-Y.; Dulikravich, G. S.
1993-11-01
Reduction of total computing time required by an iterative algorithm for solving Navier-Stokes equations is an important aspect of making the existing and future analysis codes more cost effective. Several attempts have been made to accelerate the convergence of an explicit Runge-Kutta time-stepping algorithm. These acceleration methods are based on local time stepping, implicit residual smoothing, enthalpy damping, and multigrid techniques. Also, an extrapolation procedure based on the power method and the Minimal Residual Method (MRM) were applied to the Jameson's multigrid algorithm. The MRM uses same values of optimal weights for the corrections to every equation in a system and has not been shown to accelerate the scheme without multigriding. Our Distributed Minimal Residual (DMR) method based on our General Nonlinear Minimal Residual (GNLMR) method allows each component of the solution vector in a system of equations to have its own convergence speed. The DMR method was found capable of reducing the computation time by 10-75 percent depending on the test case and grid used. Recently, we have developed and tested a new method termed Sensitivity Based DMR or SBMR method that is easier to implement in different codes and is even more robust and computationally efficient than our DMR method.
Guo, Qi; Tran, An V
2012-12-17
In this paper, we investigate the transmission impairments in a high-speed single-feeder wavelength-division-multiplexed passive optical network (WDM-PON) employing low-bandwidth upstream transmitter. A 1-GHz reflective semiconductor optical amplifier (RSOA) is operated at the rates of 10 Gb/s and 20 Gb/s in the proposed WDM-PON. Since the system performance is seriously limited by its uplink in both capacity and reach owing to inter-symbol interference and reflection noise, we present a novel technique with simultaneous capability of spectral efficiency enhancement and transmission distance extension in the uplink via coding and equalization that exploit the principles of partial-response (PR) signal. It is experimentally demonstrated that the proposed system supports the delivery of 10 Gb/s and 20 Gb/s upstream signals over 75-km and 25-km bidirectional fiber, respectively. The configuration of PR equalizer is optimized for its best performance-complexity trade-off. The reflection tolerance of 10 Gb/s and 20 Gb/s channels is improved by 8 dB and 6 dB, respectively, with PR coding. The proposed cost-effective signal processing scheme has great potential for the next-generation access networks.
Improved image decompression for reduced transform coding artifacts
NASA Technical Reports Server (NTRS)
Orourke, Thomas P.; Stevenson, Robert L.
1994-01-01
The perceived quality of images reconstructed from low bit rate compression is severely degraded by the appearance of transform coding artifacts. This paper proposes a method for producing higher quality reconstructed images based on a stochastic model for the image data. Quantization (scalar or vector) partitions the transform coefficient space and maps all points in a partition cell to a representative reconstruction point, usually taken as the centroid of the cell. The proposed image estimation technique selects the reconstruction point within the quantization partition cell which results in a reconstructed image which best fits a non-Gaussian Markov random field (MRF) image model. This approach results in a convex constrained optimization problem which can be solved iteratively. At each iteration, the gradient projection method is used to update the estimate based on the image model. In the transform domain, the resulting coefficient reconstruction points are projected to the particular quantization partition cells defined by the compressed image. Experimental results will be shown for images compressed using scalar quantization of block DCT and using vector quantization of subband wavelet transform. The proposed image decompression provides a reconstructed image with reduced visibility of transform coding artifacts and superior perceived quality.
Throughput Optimization Via Adaptive MIMO Communications
2006-05-30
End-to-end matlab packet simulation platform. * Low density parity check code (LDPCC). * Field trials with Silvus DSP MIMO testbed. * High mobility...incorporate advanced LDPC (low density parity check) codes . Realizing that the power of LDPC codes come at the price of decoder complexity, we also...Channel Coding Binary Convolution Code or LDPC Packet Length 0 - 216-1, bytes Coding Rate 1/2, 2/3, 3/4, 5/6 MIMO Channel Training Length 0 - 4, symbols
Analysis of a hypersonic waverider research vehicle with a hydrocarbon scramjet engine
NASA Technical Reports Server (NTRS)
Molvik, Gregory A.; Bowles, Jeffrey V.; Huynh, Loc C.
1993-01-01
The results of a feasibility study of a hypersonic waverider research vehicle with a hydrocarbon scramjet engine are presented. The integrated waverider/scramjet geometry is first optimized with a vehicle synthesis code to produce a maximum product of the lift-to-drag ratio and the cycle specific impulse, hence cruise range. Computational fluid dynamics (CFD) is then employed to provide a nose-to-tail analysis of the system at the on-design conditions. Some differences are noted between the results of the two analysis techniques. A comparison of experimental, engineering analysis and CFD results on a waverider forebody are also included for validation.
NASA Astrophysics Data System (ADS)
Li, Yan; Li, Lin; Huang, Yi-Fan; Du, Bao-Lin
2009-07-01
This paper analyses the dynamic residual aberrations of a conformal optical system and introduces adaptive optics (AO) correction technology to this system. The image sharpening AO system is chosen as the correction scheme. Communication between MATLAB and Code V is established via ActiveX technique in computer simulation. The SPGD algorithm is operated at seven zoom positions to calculate the optimized surface shape of the deformable mirror. After comparison of performance of the corrected system with the baseline system, AO technology is proved to be a good way of correcting the dynamic residual aberration in conformal optical design.
Filtered epithermal quasi-monoenergetic neutron beams at research reactor facilities.
Mansy, M S; Bashter, I I; El-Mesiry, M S; Habib, N; Adib, M
2015-03-01
Filtered neutron techniques were applied to produce quasi-monoenergetic neutron beams in the energy range of 1.5-133keV at research reactors. A simulation study was performed to characterize the filter components and transmitted beam lines. The filtered beams were characterized in terms of the optimal thickness of the main and additive components. The filtered neutron beams had high purity and intensity, with low contamination from the accompanying thermal emission, fast neutrons and γ-rays. A computer code named "QMNB" was developed in the "MATLAB" programming language to perform the required calculations. Copyright © 2014 Elsevier Ltd. All rights reserved.
Optimal Divergence-Free Hatch Filter for GNSS Single-Frequency Measurement.
Park, Byungwoon; Lim, Cheolsoon; Yun, Youngsun; Kim, Euiho; Kee, Changdon
2017-02-24
The Hatch filter is a code-smoothing technique that uses the variation of the carrier phase. It can effectively reduce the noise of a pseudo-range with a very simple filter construction, but it occasionally causes an ionosphere-induced error for low-lying satellites. Herein, we propose an optimal single-frequency (SF) divergence-free Hatch filter that uses a satellite-based augmentation system (SBAS) message to reduce the ionospheric divergence and applies the optimal smoothing constant for its smoothing window width. According to the data-processing results, the overall performance of the proposed filter is comparable to that of the dual frequency (DF) divergence-free Hatch filter. Moreover, it can reduce the horizontal error of 57 cm to 37 cm and improve the vertical accuracy of the conventional Hatch filter by 25%. Considering that SF receivers dominate the global navigation satellite system (GNSS) market and that most of these receivers include the SBAS function, the filter suggested in this paper is of great value in that it can make the differential GPS (DGPS) performance of the low-cost SF receivers comparable to that of DF receivers.
Optimal Divergence-Free Hatch Filter for GNSS Single-Frequency Measurement
Park, Byungwoon; Lim, Cheolsoon; Yun, Youngsun; Kim, Euiho; Kee, Changdon
2017-01-01
The Hatch filter is a code-smoothing technique that uses the variation of the carrier phase. It can effectively reduce the noise of a pseudo-range with a very simple filter construction, but it occasionally causes an ionosphere-induced error for low-lying satellites. Herein, we propose an optimal single-frequency (SF) divergence-free Hatch filter that uses a satellite-based augmentation system (SBAS) message to reduce the ionospheric divergence and applies the optimal smoothing constant for its smoothing window width. According to the data-processing results, the overall performance of the proposed filter is comparable to that of the dual frequency (DF) divergence-free Hatch filter. Moreover, it can reduce the horizontal error of 57 cm to 37 cm and improve the vertical accuracy of the conventional Hatch filter by 25%. Considering that SF receivers dominate the global navigation satellite system (GNSS) market and that most of these receivers include the SBAS function, the filter suggested in this paper is of great value in that it can make the differential GPS (DGPS) performance of the low-cost SF receivers comparable to that of DF receivers. PMID:28245584
Design Optimization and Residual Strength Assessment of a Cylindrical Composite Shell Structure
NASA Technical Reports Server (NTRS)
Rais-Rohani, Masoud
2000-01-01
A summary of research conducted during the specified period is presented. The research objectives included the investigation of an efficient technique for the design optimization and residual strength assessment of a semi-monocoque cylindrical shell structure made of composite materials. The response surface methodology is used in modeling the buckling response of individual skin panels under the combined axial compression and shear loading. These models are inserted into the MSC/NASTRAN code for design optimization of the cylindrical structure under a combined bending-torsion loading condition. The comparison between the monolithic and sandwich skin design cases indicated a 35% weight saving in using sandwich skin panels. In addition, the residual strength of the optimum design was obtained by identifying the most critical region of the structure and introducing a damage in the form of skin-stringer and skin-stringer-frame detachment. The comparison between the two skin design concepts indicated that the sandwich skin design is capable of retaining a higher residual strength than its monolithic counterpart. The results of this investigation are presented and discussed in this report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ortiz-Rodriguez, J. M.; Reyes Alfaro, A.; Reyes Haro, A.
In this work a neutron spectrum unfolding code, based on artificial intelligence technology is presented. The code called ''Neutron Spectrometry and Dosimetry with Artificial Neural Networks and two Bonner spheres'', (NSDann2BS), was designed in a graphical user interface under the LabVIEW programming environment. The main features of this code are to use an embedded artificial neural network architecture optimized with the ''Robust design of artificial neural networks methodology'' and to use two Bonner spheres as the only piece of information. In order to build the code here presented, once the net topology was optimized and properly trained, knowledge stored atmore » synaptic weights was extracted and using a graphical framework build on the LabVIEW programming environment, the NSDann2BS code was designed. This code is friendly, intuitive and easy to use for the end user. The code is freely available upon request to authors. To demonstrate the use of the neural net embedded in the NSDann2BS code, the rate counts of {sup 252}Cf, {sup 241}AmBe and {sup 239}PuBe neutron sources measured with a Bonner spheres system.« less
NASA Astrophysics Data System (ADS)
Ortiz-Rodríguez, J. M.; Reyes Alfaro, A.; Reyes Haro, A.; Solís Sánches, L. O.; Miranda, R. Castañeda; Cervantes Viramontes, J. M.; Vega-Carrillo, H. R.
2013-07-01
In this work a neutron spectrum unfolding code, based on artificial intelligence technology is presented. The code called "Neutron Spectrometry and Dosimetry with Artificial Neural Networks and two Bonner spheres", (NSDann2BS), was designed in a graphical user interface under the LabVIEW programming environment. The main features of this code are to use an embedded artificial neural network architecture optimized with the "Robust design of artificial neural networks methodology" and to use two Bonner spheres as the only piece of information. In order to build the code here presented, once the net topology was optimized and properly trained, knowledge stored at synaptic weights was extracted and using a graphical framework build on the LabVIEW programming environment, the NSDann2BS code was designed. This code is friendly, intuitive and easy to use for the end user. The code is freely available upon request to authors. To demonstrate the use of the neural net embedded in the NSDann2BS code, the rate counts of 252Cf, 241AmBe and 239PuBe neutron sources measured with a Bonner spheres system.
NASA Astrophysics Data System (ADS)
Neji, N.; Jridi, M.; Alfalou, A.; Masmoudi, N.
2016-02-01
The double random phase encryption (DRPE) method is a well-known all-optical architecture which has many advantages especially in terms of encryption efficiency. However, the method presents some vulnerabilities against attacks and requires a large quantity of information to encode the complex output plane. In this paper, we present an innovative hybrid technique to enhance the performance of DRPE method in terms of compression and encryption. An optimized simultaneous compression and encryption method is applied simultaneously on the real and imaginary components of the DRPE output plane. The compression and encryption technique consists in using an innovative randomized arithmetic coder (RAC) that can well compress the DRPE output planes and at the same time enhance the encryption. The RAC is obtained by an appropriate selection of some conditions in the binary arithmetic coding (BAC) process and by using a pseudo-random number to encrypt the corresponding outputs. The proposed technique has the capabilities to process video content and to be standard compliant with modern video coding standards such as H264 and HEVC. Simulations demonstrate that the proposed crypto-compression system has presented the drawbacks of the DRPE method. The cryptographic properties of DRPE have been enhanced while a compression rate of one-sixth can be achieved. FPGA implementation results show the high performance of the proposed method in terms of maximum operating frequency, hardware occupation, and dynamic power consumption.
Cai, Yao; Hu, Huasi; Pan, Ziheng; Hu, Guang; Zhang, Tao
2018-05-17
To optimize the shield for neutrons and gamma rays compact and lightweight, a method combining the structure and components together was established employing genetic algorithms and MCNP code. As a typical case, the fission energy spectrum of 235 U which mixed neutrons and gamma rays was adopted in this study. Six types of materials were presented and optimized by the method. Spherical geometry was adopted in the optimization after checking the geometry effect. Simulations have made to verify the reliability of the optimization method and the efficiency of the optimized materials. To compare the materials visually and conveniently, the volume and weight needed to build a shield are employed. The results showed that, the composite multilayer material has the best performance. Copyright © 2018 Elsevier Ltd. All rights reserved.
Methods of alleviation of ionospheric scintillation effects on digital communications
NASA Technical Reports Server (NTRS)
Massey, J. L.
1974-01-01
The degradation of the performance of digital communication systems because of ionospheric scintillation effects can be reduced either by diversity techniques or by coding. The effectiveness of traditional space-diversity, frequency-diversity and time-diversity techniques is reviewed and design considerations isolated. Time-diversity signaling is then treated as an extremely simple form of coding. More advanced coding methods, such as diffuse threshold decoding and burst-trapping decoding, which appear attractive in combatting scintillation effects are discussed and design considerations noted. Finally, adaptive coding techniques appropriate when the general state of the channel is known are discussed.
Parallel and Portable Monte Carlo Particle Transport
NASA Astrophysics Data System (ADS)
Lee, S. R.; Cummings, J. C.; Nolen, S. D.; Keen, N. D.
1997-08-01
We have developed a multi-group, Monte Carlo neutron transport code in C++ using object-oriented methods and the Parallel Object-Oriented Methods and Applications (POOMA) class library. This transport code, called MC++, currently computes k and α eigenvalues of the neutron transport equation on a rectilinear computational mesh. It is portable to and runs in parallel on a wide variety of platforms, including MPPs, clustered SMPs, and individual workstations. It contains appropriate classes and abstractions for particle transport and, through the use of POOMA, for portable parallelism. Current capabilities are discussed, along with physics and performance results for several test problems on a variety of hardware, including all three Accelerated Strategic Computing Initiative (ASCI) platforms. Current parallel performance indicates the ability to compute α-eigenvalues in seconds or minutes rather than days or weeks. Current and future work on the implementation of a general transport physics framework (TPF) is also described. This TPF employs modern C++ programming techniques to provide simplified user interfaces, generic STL-style programming, and compile-time performance optimization. Physics capabilities of the TPF will be extended to include continuous energy treatments, implicit Monte Carlo algorithms, and a variety of convergence acceleration techniques such as importance combing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lilienthal, P.
1997-12-01
This paper describes three different computer codes which have been written to model village power applications. The reasons which have driven the development of these codes include: the existance of limited field data; diverse applications can be modeled; models allow cost and performance comparisons; simulations generate insights into cost structures. The models which are discussed are: Hybrid2, a public code which provides detailed engineering simulations to analyze the performance of a particular configuration; HOMER - the hybrid optimization model for electric renewables - which provides economic screening for sensitivity analyses; and VIPOR the village power model - which is amore » network optimization model for comparing mini-grids to individual systems. Examples of the output of these codes are presented for specific applications.« less
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code
NASA Astrophysics Data System (ADS)
Mendygral, P. J.; Radcliffe, N.; Kandalla, K.; Porter, D.; O'Neill, B. J.; Nolting, C.; Edmon, P.; Donnert, J. M. F.; Jones, T. W.
2017-02-01
We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
Performance and structure of single-mode bosonic codes
NASA Astrophysics Data System (ADS)
Albert, Victor V.; Noh, Kyungjoo; Duivenvoorden, Kasper; Young, Dylan J.; Brierley, R. T.; Reinhold, Philip; Vuillot, Christophe; Li, Linshu; Shen, Chao; Girvin, S. M.; Terhal, Barbara M.; Jiang, Liang
2018-03-01
The early Gottesman, Kitaev, and Preskill (GKP) proposal for encoding a qubit in an oscillator has recently been followed by cat- and binomial-code proposals. Numerically optimized codes have also been proposed, and we introduce codes of this type here. These codes have yet to be compared using the same error model; we provide such a comparison by determining the entanglement fidelity of all codes with respect to the bosonic pure-loss channel (i.e., photon loss) after the optimal recovery operation. We then compare achievable communication rates of the combined encoding-error-recovery channel by calculating the channel's hashing bound for each code. Cat and binomial codes perform similarly, with binomial codes outperforming cat codes at small loss rates. Despite not being designed to protect against the pure-loss channel, GKP codes significantly outperform all other codes for most values of the loss rate. We show that the performance of GKP and some binomial codes increases monotonically with increasing average photon number of the codes. In order to corroborate our numerical evidence of the cat-binomial-GKP order of performance occurring at small loss rates, we analytically evaluate the quantum error-correction conditions of those codes. For GKP codes, we find an essential singularity in the entanglement fidelity in the limit of vanishing loss rate. In addition to comparing the codes, we draw parallels between binomial codes and discrete-variable systems. First, we characterize one- and two-mode binomial as well as multiqubit permutation-invariant codes in terms of spin-coherent states. Such a characterization allows us to introduce check operators and error-correction procedures for binomial codes. Second, we introduce a generalization of spin-coherent states, extending our characterization to qudit binomial codes and yielding a multiqudit code.