Implementing the Deutsch-Jozsa algorithm with macroscopic ensembles
NASA Astrophysics Data System (ADS)
Semenenko, Henry; Byrnes, Tim
2016-05-01
Quantum computing implementations under consideration today typically deal with systems with microscopic degrees of freedom such as photons, ions, cold atoms, and superconducting circuits. The quantum information is stored typically in low-dimensional Hilbert spaces such as qubits, as quantum effects are strongest in such systems. It has, however, been demonstrated that quantum effects can be observed in mesoscopic and macroscopic systems, such as nanomechanical systems and gas ensembles. While few-qubit quantum information demonstrations have been performed with such macroscopic systems, a quantum algorithm showing exponential speedup over classical algorithms is yet to be shown. Here, we show that the Deutsch-Jozsa algorithm can be implemented with macroscopic ensembles. The encoding that we use avoids the detrimental effects of decoherence that normally plagues macroscopic implementations. We discuss two mapping procedures which can be chosen depending upon the constraints of the oracle and the experiment. Both methods have an exponential speedup over the classical case, and only require control of the ensembles at the level of the total spin of the ensembles. It is shown that both approaches reproduce the qubit Deutsch-Jozsa algorithm, and are robust under decoherence.
Graphene-based room-temperature implementation of a modified Deutsch-Jozsa quantum algorithm.
Dragoman, Daniela; Dragoman, Mircea
2015-12-01
We present an implementation of a one-qubit and two-qubit modified Deutsch-Jozsa quantum algorithm based on graphene ballistic devices working at room temperature. The modified Deutsch-Jozsa algorithm decides whether a function, equivalent to the effect of an energy potential distribution on the wave function of ballistic charge carriers, is constant or not, without measuring the output wave function. The function need not be Boolean. Simulations confirm that the algorithm works properly, opening the way toward quantum computing at room temperature based on the same clean-room technologies as those used for fabrication of very-large-scale integrated circuits. PMID:26541203
Graphene-based room-temperature implementation of a modified Deutsch-Jozsa quantum algorithm
NASA Astrophysics Data System (ADS)
Dragoman, Daniela; Dragoman, Mircea
2015-12-01
We present an implementation of a one-qubit and two-qubit modified Deutsch-Jozsa quantum algorithm based on graphene ballistic devices working at room temperature. The modified Deutsch-Jozsa algorithm decides whether a function, equivalent to the effect of an energy potential distribution on the wave function of ballistic charge carriers, is constant or not, without measuring the output wave function. The function need not be Boolean. Simulations confirm that the algorithm works properly, opening the way toward quantum computing at room temperature based on the same clean-room technologies as those used for fabrication of very-large-scale integrated circuits.
Quantum computation with classical light: Implementation of the Deutsch-Jozsa algorithm
NASA Astrophysics Data System (ADS)
Perez-Garcia, Benjamin; McLaren, Melanie; Goyal, Sandeep K.; Hernandez-Aranda, Raul I.; Forbes, Andrew; Konrad, Thomas
2016-05-01
We propose an optical implementation of the Deutsch-Jozsa Algorithm using classical light in a binary decision-tree scheme. Our approach uses a ring cavity and linear optical devices in order to efficiently query the oracle functional values. In addition, we take advantage of the intrinsic Fourier transforming properties of a lens to read out whether the function given by the oracle is balanced or constant.
Using the Deutsch-Jozsa algorithm to partition arrays
NASA Astrophysics Data System (ADS)
Lipovaca, Samir
2010-03-01
Using the Deutsch-Jozsa algorithm, we will develop a method for solving a class of problems in which we need to determine parts of an array and then apply a specified function to each independent part. Since present quantum computers are not robust enough for code writing and execution, we will build a model of a vector quantum computer that implements the Deutsch-Jozsa algorithm from a machine language view using the APL2 programming language. The core of the method is an operator (DJBOX) which allows evaluation of an arbitrary function f by the Deutsch-Jozsa algorithm. Two key functions of the method are GET/PARTITION and CALC/WITH/PARTITIONS. The GET/PARTITION function determines parts of an array based on the function f. The CALC/WITH/PARTITIONS function determines parts of an array based on the function f and then applies another function to each independent part. We will imagine the method is implemented on the above vector quantum computer. We will show that the method can be successfully executed.
Vala, Jiri; Kosloff, Ronnie; Amitay, Zohar; Zhang Bo; Leone, Stephen R.
2002-12-01
The Deutsch-Jozsa algorithm is experimentally demonstrated for three-qubit functions using pure coherent superpositions of Li{sub 2} rovibrational eigenstates. The function's character, either constant or balanced, is evaluated by first imprinting the function, using a phase-shaped femtosecond pulse, on a coherent superposition of the molecular states, and then projecting the superposition onto an ionic final state, using a second femtosecond pulse at a specific time delay.
Experimental demonstration of the Deutsch-Jozsa algorithm in homonuclear multispin systems
Wu Zhen; Luo Jun; Feng Mang; Li Jun; Zheng Wenqiang; Peng Xinhua
2011-10-15
Despite early experimental tests of the Deutsch-Jozsa (DJ) algorithm, there have been only a very few nontrivial balanced functions tested for register number n>3. In this paper, we experimentally demonstrate the DJ algorithm in four- and five-qubit homonuclear spin systems by the nuclear-magnetic-resonance technique, by which we encode the one function evaluation into a long shaped pulse with the application of the gradient ascent algorithm. Our work, dramatically reducing the accumulated errors due to gate imperfections and relaxation, demonstrates a better implementation of the DJ algorithm.
Tame, M. S.; Kim, M. S.
2010-09-15
We show that fundamental versions of the Deutsch-Jozsa and Bernstein-Vazirani quantum algorithms can be performed using a small entangled cluster state resource of only six qubits. We then investigate the minimal resource states needed to demonstrate general n-qubit versions and a scalable method to produce them. For this purpose, we propose a versatile photonic on-chip setup.
Tesch, Carmen M; de Vivie-Riedle, Regina
2004-12-22
The phase of quantum gates is one key issue for the implementation of quantum algorithms. In this paper we first investigate the phase evolution of global molecular quantum gates, which are realized by optimally shaped femtosecond laser pulses. The specific laser fields are calculated using the multitarget optimal control algorithm, our modification of the optimal control theory relevant for application in quantum computing. As qubit system we use vibrational modes of polyatomic molecules, here the two IR-active modes of acetylene. Exemplarily, we present our results for a Pi gate, which shows a strong dependence on the phase, leading to a significant decrease in quantum yield. To correct for this unwanted behavior we include pressure on the quantum phase in our multitarget approach. In addition the accuracy of these phase corrected global quantum gates is enhanced. Furthermore we could show that in our molecular approach phase corrected quantum gates and basis set independence are directly linked. Basis set independence is also another property highly required for the performance of quantum algorithms. By realizing the Deutsch-Jozsa algorithm in our two qubit molecular model system, we demonstrate the good performance of our phase corrected and basis set independent quantum gates.
Optical simulation of quantum algorithms using programmable liquid-crystal displays
Puentes, Graciana; La Mela, Cecilia; Ledesma, Silvia; Iemmi, Claudio; Paz, Juan Pablo; Saraceno, Marcos
2004-04-01
We present a scheme to perform an all optical simulation of quantum algorithms and maps. The main components are lenses to efficiently implement the Fourier transform and programmable liquid-crystal displays to introduce space dependent phase changes on a classical optical beam. We show how to simulate Deutsch-Jozsa and Grover's quantum algorithms using essentially the same optical array programmed in two different ways.
Experimental application of decoherence-free subspaces in an optical quantum-computing algorithm.
Mohseni, M; Lundeen, J S; Resch, K J; Steinberg, A M
2003-10-31
For a practical quantum computer to operate, it is essential to properly manage decoherence. One important technique for doing this is the use of "decoherence-free subspaces" (DFSs), which have recently been demonstrated. Here we present the first use of DFSs to improve the performance of a quantum algorithm. An optical implementation of the Deutsch-Jozsa algorithm can be made insensitive to a particular class of phase noise by encoding information in the appropriate subspaces; we observe a reduction of the error rate from 35% to 7%, essentially its value in the absence of noise.
Multipartite entanglement in quantum algorithms
Bruss, D.; Macchiavello, C.
2011-05-15
We investigate the entanglement features of the quantum states employed in quantum algorithms. In particular, we analyze the multipartite entanglement properties in the Deutsch-Jozsa, Grover, and Simon algorithms. Our results show that for these algorithms most instances involve multipartite entanglement.
Realization of Simple Quantum Algorithms with Circuit Quantum Electrodynamics
NASA Astrophysics Data System (ADS)
Dicarlo, Leonardo
2010-03-01
Superconducting circuits have made considerable progress in the requirements of quantum coherence, universal gate operations and qubit readout necessary to realize a quantum computer. However, simultaneously meeting these requirements makes the solid-state realization of few-qubit processors, as previously implemented in nuclear magnetic resonance, ion-trap and optical systems, an exciting challenge. We present the realization of a two-qubit superconducting processor based on circuit quantum electrodynamics (cQED), and report progress by the Yale cQED team towards a four-qubit upgrade. The architecture employs a microwave transmission-line cavity as a quantum bus coupling multiple transmon qubits. Unitary control is achieved by concatenation of high-fidelity single-qubit rotations induced via resonant microwave tones, and multi-qubit adiabatic phase gates realized by local flux control of qubit frequencies. Qubit readout uses the cavity as a quadratic detector, such that a single, calibrated measurement channel gives direct access to multi-qubit correlations. We present generation of Bell states; entanglement quantification by strong violation of Clauser-Horne-Shimony-Holt inequalities; and implementations of the Grover search and Deutsch-Jozsa algorithms. We report experimental progress in extending adiabatic phase gates and joint readout to four qubits, and improving qubit coherence on the road to realizing more complex quantum algorithms. Research done in collaboration with J. M. Chow, J. M. Gambetta, Lev S. Bishop, B. R. Johnson, D. I. Schuster, A. Nunnenkamp, J. Majer, A. Blais, L. Frunzio, M. H. Devoret, S. M. Girvin, and R. J. Schoelkopf.
Systolic algorithms and their implementation
Kung, H.T.
1984-01-01
Very high performance computer systems must rely heavily on parallelism since there are severe physical and technological limits on the ultimate speed of any single processor. The systolic array concept developed in the last several years allows effective use of a very large number of processors in parallel. This article illustrates the basic ideas by reviewing a systolic array design for matrix triangularization and describing its use in the on-the-fly updating of Cholesky decomposition of covariance matrices-a crucial computation in adaptive signal processing. Following this are discussions on issues related to the hardware implementation of systolic algorithms in general, and some guidelines for designing systolic algorithms that will be convenient for implementation. 33 references.
Linear Bregman algorithm implemented in parallel GPU
NASA Astrophysics Data System (ADS)
Li, Pengyan; Ke, Jue; Sui, Dong; Wei, Ping
2015-08-01
At present, most compressed sensing (CS) algorithms have poor converging speed, thus are difficult to run on PC. To deal with this issue, we use a parallel GPU, to implement a broadly used compressed sensing algorithm, the Linear Bregman algorithm. Linear iterative Bregman algorithm is a reconstruction algorithm proposed by Osher and Cai. Compared with other CS reconstruction algorithms, the linear Bregman algorithm only involves the vector and matrix multiplication and thresholding operation, and is simpler and more efficient for programming. We use C as a development language and adopt CUDA (Compute Unified Device Architecture) as parallel computing architectures. In this paper, we compared the parallel Bregman algorithm with traditional CPU realized Bregaman algorithm. In addition, we also compared the parallel Bregman algorithm with other CS reconstruction algorithms, such as OMP and TwIST algorithms. Compared with these two algorithms, the result of this paper shows that, the parallel Bregman algorithm needs shorter time, and thus is more convenient for real-time object reconstruction, which is important to people's fast growing demand to information technology.
Deterministic quantum computation with one photonic qubit
NASA Astrophysics Data System (ADS)
Hor-Meyll, M.; Tasca, D. S.; Walborn, S. P.; Ribeiro, P. H. Souto; Santos, M. M.; Duzzioni, E. I.
2015-07-01
We show that deterministic quantum computing with one qubit (DQC1) can be experimentally implemented with a spatial light modulator, using the polarization and the transverse spatial degrees of freedom of light. The scheme allows the computation of the trace of a high-dimension matrix, being limited by the resolution of the modulator panel and the technical imperfections. In order to illustrate the method, we compute the normalized trace of unitary matrices and implement the Deutsch-Jozsa algorithm. The largest matrix that can be manipulated with our setup is 1080 ×1920 , which is able to represent a system with approximately 21 qubits.
OpenAD : algorithm implementation user guide.
Utke, J.
2004-05-13
Research in automatic differentiation has led to a number of tools that implement various approaches and algorithms for the most important programming languages. While all these tools have the same mathematical underpinnings, the actual implementations have little in common and mostly are specialized for a particular programming language, compiler internal representation, or purpose. This specialization does not promote an open test bed for experimentation with new algorithms that arise from exploiting structural properties of numerical codes in a source transformation context. OpenAD is being designed to fill this need by providing a framework that allows for relative ease in the implementation of algorithms that operate on a representation of the numerical kernel of a program. Language independence is achieved by using an intermediate XML format and the abstraction of common compiler analyses in Open-Analysis. The intermediate format is mapped to concrete programming languages via two front/back end combinations. The design allows for reuse and combination of already implemented algorithms. We describe the set of algorithms and basic functionality currently implemented in OpenAD and explain the necessary steps to add a new algorithm to the framework.
Java implementation of Class Association Rule algorithms
2007-08-30
Java implementation of three Class Association Rule mining algorithms, NETCAR, CARapriori, and clustering based rule mining. NETCAR algorithm is a novel algorithm developed by Makio Tamura. The algorithm is discussed in a paper: UCRL-JRNL-232466-DRAFT, and would be published in a peer review scientific journal. The software is used to extract combinations of genes relevant with a phenotype from a phylogenetic profile and a phenotype profile. The phylogenetic profiles is represented by a binary matrix andmore » a phenotype profile is represented by a binary vector. The present application of this software will be in genome analysis, however, it could be applied more generally.« less
Java implementation of Class Association Rule algorithms
Tamura, Makio
2007-08-30
Java implementation of three Class Association Rule mining algorithms, NETCAR, CARapriori, and clustering based rule mining. NETCAR algorithm is a novel algorithm developed by Makio Tamura. The algorithm is discussed in a paper: UCRL-JRNL-232466-DRAFT, and would be published in a peer review scientific journal. The software is used to extract combinations of genes relevant with a phenotype from a phylogenetic profile and a phenotype profile. The phylogenetic profiles is represented by a binary matrix and a phenotype profile is represented by a binary vector. The present application of this software will be in genome analysis, however, it could be applied more generally.
Quantum optical implementation of Grover's algorithm
Scully, Marlan O.; Zubairy, M. S.
2001-01-01
We present a scheme for a quantum optical implementation of Grover's algorithm based on resonant atomic interactions with classical fields and dispersive couplings with quantized cavity fields. The proposed scheme depends on preparation of entangled states and is within current state-of-the-art technology. PMID:11504938
A Fast Implementation of the ISOCLUS Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2003-01-01
Unsupervised clustering is a fundamental building block in numerous image processing applications. One of the most popular and widely used clustering schemes for remote sensing applications is the ISOCLUS algorithm, which is based on the ISODATA method. The algorithm is given a set of n data points in d-dimensional space, an integer k indicating the initial number of clusters, and a number of additional parameters. The general goal is to compute the coordinates of a set of cluster centers in d-space, such that those centers minimize the mean squared distance from each data point to its nearest center. This clustering algorithm is similar to another well-known clustering method, called k-means. One significant feature of ISOCLUS over k-means is that the actual number of clusters reported might be fewer or more than the number supplied as part of the input. The algorithm uses different heuristics to determine whether to merge lor split clusters. As ISOCLUS can run very slowly, particularly on large data sets, there has been a growing .interest in the remote sensing community in computing it efficiently. We have developed a faster implementation of the ISOCLUS algorithm. Our improvement is based on a recent acceleration to the k-means algorithm of Kanungo, et al. They showed that, by using a kd-tree data structure for storing the data, it is possible to reduce the running time of k-means. We have adapted this method for the ISOCLUS algorithm, and we show that it is possible to achieve essentially the same results as ISOCLUS on large data sets, but with significantly lower running times. This adaptation involves computing a number of cluster statistics that are needed for ISOCLUS but not for k-means. Both the k-means and ISOCLUS algorithms are based on iterative schemes, in which nearest neighbors are calculated until some convergence criterion is satisfied. Each iteration requires that the nearest center for each data point be computed. Naively, this requires O
Categorizing Variations of Student-Implemented Sorting Algorithms
ERIC Educational Resources Information Center
Taherkhani, Ahmad; Korhonen, Ari; Malmi, Lauri
2012-01-01
In this study, we examined freshmen students' sorting algorithm implementations in data structures and algorithms' course in two phases: at the beginning of the course before the students received any instruction on sorting algorithms, and after taking a lecture on sorting algorithms. The analysis revealed that many students have insufficient…
Implementation of a Wavefront-Sensing Algorithm
NASA Technical Reports Server (NTRS)
Smith, Jeffrey S.; Dean, Bruce; Aronstein, David
2013-01-01
A computer program has been written as a unique implementation of an image-based wavefront-sensing algorithm reported in "Iterative-Transform Phase Retrieval Using Adaptive Diversity" (GSC-14879-1), NASA Tech Briefs, Vol. 31, No. 4 (April 2007), page 32. This software was originally intended for application to the James Webb Space Telescope, but is also applicable to other segmented-mirror telescopes. The software is capable of determining optical-wavefront information using, as input, a variable number of irradiance measurements collected in defocus planes about the best focal position. The software also uses input of the geometrical definition of the telescope exit pupil (otherwise denoted the pupil mask) to identify the locations of the segments of the primary telescope mirror. From the irradiance data and mask information, the software calculates an estimate of the optical wavefront (a measure of performance) of the telescope generally and across each primary mirror segment specifically. The software is capable of generating irradiance data, wavefront estimates, and basis functions for the full telescope and for each primary-mirror segment. Optionally, each of these pieces of information can be measured or computed outside of the software and incorporated during execution of the software.
Parallel optimization algorithms and their implementation in VLSI design
NASA Technical Reports Server (NTRS)
Lee, G.; Feeley, J. J.
1991-01-01
Two new parallel optimization algorithms based on the simplex method are described. They may be executed by a SIMD parallel processor architecture and be implemented in VLSI design. Several VLSI design implementations are introduced. An application example is reported to demonstrate that the algorithms are effective.
Design and implementation of parallel multigrid algorithms
NASA Technical Reports Server (NTRS)
Chan, Tony F.; Tuminaro, Ray S.
1988-01-01
Techniques for mapping multigrid algorithms to solve elliptic PDEs on hypercube parallel computers are described and demonstrated. The need for proper data mapping to minimize communication distances is stressed, and an execution-time model is developed to show how algorithm efficiency is affected by changes in the machine and algorithm parameters. Particular attention is then given to the case of coarse computational grids, which can lead to idle processors, load imbalances, and inefficient performance. It is shown that convergence can be improved by using idle processors to solve a new problem concurrently on the fine grid defined by a splitting.
Parallel implementation of an algorithm for Delaunay triangulation
NASA Technical Reports Server (NTRS)
Merriam, Marshal L.
1992-01-01
The theory and practice of implementing Tanemura's algorithm for 3D Delaunay triangulation on Intel's Gamma prototype, a 128 processor MIMD computer, is described. Efficient implementation of Tanemura's algorithm on a conventional, vector processing supercomputer is problematic. It does not vectorize to any significant degree and requires indirect addressing. Efficient implementation on a parallel architecture is possible, however. Speeds in excess of 20 times a single processor Cray Y-MP are realized on 128 processors of the Intel Gamma prototype.
Parallel implementation of an algorithm for Delaunay triangulation
NASA Technical Reports Server (NTRS)
Merriam, Marshall L.
1992-01-01
This work concerns the theory and practice of implementing Tanemura's algorithm for 3D Delaunay triangulation on Intel's Gamma prototype, a 128 processor MIMD computer. Tanemura's algorithm does not vectorize to any significant degree and requires indirect addressing. Efficient implementation on a conventional, vector processing, supercomputer is problematic. Efficient implementation on a parallel architecture is possible, however. In this work, speeds in excess of 8 times a single processor Cray Y-mp are realized on 128 processors of the Intel Gamma prototype.
New algorithms for phase unwrapping: implementation and testing
NASA Astrophysics Data System (ADS)
Kotlicki, Krzysztof
1998-11-01
In this paper it is shown how the regularization theory was used for the new noise immune algorithm for phase unwrapping. The algorithm were developed by M. Servin, J.L. Marroquin and F.J. Cuevas in Centro de Investigaciones en Optica A.C. and Centro de Investigacion en Matematicas A.C. in Mexico. The theory was presented. The objective of the work was to implement the algorithm into the software able to perform the off-line unwrapping on the fringe pattern. The algorithms are present as well as the result and also the software developed for the implementation.
The SAPHIRE server: a new algorithm and implementation.
Hersh, W.; Leone, T. J.
1995-01-01
SAPHIRE is an experimental information retrieval system implemented to test new approaches to automated indexing and retrieval of medical documents. Due to limitations in its original concept-matching algorithm, a modified algorithm has been implemented which allows greater flexibility in partial matching and different word order within concepts. With the concomitant growth in client-server applications and the Internet in general, the new algorithm has been implemented as a server that can be accessed via other applications on the Internet. PMID:8563413
An Agent Inspired Reconfigurable Computing Implementation of a Genetic Algorithm
NASA Technical Reports Server (NTRS)
Weir, John M.; Wells, B. Earl
2003-01-01
Many software systems have been successfully implemented using an agent paradigm which employs a number of independent entities that communicate with one another to achieve a common goal. The distributed nature of such a paradigm makes it an excellent candidate for use in high speed reconfigurable computing hardware environments such as those present in modem FPGA's. In this paper, a distributed genetic algorithm that can be applied to the agent based reconfigurable hardware model is introduced. The effectiveness of this new algorithm is evaluated by comparing the quality of the solutions found by the new algorithm with those found by traditional genetic algorithms. The performance of a reconfigurable hardware implementation of the new algorithm on an FPGA is compared to traditional single processor implementations.
Multiple Lookup Table-Based AES Encryption Algorithm Implementation
NASA Astrophysics Data System (ADS)
Gong, Jin; Liu, Wenyi; Zhang, Huixin
Anew AES (Advanced Encryption Standard) encryption algorithm implementation was proposed in this paper. It is based on five lookup tables, which are generated from S-box(the substitution table in AES). The obvious advantages are reducing the code-size, improving the implementation efficiency, and helping new learners to understand the AES encryption algorithm and GF(28) multiplication which are necessary to correctly implement AES[1]. This method can be applied on processors with word length 32 or above, FPGA and others. And correspondingly we can implement it by VHDL, Verilog, VB and other languages.
Algorithm for predictive control implementation on fiber optic transmission lines
NASA Astrophysics Data System (ADS)
Andreev, Vladimir A.; Burdin, Vladimir A.; Voronkov, Andrey A.
2014-04-01
This paper presents the algorithm for predictive control implementation on fiber-optic transmission lines. In order to improve the maintenance of fiber optic communication lines, the algorithm prediction uptime optic communication cables have been worked out. It considers the results of scheduled preventive maintenance and database of various works on the track cable line during maintenance.
A fast portable implementation of the Secure Hash Algorithm, III.
McCurley, Kevin S.
1992-10-01
In 1992, NIST announced a proposed standard for a collision-free hash function. The algorithm for producing the hash value is known as the Secure Hash Algorithm (SHA), and the standard using the algorithm in known as the Secure Hash Standard (SHS). Later, an announcement was made that a scientist at NSA had discovered a weakness in the original algorithm. A revision to this standard was then announced as FIPS 180-1, and includes a slight change to the algorithm that eliminates the weakness. This new algorithm is called SHA-1. In this report we describe a portable and efficient implementation of SHA-1 in the C language. Performance information is given, as well as tips for porting the code to other architectures. We conclude with some observations on the efficiency of the algorithm, and a discussion of how the efficiency of SHA might be improved.
Algorithm implementation on the Navier-Stokes computer
NASA Technical Reports Server (NTRS)
Krist, Steven E.; Zang, Thomas A.
1987-01-01
The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Implementing a self-structuring data learning algorithm
NASA Astrophysics Data System (ADS)
Graham, James; Carson, Daniel; Ternovskiy, Igor
2016-05-01
In this paper, we elaborate on what we did to implement our self-structuring data learning algorithm. To recap, we are working to develop a data learning algorithm that will eventually be capable of goal driven pattern learning and extrapolation of more complex patterns from less complex ones. At this point we have developed a conceptual framework for the algorithm, but have yet to discuss our actual implementation and the consideration and shortcuts we needed to take to create said implementation. We will elaborate on our initial setup of the algorithm and the scenarios we used to test our early stage algorithm. While we want this to be a general algorithm, it is necessary to start with a simple scenario or two to provide a viable development and testing environment. To that end, our discussion will be geared toward what we include in our initial implementation and why, as well as what concerns we may have. In the future, we expect to be able to apply our algorithm to a more general approach, but to do so within a reasonable time, we needed to pick a place to start.
The simplification of fuzzy control algorithm and hardware implementation
NASA Technical Reports Server (NTRS)
Wu, Z. Q.; Wang, P. Z.; Teh, H. H.
1991-01-01
The conventional interface composition algorithm of a fuzzy controller is very time and memory consuming. As a result, it is difficult to do real time fuzzy inference, and most fuzzy controllers are realized by look-up tables. Here, researchers derive a simplified algorithm using the defuzzification mean of maximum. This algorithm takes shorter computation time and needs less memory usage, thus making it possible to compute the fuzzy inference on real time and easy to tune the control rules on line. A hardware implementation based on a simplified fuzzy inference algorithm is described.
Rapid algorithm prototyping and implementation for power quality measurement
NASA Astrophysics Data System (ADS)
Kołek, Krzysztof; Piątek, Krzysztof
2015-12-01
This article presents a Model-Based Design (MBD) approach to rapidly implement power quality (PQ) metering algorithms. Power supply quality is a very important aspect of modern power systems and will become even more important in future smart grids. In this case, maintaining the PQ parameters at the desired level will require efficient implementation methods of the metering algorithms. Currently, the development of new, advanced PQ metering algorithms requires new hardware with adequate computational capability and time intensive, cost-ineffective manual implementations. An alternative, considered here, is an MBD approach. The MBD approach focuses on the modelling and validation of the model by simulation, which is well-supported by a Computer-Aided Engineering (CAE) packages. This paper presents two algorithms utilized in modern PQ meters: a phase-locked loop based on an Enhanced Phase Locked Loop (EPLL), and the flicker measurement according to the IEC 61000-4-15 standard. The algorithms were chosen because of their complexity and non-trivial development. They were first modelled in the MATLAB/Simulink package, then tested and validated in a simulation environment. The models, in the form of Simulink diagrams, were next used to automatically generate C code. The code was compiled and executed in real-time on the Zynq Xilinx platform that combines a reconfigurable Field Programmable Gate Array (FPGA) with a dual-core processor. The MBD development of PQ algorithms, automatic code generation, and compilation form a rapid algorithm prototyping and implementation path for PQ measurements. The main advantage of this approach is the ability to focus on the design, validation, and testing stages while skipping over implementation issues. The code generation process renders production-ready code that can be easily used on the target hardware. This is especially important when standards for PQ measurement are in constant development, and the PQ issues in emerging smart
Efficient Implementation of the Backpropagation Algorithm in FPGAs and Microcontrollers.
Ortega-Zamorano, Francisco; Jerez, Jose M; Urda Munoz, Daniel; Luque-Baena, Rafael M; Franco, Leonardo
2016-09-01
The well-known backpropagation learning algorithm is implemented in a field-programmable gate array (FPGA) board and a microcontroller, focusing in obtaining efficient implementations in terms of a resource usage and computational speed. The algorithm was implemented in both cases using a training/validation/testing scheme in order to avoid overfitting problems. For the case of the FPGA implementation, a new neuron representation that reduces drastically the resource usage was introduced by combining the input and first hidden layer units in a single module. Further, a time-division multiplexing scheme was implemented for carrying out product computations taking advantage of the built-in digital signal processor cores. In both implementations, the floating-point data type representation normally used in a personal computer (PC) has been changed to a more efficient one based on a fixed-point scheme, reducing system memory variable usage and leading to an increase in computation speed. The results show that the modifications proposed produced a clear increase in computation speed in comparison with the standard PC-based implementation, demonstrating the usefulness of the intrinsic parallelism of FPGAs in neurocomputational tasks and the suitability of both implementations of the algorithm for its application to the real world problems. PMID:26277004
Outline of a fast hardware implementation of Winograd's DFT algorithm
NASA Technical Reports Server (NTRS)
Zohar, S.
1980-01-01
The main characteristics of the discrete Fourier transform (DFT) algorithm considered by Winograd (1976) is a significant reduction in the number of multiplications. Its primary disadvantage is a higher structural complexity. It is, therefore, difficult to translate the reduced number of multiplications into faster execution of the DFT by means of a software implementation of the algorithm. For this reason, a hardware implementation is considered in the current study, taking into account a design based on the algorithm prescription discussed by Zohar (1979). The hardware implementation of a FORTRAN subroutine is proposed, giving attention to a pipelining scheme in which 5 consecutive data batches are being operated on simultaneously, each batch undergoing one of 5 processing phases.
Experimental implementation of the quantum random-walk algorithm
Du Jiangfeng; Li Hui; Shi Mingjun; Zhou Xianyi; Han Rongdian; Xu Xiaodong; Wu Jihui
2003-04-01
The quantum random walk is a possible approach to construct quantum algorithms. Several groups have investigated the quantum random walk and experimental schemes were proposed. In this paper, we present the experimental implementation of the quantum random-walk algorithm on a nuclear-magnetic-resonance quantum computer. We observe that the quantum walk is in sharp contrast to its classical counterpart. In particular, the properties of the quantum walk strongly depends on the quantum entanglement.
On the design, analysis, and implementation of efficient parallel algorithms
Sohn, S.M.
1989-01-01
There is considerable interest in developing algorithms for a variety of parallel computer architectures. This is not a trivial problem, although for certain models great progress has been made. Recently, general-purpose parallel machines have become available commercially. These machines possess widely varying interconnection topologies and data/instruction access schemes. It is important, therefore, to develop methodologies and design paradigms for not only synthesizing parallel algorithms from initial problem specifications, but also for mapping algorithms between different architectures. This work has considered both of these problems. A systolic array consists of a large collection of simple processors that are interconnected in a uniform pattern. The author has studied in detain the problem of mapping systolic algorithms onto more general-purpose parallel architectures such as the hypercube. The hypercube architecture is notable due to its symmetry and high connectivity, characteristics which are conducive to the efficient embedding of parallel algorithms. Although the parallel-to-parallel mapping techniques have yielded efficient target algorithms, it is not surprising that an algorithm designed directly for a particular parallel model would achieve superior performance. In this context, the author has developed hypercube algorithms for some important problems in speech and signal processing, text processing, language processing and artificial intelligence. These algorithms were implemented on a 64-node NCUBE/7 hypercube machine in order to evaluate their performance.
Efficient implementation of the adaptive scale pixel decomposition algorithm
NASA Astrophysics Data System (ADS)
Zhang, L.; Bhatnagar, S.; Rau, U.; Zhang, M.
2016-08-01
Context. Most popular algorithms in use to remove the effects of a telescope's point spread function (PSF) in radio astronomy are variants of the CLEAN algorithm. Most of these algorithms model the sky brightness using the delta-function basis, which results in undesired artefacts when used to image extended emission. The adaptive scale pixel decomposition (Asp-Clean) algorithm models the sky brightness on a scale-sensitive basis and thus gives a significantly better imaging performance when imaging fields that contain both resolved and unresolved emission. Aims: However, the runtime cost of Asp-Clean is higher than that of scale-insensitive algorithms. In this paper, we identify the most expensive step in the original Asp-Clean algorithm and present an efficient implementation of it, which significantly reduces the computational cost while keeping the imaging performance comparable to the original algorithm. The PSF sidelobe levels of modern wide-band telescopes are significantly reduced, allowing us to make approximations to reduce the computational cost, which in turn allows for the deconvolution of larger images on reasonable timescales. Methods: As in the original algorithm, scales in the image are estimated through function fitting. Here we introduce an analytical method to model extended emission, and a modified method for estimating the initial values used for the fitting procedure, which ultimately leads to a lower computational cost. Results: The new implementation was tested with simulated EVLA data and the imaging performance compared well with the original Asp-Clean algorithm. Tests show that the current algorithm can recover features at different scales with lower computational cost.
NMR implementation of adiabatic SAT algorithm using strongly modulated pulses.
Mitra, Avik; Mahesh, T S; Kumar, Anil
2008-03-28
NMR implementation of adiabatic algorithms face severe problems in homonuclear spin systems since the qubit selective pulses are long and during this period, evolution under the Hamiltonian and decoherence cause errors. The decoherence destroys the answer as it causes the final state to evolve to mixed state and in homonuclear systems, evolution under the internal Hamiltonian causes phase errors preventing the initial state to converge to the solution state. The resolution of these issues is necessary before one can proceed to implement an adiabatic algorithm in a large system where homonuclear coupled spins will become a necessity. In the present work, we demonstrate that by using "strongly modulated pulses" (SMPs) for the creation of interpolating Hamiltonian, one can circumvent both the problems and successfully implement the adiabatic SAT algorithm in a homonuclear three qubit system. This work also demonstrates that the SMPs tremendously reduce the time taken for the implementation of the algorithm, can overcome problems associated with decoherence, and will be the modality in future implementation of quantum information processing by NMR. PMID:18376911
NMR implementation of adiabatic SAT algorithm using strongly modulated pulses
NASA Astrophysics Data System (ADS)
Mitra, Avik; Mahesh, T. S.; Kumar, Anil
2008-03-01
NMR implementation of adiabatic algorithms face severe problems in homonuclear spin systems since the qubit selective pulses are long and during this period, evolution under the Hamiltonian and decoherence cause errors. The decoherence destroys the answer as it causes the final state to evolve to mixed state and in homonuclear systems, evolution under the internal Hamiltonian causes phase errors preventing the initial state to converge to the solution state. The resolution of these issues is necessary before one can proceed to implement an adiabatic algorithm in a large system where homonuclear coupled spins will become a necessity. In the present work, we demonstrate that by using "strongly modulated pulses" (SMPs) for the creation of interpolating Hamiltonian, one can circumvent both the problems and successfully implement the adiabatic SAT algorithm in a homonuclear three qubit system. This work also demonstrates that the SMPs tremendously reduce the time taken for the implementation of the algorithm, can overcome problems associated with decoherence, and will be the modality in future implementation of quantum information processing by NMR.
FPGA implementation of sparse matrix algorithm for information retrieval
NASA Astrophysics Data System (ADS)
Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio
2005-06-01
Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
A novel pipeline based FPGA implementation of a genetic algorithm
NASA Astrophysics Data System (ADS)
Thirer, Nonel
2014-05-01
To solve problems when an analytical solution is not available, more and more bio-inspired computation techniques have been applied in the last years. Thus, an efficient algorithm is the Genetic Algorithm (GA), which imitates the biological evolution process, finding the solution by the mechanism of "natural selection", where the strong has higher chances to survive. A genetic algorithm is an iterative procedure which operates on a population of individuals called "chromosomes" or "possible solutions" (usually represented by a binary code). GA performs several processes with the population individuals to produce a new population, like in the biological evolution. To provide a high speed solution, pipelined based FPGA hardware implementations are used, with a nstages pipeline for a n-phases genetic algorithm. The FPGA pipeline implementations are constraints by the different execution time of each stage and by the FPGA chip resources. To minimize these difficulties, we propose a bio-inspired technique to modify the crossover step by using non identical twins. Thus two of the chosen chromosomes (parents) will build up two new chromosomes (children) not only one as in classical GA. We analyze the contribution of this method to reduce the execution time in the asynchronous and synchronous pipelines and also the possibility to a cheaper FPGA implementation, by using smaller populations. The full hardware architecture for a FPGA implementation to our target ALTERA development card is presented and analyzed.
Implementing Linear Algebra Related Algorithms on the TI-92+ Calculator.
ERIC Educational Resources Information Center
Alexopoulos, John; Abraham, Paul
2001-01-01
Demonstrates a less utilized feature of the TI-92+: its natural and powerful programming language. Shows how to implement several linear algebra related algorithms including the Gram-Schmidt process, Least Squares Approximations, Wronskians, Cholesky Decompositions, and Generalized Linear Least Square Approximations with QR Decompositions.…
Efficient implementations of hyperspectral chemical-detection algorithms
NASA Astrophysics Data System (ADS)
Brett, Cory J. C.; DiPietro, Robert S.; Manolakis, Dimitris G.; Ingle, Vinay K.
2013-10-01
Many military and civilian applications depend on the ability to remotely sense chemical clouds using hyperspectral imagers, from detecting small but lethal concentrations of chemical warfare agents to mapping plumes in the aftermath of natural disasters. Real-time operation is critical in these applications but becomes diffcult to achieve as the number of chemicals we search for increases. In this paper, we present efficient CPU and GPU implementations of matched-filter based algorithms so that real-time operation can be maintained with higher chemical-signature counts. The optimized C++ implementations show between 3x and 9x speedup over vectorized MATLAB implementations.
A distributed Canny edge detector: algorithm and FPGA implementation.
Xu, Qian; Varadarajan, Srenivas; Chakrabarti, Chaitali; Karam, Lina J
2014-07-01
The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM READ/WRITE time and the computation time) to detect edges of 512 × 512 images in the USC SIPI database when clocked at 100
A distributed Canny edge detector: algorithm and FPGA implementation.
Xu, Qian; Varadarajan, Srenivas; Chakrabarti, Chaitali; Karam, Lina J
2014-07-01
The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM READ/WRITE time and the computation time) to detect edges of 512 × 512 images in the USC SIPI database when clocked at 100
Experimental implementation of Grover's search algorithm with neutral atom qubits
NASA Astrophysics Data System (ADS)
Sun, Yuan; Lichtman, Martin; Baker, Kevin; Saffman, Mark
2016-05-01
Grover's algorithm for searching an unsorted data base provides a provable speedup over the best possible classical search and is therefore a test bed for demonstrating the power of quantum computation. The algorithm has been demonstrated with NMR, trapped ion, photonic, and superconducting hardware, but only with two qubits encoding a four element database. We report on progress towards experimental demonstration of Grover's algorithm using two and three neutral atom qubits encoding a database with up to eight elements. Our approach uses a Rydberg blockade Ck NOT gate for efficient implementation of the Grover iterations. Quantum Monte Carlo simulations of the algorithm performance that account for gate errors and decoherence rates are compared with experimental results. Work supported by the IARPA MQCO program.
Efficient Hardware Implementation of the Lightweight Block Encryption Algorithm LEA
Lee, Donggeon; Kim, Dong-Chan; Kwon, Daesung; Kim, Howon
2014-01-01
Recently, due to the advent of resource-constrained trends, such as smartphones and smart devices, the computing environment is changing. Because our daily life is deeply intertwined with ubiquitous networks, the importance of security is growing. A lightweight encryption algorithm is essential for secure communication between these kinds of resource-constrained devices, and many researchers have been investigating this field. Recently, a lightweight block cipher called LEA was proposed. LEA was originally targeted for efficient implementation on microprocessors, as it is fast when implemented in software and furthermore, it has a small memory footprint. To reflect on recent technology, all required calculations utilize 32-bit wide operations. In addition, the algorithm is comprised of not complex S-Box-like structures but simple Addition, Rotation, and XOR operations. To the best of our knowledge, this paper is the first report on a comprehensive hardware implementation of LEA. We present various hardware structures and their implementation results according to key sizes. Even though LEA was originally targeted at software efficiency, it also shows high efficiency when implemented as hardware. PMID:24406859
VIRTEX-5 Fpga Implementation of Advanced Encryption Standard Algorithm
NASA Astrophysics Data System (ADS)
Rais, Muhammad H.; Qasim, Syed M.
2010-06-01
In this paper, we present an implementation of Advanced Encryption Standard (AES) cryptographic algorithm using state-of-the-art Virtex-5 Field Programmable Gate Array (FPGA). The design is coded in Very High Speed Integrated Circuit Hardware Description Language (VHDL). Timing simulation is performed to verify the functionality of the designed circuit. Performance evaluation is also done in terms of throughput and area. The design implemented on Virtex-5 (XC5VLX50FFG676-3) FPGA achieves a maximum throughput of 4.34 Gbps utilizing a total of 399 slices.
A parallel implementation of the Wang Landau algorithm
NASA Astrophysics Data System (ADS)
Zhan, Lixin
2008-09-01
The Wang-Landau algorithm is a flat-histogram Monte Carlo method that performs random walks in the configuration space of a system to obtain a close estimation of the density of states iteratively. It has been applied successfully to many research fields. In this paper, we propose a parallel implementation of the Wang-Landau algorithm on computers of shared memory architectures by utilizing the OpenMP API for distributed computing. This implementation is applied to Ising model systems with promising speedups. We also examine the effects on the running speed when using different strategies in accessing the shared memory space during the updating procedure. The allowance of data race is recommended in consideration of the simulation efficiency. Such treatment does not affect the accuracy of the final density of states obtained.
Implementing wide baseline matching algorithms on a graphics processing unit.
Rothganger, Fredrick H.; Larson, Kurt W.; Gonzales, Antonio Ignacio; Myers, Daniel S.
2007-10-01
Wide baseline matching is the state of the art for object recognition and image registration problems in computer vision. Though effective, the computational expense of these algorithms limits their application to many real-world problems. The performance of wide baseline matching algorithms may be improved by using a graphical processing unit as a fast multithreaded co-processor. In this paper, we present an implementation of the difference of Gaussian feature extractor, based on the CUDA system of GPU programming developed by NVIDIA, and implemented on their hardware. For a 2000x2000 pixel image, the GPU-based method executes nearly thirteen times faster than a comparable CPU-based method, with no significant loss of accuracy.
A high performance hardware implementation image encryption with AES algorithm
NASA Astrophysics Data System (ADS)
Farmani, Ali; Jafari, Mohamad; Miremadi, Seyed Sohrab
2011-06-01
This paper describes implementation of a high-speed encryption algorithm with high throughput for encrypting the image. Therefore, we select a highly secured symmetric key encryption algorithm AES(Advanced Encryption Standard), in order to increase the speed and throughput using pipeline technique in four stages, control unit based on logic gates, optimal design of multiplier blocks in mixcolumn phase and simultaneous production keys and rounds. Such procedure makes AES suitable for fast image encryption. Implementation of a 128-bit AES on FPGA of Altra company has been done and the results are as follow: throughput, 6 Gbps in 471MHz. The time of encrypting in tested image with 32*32 size is 1.15ms.
Control algorithm implementation for a redundant degree of freedom manipulator
NASA Technical Reports Server (NTRS)
Cohan, Steve
1991-01-01
This project's purpose is to develop and implement control algorithms for a kinematically redundant robotic manipulator. The manipulator is being developed concurrently by Odetics Inc., under internal research and development funding. This SBIR contract supports algorithm conception, development, and simulation, as well as software implementation and integration with the manipulator hardware. The Odetics Dexterous Manipulator is a lightweight, high strength, modular manipulator being developed for space and commercial applications. It has seven fully active degrees of freedom, is electrically powered, and is fully operational in 1 G. The manipulator consists of five self-contained modules. These modules join via simple quick-disconnect couplings and self-mating connectors which allow rapid assembly/disassembly for reconfiguration, transport, or servicing. Each joint incorporates a unique drive train design which provides zero backlash operation, is insensitive to wear, and is single fault tolerant to motor or servo amplifier failure. The sensing system is also designed to be single fault tolerant. Although the initial prototype is not space qualified, the design is well-suited to meeting space qualification requirements. The control algorithm design approach is to develop a hierarchical system with well defined access and interfaces at each level. The high level endpoint/configuration control algorithm transforms manipulator endpoint position/orientation commands to joint angle commands, providing task space motion. At the same time, the kinematic redundancy is resolved by controlling the configuration (pose) of the manipulator, using several different optimizing criteria. The center level of the hierarchy servos the joints to their commanded trajectories using both linear feedback and model-based nonlinear control techniques. The lowest control level uses sensed joint torque to close torque servo loops, with the goal of improving the manipulator dynamic behavior
Decoding the brain's algorithm for categorization from its neural implementation.
Mack, Michael L; Preston, Alison R; Love, Bradley C
2013-10-21
Acts of cognition can be described at different levels of analysis: what behavior should characterize the act, what algorithms and representations underlie the behavior, and how the algorithms are physically realized in neural activity [1]. Theories that bridge levels of analysis offer more complete explanations by leveraging the constraints present at each level [2-4]. Despite the great potential for theoretical advances, few studies of cognition bridge levels of analysis. For example, formal cognitive models of category decisions accurately predict human decision making [5, 6], but whether model algorithms and representations supporting category decisions are consistent with underlying neural implementation remains unknown. This uncertainty is largely due to the hurdle of forging links between theory and brain [7-9]. Here, we tackle this critical problem by using brain response to characterize the nature of mental computations that support category decisions to evaluate two dominant, and opposing, models of categorization. We found that brain states during category decisions were significantly more consistent with latent model representations from exemplar [5] rather than prototype theory [10, 11]. Representations of individual experiences, not the abstraction of experiences, are critical for category decision making. Holding models accountable for behavior and neural implementation provides a means for advancing more complete descriptions of the algorithms of cognition. PMID:24094852
Experience with a Genetic Algorithm Implemented on a Multiprocessor Computer
NASA Technical Reports Server (NTRS)
Plassman, Gerald E.; Sobieszczanski-Sobieski, Jaroslaw
2000-01-01
Numerical experiments were conducted to find out the extent to which a Genetic Algorithm (GA) may benefit from a multiprocessor implementation, considering, on one hand, that analyses of individual designs in a population are independent of each other so that they may be executed concurrently on separate processors, and, on the other hand, that there are some operations in a GA that cannot be so distributed. The algorithm experimented with was based on a gaussian distribution rather than bit exchange in the GA reproductive mechanism, and the test case was a hub frame structure of up to 1080 design variables. The experimentation engaging up to 128 processors confirmed expectations of radical elapsed time reductions comparing to a conventional single processor implementation. It also demonstrated that the time spent in the non-distributable parts of the algorithm and the attendant cross-processor communication may have a very detrimental effect on the efficient utilization of the multiprocessor machine and on the number of processors that can be used effectively in a concurrent manner. Three techniques were devised and tested to mitigate that effect, resulting in efficiency increasing to exceed 99 percent.
Implementation of several mathematical algorithms to breast tissue density classification
NASA Astrophysics Data System (ADS)
Quintana, C.; Redondo, M.; Tirao, G.
2014-02-01
The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories.
Developing and Implementing the Data Mining Algorithms in RAVEN
Sen, Ramazan Sonat; Maljovec, Daniel Patrick; Alfonsi, Andrea; Rabiti, Cristian
2015-09-01
The RAVEN code is becoming a comprehensive tool to perform probabilistic risk assessment, uncertainty quantification, and verification and validation. The RAVEN code is being developed to support many programs and to provide a set of methodologies and algorithms for advanced analysis. Scientific computer codes can generate enormous amounts of data. To post-process and analyze such data might, in some cases, take longer than the initial software runtime. Data mining algorithms/methods help in recognizing and understanding patterns in the data, and thus discover knowledge in databases. The methodologies used in the dynamic probabilistic risk assessment or in uncertainty and error quantification analysis couple system/physics codes with simulation controller codes, such as RAVEN. RAVEN introduces both deterministic and stochastic elements into the simulation while the system/physics code model the dynamics deterministically. A typical analysis is performed by sampling values of a set of parameter values. A major challenge in using dynamic probabilistic risk assessment or uncertainty and error quantification analysis for a complex system is to analyze the large number of scenarios generated. Data mining techniques are typically used to better organize and understand data, i.e. recognizing patterns in the data. This report focuses on development and implementation of Application Programming Interfaces (APIs) for different data mining algorithms, and the application of these algorithms to different databases.
Kodiak: An Implementation Framework for Branch and Bound Algorithms
NASA Technical Reports Server (NTRS)
Smith, Andrew P.; Munoz, Cesar A.; Narkawicz, Anthony J.; Markevicius, Mantas
2015-01-01
Recursive branch and bound algorithms are often used to refine and isolate solutions to several classes of global optimization problems. A rigorous computation framework for the solution of systems of equations and inequalities involving nonlinear real arithmetic over hyper-rectangular variable and parameter domains is presented. It is derived from a generic branch and bound algorithm that has been formally verified, and utilizes self-validating enclosure methods, namely interval arithmetic and, for polynomials and rational functions, Bernstein expansion. Since bounds computed by these enclosure methods are sound, this approach may be used reliably in software verification tools. Advantage is taken of the partial derivatives of the constraint functions involved in the system, firstly to reduce the branching factor by the use of bisection heuristics and secondly to permit the computation of bifurcation sets for systems of ordinary differential equations. The associated software development, Kodiak, is presented, along with examples of three different branch and bound problem types it implements.
Neural network implementations of data association algorithms for sensor fusion
NASA Technical Reports Server (NTRS)
Brown, Donald E.; Pittard, Clarence L.; Martin, Worthy N.
1989-01-01
The paper is concerned with locating a time varying set of entities in a fixed field when the entities are sensed at discrete time instances. At a given time instant a collection of bivariate Gaussian sensor reports is produced, and these reports estimate the location of a subset of the entities present in the field. A database of reports is maintained, which ideally should contain one report for each entity sensed. Whenever a collection of sensor reports is received, the database must be updated to reflect the new information. This updating requires association processing between the database reports and the new sensor reports to determine which pairs of sensor and database reports correspond to the same entity. Algorithms for performing this association processing are presented. Neural network implementation of the algorithms, along with simulation results comparing the approaches are provided.
A pipelined FPGA implementation of an encryption algorithm based on genetic algorithm
NASA Astrophysics Data System (ADS)
Thirer, Nonel
2013-05-01
With the evolution of digital data storage and exchange, it is essential to protect the confidential information from every unauthorized access. High performance encryption algorithms were developed and implemented by software and hardware. Also many methods to attack the cipher text were developed. In the last years, the genetic algorithm has gained much interest in cryptanalysis of cipher texts and also in encryption ciphers. This paper analyses the possibility to use the genetic algorithm as a multiple key sequence generator for an AES (Advanced Encryption Standard) cryptographic system, and also to use a three stages pipeline (with four main blocks: Input data, AES Core, Key generator, Output data) to provide a fast encryption and storage/transmission of a large amount of data.
DSP Implementation of the Retinex Image Enhancement Algorithm
NASA Technical Reports Server (NTRS)
Hines, Glenn; Rahman, Zia-Ur; Jobson, Daniel; Woodell, Glenn
2004-01-01
The Retinex is a general-purpose image enhancement algorithm that is used to produce good visual representations of scenes. It performs a non-linear spatial/spectral transform that synthesizes strong local contrast enhancement and color constancy. A real-time, video frame rate implementation of the Retinex is required to meet the needs of various potential users. Retinex processing contains a relatively large number of complex computations, thus to achieve real-time performance using current technologies requires specialized hardware and software. In this paper we discuss the design and development of a digital signal processor (DSP) implementation of the Retinex. The target processor is a Texas Instruments TMS320C6711 floating point DSP. NTSC video is captured using a dedicated frame-grabber card, Retinex processed, and displayed on a standard monitor. We discuss the optimizations used to achieve real-time performance of the Retinex and also describe our future plans on using alternative architectures.
Purgatorio - A new implementation of the Inferno algorithm
Wilson, B; Sonnad, V; Sterne, P; Isaacs, W
2005-03-29
For astrophysical applications, as well as modeling laser-produced plasmas, there is a continual need for equation-of-state data over a wide domain of physical conditions. This paper presents algorithmic aspects for computing the Helmholtz free energy of plasma electrons for temperatures spanning from a few Kelvin to several KeV, and densities ranging from essentially isolated ion conditions to such large compressions that most bound orbitals become delocalized. The objective is high precision results in order to compute pressure and other thermodynamic quantities by numerical differentiation. This approach has the advantage that internal thermodynamic self-consistency is ensured, regardless of the specific physical model, but at the cost of very stringent numerical tolerances for each operation. The computational aspects we address in this paper are faced by any model that relies on input from the quantum mechanical spectrum of a spherically symmetric Hamiltonian operator. The particular physical model we employ is that of INFERNO; of a spherically averaged ion embedded in jellium. An overview of PURGATORIO, a new implementation of the INFERNO equation of state model, is presented. The new algorithm emphasizes a novel decimation scheme for automatically resolving the structure of the continuum density of states, circumventing limitations of the pseudo-R matrix algorithm previously utilized.
Neuromorphic implementations of neurobiological learning algorithms for spiking neural networks.
Walter, Florian; Röhrbein, Florian; Knoll, Alois
2015-12-01
The application of biologically inspired methods in design and control has a long tradition in robotics. Unlike previous approaches in this direction, the emerging field of neurorobotics not only mimics biological mechanisms at a relatively high level of abstraction but employs highly realistic simulations of actual biological nervous systems. Even today, carrying out these simulations efficiently at appropriate timescales is challenging. Neuromorphic chip designs specially tailored to this task therefore offer an interesting perspective for neurorobotics. Unlike Von Neumann CPUs, these chips cannot be simply programmed with a standard programming language. Like real brains, their functionality is determined by the structure of neural connectivity and synaptic efficacies. Enabling higher cognitive functions for neurorobotics consequently requires the application of neurobiological learning algorithms to adjust synaptic weights in a biologically plausible way. In this paper, we therefore investigate how to program neuromorphic chips by means of learning. First, we provide an overview over selected neuromorphic chip designs and analyze them in terms of neural computation, communication systems and software infrastructure. On the theoretical side, we review neurobiological learning techniques. Based on this overview, we then examine on-die implementations of these learning algorithms on the considered neuromorphic chips. A final discussion puts the findings of this work into context and highlights how neuromorphic hardware can potentially advance the field of autonomous robot systems. The paper thus gives an in-depth overview of neuromorphic implementations of basic mechanisms of synaptic plasticity which are required to realize advanced cognitive capabilities with spiking neural networks. PMID:26422422
Neuromorphic implementations of neurobiological learning algorithms for spiking neural networks.
Walter, Florian; Röhrbein, Florian; Knoll, Alois
2015-12-01
The application of biologically inspired methods in design and control has a long tradition in robotics. Unlike previous approaches in this direction, the emerging field of neurorobotics not only mimics biological mechanisms at a relatively high level of abstraction but employs highly realistic simulations of actual biological nervous systems. Even today, carrying out these simulations efficiently at appropriate timescales is challenging. Neuromorphic chip designs specially tailored to this task therefore offer an interesting perspective for neurorobotics. Unlike Von Neumann CPUs, these chips cannot be simply programmed with a standard programming language. Like real brains, their functionality is determined by the structure of neural connectivity and synaptic efficacies. Enabling higher cognitive functions for neurorobotics consequently requires the application of neurobiological learning algorithms to adjust synaptic weights in a biologically plausible way. In this paper, we therefore investigate how to program neuromorphic chips by means of learning. First, we provide an overview over selected neuromorphic chip designs and analyze them in terms of neural computation, communication systems and software infrastructure. On the theoretical side, we review neurobiological learning techniques. Based on this overview, we then examine on-die implementations of these learning algorithms on the considered neuromorphic chips. A final discussion puts the findings of this work into context and highlights how neuromorphic hardware can potentially advance the field of autonomous robot systems. The paper thus gives an in-depth overview of neuromorphic implementations of basic mechanisms of synaptic plasticity which are required to realize advanced cognitive capabilities with spiking neural networks.
Cascade Error Projection: A Learning Algorithm for Hardware Implementation
NASA Technical Reports Server (NTRS)
Duong, Tuan A.; Daud, Taher
1996-01-01
In this paper, we workout a detailed mathematical analysis for a new learning algorithm termed Cascade Error Projection (CEP) and a general learning frame work. This frame work can be used to obtain the cascade correlation learning algorithm by choosing a particular set of parameters. Furthermore, CEP learning algorithm is operated only on one layer, whereas the other set of weights can be calculated deterministically. In association with the dynamical stepsize change concept to convert the weight update from infinite space into a finite space, the relation between the current stepsize and the previous energy level is also given and the estimation procedure for optimal stepsize is used for validation of our proposed technique. The weight values of zero are used for starting the learning for every layer, and a single hidden unit is applied instead of using a pool of candidate hidden units similar to cascade correlation scheme. Therefore, simplicity in hardware implementation is also obtained. Furthermore, this analysis allows us to select from other methods (such as the conjugate gradient descent or the Newton's second order) one of which will be a good candidate for the learning technique. The choice of learning technique depends on the constraints of the problem (e.g., speed, performance, and hardware implementation); one technique may be more suitable than others. Moreover, for a discrete weight space, the theoretical analysis presents the capability of learning with limited weight quantization. Finally, 5- to 8-bit parity and chaotic time series prediction problems are investigated; the simulation results demonstrate that 4-bit or more weight quantization is sufficient for learning neural network using CEP. In addition, it is demonstrated that this technique is able to compensate for less bit weight resolution by incorporating additional hidden units. However, generation result may suffer somewhat with lower bit weight quantization.
Comparison of cyclic correlation algorithm implemented in matlab and python
NASA Astrophysics Data System (ADS)
Carr, Richard; Whitney, James
Simulation is a necessary step for all engineering projects. Simulation gives the engineers an approximation of how their devices will perform under different circumstances, without hav-ing to build, or before building a physical prototype. This is especially true for space bound devices, i.e., space communication systems, where the impact of system malfunction or failure is several orders of magnitude over that of terrestrial applications. Therefore having a reliable simulation tool is key in developing these devices and systems. Math Works Matrix Laboratory (MATLAB) is a matrix based software used by scientists and engineers to solve problems and perform complex simulations. MATLAB has a number of applications in a wide variety of fields which include communications, signal processing, image processing, mathematics, eco-nomics and physics. Because of its many uses MATLAB has become the preferred software for many engineers; it is also very expensive, especially for students and startups. One alternative to MATLAB is Python. The Python is a powerful, easy to use, open source programming environment that can be used to perform many of the same functions as MATLAB. Python programming environment has been steadily gaining popularity in niche programming circles. While there are not as many function included in the software as MATLAB, there are many open source functions that have been developed that are available to be downloaded for free. This paper illustrates how Python can implement the cyclic correlation algorithm and com-pares the results to the cyclic correlation algorithm implemented in the MATLAB environment. Some of the characteristics to be compared are the accuracy and precision of the results, and the length of the programs. The paper will demonstrate that Python is capable of performing simulations of complex algorithms such cyclic correlation.
NASA Astrophysics Data System (ADS)
López-Medina, Mario E.; Vázquez-Montiel, Sergio; Herrera-Vázquez, Joel
2008-04-01
The Genetic Algorithms, GAs, are a method of global optimization that we use in the stage of optimization in the design of optical systems. In the case of optical design and optimization, the efficiency and convergence speed of GAs are related with merit function, crossover operator, and mutation operator. In this study we present a comparison between several genetic algorithms implementations using different optical systems, like achromatic cemented doublet, air spaced doublet and telescopes. We do the comparison varying the type of design parameters and the number of parameters to be optimized. We also implement the GAs using discreet parameters with binary chains and with continuous parameter using real numbers in the chromosome; analyzing the differences in the time taken to find the solution and the precision in the results between discreet and continuous parameters. Additionally, we use different merit function to optimize the same optical system. We present the obtained results in tables, graphics and a detailed example; and of the comparison we conclude which is the best way to implement GAs for design and optimization optical system. The programs developed for this work were made using the C programming language and OSLO for the simulation of the optical systems.
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification
Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao
2012-01-01
This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
All-Optical Implementation of the Ant Colony Optimization Algorithm.
Hu, Wenchao; Wu, Kan; Shum, Perry Ping; Zheludev, Nikolay I; Soci, Cesare
2016-01-01
We report all-optical implementation of the optimization algorithm for the famous "ant colony" problem. Ant colonies progressively optimize pathway to food discovered by one of the ants through identifying the discovered route with volatile chemicals (pheromones) secreted on the way back from the food deposit. Mathematically this is an important example of graph optimization problem with dynamically changing parameters. Using an optical network with nonlinear waveguides to represent the graph and a feedback loop, we experimentally show that photons traveling through the network behave like ants that dynamically modify the environment to find the shortest pathway to any chosen point in the graph. This proof-of-principle demonstration illustrates how transient nonlinearity in the optical system can be exploited to tackle complex optimization problems directly, on the hardware level, which may be used for self-routing of optical signals in transparent communication networks and energy flow in photonic systems. PMID:27222098
An overview of SuperLU: Algorithms, implementation, and userinterface
Li, Xiaoye S.
2003-09-30
We give an overview of the algorithms, design philosophy,and implementation techniques in the software SuperLU, for solving sparseunsymmetric linear systems. In particular, we highlight the differencesbetween the sequential SuperLU (including its multithreaded extension)and parallel SuperLU_DIST. These include the numerical pivoting strategy,the ordering strategy for preserving sparsity, the ordering in which theupdating tasks are performed, the numerical kernel, and theparallelization strategy. Because of the scalability concern, theparallel code is drastically different from the sequential one. Wedescribe the user interfaces ofthe libraries, and illustrate how to usethe libraries most efficiently depending on some matrix characteristics.Finally, we give some examples of how the solver has been used inlarge-scale scientific applications, and the performance.
All-Optical Implementation of the Ant Colony Optimization Algorithm
Hu, Wenchao; Wu, Kan; Shum, Perry Ping; Zheludev, Nikolay I.; Soci, Cesare
2016-01-01
We report all-optical implementation of the optimization algorithm for the famous “ant colony” problem. Ant colonies progressively optimize pathway to food discovered by one of the ants through identifying the discovered route with volatile chemicals (pheromones) secreted on the way back from the food deposit. Mathematically this is an important example of graph optimization problem with dynamically changing parameters. Using an optical network with nonlinear waveguides to represent the graph and a feedback loop, we experimentally show that photons traveling through the network behave like ants that dynamically modify the environment to find the shortest pathway to any chosen point in the graph. This proof-of-principle demonstration illustrates how transient nonlinearity in the optical system can be exploited to tackle complex optimization problems directly, on the hardware level, which may be used for self-routing of optical signals in transparent communication networks and energy flow in photonic systems. PMID:27222098
All-Optical Implementation of the Ant Colony Optimization Algorithm
NASA Astrophysics Data System (ADS)
Hu, Wenchao; Wu, Kan; Shum, Perry Ping; Zheludev, Nikolay I.; Soci, Cesare
2016-05-01
We report all-optical implementation of the optimization algorithm for the famous “ant colony” problem. Ant colonies progressively optimize pathway to food discovered by one of the ants through identifying the discovered route with volatile chemicals (pheromones) secreted on the way back from the food deposit. Mathematically this is an important example of graph optimization problem with dynamically changing parameters. Using an optical network with nonlinear waveguides to represent the graph and a feedback loop, we experimentally show that photons traveling through the network behave like ants that dynamically modify the environment to find the shortest pathway to any chosen point in the graph. This proof-of-principle demonstration illustrates how transient nonlinearity in the optical system can be exploited to tackle complex optimization problems directly, on the hardware level, which may be used for self-routing of optical signals in transparent communication networks and energy flow in photonic systems.
The maximum clique enumeration problem: algorithms, applications, and implementations
2012-01-01
Background The maximum clique enumeration (MCE) problem asks that we identify all maximum cliques in a finite, simple graph. MCE is closely related to two other well-known and widely-studied problems: the maximum clique optimization problem, which asks us to determine the size of a largest clique, and the maximal clique enumeration problem, which asks that we compile a listing of all maximal cliques. Naturally, these three problems are NP-hard, given that they subsume the classic version of the NP-complete clique decision problem. MCE can be solved in principle with standard enumeration methods due to Bron, Kerbosch, Kose and others. Unfortunately, these techniques are ill-suited to graphs encountered in our applications. We must solve MCE on instances deeply seeded in data mining and computational biology, where high-throughput data capture often creates graphs of extreme size and density. MCE can also be solved in principle using more modern algorithms based in part on vertex cover and the theory of fixed-parameter tractability (FPT). While FPT is an improvement, these algorithms too can fail to scale sufficiently well as the sizes and densities of our datasets grow. Results An extensive testbed of benchmark graphs are created using publicly available transcriptomic datasets from the Gene Expression Omnibus (GEO). Empirical testing reveals crucial but latent features of such high-throughput biological data. In turn, it is shown that these features distinguish real data from random data intended to reproduce salient topological features. In particular, with real data there tends to be an unusually high degree of maximum clique overlap. Armed with this knowledge, novel decomposition strategies are tuned to the data and coupled with the best FPT MCE implementations. Conclusions Several algorithmic improvements to MCE are made which progressively decrease the run time on graphs in the testbed. Frequently the final runtime improvement is several orders of magnitude
An implementation of continuous genetic algorithm in parameter estimation of predator-prey model
NASA Astrophysics Data System (ADS)
Windarto
2016-03-01
Genetic algorithm is an optimization method based on the principles of genetics and natural selection in life organisms. The main components of this algorithm are chromosomes population (individuals population), parent selection, crossover to produce new offspring, and random mutation. In this paper, continuous genetic algorithm was implemented to estimate parameters in a predator-prey model of Lotka-Volterra type. For simplicity, all genetic algorithm parameters (selection rate and mutation rate) are set to be constant along implementation of the algorithm. It was found that by selecting suitable mutation rate, the algorithms can estimate these parameters well.
An Object-Oriented Collection of Minimum Degree Algorithms: Design, Implementation, and Experiences
NASA Technical Reports Server (NTRS)
Kumfert, Gary; Pothen, Alex
1999-01-01
The multiple minimum degree (MMD) algorithm and its variants have enjoyed 20+ years of research and progress in generating fill-reducing orderings for sparse, symmetric positive definite matrices. Although conceptually simple, efficient implementations of these algorithms are deceptively complex and highly specialized. In this case study, we present an object-oriented library that implements several recent minimum degree-like algorithms. We discuss how object-oriented design forces us to decompose these algorithms in a different manner than earlier codes and demonstrate how this impacts the flexibility and efficiency of our C++ implementation. We compare the performance of our code against other implementations in C or Fortran.
Systolic VLSI array for implementing the Kalman filter algorithm
NASA Technical Reports Server (NTRS)
Chang, Jaw J. (Inventor); Yeh, Hen-Geul (Inventor)
1989-01-01
A method and apparatus for processing signals representative of a complex matrix/vector equation is disclosed and claimed. More particularly, signals representing an orderly sequence of the combined matrix and vector equation, known as a Kalman filter algorithm, is processed in real-time in accordance with the principles of this invention. The Kalman filter algorithm is converted into a Faddeev algorithm, which is a matrix-only algorithm. The Faddeev algorithm is modified to represent both the matrix and vector portions of the Kalman filter algorithm. The modified Faddeev algorithm is embodied into electrical signals which are applied as inputs to a systolic array processor, which performs triangulation and nullification on the input signals, and delivers an output signal to a real-time utilization circuit.
Error tolerance in an NMR implementation of Grover's fixed-point quantum search algorithm
Xiao Li; Jones, Jonathan A.
2005-09-15
We describe an implementation of Grover's fixed-point quantum search algorithm on a nuclear magnetic resonance quantum computer, searching for either one or two matching items in an unsorted database of four items. In this algorithm the target state (an equally weighted superposition of the matching states) is a fixed point of the recursive search operator, so that the algorithm always moves towards the desired state. The effects of systematic errors in the implementation are briefly explored.
Improvement and implementation for Canny edge detection algorithm
NASA Astrophysics Data System (ADS)
Yang, Tao; Qiu, Yue-hong
2015-07-01
Edge detection is necessary for image segmentation and pattern recognition. In this paper, an improved Canny edge detection approach is proposed due to the defect of traditional algorithm. A modified bilateral filter with a compensation function based on pixel intensity similarity judgment was used to smooth image instead of Gaussian filter, which could preserve edge feature and remove noise effectively. In order to solve the problems of sensitivity to the noise in gradient calculating, the algorithm used 4 directions gradient templates. Finally, Otsu algorithm adaptively obtain the dual-threshold. All of the algorithm simulated with OpenCV 2.4.0 library in the environments of vs2010, and through the experimental analysis, the improved algorithm has been proved to detect edge details more effectively and with more adaptability.
A modified density-based clustering algorithm and its implementation
NASA Astrophysics Data System (ADS)
Ban, Zhihua; Liu, Jianguo; Yuan, Lulu; Yang, Hua
2015-12-01
This paper presents an improved density-based clustering algorithm based on the paper of clustering by fast search and find of density peaks. A distance threshold is introduced for the purpose of economizing memory. In order to reduce the probability that two points share the same density value, similarity is utilized to define proximity measure. We have tested the modified algorithm on a large data set, several small data sets and shape data sets. It turns out that the proposed algorithm can obtain acceptable results and can be applied more wildly.
Algorithm Implementation for a Prototype Time-Encoded Signature Detector
Mercier, Theresa M.; Runkle, Robert C.; Stephens, Daniel L.; Hyronimus, Brian J.; Morris, Scott J.; Seifert, Allen; Wyatt, Cory R.
2007-12-31
The authors constructed a prototype Time-Encoded Signature (TES) system, complete with automated detection algorithms, useful for the detection of point-like, gamma-ray sources in search applications where detectors observe large variability in background count rates beyond statistical (Poisson) noise. The person-carried TES instrument consists of two Cesium Iodide scintillators placed on opposite sides of a Tungsten shield. This geometry mitigates systematic background variation, and induces a unique signature upon encountering point-like sources. This manuscript focuses on the development of detection algorithms that both identify point-source signatures and are computationally simple. The latter constraint derives from the instrument’s mobile (and thus low power) operation. The authors evaluated algorithms on both simulated and field data. The results of this analysis demonstrate the ability to detect sources at a wide range of source-detector distances using computationally simple algorithms.
An Improved QRS Wave Group Detection Algorithm and Matlab Implementation
NASA Astrophysics Data System (ADS)
Zhang, Hongjun
This paper presents an algorithm using Matlab software to detect QRS wave group of MIT-BIH ECG database. First of all the noise in ECG be Butterworth filtered, and then analysis the ECG signal based on wavelet transform to detect the parameters of the principle of singularity, more accurate detection of the QRS wave group was achieved.
Implementations of back propagation algorithm in ecosystems applications
NASA Astrophysics Data System (ADS)
Ali, Khalda F.; Sulaiman, Riza; Elamir, Amir Mohamed
2015-05-01
Artificial Neural Networks (ANNs) have been applied to an increasing number of real world problems of considerable complexity. Their most important advantage is in solving problems which are too complex for conventional technologies, that do not have an algorithmic solutions or their algorithmic Solutions is too complex to be found. In general, because of their abstraction from the biological brain, ANNs are developed from concept that evolved in the late twentieth century neuro-physiological experiments on the cells of the human brain to overcome the perceived inadequacies with conventional ecological data analysis methods. ANNs have gained increasing attention in ecosystems applications, because of ANN's capacity to detect patterns in data through non-linear relationships, this characteristic confers them a superior predictive ability. In this research, ANNs is applied in an ecological system analysis. The neural networks use the well known Back Propagation (BP) Algorithm with the Delta Rule for adaptation of the system. The Back Propagation (BP) training Algorithm is an effective analytical method for adaptation of the ecosystems applications, the main reason because of their capacity to detect patterns in data through non-linear relationships. This characteristic confers them a superior predicting ability. The BP algorithm uses supervised learning, which means that we provide the algorithm with examples of the inputs and outputs we want the network to compute, and then the error is calculated. The idea of the back propagation algorithm is to reduce this error, until the ANNs learns the training data. The training begins with random weights, and the goal is to adjust them so that the error will be minimal. This research evaluated the use of artificial neural networks (ANNs) techniques in an ecological system analysis and modeling. The experimental results from this research demonstrate that an artificial neural network system can be trained to act as an expert
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices, part 2
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Nachtigal, Noel M.
1990-01-01
It is shown how the look-ahead Lanczos process (combined with a quasi-minimal residual QMR) approach) can be used to develop a robust black box solver for large sparse non-Hermitian linear systems. Details of an implementation of the resulting QMR algorithm are presented. It is demonstrated that the QMR method is closely related to the biconjugate gradient (BCG) algorithm; however, unlike BCG, the QMR algorithm has smooth convergence curves and good numerical properties. We report numerical experiments with our implementation of the look-ahead Lanczos algorithm, both for eigenvalue problem and linear systems. Also, program listings of FORTRAN implementations of the look-ahead algorithm and the QMR method are included.
Comparative study of fusion algorithms and implementation of new efficient solution
NASA Astrophysics Data System (ADS)
Besrour, Amine; Snoussi, Hichem; Siala, Mohamed; Abdelkefi, Fatma
2014-05-01
High Dynamic Range (HDR) imaging has been the subject of significant researches over the past years, the goal of acquiring the best cinema-quality HDR images of fast-moving scenes using an efficient merging algorithm has not yet been achieved. In fact, through the years, many efficient algorithms have been implemented and developed. However, they are not yet efficient since they don't treat all the situations and they have not enough speed to ensure fast HDR image reconstitution. In this paper, we will present a full comparative analyze and study of the available fusion algorithms. Also, we will implement our personal algorithm which can be more optimized and faster than the existed ones. We will also present our investigated algorithm that has the advantage to be more optimized than the existing ones. This merging algorithm is related to our hardware solution allowing us to obtain four pictures with different exposures.
Constant-time parallel sorting algorithm and its optical implementation using smart pixels
NASA Astrophysics Data System (ADS)
Louri, Ahmed; Hatch, James A., Jr.; Na, Jongwhoa
1995-06-01
Sorting is a fundamental operation that has important implications in a vast number of areas. For instance, sorting is heavily utilized in applications such as database machines, in which hashing techniques are used to accelerate data-processing algorithms. It is also the basis for interprocessor message routing and has strong implications in video telecommunications. However, high-speed electronic sorting networks are difficult to implement with VLSI technology because of the dense, global connectivity required. Optics eliminates this bottleneck by offering global interconnects, massive parallelism, and noninterfering communications. We present a parallel sorting algorithm and its efficient optical implementation. The algorithm sorts n data elements in few steps, independent of the number of elements to be sorted. Thus it is a constant-time sorting algorithm [i.e., O(1) time]. We also estimate the system's performance to show that the proposed sorting algorithm can provide at least 2 orders of magnitude improvement in execution time over conventional electronic algorithms.
A simple implementation of the Viterbi algorithm on the Motorola DSP56001
NASA Technical Reports Server (NTRS)
Messer, Dion D.; Park, Sangil
1990-01-01
As systems designers design communication systems with digital rather than analog components to reduce noise and increase channel capacity, they must have the ability to perform traditional communication algorithms digitally. The use of trellis coded modulation as well as the extensive use of convolutional encoding for error detection and correction requires an efficient digital implementation of the Viterbi Algorithm for real time demodulation and decoding. Digital signal processors are now fast enough to implement Viterbi decoding in conjunction with the normal receiver/transmitter functions for lower speed channels on a single chip as well as performing fast decoding for higher speed channels, if the algorithm is implemented efficiently. The purpose of this paper is to identify a good way to implement the Viterbi Algorithm (VA) on the Motorola DSP56001, balancing performance considerations with speed and memory efficiency.
A Fast Implementation of the ISODATA Clustering Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2005-01-01
Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. We present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a kd-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. We also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point. We provide both theoretical and empirical justification that our modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. We also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that our approach has significantly lower running times.
Implementation of Real-Time Feedback Flow Control Algorithms on a Canonical Testbed
NASA Technical Reports Server (NTRS)
Tian, Ye; Song, Qi; Cattafesta, Louis
2005-01-01
This report summarizes the activities on "Implementation of Real-Time Feedback Flow Control Algorithms on a Canonical Testbed." The work summarized consists primarily of two parts. The first part summarizes our previous work and the extensions to adaptive ID and control algorithms. The second part concentrates on the validation of adaptive algorithms by applying them to a vibration beam test bed. Extensions to flow control problems are discussed.
NASA Technical Reports Server (NTRS)
Delaat, J. C.; Merrill, W. C.
1983-01-01
A sensor failure detection, isolation, and accommodation algorithm was developed which incorporates analytic sensor redundancy through software. This algorithm was implemented in a high level language on a microprocessor based controls computer. Parallel processing and state-of-the-art 16-bit microprocessors are used along with efficient programming practices to achieve real-time operation.
A Novel Implementation of Efficient Algorithms for Quantum Circuit Synthesis
NASA Astrophysics Data System (ADS)
Zeller, Luke
In this project, we design and develop a computer program to effectively approximate arbitrary quantum gates using the discrete set of Clifford Gates together with the T gate (π/8 gate). Employing recent results from Mosca et. al. and Giles and Selinger, we implement a decomposition scheme that outputs a sequence of Clifford, T, and Tt gates that approximate the input to within a specified error range ɛ. Specifically, the given gate is first rounded to an element of Z[1/2, i] with a precision determined by ɛ, and then exact synthesis is employed to produce the resulting gate. It is known that this procedure is optimal in approximating an arbitrary single qubit gate. Our program, written in Matlab and Python, can complete both approximate and exact synthesis of qubits. It can be used to assist in the experimental implementation of an arbitrary fault-tolerant single qubit gate, for which direct implementation isn't feasible.
Demonstration of a small programmable quantum computer with atomic qubits.
Debnath, S; Linke, N M; Figgatt, C; Landsman, K A; Wright, K; Monroe, C
2016-08-01
Quantum computers can solve certain problems more efficiently than any possible conventional computer. Small quantum algorithms have been demonstrated on multiple quantum computing platforms, many specifically tailored in hardware to implement a particular algorithm or execute a limited number of computational paths. Here we demonstrate a five-qubit trapped-ion quantum computer that can be programmed in software to implement arbitrary quantum algorithms by executing any sequence of universal quantum logic gates. We compile algorithms into a fully connected set of gate operations that are native to the hardware and have a mean fidelity of 98 per cent. Reconfiguring these gate sequences provides the flexibility to implement a variety of algorithms without altering the hardware. As examples, we implement the Deutsch-Jozsa and Bernstein-Vazirani algorithms with average success rates of 95 and 90 per cent, respectively. We also perform a coherent quantum Fourier transform on five trapped-ion qubits for phase estimation and period finding with average fidelities of 62 and 84 per cent, respectively. This small quantum computer can be scaled to larger numbers of qubits within a single register, and can be further expanded by connecting several such modules through ion shuttling or photonic quantum channels. PMID:27488798
Demonstration of a small programmable quantum computer with atomic qubits.
Debnath, S; Linke, N M; Figgatt, C; Landsman, K A; Wright, K; Monroe, C
2016-08-03
Quantum computers can solve certain problems more efficiently than any possible conventional computer. Small quantum algorithms have been demonstrated on multiple quantum computing platforms, many specifically tailored in hardware to implement a particular algorithm or execute a limited number of computational paths. Here we demonstrate a five-qubit trapped-ion quantum computer that can be programmed in software to implement arbitrary quantum algorithms by executing any sequence of universal quantum logic gates. We compile algorithms into a fully connected set of gate operations that are native to the hardware and have a mean fidelity of 98 per cent. Reconfiguring these gate sequences provides the flexibility to implement a variety of algorithms without altering the hardware. As examples, we implement the Deutsch-Jozsa and Bernstein-Vazirani algorithms with average success rates of 95 and 90 per cent, respectively. We also perform a coherent quantum Fourier transform on five trapped-ion qubits for phase estimation and period finding with average fidelities of 62 and 84 per cent, respectively. This small quantum computer can be scaled to larger numbers of qubits within a single register, and can be further expanded by connecting several such modules through ion shuttling or photonic quantum channels.
Demonstration of a small programmable quantum computer with atomic qubits
NASA Astrophysics Data System (ADS)
Debnath, S.; Linke, N. M.; Figgatt, C.; Landsman, K. A.; Wright, K.; Monroe, C.
2016-08-01
Quantum computers can solve certain problems more efficiently than any possible conventional computer. Small quantum algorithms have been demonstrated on multiple quantum computing platforms, many specifically tailored in hardware to implement a particular algorithm or execute a limited number of computational paths. Here we demonstrate a five-qubit trapped-ion quantum computer that can be programmed in software to implement arbitrary quantum algorithms by executing any sequence of universal quantum logic gates. We compile algorithms into a fully connected set of gate operations that are native to the hardware and have a mean fidelity of 98 per cent. Reconfiguring these gate sequences provides the flexibility to implement a variety of algorithms without altering the hardware. As examples, we implement the Deutsch-Jozsa and Bernstein-Vazirani algorithms with average success rates of 95 and 90 per cent, respectively. We also perform a coherent quantum Fourier transform on five trapped-ion qubits for phase estimation and period finding with average fidelities of 62 and 84 per cent, respectively. This small quantum computer can be scaled to larger numbers of qubits within a single register, and can be further expanded by connecting several such modules through ion shuttling or photonic quantum channels.
Evaluation of a segmentation algorithm designed for an FPGA implementation
NASA Astrophysics Data System (ADS)
Schwenk, Kurt; Schönermark, Maria; Huber, Felix
2013-10-01
The present work has to be seen in the context of real-time on-board image evaluation of optical satellite data. With on board image evaluation more useful data can be acquired, the time to get requested information can be decreased and new real-time applications are possible. Because of the relative high processing power in comparison to the low power consumption, Field Programmable Gate Array (FPGA) technology has been chosen as an adequate hardware platform for image processing tasks. One fundamental part for image evaluation is image segmentation. It is a basic tool to extract spatial image information which is very important for many applications such as object detection. Therefore a special segmentation algorithm using the advantages of FPGA technology has been developed. The aim of this work is the evaluation of this algorithm. Segmentation evaluation is a difficult task. The most common way for evaluating the performance of a segmentation method is still subjective evaluation, in which human experts determine the quality of a segmentation. This way is not in compliance with our needs. The evaluation process has to provide a reasonable quality assessment, should be objective, easy to interpret and simple to execute. To reach these requirements a so called Segmentation Accuracy Equality norm (SA EQ) was created, which compares the difference of two segmentation results. It can be shown that this norm is capable as a first quality measurement. Due to its objectivity and simplicity the algorithm has been tested on a specially chosen synthetic test model. In this work the most important results of the quality assessment will be presented.
FPGA implementation of vision algorithms for small autonomous robots
NASA Astrophysics Data System (ADS)
Anderson, J. D.; Lee, D. J.; Archibald, J. K.
2005-10-01
The use of on-board vision with small autonomous robots has been made possible by the advances in the field of Field Programmable Gate Array (FPGA) technology. By connecting a CMOS camera to an FPGA board, on-board vision has been used to reduce the computation time inherent in vision algorithms. The FPGA board allows the user to create custom hardware in a faster, safer, and more easily verifiable manner that decreases the computation time and allows the vision to be done in real-time. Real-time vision tasks for small autonomous robots include object tracking, obstacle detection and avoidance, and path planning. Competitions were created to demonstrate that our algorithms work with our small autonomous vehicles in dealing with these problems. These competitions include Mouse-Trapped-in-a-Box, where the robot has to detect the edges of a box that it is trapped in and move towards them without touching them; Obstacle Avoidance, where an obstacle is placed at any arbitrary point in front of the robot and the robot has to navigate itself around the obstacle; Canyon Following, where the robot has to move to the center of a canyon and follow the canyon walls trying to stay in the center; the Grand Challenge, where the robot had to navigate a hallway and return to its original position in a given amount of time; and Stereo Vision, where a separate robot had to catch tennis balls launched from an air powered cannon. Teams competed on each of these competitions that were designed for a graduate-level robotic vision class, and each team had to develop their own algorithm and hardware components. This paper discusses one team's approach to each of these problems.
Motion Cueing Algorithm Development: New Motion Cueing Program Implementation and Tuning
NASA Technical Reports Server (NTRS)
Houck, Jacob A. (Technical Monitor); Telban, Robert J.; Cardullo, Frank M.; Kelly, Lon C.
2005-01-01
A computer program has been developed for the purpose of driving the NASA Langley Research Center Visual Motion Simulator (VMS). This program includes two new motion cueing algorithms, the optimal algorithm and the nonlinear algorithm. A general description of the program is given along with a description and flowcharts for each cueing algorithm, and also descriptions and flowcharts for subroutines used with the algorithms. Common block variable listings and a program listing are also provided. The new cueing algorithms have a nonlinear gain algorithm implemented that scales each aircraft degree-of-freedom input with a third-order polynomial. A description of the nonlinear gain algorithm is given along with past tuning experience and procedures for tuning the gain coefficient sets for each degree-of-freedom to produce the desired piloted performance. This algorithm tuning will be needed when the nonlinear motion cueing algorithm is implemented on a new motion system in the Cockpit Motion Facility (CMF) at the NASA Langley Research Center.
A Linac Simulation Code for Macro-Particles Tracking and Steering Algorithm Implementation
sun, yipeng
2012-05-03
In this paper, a linac simulation code written in Fortran90 is presented and several simulation examples are given. This code is optimized to implement linac alignment and steering algorithms, and evaluate the accelerator errors such as RF phase and acceleration gradient, quadrupole and BPM misalignment. It can track a single particle or a bunch of particles through normal linear accelerator elements such as quadrupole, RF cavity, dipole corrector and drift space. One-to-one steering algorithm and a global alignment (steering) algorithm are implemented in this code.
Implementation of perceptual aspects in a face recognition algorithm
NASA Astrophysics Data System (ADS)
Crenna, F.; Zappa, E.; Bovio, L.; Testa, R.; Gasparetto, M.; Rossi, G. B.
2013-09-01
Automatic face recognition is a biometric technique particularly appreciated in security applications. In fact face recognition presents the opportunity to operate at a low invasive level without the collaboration of the subjects under tests, with face images gathered either from surveillance systems or from specific cameras located in strategic points. The automatic recognition algorithms perform a measurement, on the face images, of a set of specific characteristics of the subject and provide a recognition decision based on the measurement results. Unfortunately several quantities may influence the measurement of the face geometry such as its orientation, the lighting conditions, the expression and so on, affecting the recognition rate. On the other hand human recognition of face is a very robust process far less influenced by the surrounding conditions. For this reason it may be interesting to insert perceptual aspects in an automatic facial-based recognition algorithm to improve its robustness. This paper presents a first study in this direction investigating the correlation between the results of a perception experiment and the facial geometry, estimated by means of the position of a set of repere points.
Synchronous multi-microprocessor system for implementing digital signal processing algorithms
Barnwell, T.P. III; Hodges, C.J.M.
1982-01-01
This paper discusses the details of a multi-microprocessor system design as a research facility for studying multiprocessor implementation of digital signal processing algorithms. The overall system, which consists of a control microprocessor, eight satellite microprocessors, a control minicomputer, and extensive distributed software, has proven to be an effect tool in the study of multiprocessor implementations. 5 references.
Implementation and evaluation of ILLIAC 4 algorithms for multispectral image processing
NASA Technical Reports Server (NTRS)
Swain, P. H.
1974-01-01
Data concerning a multidisciplinary and multi-organizational effort to implement multispectral data analysis algorithms on a revolutionary computer, the Illiac 4, are reported. The effectiveness and efficiency of implementing the digital multispectral data analysis techniques for producing useful land use classifications from satellite collected data were demonstrated.
Casson, Alexander J; Rodriguez-Villegas, Esther
2008-01-01
This paper quantifies the performance difference between custom and generic hardware algorithm implementations, illustrating the challenges that are involved in Body Area Network signal processing implementations. The potential use of analogue signal processing to improve the power performance is also demonstrated.
On Distribution Reduction and Algorithm Implementation in Inconsistent Ordered Information Systems
Zhang, Yanqin
2014-01-01
As one part of our work in ordered information systems, distribution reduction is studied in inconsistent ordered information systems (OISs). Some important properties on distribution reduction are studied and discussed. The dominance matrix is restated for reduction acquisition in dominance relations based information systems. Matrix algorithm for distribution reduction acquisition is stepped. And program is implemented by the algorithm. The approach provides an effective tool for the theoretical research and the applications for ordered information systems in practices. For more detailed and valid illustrations, cases are employed to explain and verify the algorithm and the program which shows the effectiveness of the algorithm in complicated information systems. PMID:25258721
NASA Technical Reports Server (NTRS)
Delaat, J. C.
1984-01-01
An advanced, sensor failure detection, isolation, and accomodation algorithm has been developed by NASA for the F100 turbofan engine. The algorithm takes advantage of the analytical redundancy of the sensors to improve the reliability of the sensor set. The method requires the controls computer, to determine when a sensor failure has occurred without the help of redundant hardware sensors in the control system. The controls computer provides an estimate of the correct value of the output of the failed sensor. The algorithm has been programmed in FORTRAN using a real-time microprocessor-based controls computer. A detailed description of the algorithm and its implementation on a microprocessor is given.
Implementation and performance of a domain decomposition algorithm in Sisal
DeBoni, T.; Feo, J.; Rodrigue, G.; Muller, J.
1993-09-23
Sisal is a general-purpose functional language that hides the complexity of parallel processing, expedites parallel program development, and guarantees determinacy. Parallelism and management of concurrent tasks are realized automatically by the compiler and runtime system. Spatial domain decomposition is a widely-used method that focuses computational resources on the most active, or important, areas of a domain. Many complex programming issues are introduced in paralleling this method including: dynamic spatial refinement, dynamic grid partitioning and fusion, task distribution, data distribution, and load balancing. In this paper, we describe a spatial domain decomposition algorithm programmed in Sisal. We explain the compilation process, and present the execution performance of the resultant code on two different multiprocessor systems: a multiprocessor vector supercomputer, and cache-coherent scalar multiprocessor.
Implementation of jump-diffusion algorithms for understanding FLIR scenes
NASA Astrophysics Data System (ADS)
Lanterman, Aaron D.; Miller, Michael I.; Snyder, Donald L.
1995-07-01
Our pattern theoretic approach to the automated understanding of forward-looking infrared (FLIR) images brings the traditionally separate endeavors of detection, tracking, and recognition together into a unified jump-diffusion process. New objects are detected and object types are recognized through discrete jump moves. Between jumps, the location and orientation of objects are estimated via continuous diffusions. An hypothesized scene, simulated from the emissive characteristics of the hypothesized scene elements, is compared with the collected data by a likelihood function based on sensor statistics. This likelihood is combined with a prior distribution defined over the set of possible scenes to form a posterior distribution. The jump-diffusion process empirically generates the posterior distribution. Both the diffusion and jump operations involve the simulation of a scene produced by a hypothesized configuration. Scene simulation is most effectively accomplished by pipelined rendering engines such as silicon graphics. We demonstrate the execution of our algorithm on a silicon graphics onyx/reality engine.
Farber, R.M.; Lapedes, A.S.; Rico-Martinez, R.; Kevrekidis, I.G.
1993-12-31
Time-delay mappings constructed using neural networks have proven successful in performing nonlinear system identification; however, because of their discrete nature, their use in bifurcation analysis of continuous-time systems is limited. This shortcoming can be avoided by embedding the neural networks in a training algorithm that mimics a numerical integrator. Both explicit and implicit integrators can be used. The former case is based on repeated evaluations of the network in a feedforward implementation; the latter relies on a recurrent network implementation. Here the algorithms and their implementation on parallel machines (SIMD and MIMD architectures) are discussed.
Farber, R.M.; Lapedes, A.S. ); Rico-Martinez, R.; Kevrekidis, I.G. . Dept. of Chemical Engineering)
1993-01-01
Time-delay mappings constructed using neural networks have proven successful performing nonlinear system identification; however, because of their discrete nature, their use in bifurcation analysis of continuous-tune systems is limited. This shortcoming can be avoided by embedding the neural networks in a training algorithm that mimics a numerical integrator. Both explicit and implicit integrators can be used. The former case is based on repeated evaluations of the network in a feedforward implementation; the latter relies on a recurrent network implementation. Here the algorithms and their implementation on parallel machines (SIMD and MIMD architectures) are discussed.
Farber, R.M.; Lapedes, A.S.; Rico-Martinez, R.; Kevrekidis, I.G.
1993-06-01
Time-delay mappings constructed using neural networks have proven successful performing nonlinear system identification; however, because of their discrete nature, their use in bifurcation analysis of continuous-tune systems is limited. This shortcoming can be avoided by embedding the neural networks in a training algorithm that mimics a numerical integrator. Both explicit and implicit integrators can be used. The former case is based on repeated evaluations of the network in a feedforward implementation; the latter relies on a recurrent network implementation. Here the algorithms and their implementation on parallel machines (SIMD and MIMD architectures) are discussed.
NASA Astrophysics Data System (ADS)
Jin, Minglei; Jin, Weiqi; Li, Yiyang; Li, Shuo
2015-08-01
In this paper, we propose a novel scene-based non-uniformity correction algorithm for infrared image processing-temporal high-pass non-uniformity correction algorithm based on grayscale mapping (THP and GM). The main sources of non-uniformity are: (1) detector fabrication inaccuracies; (2) non-linearity and variations in the read-out electronics and (3) optical path effects. The non-uniformity will be reduced by non-uniformity correction (NUC) algorithms. The NUC algorithms are often divided into calibration-based non-uniformity correction (CBNUC) algorithms and scene-based non-uniformity correction (SBNUC) algorithms. As non-uniformity drifts temporally, CBNUC algorithms must be repeated by inserting a uniform radiation source which SBNUC algorithms do not need into the view, so the SBNUC algorithm becomes an essential part of infrared imaging system. The SBNUC algorithms' poor robustness often leads two defects: artifacts and over-correction, meanwhile due to complicated calculation process and large storage consumption, hardware implementation of the SBNUC algorithms is difficult, especially in Field Programmable Gate Array (FPGA) platform. The THP and GM algorithm proposed in this paper can eliminate the non-uniformity without causing defects. The hardware implementation of the algorithm only based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay: less than 20 lines, it can be transplanted to a variety of infrared detectors equipped with FPGA image processing module, it can reduce the stripe non-uniformity and the ripple non-uniformity.
Implementation of a partitioned algorithm for simulation of large CSI problems
NASA Technical Reports Server (NTRS)
Alvin, Kenneth F.; Park, K. C.
1991-01-01
The implementation of a partitioned numerical algorithm for determining the dynamic response of coupled structure/controller/estimator finite-dimensional systems is reviewed. The partitioned approach leads to a set of coupled first and second-order linear differential equations which are numerically integrated with extrapolation and implicit step methods. The present software implementation, ACSIS, utilizes parallel processing techniques at various levels to optimize performance on a shared-memory concurrent/vector processing system. A general procedure for the design of controller and filter gains is also implemented, which utilizes the vibration characteristics of the structure to be solved. Also presented are: example problems; a user's guide to the software; the procedures and algorithm scripts; a stability analysis for the algorithm; and the source code for the parallel implementation.
NASA Astrophysics Data System (ADS)
Xiong, Yuting; Gu, Zhaojun; Jin, Wei
In this paper, a novel practical algorithmic solution for automatic discovering the physical topology of switched Ethernet was proposed. Our algorithm collects standard SNMP MIB information that is widely supported in modern IP networks and then builds the physical topology of the active network. We described the relative definitions, system model and proved the correctness of the algorithm. Practically, the algorithm was implemented in our visualization network monitoring system. We also presented the main steps of the algorithm, core codes and running results on the lab network. The experimental results clearly validate our approach, demonstrating that our algorithm is simple and effective which can discover the accurate up-to-date physical network topology.
NASA Astrophysics Data System (ADS)
Xu, Dexiang
This dissertation presents a novel method of designing finite word length Finite Impulse Response (FIR) digital filters using a Real Parameter Parallel Genetic Algorithm (RPPGA). This algorithm is derived from basic Genetic Algorithms which are inspired by natural genetics principles. Both experimental results and theoretical studies in this work reveal that the RPPGA is a suitable method for determining the optimal or near optimal discrete coefficients of finite word length FIR digital filters. Performance of RPPGA is evaluated by comparing specifications of filters designed by other methods with filters designed by RPPGA. The parallel and spatial structures of the algorithm result in faster and more robust optimization than basic genetic algorithms. A filter designed by RPPGA is implemented in hardware to attenuate high frequency noise in a data acquisition system for collecting seismic signals. These studies may lead to more applications of the Real Parameter Parallel Genetic Algorithms in Electrical Engineering.
Image segmentation with genetic algorithms: a formulation and implementation
NASA Astrophysics Data System (ADS)
Seetharaman, Gunasekaran; Narasimhan, Amruthur; Sathe, Anand; Storc, Lisa
1991-10-01
Image segmentation is an important step in any computer vision system. Segmentation refers to the partitioning of the image plane into several regions, such that each region corresponds to a logical entity present in the scene. The problem is inherently NP, and the theory on the existence and uniqueness of the ideal segmentation is not yet established. Several methods have been proposed in literature for image segmentation. With the exception of the state-space approach to segmentation, other methods lack generality. The state-space approach, however, amounts to searching for the solution in a large search space of 22n(2) possibilities for a n X n image. In this paper, a classic approach based on state-space techniques for segmentation due to Brice and Fennema is reformulated using genetic algorithms. The state space representation of a partially segmented image lends itself to binary strings, in which the dominant substrings are easily explained in terms of chromosomes. Also the operations such as crossover and mutations are easily abstracted. In particular, when multiple images are segmented from an image sequence, fusion of constraints from one to the other becomes clear under this formulation.
Modular architecture for high performance implementation of the FFT algorithm
Spiecha, K. ); Jarocki, R. )
1990-12-01
This paper presents a new VLSI-oriented architecture to compute discrete Fourier transform. It consists of a homogeneous structure of processing elements. The structure has a performance equal to 1/{ital t} transforms per second, where {ital t} is the time needed for the execution of a single butterfly computation or the time needed for the collection of a complete vector of samples, whichever occurs to be longer. Although the system is not optimal (it achieves {ital O(N}{sup 3} log{sup 4} {ital N)} area time{sup 2} performance), the architecture is modular and makes it possible to design a system which performs FFT of any size without any extra circuitry. Moreover, the system can provide a built-in self-test and self-restructuring. The system consists of only one type of integrated circuit, its structure being irrespective of the transform size, which considerably reduces the cost of implementation.
Implementing embedded artificial intelligence rules within algorithmic programming languages
NASA Technical Reports Server (NTRS)
Feyock, Stefan
1988-01-01
Most integrations of artificial intelligence (AI) capabilities with non-AI (usually FORTRAN-based) application programs require the latter to execute separately to run as a subprogram or, at best, as a coroutine, of the AI system. In many cases, this organization is unacceptable; instead, the requirement is for an AI facility that runs in embedded mode; i.e., is called as subprogram by the application program. The design and implementation of a Prolog-based AI capability that can be invoked in embedded mode are described. The significance of this system is twofold: Provision of Prolog-based symbol-manipulation and deduction facilities makes a powerful symbolic reasoning mechanism available to applications programs written in non-AI languages. The power of the deductive and non-procedural descriptive capabilities of Prolog, which allow the user to describe the problem to be solved, rather than the solution, is to a large extent vitiated by the absence of the standard control structures provided by other languages. Embedding invocations of Prolog rule bases in programs written in non-AI languages makes it possible to put Prolog calls inside DO loops and similar control constructs. The resulting merger of non-AI and AI languages thus results in a symbiotic system in which the advantages of both programming systems are retained, and their deficiencies largely remedied.
Quantum computation: algorithms and implementation in quantum dot devices
NASA Astrophysics Data System (ADS)
Gamble, John King
In this thesis, we explore several aspects of both the software and hardware of quantum computation. First, we examine the computational power of multi-particle quantum random walks in terms of distinguishing mathematical graphs. We study both interacting and non-interacting multi-particle walks on strongly regular graphs, proving some limitations on distinguishing powers and presenting extensive numerical evidence indicative of interactions providing more distinguishing power. We then study the recently proposed adiabatic quantum algorithm for Google PageRank, and show that it exhibits power-law scaling for realistic WWW-like graphs. Turning to hardware, we next analyze the thermal physics of two nearby 2D electron gas (2DEG), and show that an analogue of the Coulomb drag effect exists for heat transfer. In some distance and temperature, this heat transfer is more significant than phonon dissipation channels. After that, we study the dephasing of two-electron states in a single silicon quantum dot. Specifically, we consider dephasing due to the electron-phonon coupling and charge noise, separately treating orbital and valley excitations. In an ideal system, dephasing due to charge noise is strongly suppressed due to a vanishing dipole moment. However, introduction of disorder or anharmonicity leads to large effective dipole moments, and hence possibly strong dephasing. Building on this work, we next consider more realistic systems, including structural disorder systems. We present experiment and theory, which demonstrate energy levels that vary with quantum dot translation, implying a structurally disordered system. Finally, we turn to the issues of valley mixing and valley-orbit hybridization, which occurs due to atomic-scale disorder at quantum well interfaces. We develop a new theoretical approach to study these effects, which we name the disorder-expansion technique. We demonstrate that this method successfully reproduces atomistic tight-binding techniques
NASA Astrophysics Data System (ADS)
Trigueros-Espinosa, Blas; Vélez-Reyes, Miguel; Santiago-Santiago, Nayda G.; Rosario-Torres, Samuel
2011-06-01
Hyperspectral sensors can collect hundreds of images taken at different narrow and contiguously spaced spectral bands. This high-resolution spectral information can be used to identify materials and objects within the field of view of the sensor by their spectral signature, but this process may be computationally intensive due to the large data sizes generated by the hyperspectral sensors, typically hundreds of megabytes. This can be an important limitation for some applications where the detection process must be performed in real time (surveillance, explosive detection, etc.). In this work, we developed a parallel implementation of three state-ofthe- art target detection algorithms (RX algorithm, matched filter and adaptive matched subspace detector) using a graphics processing unit (GPU) based on the NVIDIA® CUDA™ architecture. In addition, a multi-core CPUbased implementation of each algorithm was developed to be used as a baseline for the speedups estimation. We evaluated the performance of the GPU-based implementations using an NVIDIA ® Tesla® C1060 GPU card, and the detection accuracy of the implemented algorithms was evaluated using a set of phantom images simulating traces of different materials on clothing. We achieved a maximum speedup in the GPU implementations of around 20x over a multicore CPU-based implementation, which suggests that applications for real-time detection of targets in HSI can greatly benefit from the performance of GPUs as processing hardware.
Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory
Churbanov, Alexander; Winters-Hilt, Stephen
2008-01-01
Background The Baum-Welch learning procedure for Hidden Markov Models (HMMs) provides a powerful tool for tailoring HMM topologies to data for use in knowledge discovery and clustering. A linear memory procedure recently proposed by Miklós, I. and Meyer, I.M. describes a memory sparse version of the Baum-Welch algorithm with modifications to the original probabilistic table topologies to make memory use independent of sequence length (and linearly dependent on state number). The original description of the technique has some errors that we amend. We then compare the corrected implementation on a variety of data sets with conventional and checkpointing implementations. Results We provide a correct recurrence relation for the emission parameter estimate and extend it to parameter estimates of the Normal distribution. To accelerate estimation of the prior state probabilities, and decrease memory use, we reverse the originally proposed forward sweep. We describe different scaling strategies necessary in all real implementations of the algorithm to prevent underflow. In this paper we also describe our approach to a linear memory implementation of the Viterbi decoding algorithm (with linearity in the sequence length, while memory use is approximately independent of state number). We demonstrate the use of the linear memory implementation on an extended Duration Hidden Markov Model (DHMM) and on an HMM with a spike detection topology. Comparing the various implementations of the Baum-Welch procedure we find that the checkpointing algorithm produces the best overall tradeoff between memory use and speed. In cases where sequence length is very large (for Baum-Welch), or state number is very large (for Viterbi), the linear memory methods outlined may offer some utility. Conclusion Our performance-optimized Java implementations of Baum-Welch algorithm are available at . The described method and implementations will aid sequence alignment, gene structure prediction, HMM
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices, part 1
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.
1990-01-01
The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. We present an implementation of a look-ahead version of the Lanczos algorithm which overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and is not restricted to steps of length 2, as earlier implementations are. Also, our implementation has the feature that it requires roughly the same number of inner products as the standard Lanczos process without look-ahead.
NASA Astrophysics Data System (ADS)
Kershaw, James; Hamlyn, Garry
1994-06-01
This paper describes the implementation of algorithms for automatically extracting and matching features in stereo pairs of images. The implementation has been designed to be as modular as possible to allow different algorithms for each stage in the matching process to be combined in the most appropriate manner for each particular problem. The modules have been implemented in the AVS environment but are designed to be portable to any platform. This work has been undertaken as part of task DEF 93/1 63 'Intelligence Analysis of Imagery', and forms part of ITD's contribution to the Visual Processing research program in the Centre for Sensor System and Information Processing. A major aim of both the task and the research program is to produce software to assist intelligence analysts in extracting three dimensional shape from imagery: the algorithms and software described here will form the first part of a module for automatically extracting depth information from stereo image pairs.
Star-field identification algorithm. [for implementation on CCD-based imaging camera
NASA Technical Reports Server (NTRS)
Scholl, M. S.
1993-01-01
A description of a new star-field identification algorithm that is suitable for implementation on CCD-based imaging cameras is presented. The minimum identifiable star pattern element consists of an oriented star triplet defined by three stars, their celestial coordinates, and their visual magnitudes. The algorithm incorporates tolerance to faulty input data, errors in the reference catalog, and instrument-induced systematic errors.
NASA Astrophysics Data System (ADS)
Tsukahara, Hiroshi; Iwano, Kaoru; Mitsumata, Chiharu; Ishikawa, Tadashi; Ono, Kanta
2016-10-01
We implement low communication frequency three-dimensional fast Fourier transform algorithms on micromagnetics simulator for calculations of a magnetostatic field which occupies a significant portion of large-scale micromagnetics simulation. This fast Fourier transform algorithm reduces the frequency of all-to-all communications from six to two times. Simulation times with our simulator show high scalability in parallelization, even if we perform the micromagnetics simulation using 32 768 physical computing cores. This low communication frequency fast Fourier transform algorithm enables world largest class micromagnetics simulations to be carried out with over one billion calculation cells.
Lu Dawei; Peng Xinhua; Du Jiangfeng; Zhu Jing; Zou Ping; Yu Yihua; Zhang Shanmin; Chen Qun
2010-02-15
An important quantum search algorithm based on the quantum random walk performs an oracle search on a database of N items with O({radical}(phN)) calls, yielding a speedup similar to the Grover quantum search algorithm. The algorithm was implemented on a quantum information processor of three-qubit liquid-crystal nuclear magnetic resonance (NMR) in the case of finding 1 out of 4, and the diagonal elements' tomography of all the final density matrices was completed with comprehensible one-dimensional NMR spectra. The experimental results agree well with the theoretical predictions.
Subspace scheduling and parallel implementation of non-systolic regular iterative algorithms
Roychowdhury, V.P.; Kailath, T.
1989-01-01
The study of Regular Iterative Algorithms (RIAs) was introduced in a seminal paper by Karp, Miller, and Winograd in 1967. In more recent years, the study of systolic architectures has led to a renewed interest in this class of algorithms, and the class of algorithms implementable on systolic arrays (as commonly understood) has been identified as a precise subclass of RIAs include matrix pivoting algorithms and certain forms of numerically stable two-dimensional filtering algorithms. It has been shown that the so-called hyperplanar scheduling for systolic algorithms can no longer be used to schedule and implement non-systolic RIAs. Based on the analysis of a so-called computability tree we generalize the concept of hyperplanar scheduling and determine linear subspaces in the index space of a given RIA such that all variables lying on the same subspace can be scheduled at the same time. This subspace scheduling technique is shown to be asymptotically optimal, and formal procedures are developed for designing processor arrays that will be compatible with our scheduling schemes. Explicit formulas for the schedule of a given variable are determined whenever possible; subspace scheduling is also applied to obtain lower dimensional processor arrays for systolic algorithms.
Implementation of C* Boundary Conditions in the Hybrid Monte Carlo Algorithm
NASA Astrophysics Data System (ADS)
Carmona, José Manuel; D'elia, Massimo; Di Giacomo, Adriano; Lucini, Biagio
In the study of QCD dynamics, C* boundary conditions are physically relevant in certain cases. In this paper, we study the implementation of these boundary conditions in the lattice formulation of full QCD with staggered fermions. In particular, we show that the usual even-odd partition trick to avoid the redoubling of the fermion matrix is still valid in this case. We give an explicit implementation of these boundary conditions for the Hybrid Monte Carlo algorithm.
Electromagnetic Interactions GEneRalized (EIGER): Algorithm abstraction and HPC implementation
Sharpe, R.M.; Grant, J.B.; Champagne, N.J.; Wilton, D.R.; Jackson, D.R.; Johnson, W.A.; Jorgensen, R.E.; Rockway, J.W.; Manry, C.W.
1998-06-01
Modern software development methods combined with key generalizations of standard computational algorithms enable the development of a new class of electromagnetic modeling tools. This paper describes current and anticipated capabilities of a frequency domain modeling code, EIGER, which has an extremely wide range of applicability. In addition, software implementation methods and high performance computing issues are discussed.
Electromagnetic interactions GEneRalized (EIGER): algorithm abstraction and HPC implementation
Sharpe, R.M., LLNL
1998-04-21
Modern software development methods combined with key generalizations of standard computational algorithms enable the development of a new class of electromagnetic modeling tools. This paper describes current and anticipated capabilities of a frequency domain modeling code, EIGER, which has an extremely wide range of applicability. In addition, software implementation methods and high performance computing issues are discussed.
Design and Implementation of Numerical Linear Algebra Algorithms on Fixed Point DSPs
NASA Astrophysics Data System (ADS)
Nikolić, Zoran; Nguyen, Ha Thai; Frantz, Gene
2007-12-01
Numerical linear algebra algorithms use the inherent elegance of matrix formulations and are usually implemented using C/C++ floating point representation. The system implementation is faced with practical constraints because these algorithms usually need to run in real time on fixed point digital signal processors (DSPs) to reduce total hardware costs. Converting the simulation model to fixed point arithmetic and then porting it to a target DSP device is a difficult and time-consuming process. In this paper, we analyze the conversion process. We transformed selected linear algebra algorithms from floating point to fixed point arithmetic, and compared real-time requirements and performance between the fixed point DSP and floating point DSP algorithm implementations. We also introduce an advanced code optimization and an implementation by DSP-specific, fixed point C code generation. By using the techniques described in the paper, speed can be increased by a factor of up to 10 compared to floating point emulation on fixed point hardware.
An implementation of a data-transmission pipelining algorithm on Imote2 platforms
NASA Astrophysics Data System (ADS)
Li, Xu; Dorvash, Siavash; Cheng, Liang; Pakzad, Shamim
2011-04-01
Over the past several years, wireless network systems and sensing technologies have been developed significantly. This has resulted in the broad application of wireless sensor networks (WSNs) in many engineering fields and in particular structural health monitoring (SHM). The movement of traditional SHM toward the new generation of SHM, which utilizes WSNs, relies on the advantages of this new approach such as relatively low costs, ease of implementation and the capability of onboard data processing and management. In the particular case of long span bridge monitoring, a WSN should be capable of transmitting commands and measurement data over long network geometry in a reliable manner. While using single-hop data transmission in such geometry requires a long radio range and consequently a high level of power supply, multi-hop communication may offer an effective and reliable way for data transmissions across the network. Using a multi-hop communication protocol, the network relays data from a remote node to the base station via intermediary nodes. We have proposed a data-transmission pipelining algorithm to enable an effective use of the available bandwidth and minimize the energy consumption and the delay performance by the multi-hop communication protocol. This paper focuses on the implementation aspect of the pipelining algorithm on Imote2 platforms for SHM applications, describes its interaction with underlying routing protocols, and presents the solutions to various implementation issues of the proposed pipelining algorithm. Finally, the performance of the algorithm is evaluated based on the results of an experimental implementation.
Implementation and evaluation of the new wind algorithm in NASA's 50 MHz doppler radar wind profiler
NASA Technical Reports Server (NTRS)
Taylor, Gregory E.; Manobianco, John T.; Schumann, Robin S.; Wheeler, Mark M.; Yersavich, Ann M.
1993-01-01
The purpose of this report is to document the Applied Meteorology Unit's implementation and evaluation of the wind algorithm developed by Marshall Space Flight Center (MSFC) on the data analysis processor (DAP) of NASA's 50 MHz doppler radar wind profiler (DRWP). The report also includes a summary of the 50 MHz DRWP characteristics and performance and a proposed concept of operations for the DRWP.
Towards the Implementation of an Autonomous Camera Algorithm on the da Vinci Platform.
Eslamian, Shahab; Reisner, Luke A; King, Brady W; Pandya, Abhilash K
2016-01-01
Camera positioning is critical for all telerobotic surgical systems. Inadequate visualization of the remote site can lead to serious errors that can jeopardize the patient. An autonomous camera algorithm has been developed on a medical robot (da Vinci) simulator. It is found to be robust in key scenarios of operation. This system behaves with predictable and expected actions for the camera arm with respect to the tool positions. The implementation of this system is described herein. The simulation closely models the methodology needed to implement autonomous camera control in a real hardware system. The camera control algorithm follows three rules: (1) keep the view centered on the tools, (2) keep the zoom level optimized such that the tools never leave the field of view, and (3) avoid unnecessary movement of the camera that may distract/disorient the surgeon. Our future work will apply this algorithm to the real da Vinci hardware.
Towards the Implementation of an Autonomous Camera Algorithm on the da Vinci Platform.
Eslamian, Shahab; Reisner, Luke A; King, Brady W; Pandya, Abhilash K
2016-01-01
Camera positioning is critical for all telerobotic surgical systems. Inadequate visualization of the remote site can lead to serious errors that can jeopardize the patient. An autonomous camera algorithm has been developed on a medical robot (da Vinci) simulator. It is found to be robust in key scenarios of operation. This system behaves with predictable and expected actions for the camera arm with respect to the tool positions. The implementation of this system is described herein. The simulation closely models the methodology needed to implement autonomous camera control in a real hardware system. The camera control algorithm follows three rules: (1) keep the view centered on the tools, (2) keep the zoom level optimized such that the tools never leave the field of view, and (3) avoid unnecessary movement of the camera that may distract/disorient the surgeon. Our future work will apply this algorithm to the real da Vinci hardware. PMID:27046563
Evaluation of mass spectral library search algorithms implemented in commercial software.
Samokhin, Andrey; Sotnezova, Ksenia; Lashin, Vitaly; Revelsky, Igor
2015-06-01
Performance of several library search algorithms (against EI mass spectral databases) implemented in commercial software products ( acd/specdb, chemstation, gc/ms solution and ms search) was estimated. Test set contained 1000 mass spectra, which were randomly selected from NIST'08 (RepLib) mass spectral database. It was shown that composite (also known as identity) algorithm implemented in ms search (NIST) software gives statistically the best results: the correct compound occupied the first position in the list of possible candidates in 81% of cases; the correct compound was within the list of top ten candidates in 98% of cases. It was found that use of presearch option can lead to rejection of the correct answer from the list of possible candidates (therefore presearch option should not be used, if possible). Overall performance of library search algorithms was estimated using receiver operating characteristic curves.
Design and Implementation of Hybrid CORDIC Algorithm Based on Phase Rotation Estimation for NCO
Zhang, Chaozhu; Han, Jinan; Li, Ke
2014-01-01
The numerical controlled oscillator has wide application in radar, digital receiver, and software radio system. Firstly, this paper introduces the traditional CORDIC algorithm. Then in order to improve computing speed and save resources, this paper proposes a kind of hybrid CORDIC algorithm based on phase rotation estimation applied in numerical controlled oscillator (NCO). Through estimating the direction of part phase rotation, the algorithm reduces part phase rotation and add-subtract unit, so that it decreases delay. Furthermore, the paper simulates and implements the numerical controlled oscillator by Quartus II software and Modelsim software. Finally, simulation results indicate that the improvement over traditional CORDIC algorithm is achieved in terms of ease of computation, resource utilization, and computing speed/delay while maintaining the precision. It is suitable for high speed and precision digital modulation and demodulation. PMID:25110750
Design and Implementation of Broadcast Algorithms for Extreme-Scale Systems
Shamis, Pavel; Graham, Richard L; Gorentla Venkata, Manjunath; Ladd, Joshua
2011-01-01
The scalability and performance of collective communication operations limit the scalability and performance of many scientific applications. This paper presents two new blocking and nonblocking Broadcast algorithms for communicators with arbitrary communication topology, and studies their performance. These algorithms benefit from increased concurrency and a reduced memory footprint, making them suitable for use on large-scale systems. Measuring small, medium, and large data Broadcasts on a Cray-XT5, using 24,576 MPI processes, the Cheetah algorithms outperform the native MPI on that system by 51%, 69%, and 9%, respectively, at the same process count. These results demonstrate an algorithmic approach to the implementation of the important class of collective communications, which is high performing, scalable, and also uses resources in a scalable manner.
NASA Astrophysics Data System (ADS)
Dai, Liyi
2016-05-01
Stochastic optimization is a fundamental problem that finds applications in many areas including biological and cognitive sciences. The classical stochastic approximation algorithm for iterative stochastic optimization requires gradient information of the sample object function that is typically difficult to obtain in practice. Recently there has been renewed interests in derivative free approaches to stochastic optimization. In this paper, we examine the rates of convergence for the Kiefer-Wolfowitz algorithm and the mirror descent algorithm, by approximating gradient using finite differences generated through common random numbers. It is shown that the convergence of these algorithms can be accelerated by controlling the implementation of the finite differences. Particularly, it is shown that the rate can be increased to n-2/5 in general and to n-1/2, the best possible rate of stochastic approximation, in Monte Carlo optimization for a broad class of problems, in the iteration number n.
A complete implementation of the conjugate gradient algorithm on a reconfigurable supercomputer
Dubois, David H; Dubois, Andrew J; Connor, Carolyn M; Boorman, Thomas M; Poole, Stephen W
2008-01-01
The conjugate gradient is a prominent iterative method for solving systems of sparse linear equations. Large-scale scientific applications often utilize a conjugate gradient solver at their computational core. In this paper we present a field programmable gate array (FPGA) based implementation of a double precision, non-preconditioned, conjugate gradient solver for fmite-element or finite-difference methods. OUf work utilizes the SRC Computers, Inc. MAPStation hardware platform along with the 'Carte' software programming environment to ease the programming workload when working with the hybrid (CPUIFPGA) environment. The implementation is designed to handle large sparse matrices of up to order N x N where N <= 116,394, with up to 7 non-zero, 64-bit elements per sparse row. This implementation utilizes an optimized sparse matrix-vector multiply operation which is critical for obtaining high performance. Direct parallel implementations of loop unrolling and loop fusion are utilized to extract performance from the various vector/matrix operations. Rather than utilize the FPGA devices as function off-load accelerators, our implementation uses the FPGAs to implement the core conjugate gradient algorithm. Measured run-time performance data is presented comparing the FPGA implementation to a software-only version showing that the FPGA can outperform processors running up to 30x the clock rate. In conclusion we take a look at the new SRC-7 system and estimate the performance of this algorithm on that architecture.
On recursive least-squares filtering algorithms and implementations. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Hsieh, Shih-Fu
1990-01-01
In many real-time signal processing applications, fast and numerically stable algorithms for solving least-squares problems are necessary and important. In particular, under non-stationary conditions, these algorithms must be able to adapt themselves to reflect the changes in the system and take appropriate adjustments to achieve optimum performances. Among existing algorithms, the QR-decomposition (QRD)-based recursive least-squares (RLS) methods have been shown to be useful and effective for adaptive signal processing. In order to increase the speed of processing and achieve high throughput rate, many algorithms are being vectorized and/or pipelined to facilitate high degrees of parallelism. A time-recursive formulation of RLS filtering employing block QRD will be considered first. Several methods, including a new non-continuous windowing scheme based on selectively rejecting contaminated data, were investigated for adaptive processing. Based on systolic triarrays, many other forms of systolic arrays are shown to be capable of implementing different algorithms. Various updating and downdating systolic algorithms and architectures for RLS filtering are examined and compared in details, which include Householder reflector, Gram-Schmidt procedure, and Givens rotation. A unified approach encompassing existing square-root-free algorithms is also proposed. For the sinusoidal spectrum estimation problem, a judicious method of separating the noise from the signal is of great interest. Various truncated QR methods are proposed for this purpose and compared to the truncated SVD method. Computer simulations provided for detailed comparisons show the effectiveness of these methods. This thesis deals with fundamental issues of numerical stability, computational efficiency, adaptivity, and VLSI implementation for the RLS filtering problems. In all, various new and modified algorithms and architectures are proposed and analyzed; the significance of any of the new method depends
Roche-Lima, Abiel; Thulasiram, Ruppa K.
2016-01-01
Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.
Realization of a scalable coherent quantum Fourier transform
NASA Astrophysics Data System (ADS)
Debnath, Shantanu; Linke, Norbert; Figgatt, Caroline; Landsman, Kevin; Wright, Ken; Monroe, Chris
2016-05-01
The exponential speed-up in some quantum algorithms is a direct result of parallel function-evaluation paths that interfere through a quantum Fourier transform (QFT). We report the implementation of a fully coherent QFT on five trapped Yb+ atomic qubits using sequences of fundamental quantum logic gates. These modular gates can be used to program arbitrary sequences nearly independent of system size and distance between qubits. We use this capability to first perform a Deutsch-Jozsa algorithm where several instances of three-qubit balanced and constant functions are implemented and then examined using single qubit QFTs. Secondly, we apply a fully coherent five-qubit QFT as a part of a quantum phase estimation protocol. Here, the QFT operates on a five-qubit superposition state with a particular phase modulation of its coefficients and directly produces the corresponding phase to five-bit precision. Finally, we examine the performance of the QFT in the period finding problem in the context of Shor's factorization algorithm. This work is supported by the ARO with funding from the IARPA MQCO program and the AFOSR MURI on Quantum Measurement and Verification.
Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarría-Miranda, Daniel
2009-05-29
We present a new lock-free parallel algorithm for computing betweenness centrality of massive small-world networks. With minor changes to the data structures, our algorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in the HPCS SSCA#2 Graph Analysis benchmark, which has been extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the ThreadStorm processor, and a single-socket Sun multicore server with the UltraSparc T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.
A dual-processor multi-frequency implementation of the FINDS algorithm
NASA Technical Reports Server (NTRS)
Godiwala, Pankaj M.; Caglayan, Alper K.
1987-01-01
This report presents a parallel processing implementation of the FINDS (Fault Inferring Nonlinear Detection System) algorithm on a dual processor configured target flight computer. First, a filter initialization scheme is presented which allows the no-fail filter (NFF) states to be initialized using the first iteration of the flight data. A modified failure isolation strategy, compatible with the new failure detection strategy reported earlier, is discussed and the performance of the new FDI algorithm is analyzed using flight recorded data from the NASA ATOPS B-737 aircraft in a Microwave Landing System (MLS) environment. The results show that low level MLS, IMU, and IAS sensor failures are detected and isolated instantaneously, while accelerometer and rate gyro failures continue to take comparatively longer to detect and isolate. The parallel implementation is accomplished by partitioning the FINDS algorithm into two parts: one based on the translational dynamics and the other based on the rotational kinematics. Finally, a multi-rate implementation of the algorithm is presented yielding significantly low execution times with acceptable estimation and FDI performance.
Implementation and analysis of a Navier-Stokes algorithm on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1988-01-01
The results of the implementation of a Navier-Stokes algorithm on three parallel/vector computers are presented. The object of this research is to determine how well, or poorly, a single numerical algorithm would map onto three different architectures. The algorithm is a compact difference scheme for the solution of the incompressible, two-dimensional, time-dependent Navier-Stokes equations. The computers were chosen so as to encompass a variety of architectures. They are the following: the MPP, an SIMD machine with 16K bit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. The basic comparison is among SIMD instruction parallelism on the MPP, MIMD process parallelism on the Flex/32, and vectorization of a serial code on the Cray/2. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures. PMID:24723812
Hardware implementation of Corner2 lossless compression algorithm for maskless lithography systems
NASA Astrophysics Data System (ADS)
Yang, Jeehong; Li, Xiaohui; Savari, Serap A.
2012-03-01
The data delivery throughput of maskless lithography systems can be improved by applying a lossless image compression algorithm to the layout images and using a lithography writer that contains a decoding circuit packed in single silicon to decode the compressed image on-the-fly. In our past research we have introduced Corner2, a layout image compression algorithm which achieved significantly better performance in all aspects (compression ratio, encoding/decoding speed, decoder memory requirement) than Block C4. In this paper, we present the synthesis results of the Corner2 decoder for FPGA implementation.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.
1991-01-01
The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. An implementation is presented of a look-ahead version of the Lanczos algorithm that, except for the very special situation of an incurable breakdown, overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and requires the same number of matrix-vector products and inner products as the standard Lanczos process without look-ahead.
NASA Technical Reports Server (NTRS)
Tilton, James C.; Plaza, Antonio J. (Editor); Chang, Chein-I. (Editor)
2008-01-01
The hierarchical image segmentation algorithm (referred to as HSEG) is a hybrid of hierarchical step-wise optimization (HSWO) and constrained spectral clustering that produces a hierarchical set of image segmentations. HSWO is an iterative approach to region grooving segmentation in which the optimal image segmentation is found at N(sub R) regions, given a segmentation at N(sub R+1) regions. HSEG's addition of constrained spectral clustering makes it a computationally intensive algorithm, for all but, the smallest of images. To counteract this, a computationally efficient recursive approximation of HSEG (called RHSEG) has been devised. Further improvements in processing speed are obtained through a parallel implementation of RHSEG. This chapter describes this parallel implementation and demonstrates its computational efficiency on a Landsat Thematic Mapper test scene.
An Optional Threshold with Svm Cloud Detection Algorithm and Dsp Implementation
NASA Astrophysics Data System (ADS)
Zhou, Guoqing; Zhou, Xiang; Yue, Tao; Liu, Yilong
2016-06-01
This paper presents a method which combines the traditional threshold method and SVM method, to detect the cloud of Landsat-8 images. The proposed method is implemented using DSP for real-time cloud detection. The DSP platform connects with emulator and personal computer. The threshold method is firstly utilized to obtain a coarse cloud detection result, and then the SVM classifier is used to obtain high accuracy of cloud detection. More than 200 cloudy images from Lansat-8 were experimented to test the proposed method. Comparing the proposed method with SVM method, it is demonstrated that the cloud detection accuracy of each image using the proposed algorithm is higher than those of SVM algorithm. The results of the experiment demonstrate that the implementation of the proposed method on DSP can effectively realize the real-time cloud detection accurately.
NASA Technical Reports Server (NTRS)
Jaggi, S.; Quattrochi, Dale A.; Lam, Nina S.-N.
1993-01-01
Fractal geometry is increasingly becoming a useful tool for modeling natural phenomena. As an alternative to Euclidean concepts, fractals allow for a more accurate representation of the nature of complexity in natural boundaries and surfaces. The purpose of this paper is to introduce and implement three algorithms in C code for deriving fractal measurement from remotely sensed data. These three methods are: the line-divider method, the variogram method, and the triangular prism method. Remote-sensing data acquired by NASA's Calibrated Airborne Multispectral Scanner (CAMS) are used to compute the fractal dimension using each of the three methods. These data were obtained as a 30 m pixel spatial resolution over a portion of western Puerto Rico in January 1990. A description of the three methods, their implementation in PC-compatible environment, and some results of applying these algorithms to remotely sensed image data are presented.
NASA Astrophysics Data System (ADS)
Karabiber, Fethullah; Vecchio, Pietro; Grassi, Giuseppe
2011-12-01
The Bio-inspired (Bi-i) Cellular Vision System is a computing platform consisting of sensing, array sensing-processing, and digital signal processing. The platform is based on the Cellular Neural/Nonlinear Network (CNN) paradigm. This article presents the implementation of a novel CNN-based segmentation algorithm onto the Bi-i system. Each part of the algorithm, along with the corresponding implementation on the hardware platform, is carefully described through the article. The experimental results, carried out for Foreman and Car-phone video sequences, highlight the feasibility of the approach, which provides a frame rate of about 26 frames/s. Comparisons with existing CNN-based methods show that the conceived approach is more accurate, thus representing a good trade-off between real-time requirements and accuracy.
NASA Astrophysics Data System (ADS)
Gupta, Atul; Bayraktar, Harun H.; Fox, Julia C.; Keaveny, Tony M.; Papadopoulos, Panayiotis
2007-06-01
Trabecular bone is a highly porous orthotropic cellular solid material present inside human bones such as the femur (hip bone) and vertebra (spine). In this study, an infinitesimal plasticity-like model with isotropic/kinematic hardening is developed to describe yielding of trabecular bone at the continuum level. One of the unique features of this formulation is the development of the plasticity-like model in strain space for a yield envelope expressed in terms of principal strains having asymmetric yield behavior. An implicit return-mapping approach is adopted to obtain a symmetric algorithmic tangent modulus and a step-by-step procedure of algorithmic implementation is derived. To investigate the performance of this approach in a full-scale finite element simulation, the model is implemented in a non-linear finite element analysis program and several test problems including the simulation of loading of the human femur structures are analyzed. The results show good agreement with the experimental data.
A new morphological anomaly detection algorithm for hyperspectral images and its GPU implementation
NASA Astrophysics Data System (ADS)
Paz, Abel; Plaza, Antonio
2011-10-01
Anomaly detection is considered a very important task for hyperspectral data exploitation. It is now routinely applied in many application domains, including defence and intelligence, public safety, precision agriculture, geology, or forestry. Many of these applications require timely responses for swift decisions which depend upon high computing performance of algorithm analysis. However, with the recent explosion in the amount and dimensionality of hyperspectral imagery, this problem calls for the incorporation of parallel computing techniques. In the past, clusters of computers have offered an attractive solution for fast anomaly detection in hyperspectral data sets already transmitted to Earth. However, these systems are expensive and difficult to adapt to on-board data processing scenarios, in which low-weight and low-power integrated components are essential to reduce mission payload and obtain analysis results in (near) real-time, i.e., at the same time as the data is collected by the sensor. An exciting new development in the field of commodity computing is the emergence of commodity graphics processing units (GPUs), which can now bridge the gap towards on-board processing of remotely sensed hyperspectral data. In this paper, we develop a new morphological algorithm for anomaly detection in hyperspectral images along with an efficient GPU implementation of the algorithm. The algorithm is implemented on latest-generation GPU architectures, and evaluated with regards to other anomaly detection algorithms using hyperspectral data collected by NASA's Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) over the World Trade Center (WTC) in New York, five days after the terrorist attacks that collapsed the two main towers in the WTC complex. The proposed GPU implementation achieves real-time performance in the considered case study.
Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen
2002-12-10
Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the
NASA Astrophysics Data System (ADS)
Sundareshan, Malur K.; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen
2002-12-01
Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the
NASA Astrophysics Data System (ADS)
Besrour, Amine; Abdelkefi, Fatma; Siala, Mohamed; Snoussi, Hichem
2015-09-01
Nowadays, the high dynamic range (HDR) imaging represents the subject of the most researches. The major problem lies in the implementation of the best algorithm to acquire the best video quality. In fact, the major constraint is to conceive an optimal fusion which must meet the rapid movement of video frames. The implemented merging algorithms were not quick enough to reconstitute the HDR video. In this paper, we detail each of the previous existing works before detailing our algorithm and presenting results from the acquired HDR images, tone mapped with various techniques. Our proposed algorithm guarantees a more enhanced and faster solution compared to the existing ones. In fact, it has the ability to calculate the saturation matrix related to the saturation rate of the neighboring pixels. The computed coefficients are affected respectively to each picture from the tested ones. This analysis provides faster and efficient results in terms of quality and brightness. The originality of our work remains on its processing method including the pixels saturation in the totality of the captured pictures and their combination in order to obtain the best pictures illustrating all the possible details. These parameters are computed for each zone depending on the contrast and the luminosity of the current pixel and its neighboring. The final HDR image's coefficients are calculated dynamically ensuring the best image quality equilibrating the brightness and contrast values and making the perfect final image.
Implementation of a media synchronization algorithm for multistandard IP set-top box systems
NASA Astrophysics Data System (ADS)
Estévez, Esther; Samper, David; Pescador, Fernando; Juárez, Eduardo; Sanz, César
2009-05-01
Media synchronization at network context minimizes the effects of the network jitter and the skew between the emitter and receiver clocks. Theoretical algorithms cannot always be implemented on real systems for the architecture differences between a real and a theoretical system. In this paper an implementation for an intra-medium and an inter-media synchronization algorithm for a real multistandard IP set-top box is presented. For intra-medium synchronization, the proposed technique is based on controlling the receiver buffer. However for inter-media synchronization, the proposed technique is based on controlling the video playback according the Presentation Time Stamp (PTS) of the media units (audio and video). The proposed synchronizations algorithms has been integrated in an IP-STB and tested in a real environment using DVD movies and TV channels with excellent results. Those results show that the proposed algorithm can achieve media synchronization and meet the requirements of perceived quality of service (P-QoS).
Design And Implementation Of A Multi-Sensor Fusion Algorithm On A Hypercube Computer Architecture
NASA Astrophysics Data System (ADS)
Glover, Charles W.
1990-03-01
A multi-sensor integration (MSI) algorithm written for sequential single processor computer architecture has been transformed into a concurrent algorithm and implemented in parallel on a multi-processor hypercube computer architecture. This paper will present the philosophy and methodologies used in the decomposition of the sequential MSI algorithm, and its transformation into a parallel MSI algorithm. The parallel MSI algorithm was implemented on a NCUBETM hypercube computer. The performance of the parallel MSI algorithm has been measured and compared against its sequential counterpart by running test case scenarios through a simulation program. The simulation program allows the user to define the trajectories of all players in the scenario, and to pick the sensor suites of the players and their operating characteristics. For example, an air-to-air engagement scenario was used as one of the test cases. In this scenario, two friend aircrafts were being attacked by six foe aircraft in a pincer maneuver. Both the friend and foe aircrafts launch missiles at several different time points in the engagement. The sensor suites on each aircraft are dual mode RADAR, dual mode IRST, and ESM sensors. The modes of the sensors are switched as needed throughout the scenario. The RADAR sensor is used only intermittently, thus most of the MSI information is obtained from passive sensing. The maneuvers in this scenario caused aircraft and missile to constantly fly in and out of sensors field-of-view (F0V). This resulted in the MSI algorithm to constantly reacquire, initiate, and delete new tracks as it tracked all objects in the scenario. The objective was to determine performance of the parallel MSI algorithm in such a complex environment, and to determine how many multi-processors (nodes) of the hypercube could be effectively used by an aircraft in such an environment. For the scenario just discussed, a 4-node hypercube was found to be the optimal size and a factor two in speedup
Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarria-Miranda, Daniel
2009-02-15
We present a new lock-free parallel algorithm for computing betweenness centralityof massive small-world networks. With minor changes to the data structures, ouralgorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun multicore server with the UltraSPARC T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.
Implementation and evaluation of two helical CT reconstruction algorithms in CIVA
NASA Astrophysics Data System (ADS)
Banjak, H.; Costin, M.; Vienne, C.; Kaftandjian, V.
2016-02-01
The large majority of industrial CT systems reconstruct the 3D volume by using an acquisition on a circular trajec-tory. However, when inspecting long objects which are highly anisotropic, this scanning geometry creates severe artifacts in the reconstruction. For this reason, the use of an advanced CT scanning method like helical data acquisition is an efficient way to address this aspect known as the long-object problem. Recently, several analytically exact and quasi-exact inversion formulas for helical cone-beam reconstruction have been proposed. Among them, we identified two algorithms of interest for our case. These algorithms are exact and of filtered back-projection structure. In this work we implemented the filtered-backprojection (FBP) and backprojection-filtration (BPF) algorithms of Zou and Pan (2004). For performance evaluation, we present a numerical compari-son of the two selected algorithms with the helical FDK algorithm using both complete (noiseless and noisy) and truncated data generated by CIVA (the simulation platform for non-destructive testing techniques developed at CEA).
Implementing a shadow detection algorithm for synthetic vision systems in reconfigurable hardware
NASA Astrophysics Data System (ADS)
Ladeji-Osias, Jumoke; Theobalds, Andre; Nare, Otsebele; Wandji, Theirry; Scott, Craig; Nyarko, Kofi
2006-05-01
The integrity monitor for synthetic vision systems provides pilots with a consistency check between stored Digital Elevation Models (DEM) and real-time sensor data. This paper discusses the implementation of the Shadow Detection and Extraction (SHADE) algorithm in reconfigurable hardware to increase the efficiency of the design. The SHADE algorithm correlates data from a weather radar and DEM to determine occluded regions of the flight path terrain. This process of correlating the weather radar and DEM data occurs in two parallel threads which are then fed into a disparity checker. The DEM thread is broken up into four main sub-functions: 1) synchronization and translation of GPS coordinates of aircraft to the weather radar, 2) mapping range bins to coordinates and computing depression angles, 3) mapping state assignments to range bins, and 4) shadow region edge detection. This correlation must be done in realtime; therefore, a hardware implementation is ideal due to the amount of data that is to be processed. The hardware of choice is the field programmable gate array because of programmability, reusability, and computational ability. Assigning states to each range bin is the most computationally intensive process and it is implemented as a finite state machine (FSM). Results of this work are focused on the implementation of the FSM.
NASA Astrophysics Data System (ADS)
Cho, Woong; Suh, Tae-Suk; Park, Jeong-Hoon; Xing, Lei; Lee, Jeong-Woo
2012-12-01
A collapsed cone convolution algorithm was applied to a treatment planning system for the calculation of dose distributions. The distribution of beam fluences was determined using a three-source model by considering the source strengths of the primary beam, the beam scattered from the primary collimators, and an extra beam scattered from extra structures in the gantry head of the radiotherapy treatment machine. The distribution of the total energy released per unit mass (TERMA) was calculated from the distribution of the fluence by considering several physical effects such as the emission of poly-energetic photon spectra, the attenuation of the beam fluence in a medium, the horn effect, the beam-softening effect, and beam transmission through collimators or multi-leaf collimators. The distribution of the doses was calculated by using the convolution of the distribution of the TERMA and the poly-energetic kernel. The distribution of the kernel was approximated to several tens of collapsed cone lines to express the energies transferred by the electrons that originated from the interactions between the photons and the medium. The implemented algorithm was validated by comparing the calculated percentage depth doses (PDDs) and dose profiles with the measured PDDs and relevant profiles. In addition, the dose distribution for an irregular-shaped radiation field was verified by comparing the calculated doses with the measured doses obtained via EDR2 film dosimetry and with the calculated doses obtained using a different treatment planning system based on the pencil beam algorithm (Eclipse, Varian, Palo Alto, USA). The majority of the calculated doses for the PDDs, the profiles, and the irregular-shaped field showed good agreement with the measured doses to within a 2% dose difference, except in the build-up regions. The implemented algorithm was proven to be efficient and accurate for clinical purposes in radiation therapy, and it was found to be easily implementable in
An implementation of differential search algorithm (DSA) for inversion of surface wave data
NASA Astrophysics Data System (ADS)
Song, Xianhai; Li, Lei; Zhang, Xueqiang; Shi, Xinchun; Huang, Jianquan; Cai, Jianchao; Jin, Si; Ding, Jianping
2014-12-01
Surface wave dispersion analysis is widely used in geophysics to infer near-surface shear (S)-wave velocity profiles for a wide variety of applications. However, inversion of surface wave data is challenging for most local-search methods due to its high nonlinearity and to its multimodality. In this work, we proposed and implemented a new Rayleigh wave dispersion curve inversion scheme based on differential search algorithm (DSA), one of recently developed swarm intelligence-based algorithms. DSA is inspired from seasonal migration behavior of species of the living beings throughout the year for solving highly nonlinear, multivariable, and multimodal optimization problems. The proposed inverse procedure is applied to nonlinear inversion of fundamental-mode Rayleigh wave dispersion curves for near-surface S-wave velocity profiles. To evaluate calculation efficiency and stability of DSA, four noise-free and four noisy synthetic data sets are firstly inverted. Then, the performance of DSA is compared with that of genetic algorithms (GA) by two noise-free synthetic data sets. Finally, a real-world example from a waste disposal site in NE Italy is inverted to examine the applicability and robustness of the proposed approach on surface wave data. Furthermore, the performance of DSA is compared against that of GA by real data to further evaluate scores of the inverse procedure described here. Simulation results from both synthetic and actual field data demonstrate that differential search algorithm (DSA) applied to nonlinear inversion of surface wave data should be considered good not only in terms of the accuracy but also in terms of the convergence speed. The great advantages of DSA are that the algorithm is simple, robust and easy to implement. Also there are fewer control parameters to tune.
NASA Astrophysics Data System (ADS)
Akec, John A.; Steiner, Simon J.
1996-10-01
Fuzzy logic has been promoted recently by many researchers for the design of navigational algorithms for mobile robots. The new approach fits in well with a behavior-based autonomous systems framework, where common-sense rules can naturally be formulated to create rule-based navigational algorithms, and conflicts between behaviors may be resolved by assigning weights to different rules in the rule base. The applicability of the techniques has been demonstrated for robots that have used sensor devices such as ultrasonics and infrared detectors. However, the implementation issues relating to the development of vision-based, fuzzy-logic navigation algorithms do not appear, as yet, to have been fully explored. The salient features that need to be extracted from an image for recognition or collision avoidance purposes are very much application dependent; however, the needs of an autonomous mobile vehicle cannot be known fully 'a priori'. Similarly, the issues relating to the understanding of a vision generated image which is based on geometric models of the observed objects have an important role to play; however, these issues have not as yet been either addressed or incorporated into the current fuzzy logic-based algorithms that have been purported for navigational control. This paper attempts to address these issues, and attempts to come up with a suitable framework which may clarify the implementation of navigation algorithms for mobile robots that use vision sensor/s and fuzzy logic for map building, target location, and collision avoidance. The scope for application of this approach is demonstrated.
Optical implementation of a second-order translation-invariant network algorithm
NASA Astrophysics Data System (ADS)
Horan, Paul; Jennings, Andrew; Kelly, Brian; Hegarty, John
1993-03-01
Higher-order networks, particularly second-order translation-invariant networks, are introduced, and their suitability for optical implementation is outlined. The algorithm is implemented with a conventional liquid-crystal display, permitting on-line learning and updating of weights. The basic operation of the optical system is demonstrated, and the ability of the system to adapt to system nonuniformities is illustrated. The implementation with an integrated optoelectronic array of asymmetric Fabry-Perot modulators containing a GaAs/AlGaAs multiple-quantum-well active region is described. The principles of operation and operating characteristics of the device array are outlined. The use of the array in an optical system to calculate the autocorrelation matrix necessary for a second-order network is demonstrated.
Software for implementing trigger algorithms on the upgraded CMS Global Trigger System
NASA Astrophysics Data System (ADS)
Matsushita, Takashi; Arnold, Bernhard
2015-12-01
The Global Trigger is the final step of the CMS Level-1 Trigger and implements a trigger menu, a set of selection requirements applied to the final list of trigger objects. The conditions for trigger object selection, with possible topological requirements on multiobject triggers, are combined by simple combinatorial logic to form the algorithms. The LHC has resumed its operation in 2015, the collision-energy will be increased to 13 TeV with the luminosity expected to go up to 2x1034 cm-2s-1. The CMS Level-1 trigger system will be upgraded to improve its performance for selecting interesting physics events and to operate within the predefined data-acquisition rate in the challenging environment expected at LHC Run 2. The Global Trigger will be re-implemented on modern FPGAs on an Advanced Mezzanine Card in MicroTCA crate. The upgraded system will benefit from the ability to process complex algorithms with DSP slices and increased processing resources with optical links running at 10 Gbit/s, enabling more algorithms at a time than previously possible and allowing CMS to be more flexible in how it handles the trigger bandwidth. In order to handle the increased complexity of the trigger menu implemented on the upgraded Global Trigger, a set of new software has been developed. The software allows a physicist to define a menu with analysis-like triggers using intuitive user interface. The menu is then realised on FPGAs with further software processing, instantiating predefined firmware blocks. The design and implementation of the software for preparing a menu for the upgraded CMS Global Trigger system are presented.
Li, Haisen S.; Romeijn, H. Edwin; Fox, Christopher; Palta, Jatinder R.; Dempsey, James F.
2008-03-15
The authors present a comparative study of intensity modulated proton therapy (IMPT) treatment planning employing algorithms of three-dimensional (3D) modulation, and 2.5-dimensional (2.5D) modulation, and intensity modulated distal edge tracking (DET) [A. Lomax, Phys. Med. Biol. 44, 185-205 (1999)] applied to the treatment of head-and-neck cancer radiotherapy. These three approaches were also compared with 6 MV photon intensity modulated radiation therapy (IMRT). All algorithms were implemented in the University of Florida Optimized Radiation Therapy system using a finite sized pencil beam dose model and a convex fluence map optimization model. The 3D IMPT and the DET algorithms showed considerable advantages over the photon IMRT in terms of dose conformity and sparing of organs at risk when the beam number was not constrained. The 2.5D algorithm did not show an advantage over the photon IMRT except in the dose reduction to the distant healthy tissues, which is inherent in proton beam delivery. The influences of proton beam number and pencil beam size on the IMPT plan quality were also studied. Out of 24 cases studied, three cases could be adequately planned with one beam and 12 cases could be adequately planned with two beams, but the dose uniformity was often marginally acceptable. Adding one or two more beams in each case dramatically improved the dose uniformity. The finite pencil beam size had more influence on the plan quality of the 2.5D and DET algorithms than that of the 3D IMPT. To obtain a satisfactory plan quality, a 0.5 cm pencil beam size was required for the 3D IMPT and a 0.3 cm size was required for the 2.5D and the DET algorithms. Delivery of the IMPT plans produced in this study would require a proton beam spot scanning technique that has yet to be developed clinically.
An implementation of differential evolution algorithm for inversion of geoelectrical data
NASA Astrophysics Data System (ADS)
Balkaya, Çağlayan
2013-11-01
Differential evolution (DE), a population-based evolutionary algorithm (EA) has been implemented to invert self-potential (SP) and vertical electrical sounding (VES) data sets. The algorithm uses three operators including mutation, crossover and selection similar to genetic algorithm (GA). Mutation is the most important operator for the success of DE. Three commonly used mutation strategies including DE/best/1 (strategy 1), DE/rand/1 (strategy 2) and DE/rand-to-best/1 (strategy 3) were applied together with a binomial type crossover. Evolution cycle of DE was realized without boundary constraints. For the test studies performed with SP data, in addition to both noise-free and noisy synthetic data sets two field data sets observed over the sulfide ore body in the Malachite mine (Colorado) and over the ore bodies in the Neem-Ka Thana cooper belt (India) were considered. VES test studies were carried out using synthetically produced resistivity data representing a three-layered earth model and a field data set example from Gökçeada (Turkey), which displays a seawater infiltration problem. Mutation strategies mentioned above were also extensively tested on both synthetic and field data sets in consideration. Of these, strategy 1 was found to be the most effective strategy for the parameter estimation by providing less computational cost together with a good accuracy. The solutions obtained by DE for the synthetic cases of SP were quite consistent with particle swarm optimization (PSO) which is a more widely used population-based optimization algorithm than DE in geophysics. Estimated parameters of SP and VES data were also compared with those obtained from Metropolis-Hastings (M-H) sampling algorithm based on simulated annealing (SA) without cooling to clarify uncertainties in the solutions. Comparison to the M-H algorithm shows that DE performs a fast approximate posterior sampling for the case of low-dimensional inverse geophysical problems.
PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta
Chaudhury, Sidhartha; Lyskov, Sergey; Gray, Jeffrey J.
2010-01-01
Summary: PyRosetta is a stand-alone Python-based implementation of the Rosetta molecular modeling package that allows users to write custom structure prediction and design algorithms using the major Rosetta sampling and scoring functions. PyRosetta contains Python bindings to libraries that define Rosetta functions including those for accessing and manipulating protein structure, calculating energies and running Monte Carlo-based simulations. PyRosetta can be used in two ways: (i) interactively, using iPython and (ii) script-based, using Python scripting. Interactive mode contains a number of help features and is ideal for beginners while script-mode is best suited for algorithm development. PyRosetta has similar computational performance to Rosetta, can be easily scaled up for cluster applications and has been implemented for algorithms demonstrating protein docking, protein folding, loop modeling and design. Availability: PyRosetta is a stand-alone package available at http://www.pyrosetta.org under the Rosetta license which is free for academic and non-profit users. A tutorial, user's manual and sample scripts demonstrating usage are also available on the web site. Contact: pyrosetta@graylab.jhu.edu PMID:20061306
Implementation of Complex Signal Processing Algorithms for Position-Sensitive Microcalorimeters
NASA Technical Reports Server (NTRS)
Smith, Stephen J.
2008-01-01
We have recently reported on a theoretical digital signal-processing algorithm for improved energy and position resolution in position-sensitive, transition-edge sensor (POST) X-ray detectors [Smith et al., Nucl, lnstr and Meth. A 556 (2006) 2371. PoST's consists of one or more transition-edge sensors (TES's) on a large continuous or pixellated X-ray absorber and are under development as an alternative to arrays of single pixel TES's. PoST's provide a means to increase the field-of-view for the fewest number of read-out channels. In this contribution we extend the theoretical correlated energy position optimal filter (CEPOF) algorithm (originally developed for 2-TES continuous absorber PoST's) to investigate the practical implementation on multi-pixel single TES PoST's or Hydras. We use numerically simulated data for a nine absorber device, which includes realistic detector noise, to demonstrate an iterative scheme that enables convergence on the correct photon absorption position and energy without any a priori assumptions. The position sensitivity of the CEPOF implemented on simulated data agrees very well with the theoretically predicted resolution. We discuss practical issues such as the impact of random arrival phase of the measured data on the performance of the CEPOF. The CEPOF algorithm demonstrates that full-width-at- half-maximum energy resolution of < 8 eV coupled with position-sensitivity down to a few 100 eV should be achievable for a fully optimized device.
SU-E-T-67: Clinical Implementation and Evaluation of the Acuros Dose Calculation Algorithm
Yan, C; Combine, T; Dickens, K; Wynn, R; Pavord, D; Huq, M
2014-06-01
Purpose: The main aim of the current study is to present a detailed description of the implementation of the Acuros XB Dose Calculation Algorithm, and subsequently evaluate its clinical impacts by comparing it with AAA algorithm. Methods: The source models for both Acuros XB and AAA were configured by importing the same measured beam data into Eclipse treatment planning system. Both algorithms were evaluated by comparing calculated dose with measured dose on a homogeneous water phantom for field sizes ranging from 6cm × 6cm to 40cm × 40cm. Central axis and off-axis points with different depths were chosen for the comparison. Similarly, wedge fields with wedge angles from 15 to 60 degree were used. In addition, variable field sizes for a heterogeneous phantom were used to evaluate the Acuros algorithm. Finally, both Acuros and AAA were tested on VMAT patient plans for various sites. Does distributions and calculation time were compared. Results: On average, computation time is reduced by at least 50% by Acuros XB compared with AAA on single fields and VMAT plans. When used for open 6MV photon beams on homogeneous water phantom, both Acuros XB and AAA calculated doses were within 1% of measurement. For 23 MV photon beams, the calculated doses were within 1.5% of measured doses for Acuros XB and 2% for AAA. When heterogeneous phantom was used, Acuros XB also improved on accuracy. Conclusion: Compared with AAA, Acuros XB can improve accuracy while significantly reduce computation time for VMAT plans.
The mGA1.0: A common LISP implementation of a messy genetic algorithm
NASA Technical Reports Server (NTRS)
Goldberg, David E.; Kerzic, Travis
1990-01-01
Genetic algorithms (GAs) are finding increased application in difficult search, optimization, and machine learning problems in science and engineering. Increasing demands are being placed on algorithm performance, and the remaining challenges of genetic algorithm theory and practice are becoming increasingly unavoidable. Perhaps the most difficult of these challenges is the so-called linkage problem. Messy GAs were created to overcome the linkage problem of simple genetic algorithms by combining variable-length strings, gene expression, messy operators, and a nonhomogeneous phasing of evolutionary processing. Results on a number of difficult deceptive test functions are encouraging with the mGA always finding global optima in a polynomial number of function evaluations. Theoretical and empirical studies are continuing, and a first version of a messy GA is ready for testing by others. A Common LISP implementation called mGA1.0 is documented and related to the basic principles and operators developed by Goldberg et. al. (1989, 1990). Although the code was prepared with care, it is not a general-purpose code, only a research version. Important data structures and global variations are described. Thereafter brief function descriptions are given, and sample input data are presented together with sample program output. A source listing with comments is also included.
Next Generation Aura-OMI SO2 Retrieval Algorithm: Introduction and Implementation Status
NASA Technical Reports Server (NTRS)
Li, Can; Joiner, Joanna; Krotkov, Nickolay A.; Bhartia, Pawan K.
2014-01-01
We introduce our next generation algorithm to retrieve SO2 using radiance measurements from the Aura Ozone Monitoring Instrument (OMI). We employ a principal component analysis technique to analyze OMI radiance spectral in 310.5-340 nm acquired over regions with no significant SO2. The resulting principal components (PCs) capture radiance variability caused by both physical processes (e.g., Rayleigh and Raman scattering, and ozone absorption) and measurement artifacts, enabling us to account for these various interferences in SO2 retrievals. By fitting these PCs along with SO2 Jacobians calculated with a radiative transfer model to OMI-measured radiance spectra, we directly estimate SO2 vertical column density in one step. As compared with the previous generation operational OMSO2 PBL (Planetary Boundary Layer) SO2 product, our new algorithm greatly reduces unphysical biases and decreases the noise by a factor of two, providing greater sensitivity to anthropogenic emissions. The new algorithm is fast, eliminates the need for instrument-specific radiance correction schemes, and can be easily adapted to other sensors. These attributes make it a promising technique for producing long-term, consistent SO2 records for air quality and climate research. We have operationally implemented this new algorithm on OMI SIPS for producing the new generation standard OMI SO2 products.
Movie approximation technique for the implementation of fast bandwidth-smoothing algorithms
NASA Astrophysics Data System (ADS)
Feng, Wu-chi; Lam, Chi C.; Liu, Ming
1997-12-01
Bandwidth smoothing algorithms can effectively reduce the network resource requirements for the delivery of compressed video streams. For stored video, a large number of bandwidth smoothing algorithms have been introduced that are optimal under certain constraints but require access to all the frame size data in order to achieve their optimal properties. This constraint, however, can be both resource and computationally expensive, especially for moderately priced set-top-boxes. In this paper, we introduce a movie approximation technique for the representation of the frame sizes of a video, reducing the complexity of the bandwidth smoothing algorithms and the amount of frame data that must be transmitted prior to the start of playback. Our results show that the proposed approximation technique can accurately approximate the frame data with a small number of piece-wise linear segments without affecting the performance measures that the bandwidth soothing algorithms are attempting to achieve by more than 1%. In addition, we show that implementations of this technique can speed up execution times by 100 to 400 times, allowing the bandwidth plan calculation times to be reduced to tens of milliseconds. Evaluation using a compressed full-length motion-JPEG video is provided.
Kawakami, Tomoya; Fujita, Naotaka; Yoshihisa, Tomoki; Tsukamoto, Masahiko
2014-01-01
In recent years, sensors become popular and Home Energy Management System (HEMS) takes an important role in saving energy without decrease in QoL (Quality of Life). Currently, many rule-based HEMSs have been proposed and almost all of them assume "IF-THEN" rules. The Rete algorithm is a typical pattern matching algorithm for IF-THEN rules. Currently, we have proposed a rule-based Home Energy Management System (HEMS) using the Rete algorithm. In the proposed system, rules for managing energy are processed by smart taps in network, and the loads for processing rules and collecting data are distributed to smart taps. In addition, the number of processes and collecting data are reduced by processing rules based on the Rete algorithm. In this paper, we evaluated the proposed system by simulation. In the simulation environment, rules are processed by a smart tap that relates to the action part of each rule. In addition, we implemented the proposed system as HEMS using smart taps.
Kawakami, Tomoya; Fujita, Naotaka; Yoshihisa, Tomoki; Tsukamoto, Masahiko
2014-01-01
In recent years, sensors become popular and Home Energy Management System (HEMS) takes an important role in saving energy without decrease in QoL (Quality of Life). Currently, many rule-based HEMSs have been proposed and almost all of them assume "IF-THEN" rules. The Rete algorithm is a typical pattern matching algorithm for IF-THEN rules. Currently, we have proposed a rule-based Home Energy Management System (HEMS) using the Rete algorithm. In the proposed system, rules for managing energy are processed by smart taps in network, and the loads for processing rules and collecting data are distributed to smart taps. In addition, the number of processes and collecting data are reduced by processing rules based on the Rete algorithm. In this paper, we evaluated the proposed system by simulation. In the simulation environment, rules are processed by a smart tap that relates to the action part of each rule. In addition, we implemented the proposed system as HEMS using smart taps. PMID:25136672
Supercomputer implementation of finite element algorithms for high speed compressible flows
NASA Technical Reports Server (NTRS)
Thornton, E. A.; Ramakrishnan, R.
1986-01-01
Prediction of compressible flow phenomena using the finite element method is of recent origin and considerable interest. Two shock capturing finite element formulations for high speed compressible flows are described. A Taylor-Galerkin formulation uses a Taylor series expansion in time coupled with a Galerkin weighted residual statement. The Taylor-Galerkin algorithms use explicit artificial dissipation, and the performance of three dissipation models are compared. A Petrov-Galerkin algorithm has as its basis the concepts of streamline upwinding. Vectorization strategies are developed to implement the finite element formulations on the NASA Langley VPS-32. The vectorization scheme results in finite element programs that use vectors of length of the order of the number of nodes or elements. The use of the vectorization procedure speeds up processing rates by over two orders of magnitude. The Taylor-Galerkin and Petrov-Galerkin algorithms are evaluated for 2D inviscid flows on criteria such as solution accuracy, shock resolution, computational speed and storage requirements. The convergence rates for both algorithms are enhanced by local time-stepping schemes. Extension of the vectorization procedure for predicting 2D viscous and 3D inviscid flows are demonstrated. Conclusions are drawn regarding the applicability of the finite element procedures for realistic problems that require hundreds of thousands of nodes.
A Survey on GPU-Based Implementation of Swarm Intelligence Algorithms.
Tan, Ying; Ding, Ke
2016-09-01
Inspired by the collective behavior of natural swarm, swarm intelligence algorithms (SIAs) have been developed and widely used for solving optimization problems. When applied to complex problems, a large number of fitness function evaluations are needed to obtain an acceptable solution. To tackle this vital issue, graphical processing units (GPUs) have been used to accelerate the optimization procedure of SIAs. Thanks to their inherent parallelism, SIAs are very suitable for parallel implementation under the GPU platform which have achieved a great success in recent years. This paper presents a comprehensive review of GPU-based parallel SIAs in accordance with a newly proposed taxonomy. Critical concerns for the efficient parallel implementation of SIAs are also described in detail. Moreover, novel criteria are also proposed to evaluate and compare the parallel implementation and algorithm performance universally. The rationality and practicability of the proposed optimization methodology and criteria are verified by careful case study. Finally, our opinions and perspectives on the trends and prospects on the relatively new research domain are also presented for future development. PMID:26571543
Zhou, Ning; Huang, Zhenyu; Tuffner, Francis K.; Jin, Shuangshuang; Lin, Jenglung; Hauer, Matthew L.
2010-02-28
Small signal stability problems are one of the major threats to grid stability and reliability. Prony analysis has been successfully applied on ringdown data to monitor electromechanical modes of a power system using phasor measurement unit (PMU) data. To facilitate an on-line application of mode estimation, this paper develops a recursive algorithm for implementing Prony analysis and proposed an oscillation detection method to detect ringdown data in real time. By automatically detecting ringdown data, the proposed method helps guarantee that Prony analysis is applied properly and timely on the ringdown data. Thus, the mode estimation results can be performed reliably and timely. The proposed method is tested using Monte Carlo simulations based on a 17-machine model and is shown to be able to properly identify the oscillation data for on-line application of Prony analysis. In addition, the proposed method is applied to field measurement data from WECC to show the performance of the proposed algorithm.
Barazzetti, Luigi; Scaioni, Marco
2010-01-01
This paper presents the development and implementation of three image-based methods used to detect and measure the displacements of a vast number of points in the case of laboratory testing on construction materials. Starting from the needs of structural engineers, three ad hoc tools for crack measurement in fibre-reinforced specimens and 2D or 3D deformation analysis through digital images were implemented and tested. These tools make use of advanced image processing algorithms and can integrate or even substitute some traditional sensors employed today in most laboratories. In addition, the automation provided by the implemented software, the limited cost of the instruments and the possibility to operate with an indefinite number of points offer new and more extensive analysis in the field of material testing. Several comparisons with other traditional sensors widely adopted inside most laboratories were carried out in order to demonstrate the accuracy of the implemented software. Implementation details, simulations and real applications are reported and discussed in this paper. PMID:22163612
Morita, Kenji; Jitsev, Jenia; Morrison, Abigail
2016-09-15
Value-based action selection has been suggested to be realized in the corticostriatal local circuits through competition among neural populations. In this article, we review theoretical and experimental studies that have constructed and verified this notion, and provide new perspectives on how the local-circuit selection mechanisms implement reinforcement learning (RL) algorithms and computations beyond them. The striatal neurons are mostly inhibitory, and lateral inhibition among them has been classically proposed to realize "Winner-Take-All (WTA)" selection of the maximum-valued action (i.e., 'max' operation). Although this view has been challenged by the revealed weakness, sparseness, and asymmetry of lateral inhibition, which suggest more complex dynamics, WTA-like competition could still occur on short time scales. Unlike the striatal circuit, the cortical circuit contains recurrent excitation, which may enable retention or temporal integration of information and probabilistic "soft-max" selection. The striatal "max" circuit and the cortical "soft-max" circuit might co-implement an RL algorithm called Q-learning; the cortical circuit might also similarly serve for other algorithms such as SARSA. In these implementations, the cortical circuit presumably sustains activity representing the executed action, which negatively impacts dopamine neurons so that they can calculate reward-prediction-error. Regarding the suggested more complex dynamics of striatal, as well as cortical, circuits on long time scales, which could be viewed as a sequence of short WTA fragments, computational roles remain open: such a sequence might represent (1) sequential state-action-state transitions, constituting replay or simulation of the internal model, (2) a single state/action by the whole trajectory, or (3) probabilistic sampling of state/action.
Implementation of a new iterative learning control algorithm on real data.
Zamanian, Hamed; Koohi, Ardavan
2016-02-01
In this paper, a newly presented approach is proposed for closed-loop automatic tuning of a proportional integral derivative (PID) controller based on iterative learning control (ILC) algorithm. A modified ILC scheme iteratively changes the control signal by adjusting it. Once a satisfactory performance is achieved, a linear compensator is identified in the ILC behavior using casual relationship between the closed loop signals. This compensator is approximated by a PD controller which is used to tune the original PID controller. Results of implementing this approach presented on the experimental data of Damavand tokamak and are consistent with simulation outcome. PMID:26931852
Implementation of a quantum adiabatic algorithm for factorization on two qudits
Zobov, V. E. Ermilov, A. S.
2012-06-15
Implementation of an adiabatic quantum algorithm for factorization on two qudits with the number of levels d{sub 1} and d{sub 2} is considered. A method is proposed for obtaining a time-dependent effective Hamiltonian by means of a sequence of rotation operators that are selective with respect to the transitions between neighboring levels of a qudit. A sequence of RF magnetic field pulses is obtained, and a factorization of the numbers 35, 21, and 15 is numerically simulated on two quadrupole nuclei with spins 3/2 (d{sub 1} = 4) and 1 (d{sub 2} = 3).
Parallel implementation of the time-evolving block decimation algorithm for the Bose-Hubbard model
NASA Astrophysics Data System (ADS)
Urbanek, Miroslav; Soldán, Pavel
2016-02-01
A system of ultracold atoms in an optical lattice represents a powerful experimental setup for testing the fundamentals of quantum mechanics. While its microscopic interaction mechanisms are well understood, the system behavior for a moderate number of particles is difficult to simulate due to a high dimension of its many-body space. This article presents TEBDOL, a parallel implementation of the time-evolving block decimation (TEBD) algorithm that can efficiently simulate time evolution of a one-dimensional chain of atoms in optical lattices. We investigate the parallelization strategy and the strong and weak scaling with the number of processes.
Implementation of a new iterative learning control algorithm on real data.
Zamanian, Hamed; Koohi, Ardavan
2016-02-01
In this paper, a newly presented approach is proposed for closed-loop automatic tuning of a proportional integral derivative (PID) controller based on iterative learning control (ILC) algorithm. A modified ILC scheme iteratively changes the control signal by adjusting it. Once a satisfactory performance is achieved, a linear compensator is identified in the ILC behavior using casual relationship between the closed loop signals. This compensator is approximated by a PD controller which is used to tune the original PID controller. Results of implementing this approach presented on the experimental data of Damavand tokamak and are consistent with simulation outcome.
AVR microcontroller simulator for software implemented hardware fault tolerance algorithms research
NASA Astrophysics Data System (ADS)
Piotrowski, Adam; Tarnowski, Szymon; Napieralski, Andrzej
2008-01-01
Reliability of new, advanced electronic systems becomes a serious problem especially in places like accelerators and synchrotrons, where sophisticated digital devices operate closely to radiation sources. One of the possible solutions to harden the microprocessor-based system is a strict programming approach known as the Software Implemented Hardware Fault Tolerance. Unfortunately, in real environments it is not possible to perform precise and accurate tests of the new algorithms due to hardware limitation. This paper highlights the AVR-family microcontroller simulator project equipped with an appropriate monitoring and the SEU injection systems.
The Sustainable Technology Division has recently completed an implementation of the U.S. EPA's Waste Reduction (WAR) Algorithm that can be directly accessed from a Cape-Open compliant process modeling environment. The WAR Algorithm add-in can be used in AmsterChem's COFE (Cape-Op...
Implementation of QR-decomposition based on CORDIC for unitary MUSIC algorithm
NASA Astrophysics Data System (ADS)
Lounici, Merwan; Luan, Xiaoming; Saadi, Wahab
2013-07-01
The DOA (Direction Of Arrival) estimation with subspace methods such as MUSIC (MUltiple SIgnal Classification) and ESPRIT (Estimation of Signal Parameters via Rotational Invariance Technique) is based on an accurate estimation of the eigenvalues and eigenvectors of covariance matrix. QR decomposition is implemented with the Coordinate Rotation DIgital Computer (CORDIC) algorithm. QRD requires only additions and shifts [6], so it is faster and more regular than other methods. In this article the hardware architecture of an EVD (Eigen Value Decomposition) processor based on TSA (triangular systolic array) for QR decomposition is proposed. Using Xilinx System Generator (XSG), the design is implemented and the estimated logic device resource values are presented for different matrix sizes.
Implementation of a Point Algorithm for Real-Time Convex Optimization
NASA Technical Reports Server (NTRS)
Acikmese, Behcet; Motaghedi, Shui; Carson, John
2007-01-01
The primal-dual interior-point algorithm implemented in G-OPT is a relatively new and efficient way of solving convex optimization problems. Given a prescribed level of accuracy, the convergence to the optimal solution is guaranteed in a predetermined, finite number of iterations. G-OPT Version 1.0 is a flight software implementation written in C. Onboard application of the software enables autonomous, real-time guidance and control that explicitly incorporates mission constraints such as control authority (e.g. maximum thrust limits), hazard avoidance, and fuel limitations. This software can be used in planetary landing missions (Mars pinpoint landing and lunar landing), as well as in proximity operations around small celestial bodies (moons, asteroids, and comets). It also can be used in any spacecraft mission for thrust allocation in six-degrees-of-freedom control.
NASA Astrophysics Data System (ADS)
Botwicz, Jakub; Buciak, Piotr; Sapiecha, Piotr
2006-03-01
Intrusion Prevention Systems (IPSs) have become widely recognized as a powerful tool and an important element of IT security safeguards. The essential feature of network IPSs is searching through network packets and matching multiple strings, that are fingerprints of known attacks. String matching is highly resource consuming and also the most significant bottleneck of IPSs. In this article an extension of the classical Karp-Rabin algorithm and its implementation architectures were examined. The result is a software, which generates a source code of a string matching module in hardware description language, that could be easily used to create an Intrusion Prevention System implemented in reconfigurable hardware. The prepared module matches the complete set of Snort IPS signatures achieving throughput of over 2 Gbps on an Altera Stratix I1 evaluation board. The most significant advantage of the proposed architecture is that the update of the patterns database does not require reconfiguration of the circuitry.
NASA Technical Reports Server (NTRS)
Doxley, Charles A.
2016-01-01
In the current world of applications that use reconfigurable technology implemented on field programmable gate arrays (FPGAs), there is a need for flexible architectures that can grow as the systems evolve. A project has limited resources and a fixed set of requirements that development efforts are tasked to meet. Designers must develop robust solutions that practically meet the current customer demands and also have the ability to grow for future performance. This paper describes the development of a high speed serial data streaming algorithm that allows for transmission of multiple data channels over a single serial link. The technique has the ability to change to meet new applications developed for future design considerations. This approach uses the Xilinx Serial RapidIO LOGICORE Solution to implement a flexible infrastructure to meet the current project requirements with the ability to adapt future system designs.
Multi-GPU implementation of a VMAT treatment plan optimization algorithm
Tian, Zhen E-mail: Xun.Jia@UTSouthwestern.edu Folkerts, Michael; Tan, Jun; Jia, Xun E-mail: Xun.Jia@UTSouthwestern.edu Jiang, Steve B. E-mail: Xun.Jia@UTSouthwestern.edu; Peng, Fei
2015-06-15
Purpose: Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU’s relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors’ group, on a multi-GPU platform to solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. Methods: The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors’ method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H and N) cancer case is
Implementation of the FDK algorithm for cone-beam CT on the cell broadband engine architecture
NASA Astrophysics Data System (ADS)
Scherl, Holger; Koerner, Mario; Hofmann, Hannes; Eckert, Wieland; Kowarschik, Markus; Hornegger, Joachim
2007-03-01
In most of today's commercially available cone-beam CT scanners, the well known FDK method is used for solving the 3D reconstruction task. The computational complexity of this algorithm prohibits its use for many medical applications without hardware acceleration. The brand-new Cell Broadband Engine Architecture (CBEA) with its high level of parallelism is a cost-efficient processor for performing the FDK reconstruction according to the medical requirements. The programming scheme, however, is quite different to any standard personal computer hardware. In this paper, we present an innovative implementation of the most time-consuming parts of the FDK algorithm: filtering and back-projection. We also explain the required transformations to parallelize the algorithm for the CBEA. Our software framework allows to compute the filtering and back-projection in parallel, making it possible to do an on-the-fly-reconstruction. The achieved results demonstrate that a complete FDK reconstruction is computed with the CBEA in less than seven seconds for a standard clinical scenario. Given the fact that scan times are usually much higher, we conclude that reconstruction is finished right after the end of data acquisition. This enables us to present the reconstructed volume to the physician in real-time, immediately after the last projection image has been acquired by the scanning device.
McNunn, Gabriel S; Bryden, Kenneth M
2013-01-01
Tarjan's algorithm schedules the solution of systems of equations by noting the coupling and grouping between the equations. Simulating complex systems, e.g., advanced power plants, aerodynamic systems, or the multi-scale design of components, requires the linkage of large groups of coupled models. Currently, this is handled manually in systems modeling packages. That is, the analyst explicitly defines both the method and solution sequence necessary to couple the models. In small systems of models and equations this works well. However, as additional detail is needed across systems and across scales, the number of models grows rapidly. This precludes the manual assembly of large systems of federated models, particularly in systems composed of high fidelity models. This paper examines extending Tarjan's algorithm from sets of equations to sets of models. The proposed implementation of the algorithm is demonstrated using a small one-dimensional system of federated models representing the heat transfer and thermal stress in a gas turbine blade with thermal barrier coating. Enabling the rapid assembly and substitution of different models permits the rapid turnaround needed to support the “what-if” kinds of questions that arise in engineering design.
Santi, Peter Angelo; Cutler, Theresa Elizabeth; Favalli, Andrea; Koehler, Katrina Elizabeth; Henzl, Vladimir; Henzlova, Daniela; Parker, Robert Francis; Croft, Stephen
2015-12-01
In order to improve the accuracy and capabilities of neutron multiplicity counting, additional quantifiable information is needed in order to address the assumptions that are present in the point model. Extracting and utilizing higher order moments (Quads and Pents) from the neutron pulse train represents the most direct way of extracting additional information from the measurement data to allow for an improved determination of the physical properties of the item of interest. The extraction of higher order moments from a neutron pulse train required the development of advanced dead time correction algorithms which could correct for dead time effects in all of the measurement moments in a self-consistent manner. In addition, advanced analysis algorithms have been developed to address specific assumptions that are made within the current analysis model, namely that all neutrons are created at a single point within the item of interest, and that all neutrons that are produced within an item are created with the same energy distribution. This report will discuss the current status of implementation and initial testing of the advanced dead time correction and analysis algorithms that have been developed in an attempt to utilize higher order moments to improve the capabilities of correlated neutron measurement techniques.
Implementing Quantum Algorithms with Modular Gates in a Trapped Ion Chain
NASA Astrophysics Data System (ADS)
Figgatt, Caroline; Debnath, Shantanu; Linke, Norbert; Landsman, Kevin; Wright, Ken; Monroe, Chris
2016-05-01
We present experimental results on quantum algorithms performed using fully modular one- and two-qubit gates in a linear chain of 5 Yb + ions. This is accomplished through arbitrary qubit addressing and manipulation from stimulated Raman transitions driven by a beat note between counter-propagating beams from a pulsed laser. The Raman beam pairs consist of one global beam and a set of counter-propagating individual addressing beams, one for each ion. This provides arbitrary single-qubit rotations as well as arbitrary selection of ion pairs for a fully-connected system of two-qubit modular XX-entangling gates implemented using a pulse-segmentation scheme. We execute controlled-NOT gates with an average fidelity of 97.0% for all 10 possible pairs. Programming arbitrary sequences of gates allows us to construct any quantum algorithm, making this system a universal quantum computer. As an example, we present experimental results for the Bernstein-Vazirani algorithm using 4 control qubits and 1 ancilla, performed with concatenated gates that can be reconfigured to construct all 16 possible oracles, and obtain a process fidelity of 90.3%. This work is supported by the ARO with funding from the IARPA MQCO program and the AFOSR MURI on Quantum Measurement and Verification.
Pre-Hardware Optimization and Implementation Of Fast Optics Closed Control Loop Algorithms
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Lyon, Richard G.; Herman, Jay R.; Abuhassan, Nader
2004-01-01
One of the main heritage tools used in scientific and engineering data spectrum analysis is the Fourier Integral Transform and its high performance digital equivalent - the Fast Fourier Transform (FFT). The FFT is particularly useful in two-dimensional (2-D) image processing (FFT2) within optical systems control. However, timing constraints of a fast optics closed control loop would require a supercomputer to run the software implementation of the FFT2 and its inverse, as well as other image processing representative algorithm, such as numerical image folding and fringe feature extraction. A laboratory supercomputer is not always available even for ground operations and is not feasible for a night project. However, the computationally intensive algorithms still warrant alternative implementation using reconfigurable computing technologies (RC) such as Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGA), which provide low cost compact super-computing capabilities. We present a new RC hardware implementation and utilization architecture that significantly reduces the computational complexity of a few basic image-processing algorithm, such as FFT2, image folding and phase diversity for the NASA Solar Viewing Interferometer Prototype (SVIP) using a cluster of DSPs and FPGAs. The DSP cluster utilization architecture also assures avoidance of a single point of failure, while using commercially available hardware. This, combined with the control algorithms pre-hardware optimization, or the first time allows construction of image-based 800 Hertz (Hz) optics closed control loops on-board a spacecraft, based on the SVIP ground instrument. That spacecraft is the proposed Earth Atmosphere Solar Occultation Imager (EASI) to study greenhouse gases CO2, C2H, H2O, O3, O2, N2O from Lagrange-2 point in space. This paper provides an advanced insight into a new type of science capabilities for future space exploration missions based on on-board image processing
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Petrick, David J.; Day, John H. (Technical Monitor)
2001-01-01
Spacecraft telemetry rates have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image processing application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms and re-configurable computing hardware technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processing (DSP). It has been shown in [1] and [2] that this configuration can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft. However, since this technology is still maturing, intensive pre-hardware steps are necessary to achieve the benefits of hardware implementation. This paper describes these steps for the GOES-8 application, a software project developed using Interactive Data Language (IDL) (Trademark of Research Systems, Inc.) on a Workstation/UNIX platform. The solution involves converting the application to a PC/Windows/RC platform, selected mainly by the availability of low cost, adaptable high-speed RC hardware. In order for the hybrid system to run, the IDL software was modified to account for platform differences. It was interesting to examine the gains and losses in performance on the new platform, as well as unexpected observations before implementing hardware. After substantial pre-hardware optimization steps, the necessity of hardware implementation for bottleneck code in the PC environment became evident and solvable beginning with the methodology described in [1], [2], and implementing a novel methodology for this specific application [6]. The PC-RC interface bandwidth problem for the
NASA Astrophysics Data System (ADS)
Paz, Abel; Plaza, Antonio
2010-08-01
Automatic target and anomaly detection are considered very important tasks for hyperspectral data exploitation. These techniques are now routinely applied in many application domains, including defence and intelligence, public safety, precision agriculture, geology, or forestry. Many of these applications require timely responses for swift decisions which depend upon high computing performance of algorithm analysis. However, with the recent explosion in the amount and dimensionality of hyperspectral imagery, this problem calls for the incorporation of parallel computing techniques. In the past, clusters of computers have offered an attractive solution for fast anomaly and target detection in hyperspectral data sets already transmitted to Earth. However, these systems are expensive and difficult to adapt to on-board data processing scenarios, in which low-weight and low-power integrated components are essential to reduce mission payload and obtain analysis results in (near) real-time, i.e., at the same time as the data is collected by the sensor. An exciting new development in the field of commodity computing is the emergence of commodity graphics processing units (GPUs), which can now bridge the gap towards on-board processing of remotely sensed hyperspectral data. In this paper, we describe several new GPU-based implementations of target and anomaly detection algorithms for hyperspectral data exploitation. The parallel algorithms are implemented on latest-generation Tesla C1060 GPU architectures, and quantitatively evaluated using hyperspectral data collected by NASA's AVIRIS system over the World Trade Center (WTC) in New York, five days after the terrorist attacks that collapsed the two main towers in the WTC complex.
NASA Astrophysics Data System (ADS)
Bernabe, Sergio; Igual, Francisco D.; Botella, Guillermo; Prieto-Matias, Manuel; Plaza, Antonio
2015-10-01
In the last decade, the issue of endmember variability has received considerable attention, particularly when each pixel is modeled as a linear combination of endmembers or pure materials. As a result, several models and algorithms have been developed for considering the effect of endmember variability in spectral unmixing and possibly include multiple endmembers in the spectral unmixing stage. One of the most popular approach for this purpose is the multiple endmember spectral mixture analysis (MESMA) algorithm. The procedure executed by MESMA can be summarized as follows: (i) First, a standard linear spectral unmixing (LSU) or fully constrained linear spectral unmixing (FCLSU) algorithm is run in an iterative fashion; (ii) Then, we use different endmember combinations, randomly selected from a spectral library, to decompose each mixed pixel; (iii) Finally, the model with the best fit, i.e., with the lowest root mean square error (RMSE) in the reconstruction of the original pixel, is adopted. However, this procedure can be computationally very expensive due to the fact that several endmember combinations need to be tested and several abundance estimation steps need to be conducted, a fact that compromises the use of MESMA in applications under real-time constraints. In this paper we develop (for the first time in the literature) an efficient implementation of MESMA on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems. Our experiments have been conducted using a simulated data set and the clMAGMA mathematical library. This kind of implementations with the same descriptive language on different architectures are very important in order to actually calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
A time-efficient algorithm for implementing the Catmull-Clark subdivision method
NASA Astrophysics Data System (ADS)
Ioannou, G.; Savva, A.; Stylianou, V.
2015-10-01
Splines are the most popular methods in Figure Modeling and CAGD (Computer Aided Geometric Design) in generating smooth surfaces from a number of control points. The control points define the shape of a figure and splines calculate the required number of points which when displayed on a computer screen the result is a smooth surface. However, spline methods are based on a rectangular topological structure of points, i.e., a two-dimensional table of vertices, and thus cannot generate complex figures, such as the human and animal bodies that their complex structure does not allow them to be defined by a regular rectangular grid. On the other hand surface subdivision methods, which are derived by splines, generate surfaces which are defined by an arbitrary topology of control points. This is the reason that during the last fifteen years subdivision methods have taken the lead over regular spline methods in all areas of modeling in both industry and research. The cost of executing computer software developed to read control points and calculate the surface is run-time, due to the fact that the surface-structure required for handling arbitrary topological grids is very complicate. There are many software programs that have been developed related to the implementation of subdivision surfaces however, not many algorithms are documented in the literature, to support developers for writing efficient code. This paper aims to assist programmers by presenting a time-efficient algorithm for implementing subdivision splines. The Catmull-Clark which is the most popular of the subdivision methods has been employed to illustrate the algorithm.
2014-01-01
Background The huge quantity of data produced in Biomedical research needs sophisticated algorithmic methodologies for its storage, analysis, and processing. High Performance Computing (HPC) appears as a magic bullet in this challenge. However, several hard to solve parallelization and load balancing problems arise in this context. Here we discuss the HPC-oriented implementation of a general purpose learning algorithm, originally conceived for DNA analysis and recently extended to treat uncertainty on data (U-BRAIN). The U-BRAIN algorithm is a learning algorithm that finds a Boolean formula in disjunctive normal form (DNF), of approximately minimum complexity, that is consistent with a set of data (instances) which may have missing bits. The conjunctive terms of the formula are computed in an iterative way by identifying, from the given data, a family of sets of conditions that must be satisfied by all the positive instances and violated by all the negative ones; such conditions allow the computation of a set of coefficients (relevances) for each attribute (literal), that form a probability distribution, allowing the selection of the term literals. The great versatility that characterizes it, makes U-BRAIN applicable in many of the fields in which there are data to be analyzed. However the memory and the execution time required by the running are of O(n3) and of O(n5) order, respectively, and so, the algorithm is unaffordable for huge data sets. Results We find mathematical and programming solutions able to lead us towards the implementation of the algorithm U-BRAIN on parallel computers. First we give a Dynamic Programming model of the U-BRAIN algorithm, then we minimize the representation of the relevances. When the data are of great size we are forced to use the mass memory, and depending on where the data are actually stored, the access times can be quite different. According to the evaluation of algorithmic efficiency based on the Disk Model, in order to
Dongarra, J.J.; Hewitt, T.
1985-08-01
This note describes some experiments on simple, dense linear algebra algorithms. These experiments show that the CRAY X-MP is capable of small-grain multitasking arising from standard implementations of LU and Cholesky decomposition. The implementation described here provides the ''fastest'' execution rate for LU decomposition, 718 MFLOPS for a matrix of order 1000.
FPGA-based implementation for steganalysis: a JPEG-compatibility algorithm
NASA Astrophysics Data System (ADS)
Gutierrez-Fernandez, E.; Portela-García, M.; Lopez-Ongil, C.; Garcia-Valderas, M.
2013-05-01
Steganalysis is a process to detect hidden data in cover documents, like digital images, videos, audio files, etc. This is the inverse process of steganography, which is the used method to hide secret messages. The widely use of computers and network technologies make digital files very easy-to-use means for storing secret data or transmitting secret messages through the Internet. Depending on the cover medium used to embed the data, there are different steganalysis methods. In case of images, many of the steganalysis and steganographic methods are focused on JPEG image formats, since JPEG is one of the most common formats. One of the main important handicaps of steganalysis methods is the processing speed, since it is usually necessary to process huge amount of data or it can be necessary to process the on-going internet traffic in real-time. In this paper, a JPEG steganalysis system is implemented in an FPGA in order to speed-up the detection process with respect to software-based implementations and to increase the throughput. In particular, the implemented method is the JPEG-compatibility detection algorithm that is based on the fact that when a JPEG image is modified, the resulting image is incompatible with the JPEG compression process.
Ma, Yong; Li, Juan; Lu, Bin
2016-02-01
In order to monitor the emotional state changes of audience on real-time and to adjust the music playlist, we proposed an algorithm framework of an electroencephalogram (EEG) driven personalized affective music recommendation system based on the portable dry electrode shown in this paper. We also further finished a preliminary implementation on the Android platform. We used a two-dimensional emotional model of arousal and valence as the reference, and mapped the EEG data and the corresponding seed songs to the emotional coordinate quadrant in order to establish the matching relationship. Then, Mel frequency cepstrum coefficients were applied to evaluate the similarity between the seed songs and the songs in music library. In the end, during the music playing state, we used the EEG data to identify the audience's emotional state, and played and adjusted the corresponding song playlist based on the established matching relationship. PMID:27382737
Dragonfly: an implementation of the expand–maximize–compress algorithm for single-particle imaging1
Ayyer, Kartik; Lan, Ti-Yen; Elser, Veit; Loh, N. Duane
2016-01-01
Single-particle imaging (SPI) with X-ray free-electron lasers has the potential to change fundamentally how biomacromolecules are imaged. The structure would be derived from millions of diffraction patterns, each from a different copy of the macromolecule before it is torn apart by radiation damage. The challenges posed by the resultant data stream are staggering: millions of incomplete, noisy and un-oriented patterns have to be computationally assembled into a three-dimensional intensity map and then phase reconstructed. In this paper, the Dragonfly software package is described, based on a parallel implementation of the expand–maximize–compress reconstruction algorithm that is well suited for this task. Auxiliary modules to simulate SPI data streams are also included to assess the feasibility of proposed SPI experiments at the Linac Coherent Light Source, Stanford, California, USA. PMID:27504078
Recursive multiport schemes for implementing quantum algorithms with photonic integrated circuits
NASA Astrophysics Data System (ADS)
Tabia, Gelo Noel M.
2016-01-01
We present recursive multiport schemes for implementing quantum Fourier transforms and the inversion step in Grover's algorithm on an integrated linear optics device. In particular, each scheme shows how to execute a quantum operation on 2 d modes using a pair of circuits for the same operation on d modes. The circuits operate on path-encoded qudits and realize d -dimensional unitary transformations on these states using linear optical networks with O (d2) optical elements. To evaluate the schemes against realistic errors, we ran simulations of proof-of-principle experiments using a simple fabrication model of silicon-based photonic integrated devices that employ directional couplers and thermo-optic modulators for beam splitters and phase shifters, respectively. We find that high-fidelity performance is achievable with our multiport circuits for 2-qubit and 3-qubit quantum Fourier transforms, and for quantum search on four-item and eight-item databases.
Ma, Yong; Li, Juan; Lu, Bin
2016-02-01
In order to monitor the emotional state changes of audience on real-time and to adjust the music playlist, we proposed an algorithm framework of an electroencephalogram (EEG) driven personalized affective music recommendation system based on the portable dry electrode shown in this paper. We also further finished a preliminary implementation on the Android platform. We used a two-dimensional emotional model of arousal and valence as the reference, and mapped the EEG data and the corresponding seed songs to the emotional coordinate quadrant in order to establish the matching relationship. Then, Mel frequency cepstrum coefficients were applied to evaluate the similarity between the seed songs and the songs in music library. In the end, during the music playing state, we used the EEG data to identify the audience's emotional state, and played and adjusted the corresponding song playlist based on the established matching relationship.
FPGA implementation of dynamic channel assignment algorithm for cognitive wireless sensor networks
NASA Astrophysics Data System (ADS)
Martínez, Daniela M.; Andrade, Ángel G.
2015-07-01
The reliability of wireless sensor networks (WSNs) in industrial applications can be thwarted due to multipath fading, noise generated by industrial equipment or heavy machinery and particularly by the interference generated from other wireless devices operating in the same spectrum band. Recently, cognitive WSNs (CWSNs) were proposed to improve the performance and reliability of WSNs in highly interfered and noisy environments. In this class of WSN, the nodes are spectrum aware, that is, they monitor the radio spectrum to find channels available for data transmission and dynamically assign and reassign nodes to low-interference condition channels. In this work, we present the implementation of a channel assignment algorithm in a field-programmable gate array, which dynamically assigns channels to sensor nodes based on the interference and noise levels experimented in the network. From the results obtained from the performance evaluation of the CWSN when the channel assignment algorithm is considered, it is possible to identify how many channels should be available in the network in order to achieve a desired percentage of successful transmissions, subject to constraints on the signal-to-interference plus noise ratio on each active link.
NASA Astrophysics Data System (ADS)
Rajaram, Vignesh; Subramanian, Shankar C.
2016-07-01
An important aspect from the perspective of operational safety of heavy road vehicles is the detection and avoidance of collisions, particularly at high speeds. The development of a collision avoidance system is the overall focus of the research presented in this paper. The collision avoidance algorithm was developed using a sliding mode controller (SMC) and compared to one developed using linear full state feedback in terms of performance and controller effort. Important dynamic characteristics such as load transfer during braking, tyre-road interaction, dynamic brake force distribution and pneumatic brake system response were considered. The effect of aerodynamic drag on the controller performance was also studied. The developed control algorithms have been implemented on a Hardware-in-Loop experimental set-up equipped with the vehicle dynamic simulation software, IPG/TruckMaker®. The evaluation has been performed for realistic traffic scenarios with different loading and road conditions. The Hardware-in-Loop experimental results showed that the SMC and full state feedback controller were able to prevent the collision. However, when the discrepancies in the form of parametric variations were included, the SMC provided better results in terms of reduced stopping distance and lower controller effort compared to the full state feedback controller.
NASA Astrophysics Data System (ADS)
Bernabé, Sergio; Martin, Gabriel; Botella, Guillermo; Prieto-Matias, Manuel; Plaza, Antonio
2016-04-01
In the last years, hyperspectral analysis have been applied in many remote sensing applications. In fact, hyperspectral unmixing has been a challenging task in hyperspectral data exploitation. This process consists of three stages: (i) estimation of the number of pure spectral signatures or endmembers, (ii) automatic identification of the estimated endmembers, and (iii) estimation of the fractional abundance of each endmember in each pixel of the scene. However, unmixing algorithms can be computationally very expensive, a fact that compromises their use in applications under real-time constraints. In recent years, several techniques have been proposed to solve the aforementioned problem but until now, most works have focused on the second and third stages. The execution cost of the first stage is usually lower than the other stages. Indeed, it can be optional if we known a priori this estimation. However, its acceleration on parallel architectures is still an interesting and open problem. In this paper we have addressed this issue focusing on the GENE algorithm, a promising geometry-based proposal introduced in.1 We have evaluated our parallel implementation in terms of both accuracy and computational performance through Monte Carlo simulations for real and synthetic data experiments. Performance results on a modern GPU shows satisfactory 16x speedup factors, which allow us to expect that this method could meet real-time requirements on a fully operational unmixing chain.
Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.
Khaled, Heba; Faheem, Hossam El Deen Mostafa; El Gohary, Rania
2015-01-01
This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes. PMID:26510289
NASA Astrophysics Data System (ADS)
Yu, Biin; Peng, Xiang; Tian, Jindong; Niu, Hanben
2006-01-01
A cascaded iterative angular spectrum approach (CIASA) based on the methodology of virtual optics is presented for optical security applications. The technique encodes the target image into two different phase only masks (POM) using a concept of free-space angular spectrum propagation. The two phase-masks are designed and located in any two arbitrary planes interrelated through the free space propagation domain in order to implement the optical encryption or authenticity verification. And both phase masks can serve as enciphered texts. Compared with previous methods, the proposed algorithm employs an improved searching strategy: modifying the phase-distributions of both masks synchronously as well as enlarging the searching space. And with such a scheme, we make use of a high performance floating-point Digital Signal Processor (DSP) to accomplish a design of multiple-locks and multiple-keys optical image encryption system. An evaluation of the system performance is made and it is shown that the algorithm results in much faster convergence and better image quality for the recovered image. And two masks and system parameters can be used to design keys for image encryption, therefore the decrypted image can be obtained only when all these keys are under authorization. This key-assignment strategy may reduce the risk of being intruded and show a high security level. These characters may introduce a high level security that makes the encrypted image more difficult to be decrypted by an unauthorized person.
Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.
Khaled, Heba; Faheem, Hossam El Deen Mostafa; El Gohary, Rania
2015-01-01
This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.
Real-time implementation of camera positioning algorithm based on FPGA & SOPC
NASA Astrophysics Data System (ADS)
Yang, Mingcao; Qiu, Yuehong
2014-09-01
In recent years, with the development of positioning algorithm and FPGA, to achieve the camera positioning based on real-time implementation, rapidity, accuracy of FPGA has become a possibility by way of in-depth study of embedded hardware and dual camera positioning system, this thesis set up an infrared optical positioning system based on FPGA and SOPC system, which enables real-time positioning to mark points in space. Thesis completion include: (1) uses a CMOS sensor to extract the pixel of three objects with total feet, implemented through FPGA hardware driver, visible-light LED, used here as the target point of the instrument. (2) prior to extraction of the feature point coordinates, the image needs to be filtered to avoid affecting the physical properties of the system to bring the platform, where the median filtering. (3) Coordinate signs point to FPGA hardware circuit extraction, a new iterative threshold selection method for segmentation of images. Binary image is then segmented image tags, which calculates the coordinates of the feature points of the needle through the center of gravity method. (4) direct linear transformation (DLT) and extreme constraints method is applied to three-dimensional reconstruction of the plane array CMOS system space coordinates. using SOPC system on a chip here, taking advantage of dual-core computing systems, which let match and coordinate operations separately, thus increase processing speed.
Implementing and Evaluating Multithreaded Triad Census Algorithms on the Cray XMT
Chin, George; Marquez, Andres; Choudhury, Sutanay; Maschhoff, Kristyn
2009-05-29
Commonly represented as directed graphs, social networks depict relationships and behaviors among social entities such as people, groups, and organizations. Social network analysis denotes a class of mathematical and statistical methods designed to study and measure social networks. Beyond sociolo-gy, social network analysis methods are being applied to other types of data in other domains such as bioinformatics, computer networks, national security, and economics. For particular problems, the size of a social network can grow to millions of nodes and tens of millions of edges or more. In such cases, researchers could benefit from the application of social network analysis algorithms on high-performance architectures and systems. The Cray XMT is a third generation multithreaded system based on the Cray XT-3/4 platform. Like most other multithreaded architectures, the Cray XMT is designed to tolerate memory access latencies by switching context between threads. The processors maintain multiple threads of execution and util-ize hardware-based context switching to overlap the memory latency incurred by any thread with the computations from other threads. Due to its memory latency tolerance, the Cray XMT has the poten-tial of significantly improving the execution speed of irregular data-intensive applications such as those found in social network analysis. In this paper, we describe our experiences in developing and optimizing three implementations of a social network analysis method known as triadic analysis to execute on the Cray XMT. The three im-plementations possess different execution complexities, qualities, and characteristics. We evaluate how the various attributes of the codes affect their performance on the Cray XMT. We also explore the effects of different compiler options and execution strategies on the different triadic analysis im-plementations and identify general XMT programming issues and lessons learned.
NASA Astrophysics Data System (ADS)
Hadia, Sarman K.; Thakker, R. A.; Bhatt, Kirit R.
2016-05-01
The study proposes an application of evolutionary algorithms, specifically an artificial bee colony (ABC), variant ABC and particle swarm optimisation (PSO), to extract the parameters of metal oxide semiconductor field effect transistor (MOSFET) model. These algorithms are applied for the MOSFET parameter extraction problem using a Pennsylvania surface potential model. MOSFET parameter extraction procedures involve reducing the error between measured and modelled data. This study shows that ABC algorithm optimises the parameter values based on intelligent activities of honey bee swarms. Some modifications have also been applied to the basic ABC algorithm. Particle swarm optimisation is a population-based stochastic optimisation method that is based on bird flocking activities. The performances of these algorithms are compared with respect to the quality of the solutions. The simulation results of this study show that the PSO algorithm performs better than the variant ABC and basic ABC algorithm for the parameter extraction of the MOSFET model; also the implementation of the ABC algorithm is shown to be simpler than that of the PSO algorithm.
NASA Astrophysics Data System (ADS)
Vela, Adan Ernesto
2011-12-01
From 2010 to 2030, the number of instrument flight rules aircraft operations handled by Federal Aviation Administration en route traffic centers is predicted to increase from approximately 39 million flights to 64 million flights. The projected growth in air transportation demand is likely to result in traffic levels that exceed the abilities of the unaided air traffic controller in managing, separating, and providing services to aircraft. Consequently, the Federal Aviation Administration, and other air navigation service providers around the world, are making several efforts to improve the capacity and throughput of existing airspaces. Ultimately, the stated goal of the Federal Aviation Administration is to triple the available capacity of the National Airspace System by 2025. In an effort to satisfy air traffic demand through the increase of airspace capacity, air navigation service providers are considering the inclusion of advisory conflict-detection and resolution systems. In a human-in-the-loop framework, advisory conflict-detection and resolution decision-support tools identify potential conflicts and propose resolution commands for the air traffic controller to verify and issue to aircraft. A number of researchers and air navigation service providers hypothesize that the inclusion of combined conflict-detection and resolution tools into air traffic control systems will reduce or transform controller workload and enable the required increases in airspace capacity. In an effort to understand the potential workload implications of introducing advisory conflict-detection and resolution tools, this thesis provides a detailed study of the conflict event process and the implementation of conflict-detection and resolution algorithms. Specifically, the research presented here examines a metric of controller taskload: how many resolution commands an air traffic controller issues under the guidance of a conflict-detection and resolution decision-support tool. The goal
NASA Astrophysics Data System (ADS)
Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G.
2011-07-01
We describe and evaluate a fast implementation of a classical block-matching motion estimation algorithm for multiple graphical processing units (GPUs) using the compute unified device architecture computing engine. The implemented block-matching algorithm uses summed absolute difference error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation, we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and noninteger search grids. The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a noninteger search grid. The additional speedup for a noninteger search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable. In addition, we compared the execution time of the proposed FS GPU implementation with two existing, highly optimized nonfull grid search CPU-based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and simplified unsymmetrical multi-hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation. We also demonstrated that for an image sequence of 720 × 480 pixels in resolution commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards.
Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G
2011-07-01
In this paper we describe and evaluate a fast implementation of a classical block matching motion estimation algorithm for multiple Graphical Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) computing engine. The implemented block matching algorithm (BMA) uses summed absolute difference (SAD) error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and non-integer search grids.The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a non-integer search grid. The additional speedup for non-integer search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable.In addition we compared execution time of the proposed FS GPU implementation with two existing, highly optimized non-full grid search CPU based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and Simplified Unsymmetrical multi-Hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation.We also demonstrated that for an image sequence of 720×480 pixels in resolution, commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards.
Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G
2011-07-01
In this paper we describe and evaluate a fast implementation of a classical block matching motion estimation algorithm for multiple Graphical Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) computing engine. The implemented block matching algorithm (BMA) uses summed absolute difference (SAD) error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and non-integer search grids.The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a non-integer search grid. The additional speedup for non-integer search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable.In addition we compared execution time of the proposed FS GPU implementation with two existing, highly optimized non-full grid search CPU based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and Simplified Unsymmetrical multi-Hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation.We also demonstrated that for an image sequence of 720×480 pixels in resolution, commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards. PMID:22347787
Position Accuracy Improvement by Implementing the DGNSS-CP Algorithm in Smartphones.
Yoon, Donghwan; Kee, Changdon; Seo, Jiwon; Park, Byungwoon
2016-01-01
The position accuracy of Global Navigation Satellite System (GNSS) modules is one of the most significant factors in determining the feasibility of new location-based services for smartphones. Considering the structure of current smartphones, it is impossible to apply the ordinary range-domain Differential GNSS (DGNSS) method. Therefore, this paper describes and applies a DGNSS-correction projection method to a commercial smartphone. First, the local line-of-sight unit vector is calculated using the elevation and azimuth angle provided in the position-related output of Android's LocationManager, and this is transformed to Earth-centered, Earth-fixed coordinates for use. To achieve position-domain correction for satellite systems other than GPS, such as GLONASS and BeiDou, the relevant line-of-sight unit vectors are used to construct an observation matrix suitable for multiple constellations. The results of static and dynamic tests show that the standalone GNSS accuracy is improved by about 30%-60%, thereby reducing the existing error of 3-4 m to just 1 m. The proposed algorithm enables the position error to be directly corrected via software, without the need to alter the hardware and infrastructure of the smartphone. This method of implementation and the subsequent improvement in performance are expected to be highly effective to portability and cost saving. PMID:27322284
Position Accuracy Improvement by Implementing the DGNSS-CP Algorithm in Smartphones.
Yoon, Donghwan; Kee, Changdon; Seo, Jiwon; Park, Byungwoon
2016-06-18
The position accuracy of Global Navigation Satellite System (GNSS) modules is one of the most significant factors in determining the feasibility of new location-based services for smartphones. Considering the structure of current smartphones, it is impossible to apply the ordinary range-domain Differential GNSS (DGNSS) method. Therefore, this paper describes and applies a DGNSS-correction projection method to a commercial smartphone. First, the local line-of-sight unit vector is calculated using the elevation and azimuth angle provided in the position-related output of Android's LocationManager, and this is transformed to Earth-centered, Earth-fixed coordinates for use. To achieve position-domain correction for satellite systems other than GPS, such as GLONASS and BeiDou, the relevant line-of-sight unit vectors are used to construct an observation matrix suitable for multiple constellations. The results of static and dynamic tests show that the standalone GNSS accuracy is improved by about 30%-60%, thereby reducing the existing error of 3-4 m to just 1 m. The proposed algorithm enables the position error to be directly corrected via software, without the need to alter the hardware and infrastructure of the smartphone. This method of implementation and the subsequent improvement in performance are expected to be highly effective to portability and cost saving.
NASA Astrophysics Data System (ADS)
Pietrzyk, Mariusz W.; Rannou, Didier; Brennan, Patrick C.
2012-02-01
This pilot study examines the effect of a novel decision support system in medical image interpretation. This system is based on combining image spatial frequency properties and eye-tracking data in order to recognize over and under calling errors. Thus, before it can be implemented as a detection aided schema, training is required during which SVMbased algorithm learns to recognize FP from all reported outcomes, and, FN from all unreported prolonged dwelled regions. Eight radiologists inspected 50 PA chest radiographs with the specific task of identifying lung nodules. Twentyfive cases contained CT proven subtle malignant lesions (5-20mm), but prevalence was not known by the subjects, who took part in two sequential reading sessions, the second, without and with support system feedback. MCMR ROC DBM and JAFROC analyses were conducted and demonstrated significantly higher scores following feedback with p values of 0.04, and 0.03 respectively, highlighting significant improvements in radiology performance once feedback was used. This positive effect on radiologists' performance might have important implications for future CAD-system development.
Position Accuracy Improvement by Implementing the DGNSS-CP Algorithm in Smartphones
Yoon, Donghwan; Kee, Changdon; Seo, Jiwon; Park, Byungwoon
2016-01-01
The position accuracy of Global Navigation Satellite System (GNSS) modules is one of the most significant factors in determining the feasibility of new location-based services for smartphones. Considering the structure of current smartphones, it is impossible to apply the ordinary range-domain Differential GNSS (DGNSS) method. Therefore, this paper describes and applies a DGNSS-correction projection method to a commercial smartphone. First, the local line-of-sight unit vector is calculated using the elevation and azimuth angle provided in the position-related output of Android’s LocationManager, and this is transformed to Earth-centered, Earth-fixed coordinates for use. To achieve position-domain correction for satellite systems other than GPS, such as GLONASS and BeiDou, the relevant line-of-sight unit vectors are used to construct an observation matrix suitable for multiple constellations. The results of static and dynamic tests show that the standalone GNSS accuracy is improved by about 30%–60%, thereby reducing the existing error of 3–4 m to just 1 m. The proposed algorithm enables the position error to be directly corrected via software, without the need to alter the hardware and infrastructure of the smartphone. This method of implementation and the subsequent improvement in performance are expected to be highly effective to portability and cost saving. PMID:27322284
Liu, Ju-Chi; Chou, Hung-Chyun; Chen, Chien-Hsiu; Lin, Yi-Tseng
2016-01-01
A high efficient time-shift correlation algorithm was proposed to deal with the peak time uncertainty of P300 evoked potential for a P300-based brain-computer interface (BCI). The time-shift correlation series data were collected as the input nodes of an artificial neural network (ANN), and the classification of four LED visual stimuli was selected as the output node. Two operating modes, including fast-recognition mode (FM) and accuracy-recognition mode (AM), were realized. The proposed BCI system was implemented on an embedded system for commanding an adult-size humanoid robot to evaluate the performance from investigating the ground truth trajectories of the humanoid robot. When the humanoid robot walked in a spacious area, the FM was used to control the robot with a higher information transfer rate (ITR). When the robot walked in a crowded area, the AM was used for high accuracy of recognition to reduce the risk of collision. The experimental results showed that, in 100 trials, the accuracy rate of FM was 87.8% and the average ITR was 52.73 bits/min. In addition, the accuracy rate was improved to 92% for the AM, and the average ITR decreased to 31.27 bits/min. due to strict recognition constraints. PMID:27579033
Liu, Ju-Chi; Chou, Hung-Chyun; Chen, Chien-Hsiu; Lin, Yi-Tseng; Kuo, Chung-Hsien
2016-01-01
A high efficient time-shift correlation algorithm was proposed to deal with the peak time uncertainty of P300 evoked potential for a P300-based brain-computer interface (BCI). The time-shift correlation series data were collected as the input nodes of an artificial neural network (ANN), and the classification of four LED visual stimuli was selected as the output node. Two operating modes, including fast-recognition mode (FM) and accuracy-recognition mode (AM), were realized. The proposed BCI system was implemented on an embedded system for commanding an adult-size humanoid robot to evaluate the performance from investigating the ground truth trajectories of the humanoid robot. When the humanoid robot walked in a spacious area, the FM was used to control the robot with a higher information transfer rate (ITR). When the robot walked in a crowded area, the AM was used for high accuracy of recognition to reduce the risk of collision. The experimental results showed that, in 100 trials, the accuracy rate of FM was 87.8% and the average ITR was 52.73 bits/min. In addition, the accuracy rate was improved to 92% for the AM, and the average ITR decreased to 31.27 bits/min. due to strict recognition constraints.
Liu, Ju-Chi; Chou, Hung-Chyun; Chen, Chien-Hsiu; Lin, Yi-Tseng; Kuo, Chung-Hsien
2016-01-01
A high efficient time-shift correlation algorithm was proposed to deal with the peak time uncertainty of P300 evoked potential for a P300-based brain-computer interface (BCI). The time-shift correlation series data were collected as the input nodes of an artificial neural network (ANN), and the classification of four LED visual stimuli was selected as the output node. Two operating modes, including fast-recognition mode (FM) and accuracy-recognition mode (AM), were realized. The proposed BCI system was implemented on an embedded system for commanding an adult-size humanoid robot to evaluate the performance from investigating the ground truth trajectories of the humanoid robot. When the humanoid robot walked in a spacious area, the FM was used to control the robot with a higher information transfer rate (ITR). When the robot walked in a crowded area, the AM was used for high accuracy of recognition to reduce the risk of collision. The experimental results showed that, in 100 trials, the accuracy rate of FM was 87.8% and the average ITR was 52.73 bits/min. In addition, the accuracy rate was improved to 92% for the AM, and the average ITR decreased to 31.27 bits/min. due to strict recognition constraints. PMID:27579033
Albert, Carlo; Vogel, Sören
2016-01-01
The General Unified Threshold model of Survival (GUTS) provides a consistent mathematical framework for survival analysis. However, the calibration of GUTS models is computationally challenging. We present a novel algorithm and its fast implementation in our R package, GUTS, that help to overcome these challenges. We show a step-by-step application example consisting of model calibration and uncertainty estimation as well as making probabilistic predictions and validating the model with new data. Using self-defined wrapper functions, we show how to produce informative text printouts and plots without effort, for the inexperienced as well as the advanced user. The complete ready-to-run script is available as supplemental material. We expect that our software facilitates novel re-analysis of existing survival data as well as asking new research questions in a wide range of sciences. In particular the ability to quickly quantify stressor thresholds in conjunction with dynamic compensating processes, and their uncertainty, is an improvement that complements current survival analysis methods. PMID:27340823
Gong, Zongyi; Klanian, Kelly; Patel, Tushita; Sullivan, Olivia; Williams, Mark B.
2012-01-01
Purpose: We are developing a dual modality tomosynthesis breast scanner in which x-ray transmission tomosynthesis and gamma emission tomosynthesis are performed sequentially with the breast in a common configuration. In both modalities projection data are obtained over an angular range of less than 180° from one side of the mildly compressed breast resulting in incomplete and asymmetrical sampling. The objective of this work is to implement and evaluate a maximum likelihood expectation maximization (MLEM) reconstruction algorithm for gamma emission breast tomosynthesis (GEBT). Methods: A combination of Monte Carlo simulations and phantom experiments was used to test the MLEM algorithm for GEBT. The algorithm utilizes prior information obtained from the x-ray breast tomosynthesis scan to partially compensate for the incomplete angular sampling and to perform attenuation correction (AC) and resolution recovery (RR). System spatial resolution, image artifacts, lesion contrast, and signal to noise ratio (SNR) were measured as image quality figures of merit. To test the robustness of the reconstruction algorithm and to assess the relative impacts of correction techniques with changing angular range, simulations and experiments were both performed using acquisition angular ranges of 45°, 90° and 135°. For comparison, a single projection containing the same total number of counts as the full GEBT scan was also obtained to simulate planar breast scintigraphy. Results: The in-plane spatial resolution of the reconstructed GEBT images is independent of source position within the reconstructed volume and independent of acquisition angular range. For 45° acquisitions, spatial resolution in the depth dimension (the direction of breast compression) is degraded with increasing source depth (increasing distance from the collimator surface). Increasing the acquisition angular range from 45° to 135° both greatly reduces this depth dependence and improves the average depth
Schneider, Nadine; Sayle, Roger A; Landrum, Gregory A
2015-10-26
Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with that algorithm that lead to noncanonical atom orderings as well as problems when it is applied to large molecules like proteins. Furthermore, each cheminformatics toolkit or software provides its own version of a canonical ordering, most based on unpublished algorithms, which also complicates the generation of a universal unique identifier for molecules. We present an alternative canonicalization approach that uses a standard stable-sorting algorithm instead of a Morgan-like index. Two new invariants that allow canonical ordering of molecules with dependent chirality as well as those with highly symmetrical cyclic graphs have been developed. The new approach proved to be robust and fast when tested on the 1.45 million compounds of the ChEMBL 20 data set in different scenarios like random renumbering of input atoms or SMILES round tripping. Our new algorithm is able to generate a canonical order of the atoms of protein molecules within a few milliseconds. The novel algorithm is implemented in the open-source cheminformatics toolkit RDKit. With this paper, we provide a reference Python implementation of the algorithm that could easily be integrated in any cheminformatics toolkit. This provides a first step toward a common standard for canonical atom ordering to generate a universal unique identifier for molecules other than InChI.
NASA Astrophysics Data System (ADS)
Grossu, I. V.; El-Shamali, S. A.
2013-07-01
This work presents a new version of a Visual Basic 6.0 application for estimating the fractal dimension of images and 4D objects (Grossu et al. 2013 [1]). Following our attempt of investigating artistic works by fractal analysis of craquelure, we encountered important difficulties in filtering real information from noise. In this context, trying to avoid a sharp delimitation of "black" and "white" pixels, we implemented a fuzzy box-counting algorithm. Hyper-Fractal Analysis v04 example of use. Fuzzy fractal dimension of painting craquelure. Running time: In a first approximation, the algorithm is linear [2].
Implementation of FFT Algorithm using DSP TMS320F28335 for Shunt Active Power Filter
NASA Astrophysics Data System (ADS)
Patel, Pinkal Jashvantbhai; Patel, Rajesh M.; Patel, Vinod
2016-07-01
This work presents simulation, analysis and experimental verification of Fast Fourier Transform (FFT) algorithm for shunt active power filter based on three-level inverter. Different types of filters can be used for elimination of harmonics in the power system. In this work, FFT algorithm for reference current generation is discussed. FFT control algorithm is verified using PSIM simulation results with DLL block and C-code. Simulation results are compared with experimental results for FFT algorithm using DSP TMS320F28335 for shunt active power filter application.
NASA Astrophysics Data System (ADS)
Moliner, L.; Correcher, C.; González, A. J.; Conde, P.; Hernández, L.; Orero, A.; Rodríguez-Álvarez, M. J.; Sánchez, F.; Soriano, A.; Vidal, L. F.; Benlloch, J. M.
2013-02-01
In this work we present an innovative algorithm for the reconstruction of PET images based on the List-Mode (LM) technique which improves their spatial resolution compared to results obtained with current MLEM algorithms. This study appears as a part of a large project with the aim of improving diagnosis in early Alzheimer disease stages by means of a newly developed hybrid PET-MR insert. At the present, Alzheimer is the most relevant neurodegenerative disease and the best way to apply an effective treatment is its early diagnosis. The PET device will consist of several monolithic LYSO crystals coupled to SiPM detectors. Monolithic crystals can reduce scanner costs with the advantage to enable implementation of very small virtual pixels in their geometry. This is especially useful for LM reconstruction algorithms, since they do not need a pre-calculated system matrix. We have developed an LM algorithm which has been initially tested with a large aperture (186 mm) breast PET system. Such an algorithm instead of using the common lines of response, incorporates a novel calculation of tubes of response. The new approach improves the volumetric spatial resolution about a factor 2 at the border of the field of view when compared with traditionally used MLEM algorithm. Moreover, it has also shown to decrease the image noise, thus increasing the image quality.
NASA Astrophysics Data System (ADS)
Jo, Sunhwan; Jiang, Wei
2015-12-01
Replica Exchange with Solute Tempering (REST2) is a powerful sampling enhancement algorithm of molecular dynamics (MD) in that it needs significantly smaller number of replicas but achieves higher sampling efficiency relative to standard temperature exchange algorithm. In this paper, we extend the applicability of REST2 for quantitative biophysical simulations through a robust and generic implementation in greatly scalable MD software NAMD. The rescaling procedure of force field parameters controlling REST2 "hot region" is implemented into NAMD at the source code level. A user can conveniently select hot region through VMD and write the selection information into a PDB file. The rescaling keyword/parameter is written in NAMD Tcl script interface that enables an on-the-fly simulation parameter change. Our implementation of REST2 is within communication-enabled Tcl script built on top of Charm++, thus communication overhead of an exchange attempt is vanishingly small. Such a generic implementation facilitates seamless cooperation between REST2 and other modules of NAMD to provide enhanced sampling for complex biomolecular simulations. Three challenging applications including native REST2 simulation for peptide folding-unfolding transition, free energy perturbation/REST2 for absolute binding affinity of protein-ligand complex and umbrella sampling/REST2 Hamiltonian exchange for free energy landscape calculation were carried out on IBM Blue Gene/Q supercomputer to demonstrate efficacy of REST2 based on the present implementation.
Strout, Michelle
2015-08-15
Programming parallel machines is fraught with difficulties: the obfuscation of algorithms due to implementation details such as communication and synchronization, the need for transparency between language constructs and performance, the difficulty of performing program analysis to enable automatic parallelization techniques, and the existence of important "dusty deck" codes. The SAIMI project developed abstractions that enable the orthogonal specification of algorithms and implementation details within the context of existing DOE applications. The main idea is to enable the injection of small programming models such as expressions involving transcendental functions, polyhedral iteration spaces with sparse constraints, and task graphs into full programs through the use of pragmas. These smaller, more restricted programming models enable orthogonal specification of many implementation details such as how to map the computation on to parallel processors, how to schedule the computation, and how to allocation storage for the computation. At the same time, these small programming models enable the expression of the most computationally intense and communication heavy portions in many scientific simulations. The ability to orthogonally manipulate the implementation for such computations will significantly ease performance programming efforts and expose transformation possibilities and parameter to automated approaches such as autotuning. At Colorado State University, the SAIMI project was supported through DOE grant DE-SC3956 from April 2010 through August 2015. The SAIMI project has contributed a number of important results to programming abstractions that enable the orthogonal specification of implementation details in scientific codes. This final report summarizes the research that was funded by the SAIMI project.
Li, Bin; Tian, Lianfang; Ou, Shanxing
2010-01-01
In order to efficiently and effectively reconstruct 3D medical images and clearly display the detailed information of inner structures and the inner hidden interfaces between different media, an Improved Volume Rendering Optical Model (IVROM) for medical translucent volume rendering and its implementation using the preintegrated Shear-Warp Volume Rendering algorithm are proposed in this paper, which can be readily applied on a commodity PC. Based on the classical absorption and emission model, effects of volumetric shadows and direct and indirect scattering are also considered in the proposed model IVROM. Moreover, the implementation of the Improved Translucent Volume Rendering Method (ITVRM) integrating the IVROM model, Shear-Warp and preintegrated volume rendering algorithm is described, in which the aliasing and staircase effects resulting from under-sampling in Shear-Warp, are avoided by the preintegrated volume rendering technique. This study demonstrates the superiority of the proposed method.
NASA Astrophysics Data System (ADS)
Gruber, Thomas; Grim, Larry; Fauth, Ryan; Tercha, Brian; Powell, Chris; Steinhardt, Kristin
2011-05-01
Large networks of disparate chemical/biological (C/B) sensors, MET sensors, and intelligence, surveillance, and reconnaissance (ISR) sensors reporting to various command/display locations can lead to conflicting threat information, questions of alarm confidence, and a confused situational awareness. Sensor netting algorithms (SNA) are being developed to resolve these conflicts and to report high confidence consensus threat map data products on a common operating picture (COP) display. A data fusion algorithm design was completed in a Phase I SBIR effort and development continues in the Phase II SBIR effort. The initial implementation and testing of the algorithm has produced some performance results. The algorithm accepts point and/or standoff sensor data, and event detection data (e.g., the location of an explosion) from various ISR sensors (e.g., acoustic, infrared cameras, etc.). These input data are preprocessed to assign estimated uncertainty to each incoming piece of data. The data are then sent to a weighted tomography process to obtain a consensus threat map, including estimated threat concentration level uncertainty. The threat map is then tested for consistency and the overall confidence for the map result is estimated. The map and confidence results are displayed on a COP. The benefits of a modular implementation of the algorithm and comparisons of fused / un-fused data results will be presented. The metrics for judging the sensor-netting algorithm performance are warning time, threat map accuracy (as compared to ground truth), false alarm rate, and false alarm rate v. reported threat confidence level.
On implementation of EM-type algorithms in the stochastic models for a matrix computing on GPU
Gorshenin, Andrey K.
2015-03-10
The paper discusses the main ideas of an implementation of EM-type algorithms for computing on the graphics processors and the application for the probabilistic models based on the Cox processes. An example of the GPU’s adapted MATLAB source code for the finite normal mixtures with the expectation-maximization matrix formulas is given. The testing of computational efficiency for GPU vs CPU is illustrated for the different sample sizes.
Scemama, Anthony; Renon, Nicolas; Rapacioli, Mathias
2014-06-10
We present an algorithm and its parallel implementation for solving a self-consistent problem as encountered in Hartree-Fock or density functional theory. The algorithm takes advantage of the sparsity of matrices through the use of local molecular orbitals. The implementation allows one to exploit efficiently modern symmetric multiprocessing (SMP) computer architectures. As a first application, the algorithm is used within the density-functional-based tight binding method, for which most of the computational time is spent in the linear algebra routines (diagonalization of the Fock/Kohn-Sham matrix). We show that with this algorithm (i) single point calculations on very large systems (millions of atoms) can be performed on large SMP machines, (ii) calculations involving intermediate size systems (1000-100 000 atoms) are also strongly accelerated and can run efficiently on standard servers, and (iii) the error on the total energy due to the use of a cutoff in the molecular orbital coefficients can be controlled such that it remains smaller than the SCF convergence criterion. PMID:26580754
Optical pattern recognition architecture implementing the mean-square error correlation algorithm
Molley, Perry A.
1991-01-01
An optical architecture implementing the mean-square error correlation algorithm, MSE=.SIGMA.[I-R].sup.2 for discriminating the presence of a reference image R in an input image scene I by computing the mean-square-error between a time-varying reference image signal s.sub.1 (t) and a time-varying input image signal s.sub.2 (t) includes a laser diode light source which is temporally modulated by a double-sideband suppressed-carrier source modulation signal I.sub.1 (t) having the form I.sub.1 (t)=A.sub.1 [1+.sqroot.2m.sub.1 s.sub.1 (t)cos (2.pi.f.sub.o t)] and the modulated light output from the laser diode source is diffracted by an acousto-optic deflector. The resultant intensity of the +1 diffracted order from the acousto-optic device is given by: I.sub.2 (t)=A.sub.2 [+2m.sub.2.sup.2 s.sub.2.sup.2 (t)-2.sqroot.2m.sub.2 (t) cos (2.pi.f.sub.o t] The time integration of the two signals I.sub.1 (t) and I.sub.2 (t) on the CCD deflector plane produces the result R(.tau.) of the mean-square error having the form: R(.tau.)=A.sub.1 A.sub.2 {[T]+[2m.sub.2.sup.2.multidot..intg.s.sub.2.sup.2 (t-.tau.)dt]-[2m.sub.1 m.sub.2 cos (2.tau.f.sub.o .tau.).multidot..intg.s.sub.1 (t)s.sub.2 (t-.tau.)dt]} where: s.sub.1 (t) is the signal input to the diode modulation source: s.sub.2 (t) is the signal input to the AOD modulation source; A.sub.1 is the light intensity; A.sub.2 is the diffraction efficiency; m.sub.1 and m.sub.2 are constants that determine the signal-to-bias ratio; f.sub.o is the frequency offset between the oscillator at f.sub.c and the modulation at f.sub.c +f.sub.o ; and a.sub.o and a.sub.1 are constant chosen to bias the diode source and the acousto-optic deflector into their respective linear operating regions so that the diode source exhibits a linear intensity characteristic and the AOD exhibits a linear amplitude characteristic.
Terra, Ricardo Mingarini; Waisberg, Daniel Reis; de Almeida, José Luiz Jesus; Devido, Marcela Santana; Pêgo-Fernandes, Paulo Manuel; Jatene, Fabio Biscegli
2012-01-01
OBJECTIVE: We aimed to evaluate whether the inclusion of videothoracoscopy in a pleural empyema treatment algorithm would change the clinical outcome of such patients. METHODS: This study performed quality-improvement research. We conducted a retrospective review of patients who underwent pleural decortication for pleural empyema at our institution from 2002 to 2008. With the old algorithm (January 2002 to September 2005), open decortication was the procedure of choice, and videothoracoscopy was only performed in certain sporadic mid-stage cases. With the new algorithm (October 2005 to December 2008), videothoracoscopy became the first-line treatment option, whereas open decortication was only performed in patients with a thick pleural peel (>2 cm) observed by chest scan. The patients were divided into an old algorithm (n = 93) and new algorithm (n = 113) group and compared. The main outcome variables assessed included treatment failure (pleural space reintervention or death up to 60 days after medical discharge) and the occurrence of complications. RESULTS: Videothoracoscopy and open decortication were performed in 13 and 80 patients from the old algorithm group and in 81 and 32 patients from the new algorithm group, respectively (p<0.01). The patients in the new algorithm group were older (41±1 vs. 46.3±16.7 years, p = 0.014) and had higher Charlson Comorbidity Index scores [0(0-3) vs. 2(0-4), p = 0.032]. The occurrence of treatment failure was similar in both groups (19.35% vs. 24.77%, p = 0.35), although the complication rate was lower in the new algorithm group (48.3% vs. 33.6%, p = 0.04). CONCLUSIONS: The wider use of videothoracoscopy in pleural empyema treatment was associated with fewer complications and unaltered rates of mortality and reoperation even though more severely ill patients were subjected to videothoracoscopic surgery. PMID:22760892
NASA Technical Reports Server (NTRS)
Vicroy, D. D.
1984-01-01
A simplified flight management descent algorithm was developed and programmed on a small programmable calculator. It was designed to aid the pilot in planning and executing a fuel conservative descent to arrive at a metering fix at a time designated by the air traffic control system. The algorithm may also be used for planning fuel conservative descents when time is not a consideration. The descent path was calculated for a constant Mach/airspeed schedule from linear approximations of airplane performance with considerations given for gross weight, wind, and nonstandard temperature effects. An explanation and examples of how the algorithm is used, as well as a detailed flow chart and listing of the algorithm are contained.
Gisdon, Florian J; Culka, Martin; Ullmann, G Matthias
2016-10-01
Conjugate peak refinement (CPR) is a powerful and robust method to search transition states on a molecular potential energy surface. Nevertheless, the method was to the best of our knowledge so far only implemented in CHARMM. In this paper, we present PyCPR, a new Python-based implementation of the CPR algorithm within the pDynamo framework. We provide a detailed description of the theory underlying our implementation and discuss the different parts of the implementation. The method is applied to two different problems. First, we illustrate the method by analyzing the gauche to anti-periplanar transition of butane using a semiempirical QM method. Second, we reanalyze the mechanism of a glycyl-radical enzyme, namely of 4-hydroxyphenylacetate decarboxylase (HPD) using QM/MM calculations. In the end, we suggest a strategy how to use our implementation of the CPR algorithm. The integration of PyCPR into the framework pDynamo allows the combination of CPR with the large variety of methods implemented in pDynamo. PyCPR can be used in combination with quantum mechanical and molecular mechanical methods (and hybrid methods) implemented directly in pDynamo, but also in combination with external programs such as ORCA using pDynamo as interface. PyCPR is distributed as free, open source software and can be downloaded from http://www.bisb.uni-bayreuth.de/index.php?page=downloads . Graphical Abstract PyCPR is a search tool for finding saddle points on the potential energy landscape of a molecular system. PMID:27651280
Gisdon, Florian J; Culka, Martin; Ullmann, G Matthias
2016-10-01
Conjugate peak refinement (CPR) is a powerful and robust method to search transition states on a molecular potential energy surface. Nevertheless, the method was to the best of our knowledge so far only implemented in CHARMM. In this paper, we present PyCPR, a new Python-based implementation of the CPR algorithm within the pDynamo framework. We provide a detailed description of the theory underlying our implementation and discuss the different parts of the implementation. The method is applied to two different problems. First, we illustrate the method by analyzing the gauche to anti-periplanar transition of butane using a semiempirical QM method. Second, we reanalyze the mechanism of a glycyl-radical enzyme, namely of 4-hydroxyphenylacetate decarboxylase (HPD) using QM/MM calculations. In the end, we suggest a strategy how to use our implementation of the CPR algorithm. The integration of PyCPR into the framework pDynamo allows the combination of CPR with the large variety of methods implemented in pDynamo. PyCPR can be used in combination with quantum mechanical and molecular mechanical methods (and hybrid methods) implemented directly in pDynamo, but also in combination with external programs such as ORCA using pDynamo as interface. PyCPR is distributed as free, open source software and can be downloaded from http://www.bisb.uni-bayreuth.de/index.php?page=downloads . Graphical Abstract PyCPR is a search tool for finding saddle points on the potential energy landscape of a molecular system.
Xie, Dexuan; Dash, Ranjan K.; Beard, Daniel A.
2009-01-01
Fast algorithms for simulating mathematical models of coupled blood-tissue transport and metabolism are critical for the analysis of data on transport and reaction in tissues. Here, by combining the method of characteristics with the standard grid discretization technique, a novel algorithm is introduced for solving a general blood-tissue transport and metabolism model governed by a large system of one-dimensional semilinear first order partial differential equations. The key part of the algorithm is to approximate the model as a group of independent ordinary differential equation (ODE) systems such that each ODE system has the same size as the model and can be integrated independently. Thus the method can be easily implemented in parallel on a large scale multiprocessor computer. The accuracy of the algorithm is demonstrated for solving a simple blood-tissue exchange model introduced by Sangren and Sheppard (Bull. Math. Biophys. 15:387–394, 1953), which has an analytical solution. Numerical experiments made on a distributed-memory parallel computer (an HP Linux cluster) and a shared-memory parallel computer (a SGI Origin 2000) demonstrate the parallel efficiency of the algorithm. PMID:20161089
Fu, Haohao; Shao, Xueguang; Chipot, Christophe; Cai, Wensheng
2016-08-01
Proper use of the adaptive biasing force (ABF) algorithm in free-energy calculations needs certain prerequisites to be met, namely, that the Jacobian for the metric transformation and its first derivative be available and the coarse variables be independent and fully decoupled from any holonomic constraint or geometric restraint, thereby limiting singularly the field of application of the approach. The extended ABF (eABF) algorithm circumvents these intrinsic limitations by applying the time-dependent bias onto a fictitious particle coupled to the coarse variable of interest by means of a stiff spring. However, with the current implementation of eABF in the popular molecular dynamics engine NAMD, a trajectory-based post-treatment is necessary to derive the underlying free-energy change. Usually, such a posthoc analysis leads to a decrease in the reliability of the free-energy estimates due to the inevitable loss of information, as well as to a drop in efficiency, which stems from substantial read-write accesses to file systems. We have developed a user-friendly, on-the-fly code for performing eABF simulations within NAMD. In the present contribution, this code is probed in eight illustrative examples. The performance of the algorithm is compared with traditional ABF, on the one hand, and the original eABF implementation combined with a posthoc analysis, on the other hand. Our results indicate that the on-the-fly eABF algorithm (i) supplies the correct free-energy landscape in those critical cases where the coarse variables at play are coupled to either each other or to geometric restraints or holonomic constraints, (ii) greatly improves the reliability of the free-energy change, compared to the outcome of a posthoc analysis, and (iii) represents a negligible additional computational effort compared to regular ABF. Moreover, in the proposed implementation, guidelines for choosing two parameters of the eABF algorithm, namely the stiffness of the spring and the mass
Fu, Haohao; Shao, Xueguang; Chipot, Christophe; Cai, Wensheng
2016-08-01
Proper use of the adaptive biasing force (ABF) algorithm in free-energy calculations needs certain prerequisites to be met, namely, that the Jacobian for the metric transformation and its first derivative be available and the coarse variables be independent and fully decoupled from any holonomic constraint or geometric restraint, thereby limiting singularly the field of application of the approach. The extended ABF (eABF) algorithm circumvents these intrinsic limitations by applying the time-dependent bias onto a fictitious particle coupled to the coarse variable of interest by means of a stiff spring. However, with the current implementation of eABF in the popular molecular dynamics engine NAMD, a trajectory-based post-treatment is necessary to derive the underlying free-energy change. Usually, such a posthoc analysis leads to a decrease in the reliability of the free-energy estimates due to the inevitable loss of information, as well as to a drop in efficiency, which stems from substantial read-write accesses to file systems. We have developed a user-friendly, on-the-fly code for performing eABF simulations within NAMD. In the present contribution, this code is probed in eight illustrative examples. The performance of the algorithm is compared with traditional ABF, on the one hand, and the original eABF implementation combined with a posthoc analysis, on the other hand. Our results indicate that the on-the-fly eABF algorithm (i) supplies the correct free-energy landscape in those critical cases where the coarse variables at play are coupled to either each other or to geometric restraints or holonomic constraints, (ii) greatly improves the reliability of the free-energy change, compared to the outcome of a posthoc analysis, and (iii) represents a negligible additional computational effort compared to regular ABF. Moreover, in the proposed implementation, guidelines for choosing two parameters of the eABF algorithm, namely the stiffness of the spring and the mass
NASA Astrophysics Data System (ADS)
Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David
2006-05-01
The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.
Implementation on Landsat Data of a Simple Cloud Mask Algorithm Developed for MODIS Land Bands
NASA Technical Reports Server (NTRS)
Oreopoulos, Lazaros; Wilson, Michael J.; Varnai, Tamas
2010-01-01
This letter assesses the performance on Landsat-7 images of a modified version of a cloud masking algorithm originally developed for clear-sky compositing of Moderate Resolution Imaging Spectroradiometer (MODIS) images at northern mid-latitudes. While data from recent Landsat missions include measurements at thermal wavelengths, and such measurements are also planned for the next mission, thermal tests are not included in the suggested algorithm in its present form to maintain greater versatility and ease of use. To evaluate the masking algorithm we take advantage of the availability of manual (visual) cloud masks developed at USGS for the collection of Landsat scenes used here. As part of our evaluation we also include the Automated Cloud Cover Assesment (ACCA) algorithm that includes thermal tests and is used operationally by the Landsat-7 mission to provide scene cloud fractions, but no cloud masks. We show that the suggested algorithm can perform about as well as ACCA both in terms of scene cloud fraction and pixel-level cloud identification. Specifically, we find that the algorithm gives an error of 1.3% for the scene cloud fraction of 156 scenes, and a root mean square error of 7.2%, while it agrees with the manual mask for 93% of the pixels, figures very similar to those from ACCA (1.2%, 7.1%, 93.7%).
Egan, A; Laub, W
2014-06-15
Purpose: Several shortcomings of the current implementation of the analytic anisotropic algorithm (AAA) may lead to dose calculation errors in highly modulated treatments delivered to highly heterogeneous geometries. Here we introduce a set of dosimetric error predictors that can be applied to a clinical treatment plan and patient geometry in order to identify high risk plans. Once a problematic plan is identified, the treatment can be recalculated with more accurate algorithm in order to better assess its viability. Methods: Here we focus on three distinct sources dosimetric error in the AAA algorithm. First, due to a combination of discrepancies in smallfield beam modeling as well as volume averaging effects, dose calculated through small MLC apertures can be underestimated, while that behind small MLC blocks can overestimated. Second, due the rectilinear scaling of the Monte Carlo generated pencil beam kernel, energy is not properly transported through heterogeneities near, but not impeding, the central axis of the beamlet. And third, AAA overestimates dose in regions very low density (< 0.2 g/cm{sup 3}). We have developed an algorithm to detect the location and magnitude of each scenario within the patient geometry, namely the field-size index (FSI), the heterogeneous scatter index (HSI), and the lowdensity index (LDI) respectively. Results: Error indices successfully identify deviations between AAA and Monte Carlo dose distributions in simple phantom geometries. Algorithms are currently implemented in the MATLAB computing environment and are able to run on a typical RapidArc head and neck geometry in less than an hour. Conclusion: Because these error indices successfully identify each type of error in contrived cases, with sufficient benchmarking, this method can be developed into a clinical tool that may be able to help estimate AAA dose calculation errors and when it might be advisable to use Monte Carlo calculations.
NASA Astrophysics Data System (ADS)
Rueda, Antonio J.; Noguera, José M.; Luque, Adrián
2016-02-01
In recent years GPU computing has gained wide acceptance as a simple low-cost solution for speeding up computationally expensive processing in many scientific and engineering applications. However, in most cases accelerating a traditional CPU implementation for a GPU is a non-trivial task that requires a thorough refactorization of the code and specific optimizations that depend on the architecture of the device. OpenACC is a promising technology that aims at reducing the effort required to accelerate C/C++/Fortran code on an attached multicore device. Virtually with this technology the CPU code only has to be augmented with a few compiler directives to identify the areas to be accelerated and the way in which data has to be moved between the CPU and GPU. Its potential benefits are multiple: better code readability, less development time, lower risk of errors and less dependency on the underlying architecture and future evolution of the GPU technology. Our aim with this work is to evaluate the pros and cons of using OpenACC against native GPU implementations in computationally expensive hydrological applications, using the classic D8 algorithm of O'Callaghan and Mark for river network extraction as case-study. We implemented the flow accumulation step of this algorithm in CPU, using OpenACC and two different CUDA versions, comparing the length and complexity of the code and its performance with different datasets. We advance that although OpenACC can not match the performance of a CUDA optimized implementation (×3.5 slower in average), it provides a significant performance improvement against a CPU implementation (×2-6) with by far a simpler code and less implementation effort.
NASA Astrophysics Data System (ADS)
von Rudorff, Guido Falk; Wehmeyer, Christoph; Sebastiani, Daniel
2014-06-01
We adapt a swarm-intelligence-based optimization method (the artificial bee colony algorithm, ABC) to enhance its parallel scaling properties and to improve the escaping behavior from deep local minima. Specifically, we apply the approach to the geometry optimization of Lennard-Jones clusters. We illustrate the performance and the scaling properties of the parallelization scheme for several system sizes (5-20 particles). Our main findings are specific recommendations for ranges of the parameters of the ABC algorithm which yield maximal performance for Lennard-Jones clusters and Morse clusters. The suggested parameter ranges for these different interaction potentials turn out to be very similar; thus, we believe that our reported values are fairly general for the ABC algorithm applied to chemical optimization problems.
NASA Technical Reports Server (NTRS)
Whitmore, S. A.
1985-01-01
The dynamics model and data sources used to perform air-data reconstruction are discussed, as well as the Kalman filter. The need for adaptive determination of the noise statistics of the process is indicated. The filter innovations are presented as a means of developing the adaptive criterion, which is based on the true mean and covariance of the filter innovations. A method for the numerical approximation of the mean and covariance of the filter innovations is presented. The algorithm as developed is applied to air-data reconstruction for the space shuttle, and data obtained from the third landing are presented. To verify the performance of the adaptive algorithm, the reconstruction is also performed using a constant covariance Kalman filter. The results of the reconstructions are compared, and the adaptive algorithm exhibits better performance.
Implementation of a sensor guided flight algorithm for target tracking by small UAS
NASA Astrophysics Data System (ADS)
Collins, Gaemus E.; Stankevitz, Chris; Liese, Jeffrey
2011-06-01
Small xed-wing UAS (SUAS) such as Raven and Unicorn have limited power, speed, and maneuverability. Their missions can be dramatically hindered by environmental conditions (wind, terrain), obstructions (buildings, trees) blocking clear line of sight to a target, and/or sensor hardware limitations (xed stare, limited gimbal motion, lack of zoom). Toyon's Sensor Guided Flight (SGF) algorithm was designed to account for SUAS hardware shortcomings and enable long-term tracking of maneuvering targets by maintaining persistent eyes-on-target. SGF was successfully tested in simulation with high-delity UAS, sensor, and environment models, but real- world ight testing with 60 Unicorn UAS revealed surprising second order challenges that were not highlighted by the simulations. This paper describes the SGF algorithm, our rst round simulation results, our second order discoveries from ight testing, and subsequent improvements that were made to the algorithm.
Implementation of Future Climate Satellite Cloud Algorithms: Case of the GCOM-C/SGLI
NASA Astrophysics Data System (ADS)
Dim, J. R.; Murakami, H.; Nakajima, T. Y.; Takamura, T.
2012-12-01
The Global Change Observation Mission-Climate/Second Generation GLobal Imager (GCOM-C/SGLI) is a future Earth observation satellite to be launched in 2015. Its major objective is the monitoring of long-term climate changes. A major factor of these changes is the cloud impact. A new cloud algorithm adapted to the spectral characteristics of the GCOM-C/SGLI and the products derived are currently tested. The tests consist of evaluating the performance of the cloud optical thickness (COT) and the cloud particle effective radius (CLER) against simulation data, and equivalent products derived from a compatible satellite, the Terra/MODerate resolution Image Spectrometer (Terra/MODIS). In addition to these tests, the sensitivity of the products derived from this algorithm, to external and internal cloud related parameters, is analyzed. The base-map of the initial data input for this algorithm is made of geometrically corrected radiances of the Advanced Earth Observation Satellite II/GLobal Imager (ADEOS-II/GLI) and the GCOM-C/SGLI simulated radiances. The results of these performance tests, based on timely matching products, show that the GCOM-C/SGLI algorithm performs relatively well for averagely overcast scenes, with an agreement rate of ±20% with the satellite simulation products and the Terra/MODIS COT and CLER. A negative bias is however frequently observed, with the GCOM-C/SGLI retrieved parameters showing higher values at high COT levels. The algorithm also seems less reactive to thin and small particles' clouds mainly in land areas, compared to Terra/MODIS data and the satellite simulation products. Sensitivity to varying ground albedo, cloud phase, cloud structure and cloud location are analyzed to understand the influence of these parameters on the results obtained. Possible consequences of these influences on long-term climate variations and the bases for the improvement of the present algorithm in various cloud types' conditions are discussed.
Transport implementation of the Bernstein–Vazirani algorithm with ion qubits
NASA Astrophysics Data System (ADS)
Fallek, S. D.; Herold, C. D.; McMahon, B. J.; Maller, K. M.; Brown, K. R.; Amini, J. M.
2016-08-01
Using trapped ion quantum bits in a scalable microfabricated surface trap, we perform the Bernstein–Vazirani algorithm. Our architecture takes advantage of the ion transport capabilities of such a trap. The algorithm is demonstrated using two- and three-ion chains. For three ions, an improvement is achieved compared to a classical system using the same number of oracle queries. For two ions and one query, we correctly determine an unknown bit string with probability 97.6(8)%. For three ions, we succeed with probability 80.9(3)%.
NASA Astrophysics Data System (ADS)
Viau, C. R.
2012-06-01
The use of the intensity change and line-of-sight (LOS) change concepts have previously been documented in the open-literature as techniques used by non-imaging infrared (IR) seekers to reject expendable IR countermeasures (IRCM). The purpose of this project was to implement IR counter-countermeasure (IRCCM) algorithms based on target intensity and kinematic behavior for a generic imaging IR (IIR) seeker model with the underlying goal of obtaining a better understanding of how expendable IRCM can be used to defeat the latest generation of seekers. The report describes the Intensity Ratio Change (IRC) and LOS Rate Change (LRC) discrimination techniques. The algorithms and the seeker model are implemented in a physics-based simulation product called Tactical Engagement Simulation Software (TESS™). TESS is developed in the MATLAB®/Simulink® environment and is a suite of RF/IR missile software simulators used to evaluate and analyze the effectiveness of countermeasures against various classes of guided threats. The investigation evaluates the algorithm and tests their robustness by presenting the results of batch simulation runs of surface-to-air (SAM) and air-to-air (AAM) IIR missiles engaging a non-maneuvering target platform equipped with expendable IRCM as self-protection. The report discusses how varying critical parameters such track memory time, ratio thresholds and hold time can influence the outcome of an engagement.
NASA Astrophysics Data System (ADS)
Orjuela-Vargas, S. A.; Triana-Martinez, J.; Yañez, J. P.; Philips, W.
2014-03-01
Video analysis in real time requires fast and efficient algorithms to extract relevant information from a considerable number, commonly 25, of frames per second. Furthermore, robust algorithms for outdoor visual scenes may retrieve correspondent features along the day where a challenge is to deal with lighting changes. Currently, Local Binary Pattern (LBP) techniques are widely used for extracting features due to their robustness to illumination changes and the low requirements for implementation. We propose to compute an automatic threshold based on the distribution of the intensity residuals resulting from the pairwise comparisons when using LBP techniques. The intensity residuals distribution can be modelled by a Generalized Gaussian Distribution (GGD). In this paper we compute the adaptive threshold using the parameters of the GGD. We present a CUDA implementation of our proposed algorithm. We use the LBPSYM technique. Our approach is tested on videos of four different urban scenes with mobilities captured during day and night. The extracted features can be used in a further step to determine patterns, identify objects or detect background. However, further research must be conducted for blurring correction since the scenes at night are commonly blurred due to artificial lighting.
A Reduced-Complexity Fast Algorithm for Software Implementation of the IFFT/FFT in DMT Systems
NASA Astrophysics Data System (ADS)
Chan, Tsun-Shan; Kuo, Jen-Chih; Wu, An-Yeu (Andy)
2002-12-01
The discrete multitone (DMT) modulation/demodulation scheme is the standard transmission technique in the application of asymmetric digital subscriber lines (ADSL) and very-high-speed digital subscriber lines (VDSL). Although the DMT can achieve higher data rate compared with other modulation/demodulation schemes, its computational complexity is too high for cost-efficient implementations. For example, it requires 512-point IFFT/FFT as the modulation/demodulation kernel in the ADSL systems and even higher in the VDSL systems. The large block size results in heavy computational load in running programmable digital signal processors (DSPs). In this paper, we derive computationally efficient fast algorithm for the IFFT/FFT. The proposed algorithm can avoid complex-domain operations that are inevitable in conventional IFFT/FFT computation. The resulting software function requires less computational complexity. We show that it acquires only 17% number of multiplications to compute the IFFT and FFT compared with the Cooly-Tukey algorithm. Hence, the proposed fast algorithm is very suitable for firmware development in reducing the MIPS count in programmable DSPs.
Two-chip implementation of the RSA public-key encryption algorithm
Rieden, R.F.; Snyder, J.B.; Widman, R.J.; Barnard, W.J.
1982-01-01
A system has been developed which employs two identical integrated circuits to perform the encryption algorithm developed by Rivest, Shamir, and Adleman (RSA) on a 336-bit message. The integrated circuit used in the system employs the 3-micron polysilicon gate, radiation-hard, CMOS technology developed at Sandia National Laboratories.
The algorithm and implementation of EMCCD automatic gain adjustment based on fixed gray level
NASA Astrophysics Data System (ADS)
Luo, Le; Chen, Qian; He, Wei-Ji; Lu, Zhen-Xi
2015-10-01
The image quality and resolution will be affected, if the multiplication gain value of EMCCD imaging system is too low or too high. This paper presents the algorithm of EMCCD automatic gain adjustment based on fixed gray level. The algorithm takes the average brightness of the image as a measure of image quality. It calculates the multiplication gain adjustment value through the calculation of average brightness of current frame image, combining the two gray level threshold values and the system exposure function, so that the next frame image will achieve the ideal brightness value. On the basis of the algorithm, this paper builds a multiplication gain adjustment control circuit and a multiplication gain resistor lookup table. Then, we can achieve automatic adjustment of multiplication gain by changing the resistance value of the digital potentiometer in the circuit. Experimental results show that the image quality can be effectively improved, by using the algorithm of automatic gain adjustment based on fixed gray level. After adjustment, the brightness of the image is moderate, contrast enhanced, and the details more clear.
NASA Technical Reports Server (NTRS)
Springer, P.
1993-01-01
This paper discusses the method in which the Cascade-Correlation algorithm was parallelized in such a way that it could be run using the Time Warp Operating System (TWOS). TWOS is a special purpose operating system designed to run parellel discrete event simulations with maximum efficiency on parallel or distributed computers.
Kumar, M.; Hirschberg, D.S.
1983-03-01
An algorithm is presented to merge two subfiles of size n/2 each, stored in the left and the right halves of a linearly connected processor array, in 3n/2 route steps and log n compare-exchange steps. This algorithm is extended to merge two horizontally adjacent subfiles of size m*n/2 each, stored in an m*n mesh-connected processor array in row-major order, in m+2n route steps and log mn compare-exchange steps. Also an algorithm is presented to merge two vertically aligned subfiles, stored in a mesh-connected processor array in row-major order. Finally, a sorting scheme is proposed that requires 11n route steps and 2 log/sup 2/ n compare-exchange steps to sort n/sup 2/ elements stored in an n*n mesh-connected processor array. The merge algorithms for the mesh-connected processor array use wrap around connections. These connections can be avoided at the expense of some extra hardware. 12 references.
NASA Technical Reports Server (NTRS)
Soeder, J. F.
1983-01-01
As turbofan engines become more complex, the development of controls necessitate the use of multivariable control techniques. A control developed for the F100-PW-100(3) turbofan engine by using linear quadratic regulator theory and other modern multivariable control synthesis techniques is described. The assembly language implementation of this control on an SEL 810B minicomputer is described. This implementation was then evaluated by using a real-time hybrid simulation of the engine. The control software was modified to run with a real engine. These modifications, in the form of sensor and actuator failure checks and control executive sequencing, are discussed. Finally recommendations for control software implementations are presented.
Passive microwave remote sensing of rainfall with SSM/I: Algorithm development and implementation
NASA Technical Reports Server (NTRS)
Ferriday, James G.; Avery, Susan K.
1994-01-01
A physically based algorithm sensitive to emission and scattering is used to estimate rainfall using the Special Sensor Microwave/Imager (SSM/I). The algorithm is derived from radiative transfer calculations through an atmospheric cloud model specifying vertical distributions of ice and liquid hydrometeors as a function of rain rate. The algorithm is structured in two parts: SSM/I brightness temperatures are screened to detect rainfall and are then used in rain-rate calculation. The screening process distinguishes between nonraining background conditions and emission and scattering associated with hydrometeors. Thermometric temperature and polarization thresholds determined from the radiative transfer calculations are used to detect rain, whereas the rain-rate calculation is based on a linear function fit to a linear combination of channels. Separate calculations for ocean and land account for different background conditions. The rain-rate calculation is constructed to respond to both emission and scattering, reduce extraneous atmospheric and surface effects, and to correct for beam filling. The resulting SSM/I rain-rate estimates are compared to three precipitation radars as well as to a dynamically simulated rainfall event. Global estimates from the SSM/I algorithm are also compared to continental and shipboard measurements over a 4-month period. The algorithm is found to accurately describe both localized instantaneous rainfall events and global monthly patterns over both land and ovean. Over land the 4-month mean difference between SSM/I and the Global Precipitation Climatology Center continental rain gauge database is less than 10%. Over the ocean, the mean difference between SSM/I and the Legates and Willmott global shipboard rain gauge climatology is less than 20%.
NASA Technical Reports Server (NTRS)
Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.
1989-01-01
Several techniques to perform static and dynamic load balancing techniques for vision systems are presented. These techniques are novel in the sense that they capture the computational requirements of a task by examining the data when it is produced. Furthermore, they can be applied to many vision systems because many algorithms in different systems are either the same, or have similar computational characteristics. These techniques are evaluated by applying them on a parallel implementation of the algorithms in a motion estimation system on a hypercube multiprocessor system. The motion estimation system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from different time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters. It is shown that the performance gains when these data decomposition and load balancing techniques are used are significant and the overhead of using these techniques is minimal.
Facchinetti, Andrea; Sparacino, Giovanni; Cobelli, Claudio
2013-01-01
Glucose readings provided by current continuous glucose monitoring (CGM) devices still suffer from accuracy and precision issues. In April 2013, we proposed a new conceptual architecture to deal with these problems and render CGM sensors algorithmically smarter, which consists of three modules for denoising, enhancement, and prediction placed in cascade to a commercial CGM sensor. The architecture was assessed on a data set consisting of 24 type 1 diabetes patients collected in four clinical centers of the AP@home Consortium (a European project of 7th Framework Programme funded by the European Committee). This article, as a companion to our prior publication, illustrates the technical details of the algorithms and of the implementation issues. PMID:24124959
NASA Astrophysics Data System (ADS)
Melsa, J. L.; Mills, J. D.; Arora, A. A.
1983-06-01
This report describes the results of a fifteen-month study of the real-time implementation of algorithm combining time-domain harmonic scaling and Adaptive Residual Coding at a transmission bit rate of 16 kb/s. The modifications of this encoding algorithm as originally presented by Melsa and Pande to allow real-time implementation are described in detail. A non real-time FORTRAN simulation using a sixteen-bit word length was developed and tested to establish feasibility. The hardware implementation of a full-duplex, real-time system has demonstrated that this algorithm is capable of producing toll quality speech digitization. This report has been divided into two volumes. The first volume discusses the algorithm modifications and FORTRAN simulation. The details of the hardware implementation, schematics for the system and operating instructions are included in Volume 2 of this final report.
Stephen-Haynes, J
2013-12-01
This article outlines a strategic process for the evaluation of wound management products and the development of an algorithm as an implementation model for wound management. Wound management is an increasingly complex process given the variety of interactive dressings and other devices available. This article discusses the procurement process, access to wound management dressings and the use of wound management formularies within the UK. We conclude that the current commissioners of tissue viability within healthcare organisations need to adopt a proactive approach to ensure appropriate formulary evaluation and product selection, in order to achieve the most beneficial clinical and financial outcomes.
NASA Astrophysics Data System (ADS)
Ayazi, S. M.; Mashhorroudi, M. F.; Ghorbani, M.
2014-10-01
Among the main issues in the theory of geometric grids on spatial information systems, is the problem of finding the shortest path routing between two points. In this paper tried to using the graph theory and A* algorithms in transport management, the optimal method to find the shortest path with shortest time condition to be reviewed. In order to construct a graph that consists of a network of pathways and modelling of physical and phasing area, the shortest path routes, elected with the use of the algorithm is modified A*.At of the proposed method node selection Examining angle nodes the desired destination node and the next node is done. The advantage of this method is that due to the elimination of some routes, time of route calculation is reduced.
NASA Technical Reports Server (NTRS)
Schallhorn, Paul; Majumdar, Alok
2012-01-01
This paper describes a finite volume based numerical algorithm that allows multi-dimensional computation of fluid flow within a system level network flow analysis. There are several thermo-fluid engineering problems where higher fidelity solutions are needed that are not within the capacity of system level codes. The proposed algorithm will allow NASA's Generalized Fluid System Simulation Program (GFSSP) to perform multi-dimensional flow calculation within the framework of GFSSP s typical system level flow network consisting of fluid nodes and branches. The paper presents several classical two-dimensional fluid dynamics problems that have been solved by GFSSP's multi-dimensional flow solver. The numerical solutions are compared with the analytical and benchmark solution of Poiseulle, Couette and flow in a driven cavity.
Cantor network, control algorithm, two-dimensional compact structure and its optical implementation
NASA Astrophysics Data System (ADS)
Wang, Ning; Liu, Liren; Yin, Yaozu
1995-12-01
A compact integrating module technique for packaging a optical multistage Cantor network with a polarization multiplex technique is suggested. The modules have a unique configuration, which is the solid-state combination of a polarization rotator, double birefringent slabs, and a 2 \\times 2 switch array. The design and the fabrication of an eight-channel optical nonblocking Cantor network are demonstrated, and a fast-setup control algorithm is developed. The network systems are easy to assemble and insensitive to environment disturbance.
NASA Technical Reports Server (NTRS)
Chen, Shu-Po
1999-01-01
This paper presents software for solving the non-conforming fluid structure interfaces in aeroelastic simulation. It reviews the algorithm of interpolation and integration, highlights the flexibility and the user-friendly feature that allows the user to select the existing structure and fluid package, like NASTRAN and CLF3D, to perform the simulation. The presented software is validated by computing the High Speed Civil Transport model.
Schalkoff, R.J.; Shaaban, K.M.; Carver, A.E.
1996-12-31
The ARIES {number_sign}1 (Autonomous Robotic Inspection Experimental System) vision system is used to acquire drum surface images under controlled conditions and subsequently perform autonomous visual inspection leading to a classification as `acceptable` or `suspect`. Specific topics described include vision system design methodology, algorithmic structure,hardware processing structure, and image acquisition hardware. Most of these capabilities were demonstrated at the ARIES Phase II Demo held on Nov. 30, 1995. Finally, Phase III efforts are briefly addressed.
Implementation of a Landing Footprint Algorithm for the HTV-2 and Trajectory Simulations
NASA Technical Reports Server (NTRS)
Clark, Casie M.
2012-01-01
This presentation details work performed during the Fall 2011 term in the Research Controls and Dynamics Branch at NASA Dryden Flight Research Center. Included is a study on a possible landing footprint algorithm, with direct application to the HTV-2. Also discussed is work in support of the MIPCC effort, which includes optimal trajectory solutions for the F-15A Streak Eagle aircraft and theoretical performance of an F-15A with a MIPCC propulsion system.
NASA Astrophysics Data System (ADS)
Valencia, David; Plaza, Antonio; Vega-Rodríguez, Miguel A.; Pérez, Rosa M.
2005-11-01
Hyperspectral imagery is a class of image data which is used in many scientific areas, most notably, medical imaging and remote sensing. It is characterized by a wealth of spatial and spectral information. Over the last years, many algorithms have been developed with the purpose of finding "spectral endmembers," which are assumed to be pure signatures in remotely sensed hyperspectral data sets. Such pure signatures can then be used to estimate the abundance or concentration of materials in mixed pixels, thus allowing sub-pixel analysis which is crucial in many remote sensing applications due to current sensor optics and configuration. One of the most popular endmember extraction algorithms has been the pixel purity index (PPI), available from Kodak's Research Systems ENVI software package. This algorithm is very time consuming, a fact that has generally prevented its exploitation in valid response times in a wide range of applications, including environmental monitoring, military applications or hazard and threat assessment/tracking (including wildland fire detection, oil spill mapping and chemical and biological standoff detection). Field programmable gate arrays (FPGAs) are hardware components with millions of gates. Their reprogrammability and high computational power makes them particularly attractive in remote sensing applications which require a response in near real-time. In this paper, we present an FPGA design for implementation of PPI algorithm which takes advantage of a recently developed fast PPI (FPPI) algorithm that relies on software-based optimization. The proposed FPGA design represents our first step toward the development of a new reconfigurable system for fast, onboard analysis of remotely sensed hyperspectral imagery.
Ravari, Alireza Norouzzadeh; Taghirad, Hamid D
2014-10-01
In this paper the problem of loop closing from depth or camera image information in an unknown environment is investigated. A sparse model is constructed from a parametric dictionary for every range or camera image as mobile robot observations. In contrast to high-dimensional feature-based representations, in this model, the dimension of the sensor measurements' representations is reduced. Considering the loop closure detection as a clustering problem in high-dimensional space, little attention has been paid to the curse of dimensionality in the existing state-of-the-art algorithms. In this paper, a representation is developed from a sparse model of images, with a lower dimension than original sensor observations. Exploiting the algorithmic information theory, the representation is developed such that it has the geometrically transformation invariant property in the sense of Kolmogorov complexity. A universal normalized metric is used for comparison of complexity based representations of image models. Finally, a distinctive property of normalized compression distance is exploited for detecting similar places and rejecting incorrect loop closure candidates. Experimental results show efficiency and accuracy of the proposed method in comparison to the state-of-the-art algorithms and some recently proposed methods.
Ravari, Alireza Norouzzadeh; Taghirad, Hamid D
2014-10-01
In this paper the problem of loop closing from depth or camera image information in an unknown environment is investigated. A sparse model is constructed from a parametric dictionary for every range or camera image as mobile robot observations. In contrast to high-dimensional feature-based representations, in this model, the dimension of the sensor measurements' representations is reduced. Considering the loop closure detection as a clustering problem in high-dimensional space, little attention has been paid to the curse of dimensionality in the existing state-of-the-art algorithms. In this paper, a representation is developed from a sparse model of images, with a lower dimension than original sensor observations. Exploiting the algorithmic information theory, the representation is developed such that it has the geometrically transformation invariant property in the sense of Kolmogorov complexity. A universal normalized metric is used for comparison of complexity based representations of image models. Finally, a distinctive property of normalized compression distance is exploited for detecting similar places and rejecting incorrect loop closure candidates. Experimental results show efficiency and accuracy of the proposed method in comparison to the state-of-the-art algorithms and some recently proposed methods. PMID:24968363
NASA Technical Reports Server (NTRS)
Patten, William Neff
1989-01-01
There is an evident need to discover a means of establishing reliable, implementable controls for systems that are plagued by nonlinear and, or uncertain, model dynamics. The development of a generic controller design tool for tough-to-control systems is reported. The method utilizes a moving grid, time infinite element based solution of the necessary conditions that describe an optimal controller for a system. The technique produces a discrete feedback controller. Real time laboratory experiments are now being conducted to demonstrate the viability of the method. The algorithm that results is being implemented in a microprocessor environment. Critical computational tasks are accomplished using a low cost, on-board, multiprocessor (INMOS T800 Transputers) and parallel processing. Progress to date validates the methodology presented. Applications of the technique to the control of highly flexible robotic appendages are suggested.
NASA Astrophysics Data System (ADS)
Ham, Woonchul; Song, Chulgyu; Lee, Kangsan; Badarch, Luubaatar
2015-05-01
In this paper, we introduce the mobile embedded system implemented for capturing stereo image based on two CMOS camera module. We use WinCE as an operating system and capture the stereo image by using device driver for CMOS camera interface and Direct Draw API functions. We aslo comments on the GPU hardware and CUDA programming for implementation of 3D exaggeraion algorithm for ROI by adjusting and synthesizing the disparity value of ROI (region of interest) in real time. We comment on the pattern of aperture for deblurring of CMOS camera module based on the Kirchhoff diffraction formula and clarify the reason why we can get more sharp and clear image by blocking some portion of aperture or geometric sampling. Synthesized stereo image is real time monitored on the shutter glass type three-dimensional LCD monitor and disparity values of each segment are analyzed to prove the validness of emphasizing effect of ROI.
NASA Astrophysics Data System (ADS)
Zabolotny, W. M.; Byszuk, A.
2016-03-01
The CMS experiment Level-1 trigger system is undergoing an upgrade. In the barrel-endcap transition region, it is necessary to merge data from 3 types of muon detectors—RPC, DT and CSC. The Overlap Muon Track Finder (OMTF) uses the novel approach to concentrate and process those data in a uniform manner to identify muons and their transversal momentum. The paper presents the algorithm and FPGA firmware implementation of the OMTF and its data transmission system in CMS. It is foreseen that the OMTF will be subject to significant changes resulting from optimization which will be done with the aid of physics simulations. Therefore, a special, high-level, parameterized HDL implementation is necessary.
A Synchronization Algorithm and Implementation for High-Speed Block Codes Applications. Part 4
NASA Technical Reports Server (NTRS)
Lin, Shu; Zhang, Yu; Nakamura, Eric B.; Uehara, Gregory T.
1998-01-01
Block codes have trellis structures and decoders amenable to high speed CMOS VLSI implementation. For a given CMOS technology, these structures enable operating speeds higher than those achievable using convolutional codes for only modest reductions in coding gain. As a result, block codes have tremendous potential for satellite trunk and other future high-speed communication applications. This paper describes a new approach for implementation of the synchronization function for block codes. The approach utilizes the output of the Viterbi decoder and therefore employs the strength of the decoder. Its operation requires no knowledge of the signal-to-noise ratio of the received signal, has a simple implementation, adds no overhead to the transmitted data, and has been shown to be effective in simulation for received SNR greater than 2 dB.
NASA Astrophysics Data System (ADS)
Giacometto, F. J.; Vilardy, J. M.; Torres, C. O.; Mattos, L.
2011-01-01
Currently addressing problems related to security in access control, as a consequence, have been developed applications that work under unique characteristics in individuals, such as biometric features. In the world becomes important working with biometric images such as the liveliness of the iris which are for both the pattern of retinal images as your blood vessels. This paper presents an implementation of an algorithm for creating templates for biometric authentication with ocular features for FPGA, in which the object of study is that the texture pattern of iris is unique to each individual. The authentication will be based in processes such as edge extraction methods, segmentation principle of John Daugman and Libor Masek's, and standardization to obtain necessary templates for the search of matches in a database and then get the expected results of authentication.
Implementation of a Fractional Model-Based Fault Detection Algorithm into a PLC Controller
NASA Astrophysics Data System (ADS)
Kopka, Ryszard
2014-12-01
This paper presents results related to the implementation of model-based fault detection and diagnosis procedures into a typical PLC controller. To construct the mathematical model and to implement the PID regulator, a non-integer order differential/integral calculation was used. Such an approach allows for more exact control of the process and more precise modelling. This is very crucial in model-based diagnostic methods. The theoretical results were verified on a real object in the form of a supercapacitor connected to a PLC controller by a dedicated electronic circuit controlled directly from the PLC outputs.
Parrallel Implementation of Fast Randomized Algorithms for Low Rank Matrix Decomposition
Lucas, Andrew J.; Stalizer, Mark; Feo, John T.
2014-03-01
We analyze the parallel performance of randomized interpolative decomposition by de- composing low rank complex-valued Gaussian random matrices larger than 100 GB. We chose a Cray XMT supercomputer as it provides an almost ideal PRAM model permitting quick investigation of parallel algorithms without obfuscation from hardware idiosyncrasies. We obtain that on non-square matrices performance scales almost linearly with runtime about 100 times faster on 128 processors. We also verify that numerically discovered error bounds still hold on matrices two orders of magnitude larger than those previously tested.
The SFXC software correlator for very long baseline interferometry: algorithms and implementation
NASA Astrophysics Data System (ADS)
Keimpema, A.; Kettenis, M. M.; Pogrebenko, S. V.; Campbell, R. M.; Cimó, G.; Duev, D. A.; Eldering, B.; Kruithof, N.; van Langevelde, H. J.; Marchal, D.; Molera Calvés, G.; Ozdemir, H.; Paragi, Z.; Pidopryhora, Y.; Szomoru, A.; Yang, J.
2015-06-01
In this paper a description is given of the SFXC software correlator, developed and maintained at the Joint Institute for VLBI in Europe (JIVE). The software is designed to run on generic Linux-based computing clusters. The correlation algorithm is explained in detail, as are some of the novel modes that software correlation has enabled, such as wide-field VLBI imaging through the use of multiple phase centres and pulsar gating and binning. This is followed by an overview of the software architecture. Finally, the performance of the correlator as a function of number of CPU cores, telescopes and spectral channels is shown.
NASA Astrophysics Data System (ADS)
de Lorenzi, Flavio; Debattista, Victor P.; Gerhard, Ortwin; Sambhus, Niranjan
2007-03-01
We describe a made-to-measure (M2M) algorithm for constructing N-particle models of stellar systems from observational data (χ2M2M), extending earlier ideas by Syer & Tremaine. The algorithm properly accounts for observational errors, is flexible, and can be applied to various systems and geometries. We implement this algorithm in a parallel code NMAGIC and carry out a sequence of tests to illustrate its power and performance. (i) We reconstruct an isotropic Hernquist model from density moments and projected kinematics and recover the correct differential energy distribution and intrinsic kinematics. (ii) We build a self-consistent oblate three-integral maximum rotator model and compare how the distribution function is recovered from integral field and slit kinematic data. (iii) We create a non-rotating and a figure rotating triaxial stellar particle model, reproduce the projected kinematics of the figure rotating system by a non-rotating system of the same intrinsic shape, and illustrate the signature of pattern rotation in this model. From these tests, we comment on the dependence of the results from χ2M2M on the initial model, the geometry, and the amount of available data.
NASA Astrophysics Data System (ADS)
Skorupski, Krzysztof; Mroczka, Janusz; Wriedt, Thomas; Riefler, Norbert
2014-06-01
In many branches of science experiments are expensive, require specialist equipment or are very time consuming. Studying the light scattering phenomenon by fractal aggregates can serve as an example. Light scattering simulations can overcome these problems and provide us with theoretical, additional data which complete our study. For this reason a fractal-like aggregate model as well as fast aggregation codes are needed. Until now various computer models, that try to mimic the physics behind this phenomenon, have been developed. However, their implementations are mostly based on a trial-and-error procedure. Such approach is very time consuming and the morphological parameters of resulting aggregates are not exact because the postconditions (e.g. the position error) cannot be very strict. In this paper we present a very fast and accurate implementation of a tunable aggregation algorithm based on the work of Filippov et al. (2000). Randomization is reduced to its necessary minimum (our technique can be more than 1000 times faster than standard algorithms) and the position of a new particle, or a cluster, is calculated with algebraic methods. Therefore, the postconditions can be extremely strict and the resulting errors negligible (e.g. the position error can be recognized as non-existent). In our paper two different methods, which are based on the particle-cluster (PC) and the cluster-cluster (CC) aggregation processes, are presented.
NASA Technical Reports Server (NTRS)
Li, Wei; Saleeb, Atef F.
1995-01-01
This two-part report is concerned with the development of a general framework for the implicit time-stepping integrators for the flow and evolution equations in generalized viscoplastic models. The primary goal is to present a complete theoretical formulation, and to address in detail the algorithmic and numerical analysis aspects involved in its finite element implementation, as well as to critically assess the numerical performance of the developed schemes in a comprehensive set of test cases. On the theoretical side, the general framework is developed on the basis of the unconditionally-stable, backward-Euler difference scheme as a starting point. Its mathematical structure is of sufficient generality to allow a unified treatment of different classes of viscoplastic models with internal variables. In particular, two specific models of this type, which are representative of the present start-of-art in metal viscoplasticity, are considered in applications reported here; i.e., fully associative (GVIPS) and non-associative (NAV) models. The matrix forms developed for both these models are directly applicable for both initially isotropic and anisotropic materials, in general (three-dimensional) situations as well as subspace applications (i.e., plane stress/strain, axisymmetric, generalized plane stress in shells). On the computational side, issues related to efficiency and robustness are emphasized in developing the (local) interative algorithm. In particular, closed-form expressions for residual vectors and (consistent) material tangent stiffness arrays are given explicitly for both GVIPS and NAV models, with their maximum sizes 'optimized' to depend only on the number of independent stress components (but independent of the number of viscoplastic internal state parameters). Significant robustness of the local iterative solution is provided by complementing the basic Newton-Raphson scheme with a line-search strategy for convergence. In the present second part of
Simulation of Propellant Loading System Senior Design Implement in Computer Algorithm
NASA Technical Reports Server (NTRS)
Bandyopadhyay, Alak
2010-01-01
Propellant loading from the Storage Tank to the External Tank is one of the very important and time consuming pre-launch ground operations for the launch vehicle. The propellant loading system is a complex integrated system involving many physical components such as the storage tank filled with cryogenic fluid at a very low temperature, the long pipe line connecting the storage tank with the external tank, the external tank along with the flare stack, and vent systems for releasing the excess fuel. Some of the very important parameters useful for design purpose are the prediction of pre-chill time, loading time, amount of fuel lost, the maximum pressure rise etc. The physics involved for mathematical modeling is quite complex due to the fact the process is unsteady, there is phase change as some of the fuel changes from liquid to gas state, then conjugate heat transfer in the pipe walls as well as between solid-to-fluid region. The simulation is very tedious and time consuming too. So overall, this is a complex system and the objective of the work is student's involvement and work in the parametric study and optimization of numerical modeling towards the design of such system. The students have to first become familiar and understand the physical process, the related mathematics and the numerical algorithm. The work involves exploring (i) improved algorithm to make the transient simulation computationally effective (reduced CPU time) and (ii) Parametric study to evaluate design parameters by changing the operational conditions
NASA Astrophysics Data System (ADS)
Lohmann, Timo
Electric sector models are powerful tools that guide policy makers and stakeholders. Long-term power generation expansion planning models are a prominent example and determine a capacity expansion for an existing power system over a long planning horizon. With the changes in the power industry away from monopolies and regulation, the focus of these models has shifted to competing electric companies maximizing their profit in a deregulated electricity market. In recent years, consumers have started to participate in demand response programs, actively influencing electricity load and price in the power system. We introduce a model that features investment and retirement decisions over a long planning horizon of more than 20 years, as well as an hourly representation of day-ahead electricity markets in which sellers of electricity face buyers. This combination makes our model both unique and challenging to solve. Decomposition algorithms, and especially Benders decomposition, can exploit the model structure. We present a novel method that can be seen as an alternative to generalized Benders decomposition and relies on dynamic linear overestimation. We prove its finite convergence and present computational results, demonstrating its superiority over traditional approaches. In certain special cases of our model, all necessary solution values in the decomposition algorithms can be directly calculated and solving mathematical programming problems becomes entirely obsolete. This leads to highly efficient algorithms that drastically outperform their programming problem-based counterparts. Furthermore, we discuss the implementation of all tailored algorithms and the challenges from a modeling software developer's standpoint, providing an insider's look into the modeling language GAMS. Finally, we apply our model to the Texas power system and design two electricity policies motivated by the U.S. Environment Protection Agency's recently proposed CO2 emissions targets for the
NASA Astrophysics Data System (ADS)
Sengupta, Anand S.; Dhurandhar, Sanjeev; Lazzarini, Albert
2003-04-01
The first scientific runs of kilometer scale laser interferometric detectors such as LIGO are under way. Data from these detectors will be used to look for signatures of gravitational waves from astrophysical objects such as inspiraling neutron-star black-hole binaries using matched filtering. The computational resources required for online flat-search implementation of the matched filtering are large if searches are carried out for a small total mass. A flat search is implemented by constructing a single discrete grid of densely populated template waveforms spanning the dynamical parameters—masses, spins—which are correlated with the interferometer data. The correlations over the kinematical parameters can be maximized a prioriwithout constructing a template bank over them. Mohanty and Dhurandhar showed that a significant reduction in computational resources can be accomplished by using a hierarchy of such template banks where candidate events triggered by a sparsely populated grid are followed up by the regular, dense flat-search grid. The estimated speedup in this method was a factor ˜25 over the flat search. In this paper we report an improved implementation of the hierarchical search, wherein we extend the domain of hierarchy to an extra dimension—namely, the time of arrival of the signal in the bandwidth of the interferometer. This is accomplished by lowering the Nyquist sampling rate of the signal in the trigger stage. We show that this leads to further improvement in the efficiency of data analysis and speeds up the online computation by a factor of ˜65 70 over the flat search. We also take into account and discuss issues related to template placement, trigger thresholds, and other peculiar problems that do not arise in earlier implementation schemes of the hierarchical search. We present simulation results for 2PN waveforms embedded in the noise expected for initial LIGO detectors.
VLSI implementation of a new LMS-based algorithm for noise removal in ECG signal
NASA Astrophysics Data System (ADS)
Satheeskumaran, S.; Sabrigiriraj, M.
2016-06-01
Least mean square (LMS)-based adaptive filters are widely deployed for removing artefacts in electrocardiogram (ECG) due to less number of computations. But they posses high mean square error (MSE) under noisy environment. The transform domain variable step-size LMS algorithm reduces the MSE at the cost of computational complexity. In this paper, a variable step-size delayed LMS adaptive filter is used to remove the artefacts from the ECG signal for improved feature extraction. The dedicated digital Signal processors provide fast processing, but they are not flexible. By using field programmable gate arrays, the pipelined architectures can be used to enhance the system performance. The pipelined architecture can enhance the operation efficiency of the adaptive filter and save the power consumption. This technique provides high signal-to-noise ratio and low MSE with reduced computational complexity; hence, it is a useful method for monitoring patients with heart-related problem.
NASA Technical Reports Server (NTRS)
Mandy, Christophe P.; Sakamoto, Hiraku; Saenz-Otero, Alvar; Miller, David W.
2007-01-01
The MIT's Space Systems Laboratory developed the Synchronized Position Hold Engage and Reorient Experimental Satellites (SPHERES) as a risk-tolerant spaceborne facility to develop and mature control, estimation, and autonomy algorithms for distributed satellite systems for applications such as satellite formation flight. Tests performed study interferometric mission-type formation flight maneuvers in deep space. These tests consist of having the satellites trace a coordinated trajectory under tight control that would allow simulated apertures to constructively interfere observed light and measure the resulting increase in angular resolution. This paper focuses on formation initialization (establishment of a formation using limited field of view relative sensors), formation coordination (synchronization of the different satellite s motion) and fuel-balancing among the different satellites.
Algorithmic complexity for psychology: a user-friendly implementation of the coding theorem method.
Gauvrit, Nicolas; Singmann, Henrik; Soler-Toscano, Fernando; Zenil, Hector
2016-03-01
Kolmogorov-Chaitin complexity has long been believed to be impossible to approximate when it comes to short sequences (e.g. of length 5-50). However, with the newly developed coding theorem method the complexity of strings of length 2-11 can now be numerically estimated. We present the theoretical basis of algorithmic complexity for short strings (ACSS) and describe an R-package providing functions based on ACSS that will cover psychologists' needs and improve upon previous methods in three ways: (1) ACSS is now available not only for binary strings, but for strings based on up to 9 different symbols, (2) ACSS no longer requires time-consuming computing, and (3) a new approach based on ACSS gives access to an estimation of the complexity of strings of any length. Finally, three illustrative examples show how these tools can be applied to psychology.
Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)
2002-01-01
Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.
Alteration of Box-Jenkins methodology by implementing genetic algorithm method
NASA Astrophysics Data System (ADS)
Ismail, Zuhaimy; Maarof, Mohd Zulariffin Md; Fadzli, Mohammad
2015-02-01
A time series is a set of values sequentially observed through time. The Box-Jenkins methodology is a systematic method of identifying, fitting, checking and using integrated autoregressive moving average time series model for forecasting. Box-Jenkins method is an appropriate for a medium to a long length (at least 50) time series data observation. When modeling a medium to a long length (at least 50), the difficulty arose in choosing the accurate order of model identification level and to discover the right parameter estimation. This presents the development of Genetic Algorithm heuristic method in solving the identification and estimation models problems in Box-Jenkins. Data on International Tourist arrivals to Malaysia were used to illustrate the effectiveness of this proposed method. The forecast results that generated from this proposed model outperformed single traditional Box-Jenkins model.
Algorithmic complexity for psychology: a user-friendly implementation of the coding theorem method.
Gauvrit, Nicolas; Singmann, Henrik; Soler-Toscano, Fernando; Zenil, Hector
2016-03-01
Kolmogorov-Chaitin complexity has long been believed to be impossible to approximate when it comes to short sequences (e.g. of length 5-50). However, with the newly developed coding theorem method the complexity of strings of length 2-11 can now be numerically estimated. We present the theoretical basis of algorithmic complexity for short strings (ACSS) and describe an R-package providing functions based on ACSS that will cover psychologists' needs and improve upon previous methods in three ways: (1) ACSS is now available not only for binary strings, but for strings based on up to 9 different symbols, (2) ACSS no longer requires time-consuming computing, and (3) a new approach based on ACSS gives access to an estimation of the complexity of strings of any length. Finally, three illustrative examples show how these tools can be applied to psychology. PMID:25761393
NASA Astrophysics Data System (ADS)
Sun, Hong; Wu, Qian-zhong
2013-09-01
In order to improve the precision of optical-electric tracking device, proposing a kind of improved optical-electric tracking device based on MEMS, in allusion to the tracking error of gyroscope senor and the random drift, According to the principles of time series analysis of random sequence, establish AR model of gyro random error based on Kalman filter algorithm, then the output signals of gyro are multiple filtered with Kalman filter. And use ARM as micro controller servo motor is controlled by fuzzy PID full closed loop control algorithm, and add advanced correction and feed-forward links to improve response lag of angle input, Free-forward can make output perfectly follow input. The function of lead compensation link is to shorten the response of input signals, so as to reduce errors. Use the wireless video monitor module and remote monitoring software (Visual Basic 6.0) to monitor servo motor state in real time, the video monitor module gathers video signals, and the wireless video module will sent these signals to upper computer, so that show the motor running state in the window of Visual Basic 6.0. At the same time, take a detailed analysis to the main error source. Through the quantitative analysis of the errors from bandwidth and gyro sensor, it makes the proportion of each error in the whole error more intuitive, consequently, decrease the error of the system. Through the simulation and experiment results shows the system has good following characteristic, and it is very valuable for engineering application.
A new algorithm for determining 3D biplane imaging geometry: theory and implementation
NASA Astrophysics Data System (ADS)
Singh, Vikas; Xu, Jinhui; Hoffmann, Kenneth R.; Xu, Guang; Chen, Zhenming; Gopal, Anant
2005-04-01
Biplane imaging is a primary method for visual and quantitative assessment of the vasculature. A key problem called Imaging Geometry Determination problem (IGD for short) in this method is to determine the rotation-matrix R and the translation-vector t which relate the two coordinate systems. In this paper, we propose a new approach, called IG-Sieving, to calculate R and t using corresponding points in the two images. Our technique first generates an initial estimate of R and t from the gantry angles of the imaging system, and then optimizes them by solving an optimal-cell-search problem in a 6-D parametric space (three variables defining R plus the three variables of t). To efficiently find the optimal imaging geometry (IG) in 6-D, our approach divides the high dimensional search domain into a set of lower-dimensional regions, thereby reducing the optimal-cell-search problem to a set of optimization problems in 3D sub-spaces. For each such sub-space, our approach first applies efficient computational geometry techniques to identify "possibly-feasible"" IG"s, and then uses a criterion we call fall-in-number to sieve out good IGs. We show that in a bounded number of optimization steps, a (possibly infinite) set of near-optimal IGs can be determined. Simulation results indicate that our method can reconstruct 3D points with average 3D center-of-mass errors of about 0.8cm for input image-data errors as high as 0.1cm. More importantly, our algorithm provides a novel insight into the geometric structure of the solution-space, which could be exploited to significantly improve the accuracy of other biplane algorithms.
Implementation and optimization of ultrasound signal processing algorithms on mobile GPU
NASA Astrophysics Data System (ADS)
Kong, Woo Kyu; Lee, Wooyoul; Kim, Kyu Cheol; Yoo, Yangmo; Song, Tai-Kyong
2014-03-01
A general-purpose graphics processing unit (GPGPU) has been used for improving computing power in medical ultrasound imaging systems. Recently, a mobile GPU becomes powerful to deal with 3D games and videos at high frame rates on Full HD or HD resolution displays. This paper proposes the method to implement ultrasound signal processing on a mobile GPU available in the high-end smartphone (Galaxy S4, Samsung Electronics, Seoul, Korea) with programmable shaders on the OpenGL ES 2.0 platform. To maximize the performance of the mobile GPU, the optimization of shader design and load sharing between vertex and fragment shader was performed. The beamformed data were captured from a tissue mimicking phantom (Model 539 Multipurpose Phantom, ATS Laboratories, Inc., Bridgeport, CT, USA) by using a commercial ultrasound imaging system equipped with a research package (Ultrasonix Touch, Ultrasonix, Richmond, BC, Canada). The real-time performance is evaluated by frame rates while varying the range of signal processing blocks. The implementation method of ultrasound signal processing on OpenGL ES 2.0 was verified by analyzing PSNR with MATLAB gold standard that has the same signal path. CNR was also analyzed to verify the method. From the evaluations, the proposed mobile GPU-based processing method has no significant difference with the processing using MATLAB (i.e., PSNR<52.51 dB). The comparable results of CNR were obtained from both processing methods (i.e., 11.31). From the mobile GPU implementation, the frame rates of 57.6 Hz were achieved. The total execution time was 17.4 ms that was faster than the acquisition time (i.e., 34.4 ms). These results indicate that the mobile GPU-based processing method can support real-time ultrasound B-mode processing on the smartphone.
Implementation of ILLIAC 4 algorithms for multispectral image interpretation. [earth resources data
NASA Technical Reports Server (NTRS)
Ray, R. M.; Thomas, J. D.; Donovan, W. E.; Swain, P. H.
1974-01-01
Research has focused on the design and partial implementation of a comprehensive ILLIAC software system for computer-assisted interpretation of multispectral earth resources data such as that now collected by the Earth Resources Technology Satellite. Research suggests generally that the ILLIAC 4 should be as much as two orders of magnitude more cost effective than serial processing computers for digital interpretation of ERTS imagery via multivariate statistical classification techniques. The potential of the ARPA Network as a mechanism for interfacing geographically-dispersed users to an ILLIAC 4 image processing facility is discussed.
NASA Technical Reports Server (NTRS)
Maine, R. E.; Iliff, K. W.
1980-01-01
A new formulation is proposed for the problem of parameter estimation of dynamic systems with both process and measurement noise. The formulation gives estimates that are maximum likelihood asymptotically in time. The means used to overcome the difficulties encountered by previous formulations are discussed. It is then shown how the proposed formulation can be efficiently implemented in a computer program. A computer program using the proposed formulation is available in a form suitable for routine application. Examples with simulated and real data are given to illustrate that the program works well.
NASA Astrophysics Data System (ADS)
Melsa, J. L.; Mills, J. D.; Arora, A. A.
1983-06-01
This report describes the results of a fifteen month study of the real-time implementation of an algorithm combining time-domain harmonic scaling and Adaptive Residual Coding at a transmission bit rate of 16 kb/s. The modifications of this encoding algorithm as originally presented by Melso and Pande to allow real-time implementation are described in detail. A non real-time FORTRAN simulation using a sixteen-bit word length was developed and tested to establish feasibility. The hardware implementation of a full-duplex, real-time system has demonstrated that this algorithm is capable of producing toll quality speech digitization. This report has been divided into two volumes. The second volume discusses details of the hardware implementation, schematics for the system and operating instructions.
G0W0 implementation using Lanczos algorithm and Sternheimer equation
NASA Astrophysics Data System (ADS)
Laflamme Janssen, Jonathan; Berube, Nicolas; Antonius, Gabriel; Cote, Michel
2012-02-01
The G0W0 approach is an accurate method to give a physical meaning to the eigenvalues obtained in adensity-functional theory (DFT) calculation.However, the calculation of such corrections with plane wave codes is currently prohibitive for systems with more than a few hundreds of electrons. What limits calculations to this system size is the need in current implementations to invert the dielectric matrix and the need to carry out summations over conduction bands. This talk presents a strategy to avoid both of these bottlenecks. In traditional plane wave implementations of G0W0, the dielectric matrix is expressed in a plane wave basis, which needs to be relatively big to properly describe the matrix. Here, we will explain how a Lanczos basis can be generated to substantially reduce the size of the matrix. Also, the number of conduction bands needed to reach convergence in the summations is usually an order of magnitude larger than the number of valence bands. Here, the calculation of the conduction states is avoided by reformulating the summations into linear equation problems (Sternheimer equations), which also substantially reduces the computation time. Preliminary results will be presented.
NASA Astrophysics Data System (ADS)
Baba, Toshitaka; Takahashi, Narumi; Kaneda, Yoshiyuki; Ando, Kazuto; Matsuoka, Daisuke; Kato, Toshihiro
2015-12-01
Because of improvements in offshore tsunami observation technology, dispersion phenomena during tsunami propagation have often been observed in recent tsunamis, for example the 2004 Indian Ocean and 2011 Tohoku tsunamis. The dispersive propagation of tsunamis can be simulated by use of the Boussinesq model, but the model demands many computational resources. However, rapid progress has been made in parallel computing technology. In this study, we investigated a parallelized approach for dispersive tsunami wave modeling. Our new parallel software solves the nonlinear Boussinesq dispersive equations in spherical coordinates. A variable nested algorithm was used to increase spatial resolution in the target region. The software can also be used to predict tsunami inundation on land. We used the dispersive tsunami model to simulate the 2011 Tohoku earthquake on the Supercomputer K. Good agreement was apparent between the dispersive wave model results and the tsunami waveforms observed offshore. The finest bathymetric grid interval was 2/9 arcsec (approx. 5 m) along longitude and latitude lines. Use of this grid simulated tsunami soliton fission near the Sendai coast. Incorporating the three-dimensional shape of buildings and structures led to improved modeling of tsunami inundation.
NASA Astrophysics Data System (ADS)
Hatzaki, Maria; Flocas, Elena A.; Simmonds, Ian; Kouroutzoglou, John; Keay, Kevin; Rudeva, Irina
2013-04-01
Migratory cyclones and anticyclones mainly account for the short-term weather variations in extra-tropical regions. By contrast to cyclones that have drawn major scientific attention due to their direct link to active weather and precipitation, climatological studies on anticyclones are limited, even though they also are associated with extreme weather phenomena and play an important role in global and regional climate. This is especially true for the Mediterranean, a region particularly vulnerable to climate change, and the little research which has been done is essentially confined to the manual analysis of synoptic charts. For the construction of a comprehensive climatology of migratory anticyclonic systems in the Mediterranean using an objective methodology, the Melbourne University automatic tracking algorithm is applied, based to the ERA-Interim reanalysis mean sea level pressure database. The algorithm's reliability in accurately capturing the weather patterns and synoptic climatology of the transient activity has been widely proven. This algorithm has been extensively applied for cyclone studies worldwide and it has been also successfully applied for the Mediterranean, though its use for anticyclone tracking is limited to the Southern Hemisphere. In this study the performance of the tracking algorithm under different data resolutions and different choices of parameter settings in the scheme is examined. Our focus is on the appropriate modification of the algorithm in order to efficiently capture the individual characteristics of the anticyclonic tracks in the Mediterranean, a closed basin with complex topography. We show that the number of the detected anticyclonic centers and the resulting tracks largely depend upon the data resolution and the search radius. We also find that different scale anticyclones and secondary centers that lie within larger anticyclone structures can be adequately represented; this is important, since the extensions of major
NASA Astrophysics Data System (ADS)
Beeson, Brett; Barnes, David G.; Bourke, Paul D.
We describe the first distributed data implementation of the perspective shear-warp volume rendering algorithm and explore its applications to large astronomical data cubes and simulation realisations. Our system distributes sub-volumes of 3-dimensional images to leaf nodes of a Beowulf-class cluster, where the rendering takes place. Junction nodes composite the sub-volume renderings together and pass the combined images upwards for further compositing or display. We demonstrate that our system out-performs other software solutions and can render a `worst-case' 512 × 512 × 512 data volume in less than four seconds using 16 rendering and 15 compositing nodes. Our system also performs very well compared with much more expensive hardware systems. With appropriate commodity hardware, such as Swinburne's Virtual Reality Theatre or a 3Dlabs Wildcat graphics card, stereoscopic display is possible.
Implementation of the SU(2) Hamiltonian symmetry for the DMRG algorithm
NASA Astrophysics Data System (ADS)
Alvarez, Gonzalo
2012-10-01
In the Density Matrix Renormalization Group (DMRG) algorithm (White, 1992, 1993) [1,2], Hamiltonian symmetries play an important rôle. Using symmetries, the matrix representation of the Hamiltonian can be blocked. Diagonalizing each matrix block is more efficient than diagonalizing the original matrix. This paper explains how the the DMRG++ code (Alvarez, 2009) [3] has been extended to handle the non-local SU(2) symmetry in a model independent way. Improvements in CPU times compared to runs with only local symmetries are discussed for the one-orbital Hubbard model, and for a two-orbital Hubbard model for iron-based superconductors. The computational bottleneck of the algorithm and the use of shared memory parallelization are also addressed. Program summary Program title: DMRG++ Catalog identifier: AEDJ_v2_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEDJ_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Special license. See http://cpc.cs.qub.ac.uk/licence/AEDJ_v2_0.html No. of lines in distributed program, including test data, etc.: 211560 No. of bytes in distributed program, including test data, etc.: 10572185 Distribution format: tar.gz Programming language: C++. Computer: PC. Operating system: Multiplatform, tested on Linux. Has the code been vectorized or parallelized?: Yes. 1 to 8 processors with MPI, 2 to 4 cores with pthreads. RAM: 1GB (256MB is enough to run the included test) Classification: 23. Catalog identifier of previous version: AEDJ_v1_0 Journal reference of previous version: Comput. Phys. Comm. 180(2009)1572 External routines: BLAS and LAPACK Nature of problem: Strongly correlated electrons systems, display a broad range of important phenomena, and their study is a major area of research in condensed matter physics. In this context, model Hamiltonians are used to simulate the relevant interactions of a given compound, and the relevant degrees of freedom. These studies
Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek
2014-01-01
This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1, 920 × 1, 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems. PMID:24526303
NASA Astrophysics Data System (ADS)
Baker, B. I.; Friberg, P. A.
2014-12-01
Modern seismic networks typically deploy three component (3C) sensors, but still fail to utilize all of the information available in the seismograms when performing automated phase picking for real-time event location. In most cases a variation on a short term over long term average threshold detector is used for picking and then an association program is used to assign phase types to the picks. However, the 3C waveforms from an earthquake contain an abundance of information related to the P and S phases in both their polarization and energy partitioning. An approach that has been overlooked and has demonstrated encouraging results is the Component Energy Comparison Method (CECM) by Nagano et al. as published in Geophysics 1989. CECM is well suited to being used in real-time because the calculation is not computationally intensive. Furthermore, the CECM method has fewer tuning variables (3) than traditional pickers in Earthworm such as the Rex Allen algorithm (N=18) or even the Anthony Lomax Filter Picker module (N=5). In addition to computing the CECM detector we study the detector sensitivity by rotating the signal into principle components as well as estimating the P phase onset from a curvature function describing the CECM as opposed to the CECM itself. We present our results implementing this algorithm in a real-time module for Earthworm and show the improved phase picks as compared to the traditional single component pickers using Earthworm.
McDonnell, Mark D; Mohan, Ashutosh; Stricker, Christian
2013-01-01
The release of neurotransmitter vesicles after arrival of a pre-synaptic action potential (AP) at cortical synapses is known to be a stochastic process, as is the availability of vesicles for release. These processes are known to also depend on the recent history of AP arrivals, and this can be described in terms of time-varying probabilities of vesicle release. Mathematical models of such synaptic dynamics frequently are based only on the mean number of vesicles released by each pre-synaptic AP, since if it is assumed there are sufficiently many vesicle sites, then variance is small. However, it has been shown recently that variance across sites can be significant for neuron and network dynamics, and this suggests the potential importance of studying short-term plasticity using simulations that do generate trial-to-trial variability. Therefore, in this paper we study several well-known conceptual models for stochastic availability and release. We state explicitly the random variables that these models describe and propose efficient algorithms for accurately implementing stochastic simulations of these random variables in software or hardware. Our results are complemented by mathematical analysis and statement of pseudo-code algorithms.
NASA Astrophysics Data System (ADS)
Hyer, E. J.; Schmidt, C. C.; Hoffman, J.; Giglio, L.; Peterson, D. A.
2013-12-01
Polar and geostationary satellites are used operationally for fire detection and smoke source estimation by many near-real-time operational users, including operational forecast centers around the globe. The input satellite radiance data are processed by data providers to produce Level-2 and Level -3 fire detection products, but processing these data into spatially and temporally consistent estimates of fire activity requires a substantial amount of additional processing. The most significant processing steps are correction for variable coverage of the satellite observations, and correction for conditions that affect the detection efficiency of the satellite sensors. We describe a system developed by the Naval Research Laboratory (NRL) that uses the full raster information from the entire constellation to diagnose detection opportunities, calculate corrections for factors such as angular dependence of detection efficiency, and generate global estimates of fire activity at spatial and temporal scales suitable for atmospheric modeling. By incorporating these improved fire observations, smoke emissions products, such as NRL's FLAMBE, are able to produce improved estimates of global emissions. This talk provides an overview of the system, demonstrates the achievable improvement over older methods, and describes challenges for near-real-time implementation.
Van Wart, Adam T; Durrant, Jacob; Votapka, Lane; Amaro, Rommie E
2014-02-11
Allostery can occur by way of subtle cooperation among protein residues (e.g., amino acids) even in the absence of large conformational shifts. Dynamical network analysis has been used to model this cooperation, helping to computationally explain how binding to an allosteric site can impact the behavior of a primary site many ångstroms away. Traditionally, computational efforts have focused on the most optimal path of correlated motions leading from the allosteric to the primary active site. We present a program called Weighted Implementation of Suboptimal Paths (WISP) capable of rapidly identifying additional suboptimal pathways that may also play important roles in the transmission of allosteric signals. Aside from providing signal redundancy, suboptimal paths traverse residues that, if disrupted through pharmacological or mutational means, could modulate the allosteric regulation of important drug targets. To demonstrate the utility of our program, we present a case study describing the allostery of HisH-HisF, an amidotransferase from T. maritima thermotiga. WISP and its VMD-based graphical user interface (GUI) can be downloaded from http://nbcr.ucsd.edu/wisp.
An implementation of the TFQMR - algorithm on a distributed memory machine
Buecker, M.
1994-12-31
A fundamental task of numerical computing is to solve linear systems A{rvec x} = {rvec b}, which can be labeled as (1). Such systems are essential parts of many scientific problems, e.g. finite element methods contain linear systems to approximate partial differential equations. Linear systems can be solved by direct as well as iterative methods. Using a direct method means to factorize the coefficient matrix. A typical and well known direct method is the Gaussian elimination. Direct methods work well as long as the systems remain small. Unfortunately, there is need for the solution of large linear systems. When systems get large direct methods result in enormous computing time because of their complexity. As an example take (1) with a N x N matrix A. Then the solution of (1) can be computed by Gaussian elimination requiring O(N{sup 3}) operations. Additionally, an implementation of a direct method has to take care of the high storage requirement. In contrast to direct methods, iterative methods use successive approximations to obtain more accurate solutions to a linear system at each step. The iteration process generates a sequence of iterates x{sub n} converging to the solution x of (1). The iteration process ends if either x{sub n} fulfills a chosen convergence criterion or breakdowns, i.e. division by zero, occur. Possible breakdowns can be avoided by look-ahead techniques.
Raghunath, N; Faber, T L; Suryanarayanan, S; Votaw, J R
2009-02-01
Image quality is significantly degraded even by small amounts of patient motion in very high-resolution PET scanners. When patient motion is known, deconvolution methods can be used to correct the reconstructed image and reduce motion blur. This paper describes the implementation and optimization of an iterative deconvolution method that uses an ordered subset approach to make it practical and clinically viable. We performed ten separate FDG PET scans using the Hoffman brain phantom and simultaneously measured its motion using the Polaris Vicra tracking system (Northern Digital Inc., Ontario, Canada). The feasibility and effectiveness of the technique was studied by performing scans with different motion and deconvolution parameters. Deconvolution resulted in visually better images and significant improvement as quantified by the Universal Quality Index (UQI) and contrast measures. Finally, the technique was applied to human studies to demonstrate marked improvement. Thus, the deconvolution technique presented here appears promising as a valid alternative to existing motion correction methods for PET. It has the potential for deblurring an image from any modality if the causative motion is known and its effect can be represented in a system matrix.
Automated infrasound signal detection algorithms implemented in MatSeis - Infra Tool.
Hart, Darren
2004-07-01
MatSeis's infrasound analysis tool, Infra Tool, uses frequency slowness processing to deconstruct the array data into three outputs per processing step: correlation, azimuth and slowness. Until now, an experienced analyst trained to recognize a pattern observed in outputs from signal processing manually accomplished infrasound signal detection. Our goal was to automate the process of infrasound signal detection. The critical aspect of infrasound signal detection is to identify consecutive processing steps where the azimuth is constant (flat) while the time-lag correlation of the windowed waveform is above background value. These two statements describe the arrival of a correlated set of wavefronts at an array. The Hough Transform and Inverse Slope methods are used to determine the representative slope for a specified number of azimuth data points. The representative slope is then used in conjunction with associated correlation value and azimuth data variance to determine if and when an infrasound signal was detected. A format for an infrasound signal detection output file is also proposed. The detection output file will list the processed array element names, followed by detection characteristics for each method. Each detection is supplied with a listing of frequency slowness processing characteristics: human time (YYYY/MM/DD HH:MM:SS.SSS), epochal time, correlation, fstat, azimuth (deg) and trace velocity (km/s). As an example, a ground truth event was processed using the four-element DLIAR infrasound array located in New Mexico. The event is known as the Watusi chemical explosion, which occurred on 2002/09/28 at 21:25:17 with an explosive yield of 38,000 lb TNT equivalent. Knowing the source and array location, the array-to-event distance was computed to be approximately 890 km. This test determined the station-to-event azimuth (281.8 and 282.1 degrees) to within 1.6 and 1.4 degrees for the Inverse Slope and Hough Transform detection algorithms, respectively, and
Thimmaiah, Tim; Voje, William E; Carothers, James M
2015-01-01
With progress toward inexpensive, large-scale DNA assembly, the demand for simulation tools that allow the rapid construction of synthetic biological devices with predictable behaviors continues to increase. By combining engineered transcript components, such as ribosome binding sites, transcriptional terminators, ligand-binding aptamers, catalytic ribozymes, and aptamer-controlled ribozymes (aptazymes), gene expression in bacteria can be fine-tuned, with many corollaries and applications in yeast and mammalian cells. The successful design of genetic constructs that implement these kinds of RNA-based control mechanisms requires modeling and analyzing kinetically determined co-transcriptional folding pathways. Transcript design methods using stochastic kinetic folding simulations to search spacer sequence libraries for motifs enabling the assembly of RNA component parts into static ribozyme- and dynamic aptazyme-regulated expression devices with quantitatively predictable functions (rREDs and aREDs, respectively) have been described (Carothers et al., Science 334:1716-1719, 2011). Here, we provide a detailed practical procedure for computational transcript design by illustrating a high throughput, multiprocessor approach for evaluating spacer sequences and generating functional rREDs. This chapter is written as a tutorial, complete with pseudo-code and step-by-step instructions for setting up a computational cluster with an Amazon, Inc. web server and performing the large numbers of kinefold-based stochastic kinetic co-transcriptional folding simulations needed to design functional rREDs and aREDs. The method described here should be broadly applicable for designing and analyzing a variety of synthetic RNA parts, devices and transcripts.
NASA Astrophysics Data System (ADS)
Cui, Ganglong; Yang, Weitao
2011-05-01
The significance of conical intersections in photophysics, photochemistry, and photodissociation of polyatomic molecules in gas phase has been demonstrated by numerous experimental and theoretical studies. Optimization of conical intersections of small- and medium-size molecules in gas phase has currently become a routine optimization process, as it has been implemented in many electronic structure packages. However, optimization of conical intersections of small- and medium-size molecules in solution or macromolecules remains inefficient, even poorly defined, due to large number of degrees of freedom and costly evaluations of gradient difference and nonadiabatic coupling vectors. In this work, based on the sequential quantum mechanics and molecular mechanics (QM/MM) and QM/MM-minimum free energy path methods, we have designed two conical intersection optimization methods for small- and medium-size molecules in solution or macromolecules. The first one is sequential QM conical intersection optimization and MM minimization for potential energy surfaces; the second one is sequential QM conical intersection optimization and MM sampling for potential of mean force surfaces, i.e., free energy surfaces. In such methods, the region where electronic structures change remarkably is placed into the QM subsystem, while the rest of the system is placed into the MM subsystem; thus, dimensionalities of gradient difference and nonadiabatic coupling vectors are decreased due to the relatively small QM subsystem. Furthermore, in comparison with the concurrent optimization scheme, sequential QM conical intersection optimization and MM minimization or sampling reduce the number of evaluations of gradient difference and nonadiabatic coupling vectors because these vectors need to be calculated only when the QM subsystem moves, independent of the MM minimization or sampling. Taken together, costly evaluations of gradient difference and nonadiabatic coupling vectors in solution or
Implementation of a parallel algorithm for thermo-chemical nonequilibrium flow simulations
Wong, C.C.; Blottner, F.G.; Payne, J.L.; Soetrisno, M.
1995-01-01
Massively parallel (MP) computing is considered to be the future direction of high performance computing. When engineers apply this new MP computing technology to solve large-scale problems, one major interest is what is the maximum problem size that a MP computer can handle. To determine the maximum size, it is important to address the code scalability issue. Scalability implies whether the code can provide an increase in performance proportional to an increase in problem size. If the size of the problem increases, by utilizing more computer nodes, the ideal elapsed time to simulate a problem should not increase much. Hence one important task in the development of the MP computing technology is to ensure scalability. A scalable code is an efficient code. In order to obtain good scaled performance, it is necessary to first have the code optimized for a single node performance before proceeding to a large-scale simulation with a large number of computer nodes. This paper will discuss the implementation of a massively parallel computing strategy and the process of optimization to improve the scaled performance. Specifically, we will look at domain decomposition, resource management in the code, communication overhead, and problem mapping. By incorporating these improvements and adopting an efficient MP computing strategy, an efficiency of about 85% and 96%, respectively, has been achieved using 64 nodes on MP computers for both perfect gas and chemically reactive gas problems. A comparison of the performance between MP computers and a vectorized computer, such as Cray-YMP, will also be presented.
Cickovski, Trevor; Flor, Tiffany; Irving-Sachs, Galen; Novikov, Philip; Parda, James; Narasimhan, Giri
2015-01-01
In order to make multiple copies of a target sequence in the laboratory, the technique of Polymerase Chain Reaction (PCR) requires the design of "primers", which are short fragments of nucleotides complementary to the flanking regions of the target sequence. If the same primer is to amplify multiple closely related target sequences, then it is necessary to make the primers "degenerate", which would allow it to hybridize to target sequences with a limited amount of variability that may have been caused by mutations. However, the PCR technique can only allow a limited amount of degeneracy, and therefore the design of degenerate primers requires the identification of reasonably well-conserved regions in the input sequences. We take an existing algorithm for designing degenerate primers that is based on clustering and parallelize it in a web-accessible software package GPUDePiCt, using a shared memory model and the computing power of Graphics Processing Units (GPUs). We test our implementation on large sets of aligned sequences from the human genome and show a multi-fold speedup for clustering using our hybrid GPU/CPU implementation over a pure CPU approach for these sequences, which consist of more than 7,500 nucleotides. We also demonstrate that this speedup is consistent over larger numbers and longer lengths of aligned sequences.
Cickovski, Trevor; Flor, Tiffany; Irving-Sachs, Galen; Novikov, Philip; Parda, James; Narasimhan, Giri
2015-01-01
In order to make multiple copies of a target sequence in the laboratory, the technique of Polymerase Chain Reaction (PCR) requires the design of "primers", which are short fragments of nucleotides complementary to the flanking regions of the target sequence. If the same primer is to amplify multiple closely related target sequences, then it is necessary to make the primers "degenerate", which would allow it to hybridize to target sequences with a limited amount of variability that may have been caused by mutations. However, the PCR technique can only allow a limited amount of degeneracy, and therefore the design of degenerate primers requires the identification of reasonably well-conserved regions in the input sequences. We take an existing algorithm for designing degenerate primers that is based on clustering and parallelize it in a web-accessible software package GPUDePiCt, using a shared memory model and the computing power of Graphics Processing Units (GPUs). We test our implementation on large sets of aligned sequences from the human genome and show a multi-fold speedup for clustering using our hybrid GPU/CPU implementation over a pure CPU approach for these sequences, which consist of more than 7,500 nucleotides. We also demonstrate that this speedup is consistent over larger numbers and longer lengths of aligned sequences. PMID:26357230
NASA Astrophysics Data System (ADS)
Ullah, Muhammed Zafar
Neural Network and Fuzzy Logic are the two key technologies that have recently received growing attention in solving real world, nonlinear, time variant problems. Because of their learning and/or reasoning capabilities, these techniques do not need a mathematical model of the system, which may be difficult, if not impossible, to obtain for complex systems. One of the major problems in portable or electric vehicle world is secondary cell charging, which shows non-linear characteristics. Portable-electronic equipment, such as notebook computers, cordless and cellular telephones and cordless-electric lawn tools use batteries in increasing numbers. These consumers demand fast charging times, increased battery lifetime and fuel gauge capabilities. All of these demands require that the state-of-charge within a battery be known. Charging secondary cells Fast is a problem, which is difficult to solve using conventional techniques. Charge control is important in fast charging, preventing overcharging and improving battery life. This research work provides a quick and reliable approach to charger design using Neural-Fuzzy technology, which learns the exact battery charging characteristics. Neural-Fuzzy technology is an intelligent combination of neural net with fuzzy logic that learns system behavior by using system input-output data rather than mathematical modeling. The primary objective of this research is to improve the secondary cell charging algorithm and to have faster charging time based on neural network and fuzzy logic technique. Also a new architecture of a controller will be developed for implementing the charging algorithm for the secondary battery.
Breuer, Christian; Lucas, Martin; Schütze, Frank-Walter; Claus, Peter
2007-01-01
A multi-criteria optimisation procedure based on genetic algorithms is carried out in search of advanced heterogeneous catalysts for total oxidation. Simple but flexible software routines have been created to be applied within a search space of more then 150,000 individuals. The general catalyst design includes mono-, bi- and trimetallic compositions assembled out of 49 different metals and depleted on an Al2O3 support in up to nine amount levels. As an efficient tool for high-throughput screening and perfectly matched to the requirements of heterogeneous gas phase catalysis - especially for applications technically run in honeycomb structures - the multi-channel monolith reactor is implemented to evaluate the catalyst performances. Out of a multi-component feed-gas, the conversion rates of carbon monoxide (CO) and a model hydrocarbon (HC) are monitored in parallel. In combination with further restrictions to preparation and pre-treatment a primary screening can be conducted, promising to provide results close to technically applied catalysts. Presented are the resulting performances of the optimisation process for the first catalyst generations and the prospect of its auto-adaptation to specified optimisation goals. PMID:17266517
Ezra, Elishai; Maor, Idan; Bavli, Danny; Shalom, Itai; Levy, Gahl; Prill, Sebastian; Jaeger, Magnus S; Nahmias, Yaakov
2015-08-01
Microfluidic applications range from combinatorial synthesis to high throughput screening, with platforms integrating analog perfusion components, digitally controlled micro-valves and a range of sensors that demand a variety of communication protocols. Currently, discrete control units are used to regulate and monitor each component, resulting in scattered control interfaces that limit data integration and synchronization. Here, we present a microprocessor-based control unit, utilizing the MS Gadgeteer open framework that integrates all aspects of microfluidics through a high-current electronic circuit that supports and synchronizes digital and analog signals for perfusion components, pressure elements, and arbitrary sensor communication protocols using a plug-and-play interface. The control unit supports an integrated touch screen and TCP/IP interface that provides local and remote control of flow and data acquisition. To establish the ability of our control unit to integrate and synchronize complex microfluidic circuits we developed an equi-pressure combinatorial mixer. We demonstrate the generation of complex perfusion sequences, allowing the automated sampling, washing, and calibrating of an electrochemical lactate sensor continuously monitoring hepatocyte viability following exposure to the pesticide rotenone. Importantly, integration of an optical sensor allowed us to implement automated optimization protocols that require different computational challenges including: prioritized data structures in a genetic algorithm, distributed computational efforts in multiple-hill climbing searches and real-time realization of probabilistic models in simulated annealing. Our system offers a comprehensive solution for establishing optimization protocols and perfusion sequences in complex microfluidic circuits. PMID:26227212
Ezra, Elishai; Maor, Idan; Bavli, Danny; Shalom, Itai; Levy, Gahl; Prill, Sebastian; Jaeger, Magnus S; Nahmias, Yaakov
2015-08-01
Microfluidic applications range from combinatorial synthesis to high throughput screening, with platforms integrating analog perfusion components, digitally controlled micro-valves and a range of sensors that demand a variety of communication protocols. Currently, discrete control units are used to regulate and monitor each component, resulting in scattered control interfaces that limit data integration and synchronization. Here, we present a microprocessor-based control unit, utilizing the MS Gadgeteer open framework that integrates all aspects of microfluidics through a high-current electronic circuit that supports and synchronizes digital and analog signals for perfusion components, pressure elements, and arbitrary sensor communication protocols using a plug-and-play interface. The control unit supports an integrated touch screen and TCP/IP interface that provides local and remote control of flow and data acquisition. To establish the ability of our control unit to integrate and synchronize complex microfluidic circuits we developed an equi-pressure combinatorial mixer. We demonstrate the generation of complex perfusion sequences, allowing the automated sampling, washing, and calibrating of an electrochemical lactate sensor continuously monitoring hepatocyte viability following exposure to the pesticide rotenone. Importantly, integration of an optical sensor allowed us to implement automated optimization protocols that require different computational challenges including: prioritized data structures in a genetic algorithm, distributed computational efforts in multiple-hill climbing searches and real-time realization of probabilistic models in simulated annealing. Our system offers a comprehensive solution for establishing optimization protocols and perfusion sequences in complex microfluidic circuits.
NASA Astrophysics Data System (ADS)
Vanderstraeten, Barbara; DeGersem, Werner; Duthoy, Wim; DeNeve, Wilfried; Thierens, Hubert
2006-08-01
The development of new biological imaging technologies offers the opportunity to further individualize radiotherapy. Biologically conformal radiation therapy (BCRT) implies the use of the spatial distribution of one or more radiobiological parameters to guide the IMRT dose prescription. Our aim was to implement BCRT in an algorithmic segmentation-based planning approach. A biology-based segmentation tool was developed to generate initial beam segments that reflect the biological signal intensity pattern. The weights and shapes of the initial segments are optimized by means of an objective function that minimizes the root mean square deviation between the actual and intended dose values within the PTV. As proof of principle, [18F]FDG-PET-guided BCRT plans for two different levels of dose escalation were created for an oropharyngeal cancer patient. Both plans proved to be dosimetrically feasible without violating the planning constraints for the expanded spinal cord and the contralateral parotid gland as organs at risk. The obtained biological conformity was better for the first (2.5 Gy per fraction) than for the second (3 Gy per fraction) dose escalation level.
Alhassan, Sulaiman; Sayf, Alaa Abu; Arsene, Camelia; Krayem, Hicham
2016-01-01
BACKGROUND: Majority of our computed tomography-pulmonary angiography (CTPA) scans report negative findings. We hypothesized that suboptimal reliance on diagnostic algorithms contributes to apparent overuse of this test. METHODS: A retrospective review was performed on 2031 CTPA cases in a large hospital system. Investigators retrospectively calculated pretest probability (PTP). Use of CTPA was considered as inappropriate when it was ordered for patients with low PTP without checking D-dimer (DD) or following negative DD. RESULTS: Among the 2031 cases, pulmonary embolism (PE) was found in 7.4% (151 cases). About 1784 patients (88%) were considered “PE unlikely” based on Wells score. Out of those patients, 1084 cases (61%) did not have DD test prior to CTPA. In addition, 78 patients with negative DD underwent unnecessary CTPA; none of them had PE. CONCLUSIONS: The suboptimal implementation of PTP assessment tools can result in the overuse of CTPA, contributing to ineffective utilization of hospital resources, increased cost, and potential harm to patients. PMID:27803751
Implementation of a New DTSTEP Algorithm for use in RELAP5-3D and PVMEXEC Completion Report
Dr. George L Mesina
2010-12-01
The PVM Coupling methodology for decomposing a complex model into domains onto which individual programs may be applied has proven effective for solving many multi-physics problems. There have been, from the outset, some detailed and/or long-running models that cause the process to fail. This project addressed the PVM coupling issues surrounding the DTSTEP subroutines on RELAP5-3D and PVMEXEC. Some 25 errors are listed in Tables 1 and 18 and in Section 11. These arise from deficiencies in the floating point calculation and testing of time steps, cumulative time, and time targets. The algorithmic replacement of floating point control of these items with integer based timestepping was developed and implemented. The result of the first phase, undertaken by the INL was that all but three of these issues were resolved. Moreover, two conceptual errors in DTSTEP that were not PVM coupling related were discovered and solved. The final, and most difficult three PVM Bettis User Problems, were solved during the Bettis phase of development and debugging. In 8 months since the conclusion of the project, no further DTSTEP related PVM Coupling errors have been reported.
NASA Astrophysics Data System (ADS)
Abdelazim, S.; Santoro, D.; Arend, M.; Moshary, F.; Ahmed, S.
2015-05-01
In this paper, we present two signal processing algorithms implemented using the FPGA. The first algorithm involves explicate time gating of received signals that correspond to a desired spatial resolution, performing a Fast Fourier Transform (FFT) calculation on each individual time gate, taking the square modulus of the FFT to form a power spectrum and then accumulating these power spectra for 10k return signals. The second algorithm involves calculating the autocorrelation of the backscattered signals and then accumulating the autocorrelation for 10k pulses. Efficient implementation of each of these two signal processing algorithms on an FPGA is challenging because it requires there to be tradeoffs between retaining the full data word width, managing the amount of on chip memory used and respecting the constraints imposed by the data width of the FPGA. A description of the approach used to manage these tradeoffs for each of the two signal processing algorithms are presented and explained in this article. Results of atmospheric measurements obtained through these two embedded programming techniques are also presented.
Continuous-Variable Quantum Computation of Oracle Decision Problems
NASA Astrophysics Data System (ADS)
Adcock, Mark R. A.
Quantum information processing is appealing due its ability to solve certain problems quantitatively faster than classical information processing. Most quantum algorithms have been studied in discretely parameterized systems, but many quantum systems are continuously parameterized. The field of quantum optics in particular has sophisticated techniques for manipulating continuously parameterized quantum states of light, but the lack of a code-state formalism has hindered the study of quantum algorithms in these systems. To address this situation, a code-state formalism for the solution of oracle decision problems in continuously-parameterized quantum systems is developed. Quantum information processing is appealing due its ability to solve certain problems quantitatively faster than classical information processing. Most quantum algorithms have been studied in discretely parameterized systems, but many quantum systems are continuously parameterized. The field of quantum optics in particular has sophisticated techniques for manipulating continuously parameterized quantum states of light, but the lack of a code-state formalism has hindered the study of quantum algorithms in these systems. To address this situation, a code-state formalism for the solution of oracle decision problems in continuously-parameterized quantum systems is developed. In the infinite-dimensional case, we study continuous-variable quantum algorithms for the solution of the Deutsch--Jozsa oracle decision problem implemented within a single harmonic-oscillator. Orthogonal states are used as the computational bases, and we show that, contrary to a previous claim in the literature, this implementation of quantum information processing has limitations due to a position-momentum trade-off of the Fourier transform. We further demonstrate that orthogonal encoding bases are not unique, and using the coherent states of the harmonic oscillator as the computational bases, our formalism enables quantifying
NASA Astrophysics Data System (ADS)
Smith, D. E.; Felizardo, C.; Minson, S. E.; Boese, M.; Langbein, J. O.; Guillemot, C.; Murray, J. R.
2015-12-01
The earthquake early warning (EEW) systems in California and elsewhere can greatly benefit from algorithms that generate estimates of finite-fault parameters. These estimates could significantly improve real-time shaking calculations and yield important information for immediate disaster response. Minson et al. (2015) determined that combining FinDer's seismic-based algorithm (Böse et al., 2012) with BEFORES' geodetic-based algorithm (Minson et al., 2014) yields a more robust and informative joint solution than using either algorithm alone. FinDer examines the distribution of peak ground accelerations from seismic stations and determines the best finite-fault extent and strike from template matching. BEFORES employs a Bayesian framework to search for the best slip inversion over all possible fault geometries in terms of strike and dip. Using FinDer and BEFORES together generates estimates of finite-fault extent, strike, dip, preferred slip, and magnitude. To yield the quickest, most flexible, and open-source version of the joint algorithm, we translated BEFORES and FinDer from Matlab into C++. We are now developing a C++ Application Protocol Interface for these two algorithms to be connected to the seismic and geodetic data flowing from the EEW system. The interface that is being developed will also enable communication between the two algorithms to generate the joint solution of finite-fault parameters. Once this interface is developed and implemented, the next step will be to run test seismic and geodetic data through the system via the Earthworm module, Tank Player. This will allow us to examine algorithm performance on simulated data and past real events.
Kochilas, Lazaros K; Menk, Jeremiah S; Saarinen, Annamarie; Gaviglio, Amy; Lohr, Jamie L
2015-03-01
Prior to state-wide implementation of newborn screening for critical congenital heart disease (CCHD) in Minnesota, a pilot program was completed using the protocol recommended by the Secretary's Advisory Committee on Heritable Disorders in Newborns and Children (SACHDNC). This report compares the retesting rates for newborn screening for CCHDs using the SACHDNC protocol and four alternative algorithms used in large published CCHD screening studies. Data from the original Minnesota study were reanalyzed using the passing values from these four alternative protocols. The retesting rate for the first pulse oximeter measurement ranged from 1.1 % in the SACHDNC protocol to 9.6 % in the Ewer protocol. The SACHDNC protocol generated the lowest rate of retesting among all tested algorithms. Our data suggest that even minor modifications of CCHD screening protocol would significantly impact screening retesting rate. In addition, we provide support for including lower extremity oxygen saturations in the screening algorithm.
Kress, R.L.; Jansen, J.F.; Noakes, M.W.
1994-05-01
When suspended payloads are moved with an overhead crane, pendulum like oscillations are naturally introduced. This presents a problem any time a crane is used, especially when expensive and/or delicate objects are moved, when moving in a cluttered an or hazardous environment, and when objects are to be placed in tight locations. Damped-oscillation control algorithms have been demonstrated over the past several years for laboratory-scale robotic systems on dc motor-driven overhead cranes. Most overhead cranes presently in use in industry are driven by ac induction motors; consequently, Oak Ridge National Laboratory has implemented damped-oscillation crane control on one of its existing facility ac induction motor-driven overhead cranes. The purpose of this test was to determine feasibility, to work out control and interfacing specifications, and to establish the capability of newly available ac motor control hardware with respect to use in damped-oscillation-controlled systems. Flux vector inverter drives are used to investigate their acceptability for damped-oscillation crane control. The purpose of this paper is to describe the experimental implementation of a control algorithm on a full-sized, two-degree-of-freedom, industrial crane; describe the experimental evaluation of the controller including robustness to payload length changes; explain the results of experiments designed to determine the hardware required for implementation of the control algorithms; and to provide a theoretical description of the controller.
NASA Technical Reports Server (NTRS)
Nyangweso, Emmanuel; Bole, Brian
2014-01-01
Successful prediction and management of battery life using prognostic algorithms through ground and flight tests is important for performance evaluation of electrical systems. This paper details the design of test beds suitable for replicating loading profiles that would be encountered in deployed electrical systems. The test bed data will be used to develop and validate prognostic algorithms for predicting battery discharge time and battery failure time. Online battery prognostic algorithms will enable health management strategies. The platform used for algorithm demonstration is the EDGE 540T electric unmanned aerial vehicle (UAV). The fully designed test beds developed and detailed in this paper can be used to conduct battery life tests by controlling current and recording voltage and temperature to develop a model that makes a prediction of end-of-charge and end-of-life of the system based on rapid state of health (SOH) assessment.
Crane, R L; Garbow, B S; Hillstrom, K E; Minkoff, M
1980-11-01
This report describes the implementation of an algorithm of Stoer and Schittkowski for solving linearly constrained linear least-squares problems. These problems arise in many areas, particularly in data fitting where a model is provided and parameters in the model are selected to be a best least-squares fit to known experimental observations. By adding constraints to the least-squares fit, one can force user-specified properties on the parameters selected. The algorithm used applies a numerically stable implementation of the Gram-Schmidt orthogonalization procedure to deal with a factorization approach for solving the constrained least-squares problem. The software developed allows for either a user-supplied feasible starting point or the automatic generation of a feasible starting point, redecomposition after solving the problem to improve numerical accuracy, and diagnostic printout to follow the computations in the algorithm. In addition to a description of the actual method used to solve the problem, a description of the software structure and the user interfaces is provided, along with a numerical example. 3 figures, 1 table.
Link, W.A.; Barker, R.J.
2008-01-01
Judicious choice of candidate generating distributions improves efficiency of the Metropolis-Hastings algorithm. In Bayesian applications, it is sometimes possible to identify an approximation to the target posterior distribution; this approximate posterior distribution is a good choice for candidate generation. These observations are applied to analysis of the Cormack-Jolly-Seber model and its extensions. ?? Springer Science+Business Media, LLC 2007.
Link, W.A.; Barker, R.J.
2008-01-01
Judicious choice of candidate generating distributions improves efficiency of the Metropolis-Hastings algorithm. In Bayesian applications, it is sometimes possible to identify an approximation to the target posterior distribution; this approximate posterior distribution is a good choice for candidate generation. These observations are applied to analysis of the Cormack?Jolly?Seber model and its extensions.
Ewing, R.E.; Saevareid, O.; Shen, J.
1994-12-31
A multigrid algorithm for the cell-centered finite difference on equilateral triangular grids for solving second-order elliptic problems is proposed. This finite difference is a four-point star stencil in a two-dimensional domain and a five-point star stencil in a three dimensional domain. According to the authors analysis, the advantages of this finite difference are that it is an O(h{sup 2})-order accurate numerical scheme for both the solution and derivatives on equilateral triangular grids, the structure of the scheme is perhaps the simplest, and its corresponding multigrid algorithm is easily constructed with an optimal convergence rate. They are interested in relaxation of the equilateral triangular grid condition to certain general triangular grids and the application of this multigrid algorithm as a numerically reasonable preconditioner for the lowest-order Raviart-Thomas mixed triangular finite element method. Numerical test results are presented to demonstrate their analytical results and to investigate the applications of this multigrid algorithm on general triangular grids.
NASA Astrophysics Data System (ADS)
Paz, Abel; Plaza, Antonio; Plaza, Javier
2009-08-01
Automatic target detection in hyperspectral images is a task that has attracted a lot of attention recently. In the last few years, several algoritms have been developed for this purpose, including the well-known RX algorithm for anomaly detection, or the automatic target detection and classification algorithm (ATDCA), which uses an orthogonal subspace projection (OSP) approach to extract a set of spectrally distinct targets automatically from the input hyperspectral data. Depending on the complexity and dimensionality of the analyzed image scene, the target/anomaly detection process may be computationally very expensive, a fact that limits the possibility of utilizing this process in time-critical applications. In this paper, we develop computationally efficient parallel versions of both the RX and ATDCA algorithms for near real-time exploitation of these algorithms. In the case of ATGP, we use several distance metrics in addition to the OSP approach. The parallel versions are quantitatively compared in terms of target detection accuracy, using hyperspectral data collected by NASA's Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) over the World Trade Center in New York, five days after the terrorist attack of September 11th, 2001, and also in terms of parallel performance, using a massively Beowulf cluster available at NASA's Goddard Space Flight Center in Maryland.
Rapisarda, E; Bettinardi, V; Thielemans, K; Gilardi, M C
2010-07-21
The interest in positron emission tomography (PET) and particularly in hybrid integrated PET/CT systems has significantly increased in the last few years due to the improved quality of the obtained images. Nevertheless, one of the most important limits of the PET imaging technique is still its poor spatial resolution due to several physical factors originating both at the emission (e.g. positron range, photon non-collinearity) and at detection levels (e.g. scatter inside the scintillating crystals, finite dimensions of the crystals and depth of interaction). To improve the spatial resolution of the images, a possible way consists of measuring the point spread function (PSF) of the system and then accounting for it inside the reconstruction algorithm. In this work, the system response of the GE Discovery STE operating in 3D mode has been characterized by acquiring (22)Na point sources in different positions of the scanner field of view. An image-based model of the PSF was then obtained by fitting asymmetric two-dimensional Gaussians on the (22)Na images reconstructed with small pixel sizes. The PSF was then incorporated, at the image level, in a three-dimensional ordered subset maximum likelihood expectation maximization (OS-MLEM) reconstruction algorithm. A qualitative and quantitative validation of the algorithm accounting for the PSF has been performed on phantom and clinical data, showing improved spatial resolution, higher contrast and lower noise compared with the corresponding images obtained using the standard OS-MLEM algorithm.
Image-based point spread function implementation in a fully 3D OSEM reconstruction algorithm for PET
NASA Astrophysics Data System (ADS)
Rapisarda, E.; Bettinardi, V.; Thielemans, K.; Gilardi, M. C.
2010-07-01
The interest in positron emission tomography (PET) and particularly in hybrid integrated PET/CT systems has significantly increased in the last few years due to the improved quality of the obtained images. Nevertheless, one of the most important limits of the PET imaging technique is still its poor spatial resolution due to several physical factors originating both at the emission (e.g. positron range, photon non-collinearity) and at detection levels (e.g. scatter inside the scintillating crystals, finite dimensions of the crystals and depth of interaction). To improve the spatial resolution of the images, a possible way consists of measuring the point spread function (PSF) of the system and then accounting for it inside the reconstruction algorithm. In this work, the system response of the GE Discovery STE operating in 3D mode has been characterized by acquiring 22Na point sources in different positions of the scanner field of view. An image-based model of the PSF was then obtained by fitting asymmetric two-dimensional Gaussians on the 22Na images reconstructed with small pixel sizes. The PSF was then incorporated, at the image level, in a three-dimensional ordered subset maximum likelihood expectation maximization (OS-MLEM) reconstruction algorithm. A qualitative and quantitative validation of the algorithm accounting for the PSF has been performed on phantom and clinical data, showing improved spatial resolution, higher contrast and lower noise compared with the corresponding images obtained using the standard OS-MLEM algorithm.
NASA Astrophysics Data System (ADS)
Giacometto, F. J.; Vilardy, J. M.; Torres, C. O.; Mattos, L.
2011-01-01
Among the most used biometric signals to set personal security permissions, taker increasingly importance biometric iris recognition based on their textures and images of blood vessels due to the rich in these two unique characteristics that are unique to each individual. This paper presents an implementation of an algorithm characterization and correlation of templates created for biometric authentication based on iris texture analysis programmed on a FPGA (Field Programmable Gate Array), authentication is based on processes like characterization methods based on frequency analysis of the sample, and frequency correlation to obtain the expected results of authentication.
A 0.13-µm implementation of 5 Gb/s and 3-mW folded parallel architecture for AES algorithm
NASA Astrophysics Data System (ADS)
Rahimunnisa, K.; Karthigaikumar, P.; Kirubavathy, J.; Jayakumar, J.; Kumar, S. Suresh
2014-02-01
A new architecture for encrypting and decrypting the confidential data using Advanced Encryption Standard algorithm is presented in this article. This structure combines the folded structure with parallel architecture to increase the throughput. The whole architecture achieved high throughput with less power. The proposed architecture is implemented in 0.13-µm Complementary metal-oxide-semiconductor (CMOS) technology. The proposed structure is compared with different existing structures, and from the result it is proved that the proposed structure gives higher throughput and less power compared to existing works.
NASA Technical Reports Server (NTRS)
Whiffen, Gregory J.
2006-01-01
Mystic software is designed to compute, analyze, and visualize optimal high-fidelity, low-thrust trajectories, The software can be used to analyze inter-planetary, planetocentric, and combination trajectories, Mystic also provides utilities to assist in the operation and navigation of low-thrust spacecraft. Mystic will be used to design and navigate the NASA's Dawn Discovery mission to orbit the two largest asteroids, The underlying optimization algorithm used in the Mystic software is called Static/Dynamic Optimal Control (SDC). SDC is a nonlinear optimal control method designed to optimize both 'static variables' (parameters) and dynamic variables (functions of time) simultaneously. SDC is a general nonlinear optimal control algorithm based on Bellman's principal.
NASA Astrophysics Data System (ADS)
Siswantyo, Sepha; Susanti, Bety Hayat
2016-02-01
Preneel-Govaerts-Vandewalle (PGV) schemes consist of 64 possible single-block-length schemes that can be used to build a hash function based on block ciphers. For those 64 schemes, Preneel claimed that 4 schemes are secure. In this paper, we apply length extension attack on those 4 secure PGV schemes which use RC5 algorithm in its basic construction to test their collision resistance property. The attack result shows that the collision occurred on those 4 secure PGV schemes. Based on the analysis, we indicate that Feistel structure and data dependent rotation operation in RC5 algorithm, XOR operations on the scheme, along with selection of additional message block value also give impact on the collision to occur.
NASA Astrophysics Data System (ADS)
Frassinetti, L.; Olofsson, K. E. J.; Brunsell, P. R.; Drake, J. R.
2011-06-01
The EXTRAP T2R feedback system (active coils, sensor coils and controller) is used to study and develop new tools for advanced control of the MHD instabilities in fusion plasmas. New feedback algorithms developed in EXTRAP T2R reversed-field pinch allow flexible and independent control of each magnetic harmonic. Methods developed in control theory and applied to EXTRAP T2R allow a closed-loop identification of the machine plant and of the resistive wall modes growth rates. The plant identification is the starting point for the development of output-tracking algorithms which enable the generation of external magnetic perturbations. These algorithms will then be used to study the effect of a resonant magnetic perturbation (RMP) on the tearing mode (TM) dynamics. It will be shown that the stationary RMP can induce oscillations in the amplitude and jumps in the phase of the rotating TM. It will be shown that the RMP strongly affects the magnetic island position.
NASA Astrophysics Data System (ADS)
Liu, Qianshun; Liu, Yan; Yu, Feihong
2013-08-01
As a kind of film device, band-pass filter is widely used in pattern recognition, infrared detection, optical fiber communication, etc. In this paper, an algorithm for automatic measurement of band-pass filter quality criterion is proposed based on the proven theory calculation of derivate spectral transmittance of filter formula. Firstly, wavelet transform to reduce spectrum data noises is used. Secondly, combining with the Gaussian curve fitting and least squares method, the algorithm fits spectrum curve and searches the peak. Finally, some parameters for judging band-pass filter quality are figure out. Based on the algorithm, a pipeline for band-pass filters automatic measurement system has been designed that can scan the filter array automatically and display spectral transmittance of each filter. At the same time, the system compares the measuring result with the user defined standards to determine if the filter is qualified or not. The qualified product will be market with green color, and the unqualified product will be marked with red color. With the experiments verification, the automatic measurement system basically realized comprehensive, accurate and rapid measurement of band-pass filter quality and achieved the expected results.
Fox, Andrew; Williams, Mathew; Richardson, Andrew D.; Cameron, David; Gove, Jeffrey H.; Quaife, Tristan; Ricciuto, Daniel M; Reichstein, Markus; Tomelleri, Enrico; Trudinger, Cathy; Van Wijk, Mark T.
2009-10-01
We describe a model-data fusion (MDF) inter-comparison project (REFLEX), which compared various algorithms for estimating carbon (C) model parameters consistent with both measured carbon fluxes and states and a simple C model. Participants were provided with the model and with both synthetic net ecosystem exchange (NEE) ofCO2 and leaf area index (LAI) data, generated from the model with added noise, and observed NEE and LAI data from two eddy covariance sites. Participants endeavoured to estimate model parameters and states consistent with the model for all cases over the two years for which data were provided, and generate predictions for one additional year without observations. Nine participants contributed results using Metropolis algorithms, Kalman filters and a genetic algorithm. For the synthetic data case, parameter estimates compared well with the true values. The results of the analyses indicated that parameters linked directly to gross primary production (GPP) and ecosystem respiration, such as those related to foliage allocation and turnover, or temperature sensitivity of heterotrophic respiration,were best constrained and characterised. Poorly estimated parameters were those related to the allocation to and turnover of fine root/wood pools. Estimates of confidence intervals varied among algorithms, but several algorithms successfully located the true values of annual fluxes from synthetic experiments within relatively narrow 90% confidence intervals, achieving>80% success rate and mean NEE confidence intervals <110 gCm-2 year-1 for the synthetic case. Annual C flux estimates generated by participants generally agreed with gap-filling approaches using half-hourly data. The estimation of ecosystem respiration and GPP through MDF agreed well with outputs from partitioning studies using half-hourly data. Confidence limits on annual NEE increased by an average of 88% in the prediction year compared to the previous year, when data were available. Confidence
Miller, L.F.
1980-11-01
A brief description of the Oak Ridge Reactor Pool Side Facility (ORR-PSF) and of the associated control system is given. The ORR-PSF capsule temperatures are controlled by a digital computer which regulates the percent power delivered to electrical heaters. The total electrical power which can be input to a particular heater is determined by the setting of an associated variac. This report concentrates on the description of the ORR-PSF irradiation experiment computer control algorithm. The algorithm is an implementation of a discrete-time, state variable, optimal control approach. The Riccati equation is solved for a discretized system model to determine the control law. Experiments performed to obtain system model parameters are described. Results of performance evaluation experiments are also presented. The control algorithm maintains both capsule temperatures within a 288/sup 0/C +-10/sup 0/C band as required. The pressure vessel capsule temperatures are effectively maintained within a 288/sup 0/C +-5/sup 0/C band.
NASA Astrophysics Data System (ADS)
Lee, Dah-Jye; Anderson, Jonathan D.; Archibald, James K.
2008-12-01
Many image and signal processing techniques have been applied to medical and health care applications in recent years. In this paper, we present a robust signal processing approach that can be used to solve the correspondence problem for an embedded stereo vision sensor to provide real-time visual guidance to the visually impaired. This approach is based on our new one-dimensional (1D) spline-based genetic algorithm to match signals. The algorithm processes image data lines as 1D signals to generate a dense disparity map, from which 3D information can be extracted. With recent advances in electronics technology, this 1D signal matching technique can be implemented and executed in parallel in hardware such as field-programmable gate arrays (FPGAs) to provide real-time feedback about the environment to the user. In order to complement (not replace) traditional aids for the visually impaired such as canes and Seeing Eyes dogs, vision systems that provide guidance to the visually impaired must be affordable, easy to use, compact, and free from attributes that are awkward or embarrassing to the user. "Seeing Eye Glasses," an embedded stereo vision system utilizing our new algorithm, meets all these requirements.
Chen, Guangye; Chacon, Luis; Barnes, Daniel C
2012-01-01
Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been developed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230, 18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver and is capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle orbit integrations from the field solver, while remaining fully self-consistent. This provides great flexibility, and dramatically improves the solver efficiency by reducing the degrees of freedom of the associated nonlinear system. However, it requires a particle push per nonlinear residual evaluation, which makes the particle push the most time-consuming operation in the algorithm. This paper describes a very efficient mixed-precision, hybrid CPU-GPU implementation of the implicit PIC algorithm. The JFNK solver is kept on the CPU (in double precision), while the inherent data parallelism of the particle mover is exploited by implementing it in single-precision on a graphics processing unit (GPU) using CUDA. Performance-oriented optimizations, with the aid of an analytical performance model, the roofline model, are employed. Despite being highly dynamic, the adaptive, charge-conserving particle mover algorithm achieves up to 300 400 GOp/s (including single-precision floating-point, integer, and logic operations) on a Nvidia GeForce GTX580, corresponding to 20 25% absolute GPU efficiency (against the peak theoretical performance) and 50-70% intrinsic efficiency (against the algorithm s maximum operational throughput, which neglects all latencies). This is about 200-300 times faster than an equivalent serial CPU implementation. When the single-precision GPU particle mover is combined with a double-precision CPU JFNK field solver, overall performance gains 100 vs. the double-precision CPU-only serial version are obtained, with no apparent loss of
NASA Astrophysics Data System (ADS)
Guo, Z.; Xiong, S. M.
2015-06-01
To efficiently solve the coupled phase field equations in 3-D, an algorithm comprising of adaptive mesh refinement (AMR) and parallel (Para-) computing capabilities was developed. Dendritic growth and subsequent coarsening were studied by employing the model to simulate multi-dendrite growth under isothermal conditions. Quantitative comparison including decrease of interface area (S) and nonlinear growing of the characteristic length (ratio between solid volume V and surface area S i.e. V/S) as time was performed between the simulation results and these predicted by the existing theories. In particular, various mechanisms including growth of lower curvature area in expense of higher curvature one, coalescence of neighbouring dendrite arms and groove advancement at the root of higher order arms for dendritic coarsening were identified and successfully revealed via the 3-D phase field simulation. In addition, results showed that the proposed algorithm could greatly shorten the computing time for 3-D phase field simulation and enable one to gain much more insight in understanding the underlying physics during dendrite growth in solidification.
NASA Astrophysics Data System (ADS)
Frolov, Vladimir; Backhaus, Scott; Chertkov, Misha
2014-10-01
We explore optimization methods for planning the placement, sizing and operations of flexible alternating current transmission system (FACTS) devices installed to relieve transmission grid congestion. We limit our selection of FACTS devices to series compensation (SC) devices that can be represented by modification of the inductance of transmission lines. Our master optimization problem minimizes the l1 norm of the inductance modification subject to the usual line thermal-limit constraints. We develop heuristics that reduce this non-convex optimization to a succession of linear programs (LP) that are accelerated further using cutting plane methods. The algorithm solves an instance of the MatPower Polish Grid model (3299 lines and 2746 nodes) in 40 seconds per iteration on a standard laptop—a speed that allows the sizing and placement of a family of SC devices to correct a large set of anticipated congestions. We observe that our algorithm finds feasible solutions that are always sparse, i.e., SC devices are placed on only a few lines. In a companion manuscript, we demonstrate our approach on realistically sized networks that suffer congestion from a range of causes, including generator retirement. In this manuscript, we focus on the development of our approach, investigate its structure on a small test system subject to congestion from uniform load growth, and demonstrate computational efficiency on a realistically sized network.
Frolov, Vladimir; Backhaus, Scott; Chertkov, Misha
2014-10-24
We explore optimization methods for planning the placement, sizing and operations of Flexible Alternating Current Transmission System (FACTS) devices installed to relieve transmission grid congestion. We limit our selection of FACTS devices to Series Compensation (SC) devices that can be represented by modification of the inductance of transmission lines. Our master optimization problem minimizes the l_{1} norm of the inductance modification subject to the usual line thermal-limit constraints. We develop heuristics that reduce this non-convex optimization to a succession of Linear Programs (LP) which are accelerated further using cutting plane methods. The algorithm solves an instance of the MatPower Polish Grid model (3299 lines and 2746 nodes) in 40 seconds per iteration on a standard laptop—a speed up that allows the sizing and placement of a family of SC devices to correct a large set of anticipated congestions. We observe that our algorithm finds feasible solutions that are always sparse, i.e., SC devices are placed on only a few lines. In a companion manuscript, we demonstrate our approach on realistically-sized networks that suffer congestion from a range of causes including generator retirement. In this manuscript, we focus on the development of our approach, investigate its structure on a small test system subject to congestion from uniform load growth, and demonstrate computational efficiency on a realistically-sized network.
Frolov, Vladimir; Backhaus, Scott N.; Chertkov, Michael
2014-01-14
We explore optimization methods for planning the placement, sizing and operations of Flexible Alternating Current Transmission System (FACTS) devices installed to relieve transmission grid congestion. We limit our selection of FACTS devices to Series Compensation (SC) devices that can be represented by modification of the inductance of transmission lines. Our master optimization problem minimizes the l_{1} norm of the inductance modification subject to the usual line thermal-limit constraints. We develop heuristics that reduce this non-convex optimization to a succession of Linear Programs (LP) which are accelerated further using cutting plane methods. The algorithm solves an instance of the MatPower Polish Grid model (3299 lines and 2746 nodes) in 40 seconds per iteration on a standard laptop—a speed up that allows the sizing and placement of a family of SC devices to correct a large set of anticipated congestions. We observe that our algorithm finds feasible solutions that are always sparse, i.e., SC devices are placed on only a few lines. In a companion manuscript, we demonstrate our approach on realistically-sized networks that suffer congestion from a range of causes including generator retirement. In this manuscript, we focus on the development of our approach, investigate its structure on a small test system subject to congestion from uniform load growth, and demonstrate computational efficiency on a realistically-sized network.
Frolov, Vladimir; Backhaus, Scott; Chertkov, Misha
2014-10-24
We explore optimization methods for planning the placement, sizing and operations of Flexible Alternating Current Transmission System (FACTS) devices installed to relieve transmission grid congestion. We limit our selection of FACTS devices to Series Compensation (SC) devices that can be represented by modification of the inductance of transmission lines. Our master optimization problem minimizes the l1 norm of the inductance modification subject to the usual line thermal-limit constraints. We develop heuristics that reduce this non-convex optimization to a succession of Linear Programs (LP) which are accelerated further using cutting plane methods. The algorithm solves an instance of the MatPower Polishmore » Grid model (3299 lines and 2746 nodes) in 40 seconds per iteration on a standard laptop—a speed up that allows the sizing and placement of a family of SC devices to correct a large set of anticipated congestions. We observe that our algorithm finds feasible solutions that are always sparse, i.e., SC devices are placed on only a few lines. In a companion manuscript, we demonstrate our approach on realistically-sized networks that suffer congestion from a range of causes including generator retirement. In this manuscript, we focus on the development of our approach, investigate its structure on a small test system subject to congestion from uniform load growth, and demonstrate computational efficiency on a realistically-sized network.« less
Stone, Jonathan W.; Bleckley, Samuel; Lavelle, Sean; Schroeder, Susan J.
2015-01-01
We present new modifications to the Wuchty algorithm in order to better define and explore possible conformations for an RNA sequence. The new features, including parallelization, energy-independent lonely pair constraints, context-dependent chemical probing constraints, helix filters, and optional multibranch loops, provide useful tools for exploring the landscape of RNA folding. Chemical probing alone may not necessarily define a single unique structure. The helix filters and optional multibranch loops are global constraints on RNA structure that are an especially useful tool for generating models of encapsidated viral RNA for which cryoelectron microscopy or crystallography data may be available. The computations generate a combinatorially complete set of structures near a free energy minimum and thus provide data on the density and diversity of structures near the bottom of a folding funnel for an RNA sequence. The conformational landscapes for some RNA sequences may resemble a low, wide basin rather than a steep funnel that converges to a single structure. PMID:25695434
NASA Technical Reports Server (NTRS)
Hinton, David A.
2001-01-01
A ground-based system has been developed to demonstrate the feasibility of automating the process of collecting relevant weather data, predicting wake vortex behavior from a data base of aircraft, prescribing safe wake vortex spacing criteria, estimating system benefit, and comparing predicted and observed wake vortex behavior. This report describes many of the system algorithms, features, limitations, and lessons learned, as well as suggested system improvements. The system has demonstrated concept feasibility and the potential for airport benefit. Significant opportunities exist however for improved system robustness and optimization. A condensed version of the development lab book is provided along with samples of key input and output file types. This report is intended to document the technical development process and system architecture, and to augment archived internal documents that provide detailed descriptions of software and file formats.
Koleva, M. N.
2008-10-30
This study is proposing a two-grid algorithm for solving a fully conservative difference schemes (FCDS) for gas dynamics equations. In this two-level scheme, the fully nonlinear problem is solved on a coarse grid with space mesh step H. The nonlinearities are expanded about the coarse grid solution and an appropriate interpolation operator is used to provide values of the coarse grid solution on the fine grid. The resulting linear system is solved on a fine mesh with step size h. Applications in the computation of shock and rarefaction waves model problems illustrate the method's effectiveness. It is shown numerically, that the coarse grid can be much coarser than the fine grid and achieve optimal approximation as long as the mesh sizes satisfy H = O({radical}(h))
NASA Astrophysics Data System (ADS)
Renner, A.; Furtado, H.; Seppenwoolde, Y.; Birkfellner, W.; Georg, D.
2016-03-01
A radiotherapy (RT) treatment can last for several weeks. In that time organ motion and shape changes introduce uncertainty in dose application. Monitoring and quantifying the change can yield a more precise irradiation margin definition and thereby reduce dose delivery to healthy tissue and adjust tumor targeting. Deformable image registration (DIR) has the potential to fulfill this task by calculating a deformation field (DF) between a planning CT and a repeated CT of the altered anatomy. Application of the DF on the original contours yields new contours that can be used for an adapted treatment plan. DIR is a challenging method and therefore needs careful user interaction. Without a proper graphical user interface (GUI) a misregistration cannot be easily detected by visual inspection and the results cannot be fine-tuned by changing registration parameters. To provide a DIR algorithm with such a GUI available for everyone, we created the extension Featurelet-Registration for the open source software platform 3D Slicer. The registration logic is an upgrade of an in-house-developed DIR method, which is a featurelet-based piecewise rigid registration. The so called "featurelets" are equally sized rectangular subvolumes of the moving image which are rigidly registered to rectangular search regions on the fixed image. The output is a deformed image and a deformation field. Both can be visualized directly in 3D Slicer facilitating the interpretation and quantification of the results. For validation of the registration accuracy two deformable phantoms were used. The performance was benchmarked against a demons algorithm with comparable results.
Li, Chuan; Li, Lin; Zhang, Jie; Alexov, Emil
2012-01-01
The Gauss-Seidel method is a standard iterative numerical method widely used to solve a system of equations and, in general, is more efficient comparing to other iterative methods, such as the Jacobi method. However, standard implementation of the Gauss-Seidel method restricts its utilization in parallel computing due to its requirement of using updated neighboring values (i.e., in current iteration) as soon as they are available. Here we report an efficient and exact (not requiring assumptions) method to parallelize iterations and to reduce the computational time as a linear/nearly linear function of the number of CPUs. In contrast to other existing solutions, our method does not require any assumptions and is equally applicable for solving linear and nonlinear equations. This approach is implemented in the DelPhi program, which is a finite difference Poisson-Boltzmann equation solver to model electrostatics in molecular biology. This development makes the iterative procedure on obtaining the electrostatic potential distribution in the parallelized DelPhi several folds faster than that in the serial code. Further we demonstrate the advantages of the new parallelized DelPhi by computing the electrostatic potential and the corresponding energies of large supramolecular structures. PMID:22674480
NASA Astrophysics Data System (ADS)
Anderson, Richard P.
An algorithm for precision approach guidance using GPS and a MicroElectroMechanical Systems/Inertial Navigation System (MEMS/INS) has been developed to meet the Required Navigational Performance (RNP) at a cost that is suitable for General Aviation (GA) applications. This scheme allows for accurate approach guidance (Category I) using Wide Area Augmentation System (WAAS) at locations not served by ILS, MLS or other types of precision landing guidance, thereby greatly expanding the number of useable airports in poor weather. At locations served by a Local Area Augmentation System (LAAS), Category III-like navigation is possible with the novel idea of a Missed Approach Time (MAT) that is similar to a Missed Approach Point (MAP) but not fixed in space. Though certain augmented types of GPS have sufficient precision for approach navigation, its use alone is insufficient to meet RNP due to an inability to monitor loss, degradation or intentional spoofing and meaconing of the GPS signal. A redundant navigation system and a health monitoring system must be added to acquire sufficient reliability, safety and time-to-alert as stated by required navigation performance. An inertial navigation system is the best choice, as it requires no external radio signals and its errors are complementary to GPS. An aiding Kalman filter is used to derive parameters that monitor the correlation between the GPS and MEMS/INS. These approach guidance parameters determines the MAT for a given RNP and provide the pilot or autopilot with proceed/do-not-proceed decision in real time. The enabling technology used to derive the guidance program is a MEMS gyroscope and accelerometer package in conjunction with a single-antenna pseudo-attitude algorithm. To be viable for most GA applications, the hardware must be reasonably priced. The MEMS gyros allows for the first cost-effective INS package to be developed. With lower cost, however, comes higher drift rates and a more dependence on GPS aiding. In
Usón, Isabel; Stevenson, Clare E M; Lawson, David M; Sheldrick, George M
2007-10-01
NovP is an S-adenosyl-L-methionine-dependent O-methyltransferase from Streptomyces spheroides (subunit MW = 29 967 Da). Recombinant N-terminally His-tagged NovP crystallizes in space group P2, with approximate unit-cell parameters a = 51.81, b = 46.04, c = 61.22 A, beta = 105.0 degrees , giving a solvent content of 44% for a single copy of the His-tagged protomer per asymmetric unit. Native synchrotron data to a resolution of 1.35 A were combined with three other native data sets collected at lower resolution (both in-house and at the synchrotron) for the sake of completeness and better scaling. Data to 2.45 A resolution were subsequently recorded in-house from a single mercury derivative. Three partial mercury sites could be located with SHELXD, but the resulting phases had a mean error of about 81 degrees and in our hands did not yield an interpretable map using standard automated software. Nevertheless, the structure of NovP could be solved by first tracing a small part of the structure by hand and then extrapolating within and beyond the experimental resolution limit using the ;free lunch algorithm' in SHELXE. The resulting phases have a mean phase error of 17 degrees relative to a refined model.
Garcia-Pareja, S.; Galan, P.; Manzano, F.; Brualla, L.; Lallena, A. M.
2010-07-15
Purpose: In this work, the authors describe an approach which has been developed to drive the application of different variance-reduction techniques to the Monte Carlo simulation of photon and electron transport in clinical accelerators. Methods: The new approach considers the following techniques: Russian roulette, splitting, a modified version of the directional bremsstrahlung splitting, and the azimuthal particle redistribution. Their application is controlled by an ant colony algorithm based on an importance map. Results: The procedure has been applied to radiosurgery beams. Specifically, the authors have calculated depth-dose profiles, off-axis ratios, and output factors, quantities usually considered in the commissioning of these beams. The agreement between Monte Carlo results and the corresponding measurements is within {approx}3%/0.3 mm for the central axis percentage depth dose and the dose profiles. The importance map generated in the calculation can be used to discuss simulation details in the different parts of the geometry in a simple way. The simulation CPU times are comparable to those needed within other approaches common in this field. Conclusions: The new approach is competitive with those previously used in this kind of problems (PSF generation or source models) and has some practical advantages that make it to be a good tool to simulate the radiation transport in problems where the quantities of interest are difficult to obtain because of low statistics.
Sung, Wen-Tsai; Lin, Jia-Syun
2013-01-01
This work aims to develop a smart LED lighting system, which is remotely controlled by Android apps via handheld devices, e.g., smartphones, tablets, and so forth. The status of energy use is reflected by readings displayed on a handheld device, and it is treated as a criterion in the lighting mode design of a system. A multimeter, a wireless light dimmer, an IR learning remote module, etc. are connected to a server by means of RS 232/485 and a human computer interface on a touch screen. The wireless data communication is designed to operate in compliance with the ZigBee standard, and signal processing on sensed data is made through a self adaptive weighted data fusion algorithm. A low variation in data fusion together with a high stability is experimentally demonstrated in this work. The wireless light dimmer as well as the IR learning remote module can be instructed directly by command given on the human computer interface, and the reading on a multimeter can be displayed thereon via the server. This proposed smart LED lighting system can be remotely controlled and self learning mode can be enabled by a single handheld device via WiFi transmission. Hence, this proposal is validated as an approach to power monitoring for home appliances, and is demonstrated as a digital home network in consideration of energy efficiency.
D'Azevedo, Ed F; Nintcheu Fata, Sylvain
2012-01-01
A collocation boundary element code for solving the three-dimensional Laplace equation, publicly available from \\url{http://www.intetec.org}, has been adapted to run on an Nvidia Tesla general purpose graphics processing unit (GPU). Global matrix assembly and LU factorization of the resulting dense matrix were performed on the GPU. Out-of-core techniques were used to solve problems larger than available GPU memory. The code achieved over eight times speedup in matrix assembly and about 56~Gflops/sec in the LU factorization using only 512~Mbytes of GPU memory. Details of the GPU implementation and comparisons with the standard sequential algorithm are included to illustrate the performance of the GPU code.
NASA Astrophysics Data System (ADS)
Hyatt, A. W.; Welander, A. S.; Eidietis, N. W.; Lanctot, M. J.; Humphreys, D. A.
2014-10-01
The DIII-D tokamak has 18 independent poloidal field (PF) shaping coils and an independent Ohmic transformer coil system. This gives great plasma shaping flexibility and freedom but requires a complex control capability that imposes some form of constraint so that a given plasma shape and specification leads to uniquely determined PF shaping currents. One such constraint used is to connect most PF coils in parallel to a common bus, forcing the sum of those PF current to be zero. This constraint has many benefits, but also leads to instability where adjacent PF coils of opposite current can mutually increase, leading to local shape distortion when using the standard shape control algorithms. We will give examples of improved control algorithms that were extensively tested using the TokSys simulation suite available at DIII-D and then successfully implemented in practice on DIII-D. In one case using TokSys simulations to develop a control solution for a long sought plasma equilibrium saved several days of expensive tokamak operation time. Work supported by the US Department of Energy under DE-FC02-04ER54698.
Brown, Douglas (Sandia National Laboratories, Livermore, CA); Gray, Genetha Anne (Sandia National Laboratories, Livermore, CA)
2005-10-01
Due to the nature of many infectious agents, such as anthrax, symptoms may either take several days to manifest or resemble those of less serious illnesses leading to misdiagnosis. Thus, bioterrorism attacks that include the release of such agents are particularly dangerous and potentially deadly. For this reason, a system is needed for the quick and correct identification of disease outbreaks. The Real-time Outbreak Disease Surveillance System (RODS), initially developed by Carnegie Mellon University and the University of Pittsburgh, was created to meet this need. The RODS software implements different classifiers for pertinent health surveillance data in order to determine whether or not an outbreak has occurred. In an effort to improve the capability of RODS at detecting outbreaks, we incorporate a data fusion method. Data fusion is used to improve the results of a single classification by combining the output of multiple classifiers. This paper documents the first stages of the development of a data fusion system that can combine the output of the classifiers included in RODS.
Algorithms and Algorithmic Languages.
ERIC Educational Resources Information Center
Veselov, V. M.; Koprov, V. M.
This paper is intended as an introduction to a number of problems connected with the description of algorithms and algorithmic languages, particularly the syntaxes and semantics of algorithmic languages. The terms "letter, word, alphabet" are defined and described. The concept of the algorithm is defined and the relation between the algorithm and…
Reasoning about systolic algorithms
Purushothaman, S.
1986-01-01
Systolic algorithms are a class of parallel algorithms, with small grain concurrency, well suited for implementation in VLSI. They are intended to be implemented as high-performance, computation-bound back-end processors and are characterized by a tesselating interconnection of identical processing elements. This dissertation investigates the problem of providing correctness of systolic algorithms. The following are reported in this dissertation: (1) a methodology for verifying correctness of systolic algorithms based on solving the representation of an algorithm as recurrence equations. The methodology is demonstrated by proving the correctness of a systolic architecture for optimal parenthesization. (2) The implementation of mechanical proofs of correctness of two systolic algorithms, a convolution algorithm and an optimal parenthesization algorithm, using the Boyer-Moore theorem prover. (3) An induction principle for proving correctness of systolic arrays which are modular. Two attendant inference rules, weak equivalence and shift transformation, which capture equivalent behavior of systolic arrays, are also presented.
Chiesa, Maria; Rigoni, Federica; Paderno, Maria; Borghetti, Patrizia; Gagliotti, Giovanna; Bertoni, Maurizio; Ballarin Denti, Antonio; Schiavina, Lorenzo; Goldoni, Andrea; Sangaletti, Luigi
2012-05-01
The present study is focused on the implementation of a novel, low cost, urban grid of nanostructured chemresistor gas sensors for ammonia concentration ([NH(3)]) monitoring, with NH(3) being one of the main precursors of secondary fine particulate. Low-cost chemresistor gas sensors based on carbon nanotubes have been developed, their response to [NH(3)] in the 0.17-5.0 ppm range has been tested, and the devices have been properly calibrated under different relative humidity conditions in the 33-63% range. In order to improve the chemresistor selectivity towards [NH(3)], an Expert System, based on fuzzy logic and genetic algorithms, has been developed to extract the atmospheric [NH(3)] (with a sensitivity of a few ppb) from the output signal of a model chemresistor gas sensor exposed to an NO(2), NO(X) and O(3) gas mixture. The concentration of these pollutants that are known to be the most significant interfering compounds during ammonia detection with carbon nanotube gas sensors has been tracked by the ARPA monitoring network in the city of Milan and the historical dataset collected over one year has been used to train the Expert System.
Pareschi, Fabio; Albertini, Pierluigi; Frattini, Giovanni; Mangia, Mauro; Rovatti, Riccardo; Setti, Gianluca
2016-02-01
We report the design and implementation of an Analog-to-Information Converter (AIC) based on Compressed Sensing (CS). The system is realized in a CMOS 180 nm technology and targets the acquisition of bio-signals with Nyquist frequency up to 100 kHz. To maximize performance and reduce hardware complexity, we co-design hardware together with acquisition and reconstruction algorithms. The resulting AIC outperforms previously proposed solutions mainly thanks to two key features. First, we adopt a novel method to deal with saturations in the computation of CS measurements. This allows no loss in performance even when 60% of measurements saturate. Second, the system is able to adapt itself to the energy distribution of the input by exploiting the so-called rakeness to maximize the amount of information contained in the measurements. With this approach, the 16 measurement channels integrated into a single device are expected to allow the acquisition and the correct reconstruction of most biomedical signals. As a case study, measurements on real electrocardiograms (ECGs) and electromyograms (EMGs) show signals that these can be reconstructed without any noticeable degradation with a compression rate, respectively, of 8 and 10.
INSENS classification algorithm report
Hernandez, J.E.; Frerking, C.J.; Myers, D.W.
1993-07-28
This report describes a new algorithm developed for the Imigration and Naturalization Service (INS) in support of the INSENS project for classifying vehicles and pedestrians using seismic data. This algorithm is less sensitive to nuisance alarms due to environmental events than the previous algorithm. Furthermore, the algorithm is simple enough that it can be implemented in the 8-bit microprocessor used in the INSENS system.
Samba, A.S.
1985-01-01
The problem of solving banded linear systems by direct (non-iterative) techniques on the Vector Processor System (VPS) 32 supercomputer is considered. Two efficient direct methods for solving banded linear systems on the VPS 32 are described. The vector cyclic reduction (VCR) algorithm is discussed in detail. The performance of the VCR on a three parameter model problem is also illustrated. The VCR is an adaptation of the conventional point cyclic reduction algorithm. The second direct method is the Customized Reduction of Augmented Triangles' (CRAT). CRAT has the dominant characteristics of an efficient VPS 32 algorithm. CRAT is tailored to the pipeline architecture of the VPS 32 and as a consequence the algorithm is implicitly vectorizable.
2013-07-29
The OpenEIS Algorithm package seeks to provide a low-risk path for building owners, service providers and managers to explore analytical methods for improving building control and operational efficiency. Users of this software can analyze building data, and learn how commercial implementations would provide long-term value. The code also serves as a reference implementation for developers who wish to adapt the algorithms for use in commercial tools or service offerings.
Brunet-Benkhoucha, Malik; Verhaegen, Frank; Lassalle, Stephanie; Beliveau-Nadeau, Dominic; Reniers, Brigitte; Donath, David; Taussky, Daniel; Carrier, Jean-Francois
2009-11-15
Purpose: The low dose rate brachytherapy procedure would benefit from an intraoperative postimplant dosimetry verification technique to identify possible suboptimal dose coverage and suggest a potential reimplantation. The main objective of this project is to develop an efficient, operator-free, intraoperative seed detection technique using the imaging modalities available in a low dose rate brachytherapy treatment room. Methods: This intraoperative detection allows a complete dosimetry calculation that can be performed right after an I-125 prostate seed implantation, while the patient is still under anesthesia. To accomplish this, a digital tomosynthesis-based algorithm was developed. This automatic filtered reconstruction of the 3D volume requires seven projections acquired over a total angle of 60 deg. with an isocentric imaging system. Results: A phantom study was performed to validate the technique that was used in a retrospective clinical study involving 23 patients. In the patient study, the automatic tomosynthesis-based reconstruction yielded seed detection rates of 96.7% and 2.6% false positives. The seed localization error obtained with a phantom study is 0.4{+-}0.4 mm. The average time needed for reconstruction is below 1 min. The reconstruction algorithm also provides the seed orientation with an uncertainty of 10 deg. {+-}8 deg. The seed detection algorithm presented here is reliable and was efficiently used in the clinic. Conclusions: When combined with an appropriate coregistration technique to identify the organs in the seed coordinate system, this algorithm will offer new possibilities for a next generation of clinical brachytherapy systems.
Ojala, Jarkko; Kapanen, Mika; Hyödynmaa, Simo
2016-06-01
New version 13.6.23 of the electron Monte Carlo (eMC) algorithm in Varian Eclipse™ treatment planning system has a model for 4MeV electron beam and some general improvements for dose calculation. This study provides the first overall accuracy assessment of this algorithm against full Monte Carlo (MC) simulations for electron beams from 4MeV to 16MeV with most emphasis on the lower energy range. Beams in a homogeneous water phantom and clinical treatment plans were investigated including measurements in the water phantom. Two different material sets were used with full MC: (1) the one applied in the eMC algorithm and (2) the one included in the Eclipse™ for other algorithms. The results of clinical treatment plans were also compared to those of the older eMC version 11.0.31. In the water phantom the dose differences against the full MC were mostly less than 3% with distance-to-agreement (DTA) values within 2mm. Larger discrepancies were obtained in build-up regions, at depths near the maximum electron ranges and with small apertures. For the clinical treatment plans the overall dose differences were mostly within 3% or 2mm with the first material set. Larger differences were observed for a large 4MeV beam entering curved patient surface with extended SSD and also in regions of large dose gradients. Still the DTA values were within 3mm. The discrepancies between the eMC and the full MC were generally larger for the second material set. The version 11.0.31 performed always inferiorly, when compared to the 13.6.23. PMID:27189311
Algorithm Engineering - An Attempt at a Definition
NASA Astrophysics Data System (ADS)
Sanders, Peter
This paper defines algorithm engineering as a general methodology for algorithmic research. The main process in this methodology is a cycle consisting of algorithm design, analysis, implementation and experimental evaluation that resembles Popper’s scientific method. Important additional issues are realistic models, algorithm libraries, benchmarks with real-world problem instances, and a strong coupling to applications. Algorithm theory with its process of subsequent modelling, design, and analysis is not a competing approach to algorithmics but an important ingredient of algorithm engineering.
Algorithm Visualization System for Teaching Spatial Data Algorithms
ERIC Educational Resources Information Center
Nikander, Jussi; Helminen, Juha; Korhonen, Ari
2010-01-01
TRAKLA2 is a web-based learning environment for data structures and algorithms. The system delivers automatically assessed algorithm simulation exercises that are solved using a graphical user interface. In this work, we introduce a novel learning environment for spatial data algorithms, SDA-TRAKLA2, which has been implemented on top of the…
Competing Sudakov veto algorithms
NASA Astrophysics Data System (ADS)
Kleiss, Ronald; Verheyen, Rob
2016-07-01
We present a formalism to analyze the distribution produced by a Monte Carlo algorithm. We perform these analyses on several versions of the Sudakov veto algorithm, adding a cutoff, a second variable and competition between emission channels. The formal analysis allows us to prove that multiple, seemingly different competition algorithms, including those that are currently implemented in most parton showers, lead to the same result. Finally, we test their performance in a semi-realistic setting and show that there are significantly faster alternatives to the commonly used algorithms.
Pierobon, Elisa Sefora; Sefora, Pierobon Elisa; Sandrini, Silvio; Silvio, Sandrini; De Fazio, Nicola; Nicola, De Fazio; Rossini, Giuseppe; Giuseppe, Rossini; Fontana, Iris; Iris, Fontana; Boschiero, Luigino; Luigino, Boschiero; Gropuzzo, Maria; Maria, Gropuzzo; Gotti, Eliana; Eliana, Gotti; Donati, Donato; Donato, Donati; Minetti, Enrico; Enrico, Minetti; Gandolfo, Maria Teresa; Teresa, Gandolfo Maria; Brunello, Anna; Anna, Brunello; Libetta, Carmelo; Carmelo, Libetta; Secchi, Antonio; Antonio, Secchi; Chiaramonte, Stefano; Stefano, Chiaramonte; Rigotti, Paolo; Paolo, Rigotti
2013-08-01
This 5 year observational multicentre study conducted in the Nord Italian Transplant programme area evaluated outcomes in patients receiving kidneys from donors over 60 years allocated according to a combined clinical and histological algorithm. Low-risk donors 60-69 years without risk factors were allocated to single kidney transplant (LR-SKT) based on clinical criteria. Biopsy was performed in donors over 70 years or 60-69 years with risk factors, allocated to Single (HR-SKT) or Dual kidney transplant (HR-DKT) according to the severity of histological damage. Forty HR-DKTs, 41 HR-SKTs and 234 LR-SKTs were evaluated. Baseline differences generally reflected stratification and allocation criteria. Patient and graft (death censored) survival were 90% and 92% for HR-DKT, 85% and 89% for HR-SKT, 88% and 87% for LR-SKT. The algorithm appeared user-friendly in daily practice and was safe and efficient, as demonstrated by satisfactory outcomes in all groups at 5 years. Clinical criteria performed well in low-risk donors. The excellent outcomes observed in DKTs call for fine-tuning of cut-off scores for allocation to DKT or SKT in high-risk patients.
Semioptimal practicable algorithmic cooling
NASA Astrophysics Data System (ADS)
Elias, Yuval; Mor, Tal; Weinstein, Yossi
2011-04-01
Algorithmic cooling (AC) of spins applies entropy manipulation algorithms in open spin systems in order to cool spins far beyond Shannon’s entropy bound. Algorithmic cooling of nuclear spins was demonstrated experimentally and may contribute to nuclear magnetic resonance spectroscopy. Several cooling algorithms were suggested in recent years, including practicable algorithmic cooling (PAC) and exhaustive AC. Practicable algorithms have simple implementations, yet their level of cooling is far from optimal; exhaustive algorithms, on the other hand, cool much better, and some even reach (asymptotically) an optimal level of cooling, but they are not practicable. We introduce here semioptimal practicable AC (SOPAC), wherein a few cycles (typically two to six) are performed at each recursive level. Two classes of SOPAC algorithms are proposed and analyzed. Both attain cooling levels significantly better than PAC and are much more efficient than the exhaustive algorithms. These algorithms are shown to bridge the gap between PAC and exhaustive AC. In addition, we calculated the number of spins required by SOPAC in order to purify qubits for quantum computation. As few as 12 and 7 spins are required (in an ideal scenario) to yield a mildly pure spin (60% polarized) from initial polarizations of 1% and 10%, respectively. In the latter case, about five more spins are sufficient to produce a highly pure spin (99.99% polarized), which could be relevant for fault-tolerant quantum computing.
NASA Technical Reports Server (NTRS)
Wang, Lui; Bayer, Steven E.
1991-01-01
Genetic algorithms are mathematical, highly parallel, adaptive search procedures (i.e., problem solving methods) based loosely on the processes of natural genetics and Darwinian survival of the fittest. Basic genetic algorithms concepts are introduced, genetic algorithm applications are introduced, and results are presented from a project to develop a software tool that will enable the widespread use of genetic algorithm technology.
Williams, P. T.; Dickson, T. L.; Yin, S.
2007-12-01
The current regulations to insure that nuclear reactor pressure vessels (RPVs) maintain their structural integrity when subjected to transients such as pressurized thermal shock (PTS) events were derived from computational models developed in the early-to-mid 1980s. Since that time, advancements and refinements in relevant technologies that impact RPV integrity assessment have led to an effort by the NRC to re-evaluate its PTS regulations. Updated computational methodologies have been developed through interactions between experts in the relevant disciplines of thermal hydraulics, probabilistic risk assessment, materials embrittlement, fracture mechanics, and inspection (flaw characterization). Contributors to the development of these methodologies include the NRC staff, their contractors, and representatives from the nuclear industry. These updated methodologies have been integrated into the Fracture Analysis of Vessels -- Oak Ridge (FAVOR, v06.1) computer code developed for the NRC by the Heavy Section Steel Technology (HSST) program at Oak Ridge National Laboratory (ORNL). The FAVOR, v04.1, code represents the baseline NRC-selected applications tool for re-assessing the current PTS regulations. This report is intended to document the technical bases for the assumptions, algorithms, methods, and correlations employed in the development of the FAVOR, v06.1, code.
Totally parallel multilevel algorithms
NASA Technical Reports Server (NTRS)
Frederickson, Paul O.
1988-01-01
Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
Jara, Maximilian; Reese, Tim; Malinowski, Maciej; Valle, Erika; Seehofer, Daniel; Puhl, Gero; Neuhaus, Peter; Pratschke, Johann; Stockmann, Martin
2015-01-01
Objectives Post-hepatectomy liver failure has a major impact on patient outcome. This study aims to explore the impact of the integration of a novel patient-centred evaluation, the LiMAx algorithm, on perioperative patient outcome after hepatectomy. Methods Trends in perioperative variables and morbidity and mortality rates in 1170 consecutive patients undergoing elective hepatectomy between January 2006 and December 2011 were analysed retrospectively. Propensity score matching was used to compare the effects on morbidity and mortality of the integration of the LiMAx algorithm into clinical practice. Results Over the study period, the proportion of complex hepatectomies increased from 29.1% in 2006 to 37.7% in 2011 (P = 0.034). Similarly, the proportion of patients with liver cirrhosis selected for hepatic surgery rose from 6.9% in 2006 to 11.3% in 2011 (P = 0.039). Despite these increases, rates of post-hepatectomy liver failure fell from 24.7% in 2006 to 9.0% in 2011 (P < 0.001) and liver failure-related postoperative mortality decreased from 4.0% in 2006 to 0.9% in 2011 (P = 0.014). Propensity score matching was associated with reduced rates of post-hepatectomy liver failure [24.7% (n = 77) versus 11.2% (n = 35); P < 0.001] and related mortality [3.8% (n = 12) versus 1.0% (n = 3); P = 0.035]. Conclusions Postoperative liver failure and postoperative liver failure-related mortality decreased in patients undergoing hepatectomy following the implementation of the LiMAx algorithm. PMID:26058324
Long-Boyle, Janel; Savic, Rada; Yan, Shirley; Bartelink, Imke; Musick, Lisa; French, Deborah; Law, Jason; Horn, Biljana; Cowan, Morton J.; Dvorak, Christopher C.
2014-01-01
Background Population pharmacokinetic (PK) studies of busulfan in children have shown that individualized model-based algorithms provide improved targeted busulfan therapy when compared to conventional dosing. The adoption of population PK models into routine clinical practice has been hampered by the tendency of pharmacologists to develop complex models too impractical for clinicians to use. The authors aimed to develop a population PK model for busulfan in children that can reliably achieve therapeutic exposure (concentration-at-steady-state, Css) and implement a simple, model-based tool for the initial dosing of busulfan in children undergoing HCT. Patients and Methods Model development was conducted using retrospective data available in 90 pediatric and young adult patients who had undergone HCT with busulfan conditioning. Busulfan drug levels and potential covariates influencing drug exposure were analyzed using the non-linear mixed effects modeling software, NONMEM. The final population PK model was implemented into a clinician-friendly, Microsoft Excel-based tool and used to recommend initial doses of busulfan in a group of 21 pediatric patients prospectively dosed based on the population PK model. Results Modeling of busulfan time-concentration data indicates busulfan CL displays non-linearity in children, decreasing up to approximately 20% between the concentrations of 250–2000 ng/mL. Important patient-specific covariates found to significantly impact busulfan CL were actual body weight and age. The percentage of individuals achieving a therapeutic Css was significantly higher in subjects receiving initial doses based on the population PK model (81%) versus historical controls dosed on conventional guidelines (52%) (p = 0.02). Conclusion When compared to the conventional dosing guidelines, the model-based algorithm demonstrates significant improvement for providing targeted busulfan therapy in children and young adults. PMID:25162216
Seamless Merging of Hypertext and Algorithm Animation
ERIC Educational Resources Information Center
Karavirta, Ville
2009-01-01
Online learning material that students use by themselves is one of the typical usages of algorithm animation (AA). Thus, the integration of algorithm animations into hypertext is seen as an important topic today to promote the usage of algorithm animation in teaching. This article presents an algorithm animation viewer implemented purely using…
NASA Technical Reports Server (NTRS)
Gregory, Kyle J.; Hill, Joanne E. (Editor); Black, J. Kevin; Baumgartner, Wayne H.; Jahoda, Keith
2016-01-01
A fundamental challenge in a spaceborne application of a gas-based Time Projection Chamber (TPC) for observation of X-ray polarization is handling the large amount of data collected. The TPC polarimeter described uses the APV-25 Application Specific Integrated Circuit (ASIC) to readout a strip detector. Two dimensional photoelectron track images are created with a time projection technique and used to determine the polarization of the incident X-rays. The detector produces a 128x30 pixel image per photon interaction with each pixel registering 12 bits of collected charge. This creates challenging requirements for data storage and downlink bandwidth with only a modest incidence of photons and can have a significant impact on the overall mission cost. An approach is described for locating and isolating the photoelectron track within the detector image, yielding a much smaller data product, typically between 8x8 pixels and 20x20 pixels. This approach is implemented using a Microsemi RT-ProASIC3-3000 Field-Programmable Gate Array (FPGA), clocked at 20 MHz and utilizing 10.7k logic gates (14% of FPGA), 20 Block RAMs (17% of FPGA), and no external RAM. Results will be presented, demonstrating successful photoelectron track cluster detection with minimal impact to detector dead-time.
NASA Astrophysics Data System (ADS)
Gregory, Kyle J.; Hill, Joanne E.; Black, J. Kevin; Baumgartner, Wayne H.; Jahoda, Keith
2016-05-01
A fundamental challenge in a spaceborne application of a gas-based Time Projection Chamber (TPC) for observation of X-ray polarization is handling the large amount of data collected. The TPC polarimeter described uses the APV-25 Application Specific Integrated Circuit (ASIC) to readout a strip detector. Two dimensional photo- electron track images are created with a time projection technique and used to determine the polarization of the incident X-rays. The detector produces a 128x30 pixel image per photon interaction with each pixel registering 12 bits of collected charge. This creates challenging requirements for data storage and downlink bandwidth with only a modest incidence of photons and can have a significant impact on the overall mission cost. An approach is described for locating and isolating the photoelectron track within the detector image, yielding a much smaller data product, typically between 8x8 pixels and 20x20 pixels. This approach is implemented using a Microsemi RT-ProASIC3-3000 Field-Programmable Gate Array (FPGA), clocked at 20 MHz and utilizing 10.7k logic gates (14% of FPGA), 20 Block RAMs (17% of FPGA), and no external RAM. Results will be presented, demonstrating successful photoelectron track cluster detection with minimal impact to detector dead-time.
An algorithm for generating abstract syntax trees
NASA Technical Reports Server (NTRS)
Noonan, R. E.
1985-01-01
The notion of an abstract syntax is discussed. An algorithm is presented for automatically deriving an abstract syntax directly from a BNF grammar. The implementation of this algorithm and its application to the grammar for Modula are discussed.
NASA Astrophysics Data System (ADS)
Szadkowski, Z.
2006-05-01
Extremely rare flux of UHERC requires sophisticated detection techniques. Standard methods oriented on the typical events may not be sensitive enough to capture rare events, crucial to fix a discrepancy in the current data or to confirm/reject some new hypothesis. Currently used triggers in water Cherenkov tanks in the Pierre Auger surface detector, which select events above some amplitude thresholds or investigate a length of traces are not optimized to the horizontal and very inclined showers, interesting as potentially generated by neutrinos. Those showers could be triggered using their signatures: i.e. a curvature of the shower front, transformed on the rise time of traces or muon component giving early peak for "old" showers. Currently available powerful and cost-effective FPGAs provide sufficient resources to implement new triggers not available in the past. The paper describes the implementation proposal of 16-point discrete Fourier transform based on the Radix-2 FFT algorithm into Altera Cyclone FPGA, used in the 3rd generation of the surface detector trigger. All complex coefficients are calculated online in heavy pipelined routines. The register performance ˜200 MHz and relatively low resources occupancy ˜2000 logic elements/channel for 10-bit resolution provide a powerful tool to trigger the events on the traces characteristic in the frequency domain. The FFT code has been successively merged to the code of the 1st surface selector level trigger of the Pierre Auger Observatory and is planned to be tested in real pampas environment.
The performance of asynchronous algorithms on hypercubes
Womble, D.E.
1988-12-01
Many asynchronous algorithms have been developed for parallel computers. Most implementations of asynchronous algorithms, however, have been for shared memory machines. In this paper, we study the implementation and performance of some common asynchronous algorithms on the NCUBE/ten, a 1024 node hypercube. In addition, we summarize existing theoretical work and discuss some classes of algorithms that can be made asynchronous and some that cannot. 16 refs., 3 figs.
Algorithm Calculates Cumulative Poisson Distribution
NASA Technical Reports Server (NTRS)
Bowerman, Paul N.; Nolty, Robert C.; Scheuer, Ernest M.
1992-01-01
Algorithm calculates accurate values of cumulative Poisson distribution under conditions where other algorithms fail because numbers are so small (underflow) or so large (overflow) that computer cannot process them. Factors inserted temporarily to prevent underflow and overflow. Implemented in CUMPOIS computer program described in "Cumulative Poisson Distribution Program" (NPO-17714).
Parallel algorithms for unconstrained optimizations by multisplitting
He, Qing
1994-12-31
In this paper a new parallel iterative algorithm for unconstrained optimization using the idea of multisplitting is proposed. This algorithm uses the existing sequential algorithms without any parallelization. Some convergence and numerical results for this algorithm are presented. The experiments are performed on an Intel iPSC/860 Hyper Cube with 64 nodes. It is interesting that the sequential implementation on one node shows that if the problem is split properly, the algorithm converges much faster than one without splitting.
Two Algorithms for Processing Electronic Nose Data
NASA Technical Reports Server (NTRS)
Young, Rebecca; Linnell, Bruce
2007-01-01
Two algorithms for processing the digitized readings of electronic noses, and computer programs to implement the algorithms, have been devised in a continuing effort to increase the utility of electronic noses as means of identifying airborne compounds and measuring their concentrations. One algorithm identifies the two vapors in a two-vapor mixture and estimates the concentration of each vapor (in principle, this algorithm could be extended to more than two vapors). The other algorithm identifies a single vapor and estimates its concentration.
An efficient algorithm for function optimization: modified stem cells algorithm
NASA Astrophysics Data System (ADS)
Taherdangkoo, Mohammad; Paziresh, Mahsa; Yazdi, Mehran; Bagheri, Mohammad
2013-03-01
In this paper, we propose an optimization algorithm based on the intelligent behavior of stem cell swarms in reproduction and self-organization. Optimization algorithms, such as the Genetic Algorithm (GA), Particle Swarm Optimization (PSO) algorithm, Ant Colony Optimization (ACO) algorithm and Artificial Bee Colony (ABC) algorithm, can give solutions to linear and non-linear problems near to the optimum for many applications; however, in some case, they can suffer from becoming trapped in local optima. The Stem Cells Algorithm (SCA) is an optimization algorithm inspired by the natural behavior of stem cells in evolving themselves into new and improved cells. The SCA avoids the local optima problem successfully. In this paper, we have made small changes in the implementation of this algorithm to obtain improved performance over previous versions. Using a series of benchmark functions, we assess the performance of the proposed algorithm and compare it with that of the other aforementioned optimization algorithms. The obtained results prove the superiority of the Modified Stem Cells Algorithm (MSCA).
Linearization algorithms for line transfer
Scott, H.A.
1990-11-06
Complete linearization is a very powerful technique for solving multi-line transfer problems that can be used efficiently with a variety of transfer formalisms. The linearization algorithm we describe is computationally very similar to ETLA, but allows an effective treatment of strongly-interacting lines. This algorithm has been implemented (in several codes) with two different transfer formalisms in all three one-dimensional geometries. We also describe a variation of the algorithm that handles saturable laser transport. Finally, we present a combination of linearization with a local approximate operator formalism, which has been implemented in two dimensions and is being developed in three dimensions. 11 refs.
Parallelization of Edge Detection Algorithm using MPI on Beowulf Cluster
NASA Astrophysics Data System (ADS)
Haron, Nazleeni; Amir, Ruzaini; Aziz, Izzatdin A.; Jung, Low Tan; Shukri, Siti Rohkmah
In this paper, we present the design of parallel Sobel edge detection algorithm using Foster's methodology. The parallel algorithm is implemented using MPI message passing library and master/slave algorithm. Every processor performs the same sequential algorithm but on different part of the image. Experimental results conducted on Beowulf cluster are presented to demonstrate the performance of the parallel algorithm.
STAR Algorithm Integration Team - Facilitating operational algorithm development
NASA Astrophysics Data System (ADS)
Mikles, V. J.
2015-12-01
The NOAA/NESDIS Center for Satellite Research and Applications (STAR) provides technical support of the Joint Polar Satellite System (JPSS) algorithm development and integration tasks. Utilizing data from the S-NPP satellite, JPSS generates over thirty Environmental Data Records (EDRs) and Intermediate Products (IPs) spanning atmospheric, ocean, cryosphere, and land weather disciplines. The Algorithm Integration Team (AIT) brings technical expertise and support to product algorithms, specifically in testing and validating science algorithms in a pre-operational environment. The AIT verifies that new and updated algorithms function in the development environment, enforces established software development standards, and ensures that delivered packages are functional and complete. AIT facilitates the development of new JPSS-1 algorithms by implementing a review approach based on the Enterprise Product Lifecycle (EPL) process. Building on relationships established during the S-NPP algorithm development process and coordinating directly with science algorithm developers, the AIT has implemented structured reviews with self-contained document suites. The process has supported algorithm improvements for products such as ozone, active fire, vegetation index, and temperature and moisture profiles.
In-Trail Procedure (ITP) Algorithm Design
NASA Technical Reports Server (NTRS)
Munoz, Cesar A.; Siminiceanu, Radu I.
2007-01-01
The primary objective of this document is to provide a detailed description of the In-Trail Procedure (ITP) algorithm, which is part of the Airborne Traffic Situational Awareness In-Trail Procedure (ATSA-ITP) application. To this end, the document presents a high level description of the ITP Algorithm and a prototype implementation of this algorithm in the programming language C.
A Robustly Stabilizing Model Predictive Control Algorithm
NASA Technical Reports Server (NTRS)
Ackmece, A. Behcet; Carson, John M., III
2007-01-01
A model predictive control (MPC) algorithm that differs from prior MPC algorithms has been developed for controlling an uncertain nonlinear system. This algorithm guarantees the resolvability of an associated finite-horizon optimal-control problem in a receding-horizon implementation.
The Superior Lambert Algorithm
NASA Astrophysics Data System (ADS)
der, G.
2011-09-01
Lambert algorithms are used extensively for initial orbit determination, mission planning, space debris correlation, and missile targeting, just to name a few applications. Due to the significance of the Lambert problem in Astrodynamics, Gauss, Battin, Godal, Lancaster, Gooding, Sun and many others (References 1 to 15) have provided numerous formulations leading to various analytic solutions and iterative methods. Most Lambert algorithms and their computer programs can only work within one revolution, break down or converge slowly when the transfer angle is near zero or 180 degrees, and their multi-revolution limitations are either ignored or barely addressed. Despite claims of robustness, many Lambert algorithms fail without notice, and the users seldom have a clue why. The DerAstrodynamics lambert2 algorithm, which is based on the analytic solution formulated by Sun, works for any number of revolutions and converges rapidly at any transfer angle. It provides significant capability enhancements over every other Lambert algorithm in use today. These include improved speed, accuracy, robustness, and multirevolution capabilities as well as implementation simplicity. Additionally, the lambert2 algorithm provides a powerful tool for solving the angles-only problem without artificial singularities (pointed out by Gooding in Reference 16), which involves 3 lines of sight captured by optical sensors, or systems such as the Air Force Space Surveillance System (AFSSS). The analytic solution is derived from the extended Godal’s time equation by Sun, while the iterative method of solution is that of Laguerre, modified for robustness. The Keplerian solution of a Lambert algorithm can be extended to include the non-Keplerian terms of the Vinti algorithm via a simple targeting technique (References 17 to 19). Accurate analytic non-Keplerian trajectories can be predicted for satellites and ballistic missiles, while performing at least 100 times faster in speed than most
Power spectral estimation algorithms
NASA Technical Reports Server (NTRS)
Bhatia, Manjit S.
1989-01-01
Algorithms to estimate the power spectrum using Maximum Entropy Methods were developed. These algorithms were coded in FORTRAN 77 and were implemented on the VAX 780. The important considerations in this analysis are: (1) resolution, i.e., how close in frequency two spectral components can be spaced and still be identified; (2) dynamic range, i.e., how small a spectral peak can be, relative to the largest, and still be observed in the spectra; and (3) variance, i.e., how accurate the estimate of the spectra is to the actual spectra. The application of the algorithms based on Maximum Entropy Methods to a variety of data shows that these criteria are met quite well. Additional work in this direction would help confirm the findings. All of the software developed was turned over to the technical monitor. A copy of a typical program is included. Some of the actual data and graphs used on this data are also included.
Cobweb/3: A portable implementation
NASA Technical Reports Server (NTRS)
Mckusick, Kathleen; Thompson, Kevin
1990-01-01
An algorithm is examined for data clustering and incremental concept formation. An overview is given of the Cobweb/3 system and the algorithm on which it is based, as well as the practical details of obtaining and running the system code. The implementation features a flexible user interface which includes a graphical display of the concept hierarchies that the system constructs.
Basic cluster compression algorithm
NASA Technical Reports Server (NTRS)
Hilbert, E. E.; Lee, J.
1980-01-01
Feature extraction and data compression of LANDSAT data is accomplished by BCCA program which reduces costs associated with transmitting, storing, distributing, and interpreting multispectral image data. Algorithm uses spatially local clustering to extract features from image data to describe spectral characteristics of data set. Approach requires only simple repetitive computations, and parallel processing can be used for very high data rates. Program is written in FORTRAN IV for batch execution and has been implemented on SEL 32/55.
Algorithms and Requirements for Measuring Network Bandwidth
Jin, Guojun
2002-12-08
This report unveils new algorithms for actively measuring (not estimating) available bandwidths with very low intrusion, computing cross traffic, thus estimating the physical bandwidth, provides mathematical proof that the algorithms are accurate, and addresses conditions, requirements, and limitations for new and existing algorithms for measuring network bandwidths. The paper also discusses a number of important terminologies and issues for network bandwidth measurement, and introduces a fundamental parameter -Maximum Burst Size that is critical for implementing algorithms based on multiple packets.
TVFMCATS. Time Variant Floating Mean Counting Algorithm
Huffman, R.K.
1999-05-01
This software was written to test a time variant floating mean counting algorithm. The algorithm was developed by Westinghouse Savannah River Company and a provisional patent has been filed on the algorithm. The test software was developed to work with the Val Tech model IVB prototype version II count rate meter hardware. The test software was used to verify the algorithm developed by WSRC could be correctly implemented with the vendor`s hardware.
Time Variant Floating Mean Counting Algorithm
Huffman, Russell Kevin
1999-06-03
This software was written to test a time variant floating mean counting algorithm. The algorithm was developed by Westinghouse Savannah River Company and a provisional patent has been filed on the algorithm. The test software was developed to work with the Val Tech model IVB prototype version II count rate meter hardware. The test software was used to verify the algorithm developed by WSRC could be correctly implemented with the vendor''s hardware.
Dynamic Programming Algorithm vs. Genetic Algorithm: Which is Faster?
NASA Astrophysics Data System (ADS)
Petković, Dušan
The article compares two different approaches for the optimization problem of large join queries (LJQs). Almost all commercial database systems use a form of the dynamic programming algorithm to solve the ordering of join operations for large join queries, i.e. joins with more than dozen join operations. The property of the dynamic programming algorithm is that the execution time increases significantly in the case, where the number of join operations in a query is large. Genetic algorithms (GAs), as a data mining technique, have been shown as a promising technique in solving the ordering of join operations in LJQs. Using the existing implementation of GA, we compare the dynamic programming algorithm implemented in commercial database systems with the corresponding GA module. Our results show that the use of a genetic algorithm is a better solution for optimization of large join queries, i.e., that such a technique outperforms the implementations of the dynamic programming algorithm in conventional query optimization components for very large join queries.
NASA Astrophysics Data System (ADS)
Rao, Sailesh K.; Kollath, T.
1986-07-01
In this paper, we show that every systolic array executes a Regular Iterative Algorithm with a strongly separating hyperplane and conversely, that every such algorithm can be implemented on a systolic array. This characterization provides us with an unified framework for describing the contributions of other authors. It also exposes the relevance of many fundamental concepts that were introduced in the sixties by Hennie, Waite and Karp, Miller and Winograd, to the present day concern of systolic array
Chang, C.Y.
1986-01-01
New results on efficient forms of decoding convolutional codes based on Viterbi and stack algorithms using systolic array architecture are presented. Some theoretical aspects of systolic arrays are also investigated. First, systolic array implementation of Viterbi algorithm is considered, and various properties of convolutional codes are derived. A technique called strongly connected trellis decoding is introduced to increase the efficient utilization of all the systolic array processors. The issues dealing with the composite branch metric generation, survivor updating, overall system architecture, throughput rate, and computations overhead ratio are also investigated. Second, the existing stack algorithm is modified and restated in a more concise version so that it can be efficiently implemented by a special type of systolic array called systolic priority queue. Three general schemes of systolic priority queue based on random access memory, shift register, and ripple register are proposed. Finally, a systematic approach is presented to design systolic arrays for certain general classes of recursively formulated algorithms.
NASA Technical Reports Server (NTRS)
Barth, Timothy J.; Lomax, Harvard
1987-01-01
The past decade has seen considerable activity in algorithm development for the Navier-Stokes equations. This has resulted in a wide variety of useful new techniques. Some examples for the numerical solution of the Navier-Stokes equations are presented, divided into two parts. One is devoted to the incompressible Navier-Stokes equations, and the other to the compressible form.
Tomasz Plawski, J. Hovater
2010-09-01
A digital low level radio frequency (RF) system typically incorporates either a heterodyne or direct sampling technique, followed by fast ADCs, then an FPGA, and finally a transmitting DAC. This universal platform opens up the possibilities for a variety of control algorithm implementations. The foremost concern for an RF control system is cavity field stability, and to meet the required quality of regulation, the chosen control system needs to have sufficient feedback gain. In this paper we will investigate the effectiveness of the regulation for three basic control system algorithms: I&Q (In-phase and Quadrature), Amplitude & Phase and digital SEL (Self Exciting Loop) along with the example of the Jefferson Lab 12 GeV cavity field control system.
Reactive Collision Avoidance Algorithm
NASA Technical Reports Server (NTRS)
Scharf, Daniel; Acikmese, Behcet; Ploen, Scott; Hadaegh, Fred
2010-01-01
The reactive collision avoidance (RCA) algorithm allows a spacecraft to find a fuel-optimal trajectory for avoiding an arbitrary number of colliding spacecraft in real time while accounting for acceleration limits. In addition to spacecraft, the technology can be used for vehicles that can accelerate in any direction, such as helicopters and submersibles. In contrast to existing, passive algorithms that simultaneously design trajectories for a cluster of vehicles working to achieve a common goal, RCA is implemented onboard spacecraft only when an imminent collision is detected, and then plans a collision avoidance maneuver for only that host vehicle, thus preventing a collision in an off-nominal situation for which passive algorithms cannot. An example scenario for such a situation might be when a spacecraft in the cluster is approaching another one, but enters safe mode and begins to drift. Functionally, the RCA detects colliding spacecraft, plans an evasion trajectory by solving the Evasion Trajectory Problem (ETP), and then recovers after the collision is avoided. A direct optimization approach was used to develop the algorithm so it can run in real time. In this innovation, a parameterized class of avoidance trajectories is specified, and then the optimal trajectory is found by searching over the parameters. The class of trajectories is selected as bang-off-bang as motivated by optimal control theory. That is, an avoiding spacecraft first applies full acceleration in a constant direction, then coasts, and finally applies full acceleration to stop. The parameter optimization problem can be solved offline and stored as a look-up table of values. Using a look-up table allows the algorithm to run in real time. Given a colliding spacecraft, the properties of the collision geometry serve as indices of the look-up table that gives the optimal trajectory. For multiple colliding spacecraft, the set of trajectories that avoid all spacecraft is rapidly searched on
NASA Technical Reports Server (NTRS)
Vardi, A.
1984-01-01
The representation min t s.t. F(I)(x). - t less than or equal to 0 for all i is examined. An active set strategy is designed of functions: active, semi-active, and non-active. This technique will help in preventing zigzagging which often occurs when an active set strategy is used. Some of the inequality constraints are handled with slack variables. Also a trust region strategy is used in which at each iteration there is a sphere around the current point in which the local approximation of the function is trusted. The algorithm is implemented into a successful computer program. Numerical results are provided.
Threshold extended ID3 algorithm
NASA Astrophysics Data System (ADS)
Kumar, A. B. Rajesh; Ramesh, C. Phani; Madhusudhan, E.; Padmavathamma, M.
2012-04-01
Information exchange over insecure networks needs to provide authentication and confidentiality to the database in significant problem in datamining. In this paper we propose a novel authenticated multiparty ID3 Algorithm used to construct multiparty secret sharing decision tree for implementation in medical transactions.
ERIC Educational Resources Information Center
Trochim, William M. K.
Investigated is the topic of research implementation and how it can affect evaluation results. Even when evaluations are well planned, the obtained results can be misleading if the conscientiously-constructed research plan is not correctly implemented in practice. In virtually every research arena, one finds major difficulties in implementing the…
Empirical study of parallel LRU simulation algorithms
NASA Technical Reports Server (NTRS)
Carr, Eric; Nicol, David M.
1994-01-01
This paper reports on the performance of five parallel algorithms for simulating a fully associative cache operating under the LRU (Least-Recently-Used) replacement policy. Three of the algorithms are SIMD, and are implemented on the MasPar MP-2 architecture. Two other algorithms are parallelizations of an efficient serial algorithm on the Intel Paragon. One SIMD algorithm is quite simple, but its cost is linear in the cache size. The two other SIMD algorithm are more complex, but have costs that are independent on the cache size. Both the second and third SIMD algorithms compute all stack distances; the second SIMD algorithm is completely general, whereas the third SIMD algorithm presumes and takes advantage of bounds on the range of reference tags. Both MIMD algorithm implemented on the Paragon are general and compute all stack distances; they differ in one step that may affect their respective scalability. We assess the strengths and weaknesses of these algorithms as a function of problem size and characteristics, and compare their performance on traces derived from execution of three SPEC benchmark programs.
Genetic algorithms and the immune system
Forrest, S. . Dept. of Computer Science); Perelson, A.S. )
1990-01-01
Using genetic algorithm techniques we introduce a model to examine the hypothesis that antibody and T cell receptor genes evolved so as to encode the information needed to recognize schemas that characterize common pathogens. We have implemented the algorithm on the Connection Machine for 16,384 64-bit antigens and 512 64-bit antibodies. 8 refs.
Efficient algorithms for the laboratory discovery of optimal quantum controls
NASA Astrophysics Data System (ADS)
Turinici, Gabriel; Le Bris, Claude; Rabitz, Herschel
2004-07-01
The laboratory closed-loop optimal control of quantum phenomena, expressed as minimizing a suitable cost functional, is currently implemented through an optimization algorithm coupled to the experimental apparatus. In practice, the most commonly used search algorithms are variants of genetic algorithms. As an alternative choice, a direct search deterministic algorithm is proposed in this paper. For the simple simulations studied here, it outperforms the existing approaches. An additional algorithm is introduced in order to reveal some properties of the cost functional landscape.
Mapping algorithms on regular parallel architectures
Lee, P.
1989-01-01
It is significant that many of time-intensive scientific algorithms are formulated as nested loops, which are inherently regularly structured. In this dissertation the relations between the mathematical structure of nested loop algorithms and the architectural capabilities required for their parallel execution are studied. The architectural model considered in depth is that of an arbitrary dimensional systolic array. The mathematical structure of the algorithm is characterized by classifying its data-dependence vectors according to the new ZERO-ONE-INFINITE property introduced. Using this classification, the first complete set of necessary and sufficient conditions for correct transformation of a nested loop algorithm onto a given systolic array of an arbitrary dimension by means of linear mappings is derived. Practical methods to derive optimal or suboptimal systolic array implementations are also provided. The techniques developed are used constructively to develop families of implementations satisfying various optimization criteria and to design programmable arrays efficiently executing classes of algorithms. In addition, a Computer-Aided Design system running on SUN workstations has been implemented to help in the design. The methodology, which deals with general algorithms, is illustrated by synthesizing linear and planar systolic array algorithms for matrix multiplication, a reindexed Warshall-Floyd transitive closure algorithm, and the longest common subsequence algorithm.
Parallel asynchronous systems and image processing algorithms
NASA Technical Reports Server (NTRS)
Coon, D. D.; Perera, A. G. U.
1989-01-01
A new hardware approach to implementation of image processing algorithms is described. The approach is based on silicon devices which would permit an independent analog processing channel to be dedicated to evey pixel. A laminar architecture consisting of a stack of planar arrays of the device would form a two-dimensional array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuronlike asynchronous pulse coded form through the laminar processor. Such systems would integrate image acquisition and image processing. Acquisition and processing would be performed concurrently as in natural vision systems. The research is aimed at implementation of algorithms, such as the intensity dependent summation algorithm and pyramid processing structures, which are motivated by the operation of natural vision systems. Implementation of natural vision algorithms would benefit from the use of neuronlike information coding and the laminar, 2-D parallel, vision system type architecture. Besides providing a neural network framework for implementation of natural vision algorithms, a 2-D parallel approach could eliminate the serial bottleneck of conventional processing systems. Conversion to serial format would occur only after raw intensity data has been substantially processed. An interesting challenge arises from the fact that the mathematical formulation of natural vision algorithms does not specify the means of implementation, so that hardware implementation poses intriguing questions involving vision science.
Portable Health Algorithms Test System
NASA Technical Reports Server (NTRS)
Melcher, Kevin J.; Wong, Edmond; Fulton, Christopher E.; Sowers, Thomas S.; Maul, William A.
2010-01-01
A document discusses the Portable Health Algorithms Test (PHALT) System, which has been designed as a means for evolving the maturity and credibility of algorithms developed to assess the health of aerospace systems. Comprising an integrated hardware-software environment, the PHALT system allows systems health management algorithms to be developed in a graphical programming environment, to be tested and refined using system simulation or test data playback, and to be evaluated in a real-time hardware-in-the-loop mode with a live test article. The integrated hardware and software development environment provides a seamless transition from algorithm development to real-time implementation. The portability of the hardware makes it quick and easy to transport between test facilities. This hard ware/software architecture is flexible enough to support a variety of diagnostic applications and test hardware, and the GUI-based rapid prototyping capability is sufficient to support development execution, and testing of custom diagnostic algorithms. The PHALT operating system supports execution of diagnostic algorithms under real-time constraints. PHALT can perform real-time capture and playback of test rig data with the ability to augment/ modify the data stream (e.g. inject simulated faults). It performs algorithm testing using a variety of data input sources, including real-time data acquisition, test data playback, and system simulations, and also provides system feedback to evaluate closed-loop diagnostic response and mitigation control.
Evaluating super resolution algorithms
NASA Astrophysics Data System (ADS)
Kim, Youn Jin; Park, Jong Hyun; Shin, Gun Shik; Lee, Hyun-Seung; Kim, Dong-Hyun; Park, Se Hyeok; Kim, Jaehyun
2011-01-01
This study intends to establish a sound testing and evaluation methodology based upon the human visual characteristics for appreciating the image restoration accuracy; in addition to comparing the subjective results with predictions by some objective evaluation methods. In total, six different super resolution (SR) algorithms - such as iterative back-projection (IBP), robust SR, maximum a posteriori (MAP), projections onto convex sets (POCS), a non-uniform interpolation, and frequency domain approach - were selected. The performance comparison between the SR algorithms in terms of their restoration accuracy was carried out through both subjectively and objectively. The former methodology relies upon the paired comparison method that involves the simultaneous scaling of two stimuli with respect to image restoration accuracy. For the latter, both conventional image quality metrics and color difference methods are implemented. Consequently, POCS and a non-uniform interpolation outperformed the others for an ideal situation, while restoration based methods appear more accurate to the HR image in a real world case where any prior information about the blur kernel is remained unknown. However, the noise-added-image could not be restored successfully by any of those methods. The latest International Commission on Illumination (CIE) standard color difference equation CIEDE2000 was found to predict the subjective results accurately and outperformed conventional methods for evaluating the restoration accuracy of those SR algorithms.
Algorithmic synthesis using Python compiler
NASA Astrophysics Data System (ADS)
Cieszewski, Radoslaw; Romaniuk, Ryszard; Pozniak, Krzysztof; Linczuk, Maciej
2015-09-01
This paper presents a python to VHDL compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and translate it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the programmed circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. This can be achieved by using many computational resources at the same time. Creating parallel programs implemented in FPGAs in pure HDL is difficult and time consuming. Using higher level of abstraction and High-Level Synthesis compiler implementation time can be reduced. The compiler has been implemented using the Python language. This article describes design, implementation and results of created tools.
Algorithms versus architectures for computational chemistry
NASA Technical Reports Server (NTRS)
Partridge, H.; Bauschlicher, C. W., Jr.
1986-01-01
The algorithms employed are computationally intensive and, as a result, increased performance (both algorithmic and architectural) is required to improve accuracy and to treat larger molecular systems. Several benchmark quantum chemistry codes are examined on a variety of architectures. While these codes are only a small portion of a typical quantum chemistry library, they illustrate many of the computationally intensive kernels and data manipulation requirements of some applications. Furthermore, understanding the performance of the existing algorithm on present and proposed supercomputers serves as a guide for future programs and algorithm development. The algorithms investigated are: (1) a sparse symmetric matrix vector product; (2) a four index integral transformation; and (3) the calculation of diatomic two electron Slater integrals. The vectorization strategies are examined for these algorithms for both the Cyber 205 and Cray XMP. In addition, multiprocessor implementations of the algorithms are looked at on the Cray XMP and on the MIT static data flow machine proposed by DENNIS.
Smell Detection Agent Based Optimization Algorithm
NASA Astrophysics Data System (ADS)
Vinod Chandra, S. S.
2016-09-01
In this paper, a novel nature-inspired optimization algorithm has been employed and the trained behaviour of dogs in detecting smell trails is adapted into computational agents for problem solving. The algorithm involves creation of a surface with smell trails and subsequent iteration of the agents in resolving a path. This algorithm can be applied in different computational constraints that incorporate path-based problems. Implementation of the algorithm can be treated as a shortest path problem for a variety of datasets. The simulated agents have been used to evolve the shortest path between two nodes in a graph. This algorithm is useful to solve NP-hard problems that are related to path discovery. This algorithm is also useful to solve many practical optimization problems. The extensive derivation of the algorithm can be enabled to solve shortest path problems.
Monte Carlo algorithm for free energy calculation.
Bi, Sheng; Tong, Ning-Hua
2015-07-01
We propose a Monte Carlo algorithm for the free energy calculation based on configuration space sampling. An upward or downward temperature scan can be used to produce F(T). We implement this algorithm for the Ising model on a square lattice and triangular lattice. Comparison with the exact free energy shows an excellent agreement. We analyze the properties of this algorithm and compare it with the Wang-Landau algorithm, which samples in energy space. This method is applicable to general classical statistical models. The possibility of extending it to quantum systems is discussed.
A parallel algorithm for global routing
NASA Technical Reports Server (NTRS)
Brouwer, Randall J.; Banerjee, Prithviraj
1990-01-01
A Parallel Hierarchical algorithm for Global Routing (PHIGURE) is presented. The router is based on the work of Burstein and Pelavin, but has many extensions for general global routing and parallel execution. Main features of the algorithm include structured hierarchical decomposition into separate independent tasks which are suitable for parallel execution and adaptive simplex solution for adding feedthroughs and adjusting channel heights for row-based layout. Alternative decomposition methods and the various levels of parallelism available in the algorithm are examined closely. The algorithm is described and results are presented for a shared-memory multiprocessor implementation.
Automatic ionospheric layers detection: Algorithms analysis
NASA Astrophysics Data System (ADS)
Molina, María G.; Zuccheretti, Enrico; Cabrera, Miguel A.; Bianchi, Cesidio; Sciacca, Umberto; Baskaradas, James
2016-03-01
Vertical sounding is a widely used technique to obtain ionosphere measurements, such as an estimation of virtual height versus frequency scanning. It is performed by high frequency radar for geophysical applications called "ionospheric sounder" (or "ionosonde"). Radar detection depends mainly on targets characteristics. While several targets behavior and correspondent echo detection algorithms have been studied, a survey to address a suitable algorithm for ionospheric sounder has to be carried out. This paper is focused on automatic echo detection algorithms implemented in particular for an ionospheric sounder, target specific characteristics were studied as well. Adaptive threshold detection algorithms are proposed, compared to the current implemented algorithm, and tested using actual data obtained from the Advanced Ionospheric Sounder (AIS-INGV) at Rome Ionospheric Observatory. Different cases of study have been selected according typical ionospheric and detection conditions.
Fontana, W.
1990-12-13
In this paper complex adaptive systems are defined by a self- referential loop in which objects encode functions that act back on these objects. A model for this loop is presented. It uses a simple recursive formal language, derived from the lambda-calculus, to provide a semantics that maps character strings into functions that manipulate symbols on strings. The interaction between two functions, or algorithms, is defined naturally within the language through function composition, and results in the production of a new function. An iterated map acting on sets of functions and a corresponding graph representation are defined. Their properties are useful to discuss the behavior of a fixed size ensemble of randomly interacting functions. This function gas'', or Turning gas'', is studied under various conditions, and evolves cooperative interaction patterns of considerable intricacy. These patterns adapt under the influence of perturbations consisting in the addition of new random functions to the system. Different organizations emerge depending on the availability of self-replicators.
Algorithms for Graphic Display of Sentence Dependency Structures.
ERIC Educational Resources Information Center
Craven, Timothy C.
1991-01-01
Discussion of graph-drawing algorithms compares and evaluates five algorithms for generated automatic graphic display of sentence dependency structures that have been implemented in the TEXTNET text structure management system. Evaluation criteria are discussed, and a test of the algorithms with a database of newspaper articles is described. (16…
Stereoscopic depth perception for robot vision: algorithms and architectures
Safranek, R.J.; Kak, A.C.
1983-01-01
The implementation of depth perception algorithms for computer vision is considered. In automated manufacturing, depth information is vital for tasks such as path planning and 3-d scene analysis. The presentation begins with a survey of computer algorithms for stereoscopic depth perception. The emphasis is on the Marr-Poggio paradigm of human stereo vision and its computer implementation. In addition, a stereo matching algorithm based on the relaxation labelling technique is examined. A computer architecture designed to efficiently implement stereo matching algorithms, an MIMD array interfaced to a global memory, is presented. 9 references.
Demiris, George
2014-01-01
This article provides a general introduction to implementation science—the discipline that studies the implementation process of research evidence—in the context of hospice and palliative care. By discussing how implementation science principles and frameworks can inform the design and implementation of intervention research, we aim to highlight how this approach can maximize the likelihood for translation and long-term adoption in clinical practice settings. We present 2 ongoing clinical trials in hospice that incorporate considerations for translation in their design and implementation as case studies for the implications of implementation science. This domain helps us better understand why established programs may lose their effectiveness over time or when transferred to other settings, why well-tested programs may exhibit unintended effects when introduced in new settings, or how an intervention can maximize cost-effectiveness with strategies for effective adoption. All these challenges are of significance to hospice and palliative care, where we seek to provide effective and efficient tools to improve care services. The emergence of this discipline calls for researchers and practitioners to carefully examine how to refine current and design new and innovative strategies to improve quality of care. PMID:23558847
Firefly Algorithm for Structural Search.
Avendaño-Franco, Guillermo; Romero, Aldo H
2016-07-12
The problem of computational structure prediction of materials is approached using the firefly (FF) algorithm. Starting from the chemical composition and optionally using prior knowledge of similar structures, the FF method is able to predict not only known stable structures but also a variety of novel competitive metastable structures. This article focuses on the strengths and limitations of the algorithm as a multimodal global searcher. The algorithm has been implemented in software package PyChemia ( https://github.com/MaterialsDiscovery/PyChemia ), an open source python library for materials analysis. We present applications of the method to van der Waals clusters and crystal structures. The FF method is shown to be competitive when compared to other population-based global searchers. PMID:27232694
On mapping systolic algorithms onto the hypercube
Ibarra, O.H.; Sohn, S.M. )
1990-01-01
Much effort has been devoted toward developing efficient algorithms for systolic arrays. Here the authors consider the problem of mapping these algorithms into efficient algorithms for a fixed-size hypercube architecture. They describe in detail several optimal implementations of algorithms given for one-way one and two-dimensional systolic arrays. Since interprocessor communication is many times slower than local computation in parallel computers built to date, the problem of efficient communication is specifically addressed for these mappings. In order to experimentally validate the technique, five systolic algorithms were mapped in various ways onto a 64-node NCUBE/7 MMD hypercube machine. The algorithms are for the following problems: the shuffle scheduling problem, finite impulse response filtering, linear context-free language recognition, matrix multiplication, and computing the Boolean transitive closure. Experimental evidence indicates that good performance is obtained for the mappings.
Fast training algorithms for multilayer neural nets.
Brent, R P
1991-01-01
An algorithm that is faster than back-propagation and for which it is not necessary to specify the number of hidden units in advance is described. The relationship with other fast pattern-recognition algorithms, such as algorithms based on k-d trees, is discussed. The algorithm has been implemented and tested on artificial problems, such as the parity problem, and on real problems arising in speech recognition. Experimental results, including training times and recognition accuracy, are given. Generally, the algorithm achieves accuracy as good as or better than nets trained using back-propagation. Accuracy is comparable to that for the nearest-neighbor algorithm, which is slower and requires more storage space.
Realization of a scalable Shor algorithm.
Monz, Thomas; Nigg, Daniel; Martinez, Esteban A; Brandl, Matthias F; Schindler, Philipp; Rines, Richard; Wang, Shannon X; Chuang, Isaac L; Blatt, Rainer
2016-03-01
Certain algorithms for quantum computers are able to outperform their classical counterparts. In 1994, Peter Shor came up with a quantum algorithm that calculates the prime factors of a large number vastly more efficiently than a classical computer. For general scalability of such algorithms, hardware, quantum error correction, and the algorithmic realization itself need to be extensible. Here we present the realization of a scalable Shor algorithm, as proposed by Kitaev. We factor the number 15 by effectively employing and controlling seven qubits and four "cache qubits" and by implementing generalized arithmetic operations, known as modular multipliers. This algorithm has been realized scalably within an ion-trap quantum computer and returns the correct factors with a confidence level exceeding 99%. PMID:26941315
Thomas, Kamishia L
2016-01-01
Catheter-associated urinary tract infections (CAUTIs) are the most common hospital-acquired infections. The purpose of this quality improvement (QI) project was to successfully implement a nurse-led evidence-based practice change designed to reduce CAUTIs in a cardiac intensive care and step-down unit. The QI project was implemented using a convenience sample of patients admitted to the cardiac intensive care and step-down unit.Evaluation data were collected 3 months preimplementation and 9 months postimplementation. We used Wick's Check-Plan-Do-Check-Act model of continuous QI to guide the project. A statistically significant change in the number of CAUTIs (P = .009) and CAUTI occurrences (P = .005) was observed following the intervention. The number of indwelling catheter days and indwelling catheter utilization did not significantly differ following implementation of the intervention. Nurse compliance with the intervention was computed for each month; the average compliance rate was 91%. Findings from this project indicate that a nurse-led evidence-based practice project exerted a positive influence on CAUTI occurrences.
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess
2011-01-01
More efficient versions of an interpolation method, called kriging, have been introduced in order to reduce its traditionally high computational cost. Written in C++, these approaches were tested on both synthetic and real data. Kriging is a best unbiased linear estimator and suitable for interpolation of scattered data points. Kriging has long been used in the geostatistic and mining communities, but is now being researched for use in the image fusion of remotely sensed data. This allows a combination of data from various locations to be used to fill in any missing data from any single location. To arrive at the faster algorithms, sparse SYMMLQ iterative solver, covariance tapering, Fast Multipole Methods (FMM), and nearest neighbor searching techniques were used. These implementations were used when the coefficient matrix in the linear system is symmetric, but not necessarily positive-definite.
Systolic systems: algorithms and complexity
Chang, J.H.
1986-01-01
This thesis has two main contributions. The first is the design of efficient systolic algorithms for solving recurrence equations, dynamic programming problems, scheduling problems, as well as new systolic implementation of data structures such as stacks, queues, priority queues, and dictionary machines. The second major contribution is the investigation of the computational power of systolic arrays in comparison to sequential models and other models of parallel computation.
NASA Technical Reports Server (NTRS)
Merceret, Francis; Lane, John; Immer, Christopher; Case, Jonathan; Manobianco, John
2005-01-01
The contour error map (CEM) algorithm and the software that implements the algorithm are means of quantifying correlations between sets of time-varying data that are binarized and registered on spatial grids. The present version of the software is intended for use in evaluating numerical weather forecasts against observational sea-breeze data. In cases in which observational data come from off-grid stations, it is necessary to preprocess the observational data to transform them into gridded data. First, the wind direction is gridded and binarized so that D(i,j;n) is the input to CEM based on forecast data and d(i,j;n) is the input to CEM based on gridded observational data. Here, i and j are spatial indices representing 1.25-km intervals along the west-to-east and south-to-north directions, respectively; and n is a time index representing 5-minute intervals. A binary value of D or d = 0 corresponds to an offshore wind, whereas a value of D or d = 1 corresponds to an onshore wind. CEM includes two notable subalgorithms: One identifies and verifies sea-breeze boundaries; the other, which can be invoked optionally, performs an image-erosion function for the purpose of attempting to eliminate river-breeze contributions in the wind fields.
An efficient parallel termination detection algorithm
Baker, A. H.; Crivelli, S.; Jessup, E. R.
2004-05-27
Information local to any one processor is insufficient to monitor the overall progress of most distributed computations. Typically, a second distributed computation for detecting termination of the main computation is necessary. In order to be a useful computational tool, the termination detection routine must operate concurrently with the main computation, adding minimal overhead, and it must promptly and correctly detect termination when it occurs. In this paper, we present a new algorithm for detecting the termination of a parallel computation on distributed-memory MIMD computers that satisfies all of those criteria. A variety of termination detection algorithms have been devised. Of these, the algorithm presented by Sinha, Kale, and Ramkumar (henceforth, the SKR algorithm) is unique in its ability to adapt to the load conditions of the system on which it runs, thereby minimizing the impact of termination detection on performance. Because their algorithm also detects termination quickly, we consider it to be the most efficient practical algorithm presently available. The termination detection algorithm presented here was developed for use in the PMESC programming library for distributed-memory MIMD computers. Like the SKR algorithm, our algorithm adapts to system loads and imposes little overhead. Also like the SKR algorithm, ours is tree-based, and it does not depend on any assumptions about the physical interconnection topology of the processors or the specifics of the distributed computation. In addition, our algorithm is easier to implement and requires only half as many tree traverses as does the SKR algorithm. This paper is organized as follows. In section 2, we define our computational model. In section 3, we review the SKR algorithm. We introduce our new algorithm in section 4, and prove its correctness in section 5. We discuss its efficiency and present experimental results in section 6.
Carreno, Joseph J; Lomaestro, Ben M; Jacobs, Apryl L; Meyer, Rachel E; Evans, Ann; Montero, Clemente I
2016-08-01
OBJECTIVE To evaluate time to clinical response before and after implementation of rapid blood culture identification technologies. DESIGN Before-and-after trial. SETTING Large, tertiary, urban, academic health-sciences center. PATIENTS Patients >18 years old with sepsis and concurrent bacteremia or fungemia were included in the study; patients who were pregnant, had polymicrobial septicemia, or were transferred from an outside hospital were excluded. INTERVENTION Prior to the intervention, polymerase chain reaction was used to identify Staphylococcus species from positive blood cultures, and traditional laboratory techniques were used to identify non-staphylococcal species. After the intervention, matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) assay and FilmArray were also used to identify additional species. During both periods, the antimicrobial stewardship team provided prospective audit and feedback for all patients on antibiotics. RESULTS A total of 219 patients were enrolled in the study: 115 patients prior to the intervention and 104 after the intervention. The median time to clinical response was statistically significantly shorter in the postintervention group than in the preintervention group (2 days vs 4 days, respectively; P=.002). By Cox regression, the implementation of MALDI-TOF and FilmArray was associated with shorter time to clinical response (hazard ratio [HR], 1.360; 95% confidence interval [CI], 1.018-1.816). After controlling for potential confounders, the study group was not independently associated with clinical response (adjusted HR, 1.279; 95% CI, 0.955-1.713). Mortality was numerically, but not statistically significantly, lower in the postintervention group than in the preintervention group (7.6% vs 11.4%; P=.342). CONCLUSIONS In the setting of an existing antimicrobial stewardship program, implementation of MALDI-TOF and FilmArray was associated with improved time to clinical response. Further research is needed
Effects of visualization on algorithm comprehension
NASA Astrophysics Data System (ADS)
Mulvey, Matthew
Computer science students are expected to learn and apply a variety of core algorithms which are an essential part of the field. Any one of these algorithms by itself is not necessarily extremely complex, but remembering the large variety of algorithms and the differences between them is challenging. To address this challenge, we present a novel algorithm visualization tool designed to enhance students understanding of Dijkstra's algorithm by allowing them to discover the rules of the algorithm for themselves. It is hoped that a deeper understanding of the algorithm will help students correctly select, adapt and apply the appropriate algorithm when presented with a problem to solve, and that what is learned here will be applicable to the design of other visualization tools designed to teach different algorithms. Our visualization tool is currently in the prototype stage, and this thesis will discuss the pedagogical approach that informs its design, as well as the results of some initial usability testing. Finally, to clarify the direction for further development of the tool, four different variations of the prototype were implemented, and the instructional effectiveness of each was assessed by having a small sample participants use the different versions of the prototype and then take a quiz to assess their comprehension of the algorithm.
Annealed Importance Sampling Reversible Jump MCMC algorithms
Karagiannis, Georgios; Andrieu, Christophe
2013-03-20
It will soon be 20 years since reversible jump Markov chain Monte Carlo (RJ-MCMC) algorithms have been proposed. They have significantly extended the scope of Markov chain Monte Carlo simulation methods, offering the promise to be able to routinely tackle transdimensional sampling problems, as encountered in Bayesian model selection problems for example, in a principled and flexible fashion. Their practical efficient implementation, however, still remains a challenge. A particular difficulty encountered in practice is in the choice of the dimension matching variables (both their nature and their distribution) and the reversible transformations which allow one to define the one-to-one mappings underpinning the design of these algorithms. Indeed, even seemingly sensible choices can lead to algorithms with very poor performance. The focus of this paper is the development and performance evaluation of a method, annealed importance sampling RJ-MCMC (aisRJ), which addresses this problem by mitigating the sensitivity of RJ-MCMC algorithms to the aforementioned poor design. As we shall see the algorithm can be understood as being an “exact approximation” of an idealized MCMC algorithm that would sample from the model probabilities directly in a model selection set-up. Such an idealized algorithm may have good theoretical convergence properties, but typically cannot be implemented, and our algorithms can approximate the performance of such idealized algorithms to an arbitrary degree while not introducing any bias for any degree of approximation. Our approach combines the dimension matching ideas of RJ-MCMC with annealed importance sampling and its Markov chain Monte Carlo implementation. We illustrate the performance of the algorithm with numerical simulations which indicate that, although the approach may at first appear computationally involved, it is in fact competitive.
Applications and accuracy of the parallel diagonal dominant algorithm
NASA Technical Reports Server (NTRS)
Sun, Xian-He
1993-01-01
The Parallel Diagonal Dominant (PDD) algorithm is a highly efficient, ideally scalable tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is introduced. Then the algorithm is extended to solve periodic tridiagonal systems. A variant, the reduced PDD algorithm, is also proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric, and anti-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the algorithm is a good candidate for the emerging massively parallel machines.
A garbage collection algorithm for shared memory parallel processors
Crammond, J. )
1988-12-01
This paper describes a technique for adapting the Morris sliding garbage collection algorithm to execute on parallel machines with shared memory. The algorithm is described within the framework of an implementation of the parallel logic language Parlog. However, the algorithm is a general one and can easily be adapted to parallel Prolog systems and to other languages. The performance of the algorithm executing a few simple Parlog benchmarks is analyzed. Finally, it is shown how the technique for parallelizing the sequential algorithm can be adapted for a semi-space copying algorithm.
Algorithmic Mechanism Design of Evolutionary Computation
Pei, Yan
2015-01-01
We consider algorithmic design, enhancement, and improvement of evolutionary computation as a mechanism design problem. All individuals or several groups of individuals can be considered as self-interested agents. The individuals in evolutionary computation can manipulate parameter settings and operations by satisfying their own preferences, which are defined by an evolutionary computation algorithm designer, rather than by following a fixed algorithm rule. Evolutionary computation algorithm designers or self-adaptive methods should construct proper rules and mechanisms for all agents (individuals) to conduct their evolution behaviour correctly in order to definitely achieve the desired and preset objective(s). As a case study, we propose a formal framework on parameter setting, strategy selection, and algorithmic design of evolutionary computation by considering the Nash strategy equilibrium of a mechanism design in the search process. The evaluation results present the efficiency of the framework. This primary principle can be implemented in any evolutionary computation algorithm that needs to consider strategy selection issues in its optimization process. The final objective of our work is to solve evolutionary computation design as an algorithmic mechanism design problem and establish its fundamental aspect by taking this perspective. This paper is the first step towards achieving this objective by implementing a strategy equilibrium solution (such as Nash equilibrium) in evolutionary computation algorithm. PMID:26257777
Ensemble algorithms in reinforcement learning.
Wiering, Marco A; van Hasselt, Hado
2008-08-01
This paper describes several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent. The aim is to enhance learning speed and final performance by combining the chosen actions or action probabilities of different RL algorithms. We designed and implemented four different ensemble methods combining the following five different RL algorithms: Q-learning, Sarsa, actor-critic (AC), QV-learning, and AC learning automaton. The intuitively designed ensemble methods, namely, majority voting (MV), rank voting, Boltzmann multiplication (BM), and Boltzmann addition, combine the policies derived from the value functions of the different RL algorithms, in contrast to previous work where ensemble methods have been used in RL for representing and learning a single value function. We show experiments on five maze problems of varying complexity; the first problem is simple, but the other four maze tasks are of a dynamic or partially observable nature. The results indicate that the BM and MV ensembles significantly outperform the single RL algorithms.
SDR Input Power Estimation Algorithms
NASA Technical Reports Server (NTRS)
Nappier, Jennifer M.; Briones, Janette C.
2013-01-01
The General Dynamics (GD) S-Band software defined radio (SDR) in the Space Communications and Navigation (SCAN) Testbed on the International Space Station (ISS) provides experimenters an opportunity to develop and demonstrate experimental waveforms in space. The SDR has an analog and a digital automatic gain control (AGC) and the response of the AGCs to changes in SDR input power and temperature was characterized prior to the launch and installation of the SCAN Testbed on the ISS. The AGCs were used to estimate the SDR input power and SNR of the received signal and the characterization results showed a nonlinear response to SDR input power and temperature. In order to estimate the SDR input from the AGCs, three algorithms were developed and implemented on the ground software of the SCAN Testbed. The algorithms include a linear straight line estimator, which used the digital AGC and the temperature to estimate the SDR input power over a narrower section of the SDR input power range. There is a linear adaptive filter algorithm that uses both AGCs and the temperature to estimate the SDR input power over a wide input power range. Finally, an algorithm that uses neural networks was designed to estimate the input power over a wide range. This paper describes the algorithms in detail and their associated performance in estimating the SDR input power.
Ensemble algorithms in reinforcement learning.
Wiering, Marco A; van Hasselt, Hado
2008-08-01
This paper describes several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent. The aim is to enhance learning speed and final performance by combining the chosen actions or action probabilities of different RL algorithms. We designed and implemented four different ensemble methods combining the following five different RL algorithms: Q-learning, Sarsa, actor-critic (AC), QV-learning, and AC learning automaton. The intuitively designed ensemble methods, namely, majority voting (MV), rank voting, Boltzmann multiplication (BM), and Boltzmann addition, combine the policies derived from the value functions of the different RL algorithms, in contrast to previous work where ensemble methods have been used in RL for representing and learning a single value function. We show experiments on five maze problems of varying complexity; the first problem is simple, but the other four maze tasks are of a dynamic or partially observable nature. The results indicate that the BM and MV ensembles significantly outperform the single RL algorithms. PMID:18632380
SDR input power estimation algorithms
NASA Astrophysics Data System (ADS)
Briones, J. C.; Nappier, J. M.
The General Dynamics (GD) S-Band software defined radio (SDR) in the Space Communications and Navigation (SCAN) Testbed on the International Space Station (ISS) provides experimenters an opportunity to develop and demonstrate experimental waveforms in space. The SDR has an analog and a digital automatic gain control (AGC) and the response of the AGCs to changes in SDR input power and temperature was characterized prior to the launch and installation of the SCAN Testbed on the ISS. The AGCs were used to estimate the SDR input power and SNR of the received signal and the characterization results showed a nonlinear response to SDR input power and temperature. In order to estimate the SDR input from the AGCs, three algorithms were developed and implemented on the ground software of the SCAN Testbed. The algorithms include a linear straight line estimator, which used the digital AGC and the temperature to estimate the SDR input power over a narrower section of the SDR input power range. There is a linear adaptive filter algorithm that uses both AGCs and the temperature to estimate the SDR input power over a wide input power range. Finally, an algorithm that uses neural networks was designed to estimate the input power over a wide range. This paper describes the algorithms in detail and their associated performance in estimating the SDR input power.
Bull, G; Maffetone, M A; Miller, S K
1992-01-01
Total quality management (TQM) is an organized, systematic approach to problem solving and continuous improvement. American corporations have found that TQM is an excellent way to improve competitiveness, lower operating costs, and improve productivity. Increasing numbers of laboratories are investigating the benefits of TQM. For this month's column, we asked our respondents: What steps has your laboratory taken to implement TQM?
NASA Technical Reports Server (NTRS)
Vo, San C.; Biegel, Bryan (Technical Monitor)
2001-01-01
Scalar multiplication is an essential operation in elliptic curve cryptosystems because its implementation determines the speed and the memory storage requirements. This paper discusses some improvements on two popular signed window algorithms for implementing scalar multiplications of an elliptic curve point - Morain-Olivos's algorithm and Koyarna-Tsuruoka's algorithm.
Accelerated Modular Multiplication Algorithm of Large Word Length Numbers with a Fixed Module
NASA Astrophysics Data System (ADS)
Bardis, N. G.; Drigas, A.; Markovskyy, A. P.; Vrettaros, I.
A new algorithm is proposed for the software implementation of modular multiplication, which uses pre-computations with a constant module. The developed modular multiplication algorithm provides high performance in comparison with the already known algorithms, and is oriented at the variable value of the module, especially with the software implementation on micro controllers and smart cards with a small number of bits.
Conjugate gradient algorithms using multiple recursions
Barth, T.; Manteuffel, T.
1996-12-31
Much is already known about when a conjugate gradient method can be implemented with short recursions for the direction vectors. The work done in 1984 by Faber and Manteuffel gave necessary and sufficient conditions on the iteration matrix A, in order for a conjugate gradient method to be implemented with a single recursion of a certain form. However, this form does not take into account all possible recursions. This became evident when Jagels and Reichel used an algorithm of Gragg for unitary matrices to demonstrate that the class of matrices for which a practical conjugate gradient algorithm exists can be extended to include unitary and shifted unitary matrices. The implementation uses short double recursions for the direction vectors. This motivates the study of multiple recursion algorithms.
Adaptive cuckoo search algorithm for unconstrained optimization.
Ong, Pauline
2014-01-01
Modification of the intensification and diversification approaches in the recently developed cuckoo search algorithm (CSA) is performed. The alteration involves the implementation of adaptive step size adjustment strategy, and thus enabling faster convergence to the global optimal solutions. The feasibility of the proposed algorithm is validated against benchmark optimization functions, where the obtained results demonstrate a marked improvement over the standard CSA, in all the cases. PMID:25298971
Introduction to systolic algorithms and architectures
Bentley, J.L.; Kung, H.T.
1983-01-01
The authors survey the class of systolic special-purpose computer architectures and algorithms, which are particularly well-suited for implementation in very large scale integrated circuitry (VLSI). They give a brief introduction to systolic arrays for a reader with a broad technical background and some experience in using a computer, but who is not necessarily a computer scientist. In addition they briefly survey the technological advances in VLSI that led to the development of systolic algorithms and architectures. 38 references.
Adaptive sensor fusion using genetic algorithms
Fitzgerald, D.S.; Adams, D.G.
1994-08-01
Past attempts at sensor fusion have used some form of Boolean logic to combine the sensor information. As an alteniative, an adaptive ``fuzzy`` sensor fusion technique is described in this paper. This technique exploits the robust capabilities of fuzzy logic in the decision process as well as the optimization features of the genetic algorithm. This paper presents a brief background on fuzzy logic and genetic algorithms and how they are used in an online implementation of adaptive sensor fusion.
Adaptive cuckoo search algorithm for unconstrained optimization.
Ong, Pauline
2014-01-01
Modification of the intensification and diversification approaches in the recently developed cuckoo search algorithm (CSA) is performed. The alteration involves the implementation of adaptive step size adjustment strategy, and thus enabling faster convergence to the global optimal solutions. The feasibility of the proposed algorithm is validated against benchmark optimization functions, where the obtained results demonstrate a marked improvement over the standard CSA, in all the cases.
FPGA-based Klystron linearization implementations in scope of ILC
Omet, M.; Michizono, S.; Matsumoto, T.; Miura, T.; Qiu, F.; Chase, B.; Varghese, P.; Schlarb, H.; Branlard, J.; Cichalewski, W.
2015-01-23
We report the development and implementation of four FPGA-based predistortion-type klystron linearization algorithms. Klystron linearization is essential for the realization of ILC, since it is required to operate the klystrons 7% in power below their saturation. The work presented was performed in international collaborations at the Fermi National Accelerator Laboratory (FNAL), USA and the Deutsches Elektronen Synchrotron (DESY), Germany. With the newly developed algorithms, the generation of correction factors on the FPGA was improved compared to past algorithms, avoiding quantization and decreasing memory requirements. At FNAL, three algorithms were tested at the Advanced Superconducting Test Accelerator (ASTA), demonstrating a successfulmore » implementation for one algorithm and a proof of principle for two algorithms. Furthermore, the functionality of the algorithm implemented at DESY was demonstrated successfully in a simulation.« less
FPGA-based Klystron linearization implementations in scope of ILC
Omet, M.; Michizono, S.; Varghese, P.; Schlarb, H.; Branlard, J.; Cichalewski, W.
2015-01-23
We report the development and implementation of four FPGA-based predistortion-type klystron linearization algorithms. Klystron linearization is essential for the realization of ILC, since it is required to operate the klystrons 7% in power below their saturation. The work presented was performed in international collaborations at the Fermi National Accelerator Laboratory (FNAL), USA and the Deutsches Elektronen Synchrotron (DESY), Germany. With the newly developed algorithms, the generation of correction factors on the FPGA was improved compared to past algorithms, avoiding quantization and decreasing memory requirements. At FNAL, three algorithms were tested at the Advanced Superconducting Test Accelerator (ASTA), demonstrating a successful implementation for one algorithm and a proof of principle for two algorithms. Furthermore, the functionality of the algorithm implemented at DESY was demonstrated successfully in a simulation.
Library of Continuation Algorithms
2005-03-01
LOCA (Library of Continuation Algorithms) is scientific software written in C++ that provides advanced analysis tools for nonlinear systems. In particular, it provides parameter continuation algorithms. bifurcation tracking algorithms, and drivers for linear stability analysis. The algorithms are aimed at large-scale applications that use Newtons method for their nonlinear solve.
Recursive Implementations of the Consider Filter
NASA Technical Reports Server (NTRS)
Zanetti, Renato; DSouza, Chris
2012-01-01
One method to account for parameters errors in the Kalman filter is to consider their effect in the so-called Schmidt-Kalman filter. This work addresses issues that arise when implementing a consider Kalman filter as a real-time, recursive algorithm. A favorite implementation of the Kalman filter as an onboard navigation subsystem is the UDU formulation. A new way to implement a UDU consider filter is proposed. The non-optimality of the recursive consider filter is also analyzed, and a modified algorithm is proposed to overcome this limitation.
Efficient graph algorithms for sequential and parallel computers. Doctoral thesis
Goldberg, A.V.
1987-02-01
This thesis studies graph algorithms, both in sequential and parallel contexts. In the outline of the thesis, algorithm complexities are stated in terms of the the number of vertices n, the number of edges m, the largest absolute value of capacities U, and the largest absolute value of costs C. Chapter 1 introduces a new approach to the maximum flow problem that leads to better algorithms for the problem. Chapter 2 is devoted to the minimum cost flow problem, which is a generalization of the maximum flow problem. Chapter 3 addresses implementation of parallel algorithms through a case study of an implementation of a parallel maximum flow algorithm. Parallel prefix operations play an important role in the implementation. Present experimental results achieved by the implementation are presented. Present parallel symmetry-breaking techniques are the main topic of Chapter 4.
Recursive Branching Simulated Annealing Algorithm
NASA Technical Reports Server (NTRS)
Bolcar, Matthew; Smith, J. Scott; Aronstein, David
2012-01-01
solution, and the region from which new configurations can be selected shrinks as the search continues. The key difference between these algorithms is that in the SA algorithm, a single path, or trajectory, is taken in parameter space, from the starting point to the globally optimal solution, while in the RBSA algorithm, many trajectories are taken; by exploring multiple regions of the parameter space simultaneously, the algorithm has been shown to converge on the globally optimal solution about an order of magnitude faster than when using conventional algorithms. Novel features of the RBSA algorithm include: 1. More efficient searching of the parameter space due to the branching structure, in which multiple random configurations are generated and multiple promising regions of the parameter space are explored; 2. The implementation of a trust region for each parameter in the parameter space, which provides a natural way of enforcing upper- and lower-bound constraints on the parameters; and 3. The optional use of a constrained gradient- search optimization, performed on the continuous variables around each branch s configuration in parameter space to improve search efficiency by allowing for fast fine-tuning of the continuous variables within the trust region at that configuration point.
Operational algorithm development and refinement approaches
NASA Astrophysics Data System (ADS)
Ardanuy, Philip E.
2003-11-01
Next-generation polar and geostationary systems, such as the National Polar-orbiting Operational Environmental Satellite System (NPOESS) and the Geostationary Operational Environmental Satellite (GOES)-R, will deploy new generations of electro-optical reflective and emissive capabilities. These will include low-radiometric-noise, improved spatial resolution multi-spectral and hyperspectral imagers and sounders. To achieve specified performances (e.g., measurement accuracy, precision, uncertainty, and stability), and best utilize the advanced space-borne sensing capabilities, a new generation of retrieval algorithms will be implemented. In most cases, these advanced algorithms benefit from ongoing testing and validation using heritage research mission algorithms and data [e.g., the Earth Observing System (EOS)] Moderate-resolution Imaging Spectroradiometer (MODIS) and Shuttle Ozone Limb Scattering Experiment (SOLSE)/Limb Ozone Retreival Experiment (LORE). In these instances, an algorithm's theoretical basis is not static, but rather improves with time. Once frozen, an operational algorithm can "lose ground" relative to research analogs. Cost/benefit analyses provide a basis for change management. The challenge is in reconciling and balancing the stability, and "comfort," that today"s generation of operational platforms provide (well-characterized, known, sensors and algorithms) with the greatly improved quality, opportunities, and risks, that the next generation of operational sensors and algorithms offer. By using the best practices and lessons learned from heritage/groundbreaking activities, it is possible to implement an agile process that enables change, while managing change. This approach combines a "known-risk" frozen baseline with preset completion schedules with insertion opportunities for algorithm advances as ongoing validation activities identify and repair areas of weak performance. This paper describes an objective, adaptive implementation roadmap that
NASA Technical Reports Server (NTRS)
1983-01-01
Meeting the identified needs of Earth science requires approaching EOS as an information system and not simply as one or more satellites with instruments. Six elements of strategy are outlined as follows: implementation of the individual discipline missions as currently planned; use of sustained observational capabilities offered by operational satellites without waiting for the launch of new mission; put first priority on the data system; deploy an Advanced Data Collection and Location System; put a substantial new observing capability in a low Earth orbit in such a way as to provide for sustained measurements; and group instruments to exploit their capabilities for synergism; maximize the scientific utility of the mission; and minimize the costs of implementation where possible.
J.A. Schmidt
2002-02-20
If a fusion DEMO reactor can be brought into operation during the first half of this century, fusion power production can have a significant impact on carbon dioxide production during the latter half of the century. An assessment of fusion implementation scenarios shows that the resource demands and waste production associated with these scenarios are manageable factors. If fusion is implemented during the latter half of this century it will be one element of a portfolio of (hopefully) carbon dioxide limiting sources of electrical power. It is time to assess the regional implications of fusion power implementation. An important attribute of fusion power is the wide range of possible regions of the country, or countries in the world, where power plants can be located. Unlike most renewable energy options, fusion energy will function within a local distribution system and not require costly, and difficult, long distance transmission systems. For example, the East Coast of the United States is a prime candidate for fusion power deployment by virtue of its distance from renewable energy sources. As fossil fuels become less and less available as an energy option, the transmission of energy across bodies of water will become very expensive. On a global scale, fusion power will be particularly attractive for regions separated from sources of renewable energy by oceans.
Intelligent perturbation algorithms for space scheduling optimization
NASA Technical Reports Server (NTRS)
Kurtzman, Clifford R.
1991-01-01
Intelligent perturbation algorithms for space scheduling optimization are presented in the form of the viewgraphs. The following subject areas are covered: optimization of planning, scheduling, and manifesting; searching a discrete configuration space; heuristic algorithms used for optimization; use of heuristic methods on a sample scheduling problem; intelligent perturbation algorithms are iterative refinement techniques; properties of a good iterative search operator; dispatching examples of intelligent perturbation algorithm and perturbation operator attributes; scheduling implementations using intelligent perturbation algorithms; major advances in scheduling capabilities; the prototype ISF (industrial Space Facility) experiment scheduler; optimized schedule (max revenue); multi-variable optimization; Space Station design reference mission scheduling; ISF-TDRSS command scheduling demonstration; and example task - communications check.
Algorithmic Perspectives on Problem Formulations in MDO
NASA Technical Reports Server (NTRS)
Alexandrov, Natalia M.; Lewis, Robert Michael
2000-01-01
This work is concerned with an approach to formulating the multidisciplinary optimization (MDO) problem that reflects an algorithmic perspective on MDO problem solution. The algorithmic perspective focuses on formulating the problem in light of the abilities and inabilities of optimization algorithms, so that the resulting nonlinear programming problem can be solved reliably and efficiently by conventional optimization techniques. We propose a modular approach to formulating MDO problems that takes advantage of the problem structure, maximizes the autonomy of implementation, and allows for multiple easily interchangeable problem statements to be used depending on the available resources and the characteristics of the application problem.
Complexity of the Quantum Adiabatic Algorithm
NASA Technical Reports Server (NTRS)
Hen, Itay
2013-01-01
The Quantum Adiabatic Algorithm (QAA) has been proposed as a mechanism for efficiently solving optimization problems on a quantum computer. Since adiabatic computation is analog in nature and does not require the design and use of quantum gates, it can be thought of as a simpler and perhaps more profound method for performing quantum computations that might also be easier to implement experimentally. While these features have generated substantial research in QAA, to date there is still a lack of solid evidence that the algorithm can outperform classical optimization algorithms.
Applying a Genetic Algorithm to Reconfigurable Hardware
NASA Technical Reports Server (NTRS)
Wells, B. Earl; Weir, John; Trevino, Luis; Patrick, Clint; Steincamp, Jim
2004-01-01
This paper investigates the feasibility of applying genetic algorithms to solve optimization problems that are implemented entirely in reconfgurable hardware. The paper highlights the pe$ormance/design space trade-offs that must be understood to effectively implement a standard genetic algorithm within a modem Field Programmable Gate Array, FPGA, reconfgurable hardware environment and presents a case-study where this stochastic search technique is applied to standard test-case problems taken from the technical literature. In this research, the targeted FPGA-based platform and high-level design environment was the Starbridge Hypercomputing platform, which incorporates multiple Xilinx Virtex II FPGAs, and the Viva TM graphical hardware description language.
Steele, Elliot M; Steele, Derek S
2014-02-01
Previous studies have used analysis of Ca(2+) sparks extensively to investigate both normal and pathological Ca(2+) regulation in cardiac myocytes. The great majority of these studies used line-scan confocal imaging. In part, this is because the development of open-source software for automatic detection of Ca(2+) sparks in line-scan images has greatly simplified data analysis. A disadvantage of line-scan imaging is that data are collected from a single row of pixels, representing only a small fraction of the cell, and in many instances x-y confocal imaging is preferable. However, the limited availability of software for Ca(2+) spark analysis in two-dimensional x-y image stacks presents an obstacle to its wider application. This study describes the development and characterization of software to enable automatic detection and analysis of Ca(2+) sparks within x-y image stacks, implemented as a plugin within the open-source image analysis platform ImageJ. The program includes methods to enable precise identification of cells within confocal fluorescence images, compensation for changes in background fluorescence, and options that allow exclusion of events based on spatial characteristics.
Cordic based algorithms for software defined radio (SDR) baseband processing
NASA Astrophysics Data System (ADS)
Heyne, B.; Götze, J.
2006-09-01
This paper presents two Cordic based algorithms which may be used for digital baseband processing in OFDM and/or CDMA based communication systems. The first one is a linear least squares based multiuser detector for CDMA incorporating descrambling and despreading. The second algorithm is a pure Cordic based FFT implementation. Both algorithms can be implemented using solely Cordic based architectures (e.g. coprocessors or ASIPs). The algorithms exactly fit the needs of a multistandard terminal as they both are freely parameterizable. This regards to the accuracy of the results as well as to the parameters of the performed function (e.g. size of the FFT).
The hierarchical algorithms--theory and applications
NASA Astrophysics Data System (ADS)
Su, Zheng-Yao
Monte Carlo simulations are one of the most important numerical techniques for investigating statistical physical systems. Among these systems, spin models are a typical example which also play an essential role in constructing the abstract mechanism for various complex systems. Unfortunately, traditional Monte Carlo algorithms are afflicted with "critical slowing down" near continuous phase transitions and the efficiency of the Monte Carlo simulation goes to zero as the size of the lattice is increased. To combat critical slowing down, a very different type of collective-mode algorithm, in contrast to the traditional single-spin-flipmode, was proposed by Swendsen and Wang in 1987 for Potts spin models. Since then, there has been an explosion of work attempting to understand, improve, or generalize it. In these so-called "cluster" algorithms, clusters of spin are regarded as one template and are updated at each step of the Monte Carlo procedure. In implementing these algorithms the cluster labeling is a major time-consuming bottleneck and is also isomorphic to the problem of computing connected components of an undirected graph seen in other application areas, such as pattern recognition.A number of cluster labeling algorithms for sequential computers have long existed. However, the dynamic irregular nature of clusters complicates the task of finding good parallel algorithms and this is particularly true on SIMD (single-instruction-multiple-data machines. Our design of the Hierarchical Cluster Labeling Algorithm aims at alleviating this problem by building a hierarchical structure on the problem domain and by incorporating local and nonlocal communication schemes. We present an estimate for the computational complexity of cluster labeling and prove the key features of this algorithm (such as lower computational complexity, data locality, and easy implementation) compared with the methods formerly known. In particular, this algorithm can be viewed as a generalized
IRST system: hardware implementation issues
NASA Astrophysics Data System (ADS)
Deshpande, Suyog D.; Chan, Philip; Ser, W.; Venkateswarlu, Ronda
1999-07-01
Generally, Infrared Search and Track systems use linear focal-plane-arrays with time-delay and integration, because of their high sensitivity. However, the readout is a cumbersome process and needs special effort. This paper describes signal processing and hardware (HW) implementation issues related to front-end electronics, non-uniformity compensation, signal formatting, target detection, tracking and display system. This paper proposes parallel pipeline architecture with dedicated HW for computationally intensive algorithms and SW intensive DSP HW for reconfigurable architecture.
Implementation of multigrid methods for solving Navier-Stokes equations on a multiprocessor system
NASA Technical Reports Server (NTRS)
Naik, Vijay K.; Taasan, Shlomo
1987-01-01
Presented are schemes for implementing multigrid algorithms on message based MIMD multiprocessor systems. To address the various issues involved, a nontrivial problem of solving the 2-D incompressible Navier-Stokes equations is considered as the model problem. Three different multigrid algorithms are considered. Results from implementing these algorithms on an Intel iPSC are presented.
Petri nets SM-cover-based on heuristic coloring algorithm
NASA Astrophysics Data System (ADS)
Tkacz, Jacek; Doligalski, Michał
2015-09-01
In the paper, coloring heuristic algorithm of interpreted Petri nets is presented. Coloring is used to determine the State Machines (SM) subnets. The present algorithm reduces the Petri net in order to reduce the computational complexity and finds one of its possible State Machines cover. The proposed algorithm uses elements of interpretation of Petri nets. The obtained result may not be the best, but it is sufficient for use in rapid prototyping of logic controllers. Found SM-cover will be also used in the development of algorithms for decomposition, and modular synthesis and implementation of parallel logic controllers. Correctness developed heuristic algorithm was verified using Gentzen formal reasoning system.
Efficient algorithms for the laboratory discovery of optimal quantum controls.
Turinici, Gabriel; Le Bris, Claude; Rabitz, Herschel
2004-01-01
The laboratory closed-loop optimal control of quantum phenomena, expressed as minimizing a suitable cost functional, is currently implemented through an optimization algorithm coupled to the experimental apparatus. In practice, the most commonly used search algorithms are variants of genetic algorithms. As an alternative choice, a direct search deterministic algorithm is proposed in this paper. For the simple simulations studied here, it outperforms the existing approaches. An additional algorithm is introduced in order to reveal some properties of the cost functional landscape. PMID:15324201
Fast proximity algorithm for MAP ECT reconstruction
NASA Astrophysics Data System (ADS)
Li, Si; Krol, Andrzej; Shen, Lixin; Xu, Yuesheng
2012-03-01
We arrived at the fixed-point formulation of the total variation maximum a posteriori (MAP) regularized emission computed tomography (ECT) reconstruction problem and we proposed an iterative alternating scheme to numerically calculate the fixed point. We theoretically proved that our algorithm converges to unique solutions. Because the obtained algorithm exhibits slow convergence speed, we further developed the proximity algorithm in the transformed image space, i.e. the preconditioned proximity algorithm. We used the bias-noise curve method to select optimal regularization hyperparameters for both our algorithm and expectation maximization with total variation regularization (EM-TV). We showed in the numerical experiments that our proposed algorithms, with an appropriately selected preconditioner, outperformed conventional EM-TV algorithm in many critical aspects, such as comparatively very low noise and bias for Shepp-Logan phantom. This has major ramification for nuclear medicine because clinical implementation of our preconditioned fixed-point algorithms might result in very significant radiation dose reduction in the medical applications of emission tomography.
QPSO-based adaptive DNA computing algorithm.
Karakose, Mehmet; Cigdem, Ugur
2013-01-01
DNA (deoxyribonucleic acid) computing that is a new computation model based on DNA molecules for information storage has been increasingly used for optimization and data analysis in recent years. However, DNA computing algorithm has some limitations in terms of convergence speed, adaptability, and effectiveness. In this paper, a new approach for improvement of DNA computing is proposed. This new approach aims to perform DNA computing algorithm with adaptive parameters towards the desired goal using quantum-behaved particle swarm optimization (QPSO). Some contributions provided by the proposed QPSO based on adaptive DNA computing algorithm are as follows: (1) parameters of population size, crossover rate, maximum number of operations, enzyme and virus mutation rate, and fitness function of DNA computing algorithm are simultaneously tuned for adaptive process, (2) adaptive algorithm is performed using QPSO algorithm for goal-driven progress, faster operation, and flexibility in data, and (3) numerical realization of DNA computing algorithm with proposed approach is implemented in system identification. Two experiments with different systems were carried out to evaluate the performance of the proposed approach with comparative results. Experimental results obtained with Matlab and FPGA demonstrate ability to provide effective optimization, considerable convergence speed, and high accuracy according to DNA computing algorithm.
A novel algorithm for Bluetooth ECG.
Pandya, Utpal T; Desai, Uday B
2012-11-01
In wireless transmission of ECG, data latency will be significant when battery power level and data transmission distance are not maintained. In applications like home monitoring or personalized care, to overcome the joint effect of previous issues of wireless transmission and other ECG measurement noises, a novel filtering strategy is required. Here, a novel algorithm, identified as peak rejection adaptive sampling modified moving average (PRASMMA) algorithm for wireless ECG is introduced. This algorithm first removes error in bit pattern of received data if occurred in wireless transmission and then removes baseline drift. Afterward, a modified moving average is implemented except in the region of each QRS complexes. The algorithm also sets its filtering parameters according to different sampling rate selected for acquisition of signals. To demonstrate the work, a prototyped Bluetooth-based ECG module is used to capture ECG with different sampling rate and in different position of patient. This module transmits ECG wirelessly to Bluetooth-enabled devices where the PRASMMA algorithm is applied on captured ECG. The performance of PRASMMA algorithm is compared with moving average and S-Golay algorithms visually as well as numerically. The results show that the PRASMMA algorithm can significantly improve the ECG reconstruction by efficiently removing the noise and its use can be extended to any parameters where peaks are importance for diagnostic purpose.
Digital signal processing algorithms for automatic voice recognition
NASA Technical Reports Server (NTRS)
Botros, Nazeih M.
1987-01-01
The current digital signal analysis algorithms are investigated that are implemented in automatic voice recognition algorithms. Automatic voice recognition means, the capability of a computer to recognize and interact with verbal commands. The digital signal is focused on, rather than the linguistic, analysis of speech signal. Several digital signal processing algorithms are available for voice recognition. Some of these algorithms are: Linear Predictive Coding (LPC), Short-time Fourier Analysis, and Cepstrum Analysis. Among these algorithms, the LPC is the most widely used. This algorithm has short execution time and do not require large memory storage. However, it has several limitations due to the assumptions used to develop it. The other 2 algorithms are frequency domain algorithms with not many assumptions, but they are not widely implemented or investigated. However, with the recent advances in the digital technology, namely signal processors, these 2 frequency domain algorithms may be investigated in order to implement them in voice recognition. This research is concerned with real time, microprocessor based recognition algorithms.
Algorithm-development activities
NASA Technical Reports Server (NTRS)
Carder, Kendall L.
1994-01-01
The task of algorithm-development activities at USF continues. The algorithm for determining chlorophyll alpha concentration, (Chl alpha) and gelbstoff absorption coefficient for SeaWiFS and MODIS-N radiance data is our current priority.
Accurate Finite Difference Algorithms
NASA Technical Reports Server (NTRS)
Goodrich, John W.
1996-01-01
Two families of finite difference algorithms for computational aeroacoustics are presented and compared. All of the algorithms are single step explicit methods, they have the same order of accuracy in both space and time, with examples up to eleventh order, and they have multidimensional extensions. One of the algorithm families has spectral like high resolution. Propagation with high order and high resolution algorithms can produce accurate results after O(10(exp 6)) periods of propagation with eight grid points per wavelength.
Gamma-ray Spectral Analysis Algorithm Library
1997-09-25
The routines of the Gauss Algorithm library are used to implement special purpose products that need to analyze gamma-ray spectra from GE semiconductor detectors as a part of their function. These routines provide the ability to calibrate energy, calibrate peakwidth, search for peaks, search for regions, and fit the spectral data in a given region to locate gamma rays.
Gamma-ray spectral analysis algorithm library
Egger, A. E.
2013-05-06
The routines of the Gauss Algorithms library are used to implement special purpose products that need to analyze gamma-ray spectra from Ge semiconductor detectors as a part of their function. These routines provide the ability to calibrate energy, calibrate peakwidth, search for peaks, search for regions, and fit the spectral data in a given region to locate gamma rays.
Algorithm Reveals Sinusoidal Component Of Noisy Signal
NASA Technical Reports Server (NTRS)
Kwok, Lloyd C.
1991-01-01
Algorithm performs simple statistical analysis of noisy signal to yield preliminary indication of whether or not signal contains sinusoidal component. Suitable for preprocessing or preliminary analysis of vibrations, fluctuations in pressure, and other signals that include large random components. Implemented on personal computer by easy-to-use program.
MULTIOBJECTIVE PARALLEL GENETIC ALGORITHM FOR WASTE MINIMIZATION
In this research we have developed an efficient multiobjective parallel genetic algorithm (MOPGA) for waste minimization problems. This MOPGA integrates PGAPack (Levine, 1996) and NSGA-II (Deb, 2000) with novel modifications. PGAPack is a master-slave parallel implementation of a...
Algorithm for fixed-range optimal trajectories
NASA Technical Reports Server (NTRS)
Lee, H. Q.; Erzberger, H.
1980-01-01
An algorithm for synthesizing optimal aircraft trajectories for specified range was developed and implemented in a computer program written in FORTRAN IV. The algorithm, its computer implementation, and a set of example optimum trajectories for the Boeing 727-100 aircraft are described. The algorithm optimizes trajectories with respect to a cost function that is the weighted sum of fuel cost and time cost. The optimum trajectory consists at most of a three segments: climb, cruise, and descent. The climb and descent profiles are generated by integrating a simplified set of kinematic and dynamic equations wherein the total energy of the aircraft is the independent or time like variable. At each energy level the optimum airspeeds and thrust settings are obtained as the values that minimize the variational Hamiltonian. Although the emphasis is on an off-line, open-loop computation, eventually the most important application will be in an on-board flight management system.
Experiments with conjugate gradient algorithms for homotopy curve tracking
NASA Technical Reports Server (NTRS)
Irani, Kashmira M.; Ribbens, Calvin J.; Watson, Layne T.; Kamat, Manohar P.; Walker, Homer F.
1991-01-01
There are algorithms for finding zeros or fixed points of nonlinear systems of equations that are globally convergent for almost all starting points, i.e., with probability one. The essence of all such algorithms is the construction of an appropriate homotopy map and then tracking some smooth curve in the zero set of this homotopy map. HOMPACK is a mathematical software package implementing globally convergent homotopy algorithms with three different techniques for tracking a homotopy zero curve, and has separate routines for dense and sparse Jacobian matrices. The HOMPACK algorithms for sparse Jacobian matrices use a preconditioned conjugate gradient algorithm for the computation of the kernel of the homotopy Jacobian matrix, a required linear algebra step for homotopy curve tracking. Here, variants of the conjugate gradient algorithm are implemented in the context of homotopy curve tracking and compared with Craig's preconditioned conjugate gradient method used in HOMPACK. The test problems used include actual large scale, sparse structural mechanics problems.
Workshop on algorithms for macromolecular modeling. Final project report, June 1, 1994--May 31, 1995
Leimkuhler, B.; Hermans, J.; Skeel, R.D.
1995-07-01
A workshop was held on algorithms and parallel implementations for macromolecular dynamics, protein folding, and structural refinement. This document contains abstracts and brief reports from that workshop.